OWL-DL Ontology for Classification of Lipids 1 Hong-Sang Low1, Christopher J.O. Baker2, Alexander Garcia3, Markus R. Wenk1, National University of Singapore, Singapore; 2University of New Brunswick, Saint John, Canada; 3 University of Bremen, Bremen, Germany. Abstract Lipids can be systematically classified according to functional properties, structural features, biochemical origin or biological system. However Lipid nomenclature has yet to become a robust research tool since no rigorous definitions exist for membership of specific lipid classes. Lipids need to be defined in a manner that is systematic yet at the same time semantically explicit. We report the reuse of existing lipid nomenclature, ontology describing chemical structure and the extension of the OWL-DL Lipid Ontology in order to support the classification of lipid molecules. We applied class definitions, DLaxioms, to describe lipid classes and illustrate suitability of the ontology for the classification of Fatty Acyl lipids and Mycolic acids. Introduction IUPAC-IUBMB proposed a systematic nomenclature for lipids which received limited adoption by the lipid community2. The proposed classification was complicated and prone to erroneous application by scientists2. Moreover the naming scheme was not extended and does not adequately represent many novel lipid classes discovered in the recent decades. As a result lipids still lack systematic classification and a nomenclature that is universally adopted by the biomedical research community. The LIPIDMAPS consortium aims to solve this problem by developing a scientifically robust, comprehensive and extensible classification system adapted from the IUPAC nomenclature2. This classification scheme organizes lipids from different phylogeny and synthetic domains. Still, uptake by the scientific community has been gradual and lipid literature is steeped with instances of lipid synonyms that do not follow the new nomenclature. In addition to that, LIPIDMAPS nomenclature is still a cumbersome naming scheme. primarily used is-a relationships3 to categorize lipids and definitions for lipid classes in LIPIDMAPS remained implicit. Moreover they are often dependent on a chemical diagram in the form a molecular graphic file that can only be accurately classified by a trained lipid expert. No rigorous definition, independent of a graphical diagram, exists and the graphical definitions are not flexible, nor are they extensible. Changes in such definitions require the redrawing of the chemical diagram/definition. Subsequently, communicating, storing and transferring of such structural definitions in the current format is inefficient and there is much reliance on trained experts. There is therefore a need for lipids to be defined in a manner that is systematic and explicit (following LIPIDMAPS hierarchical structure). A rigorous definition would involve a minimal necessary and sufficient declaration for class membership that could adequately describe a lipid without a molecular structure diagram being necessary. Description logics (DL) describes a domain using class descriptions according to a logic based semantic previously used to represent chemical knowledge4,5. Using DL, it is possible to define a lipid with necessary conditions where epoxy fatty acid is defined as a lipid that at least have a carboxylic acid group and an epoxy group. Moreover, we can define with necessary and sufficient conditions such where epoxy fatty acid is defined as a lipid that can only have epoxy group and carboxylic acid group. Any molecules that have functional groups other than epoxy group and carboxylic acid group cannot be considered as an epoxy fatty acid. O OH O Figure 1. An example of epoxy fatty acid Hierarchical Classification of Lipid Nomenclature Lipids are organic compounds and can be systematically classified in many ways such as using atomic connectivity, physicochemical properties, presence of functional groups, or the type of bioactivities. Albeit an important contribution, the LIPIDMAPS central repository of lipids has The Lipid Ontology The first reported ontology was exclusively developed to conceptualize and capture knowledge in the domain of lipids through the use of concepts, relations, instances and constraints on concepts6. It is a resource that provides a common terminology for the lipid domain and a basis for interoperability between information systems. It had also been used to support navigation of lipid literature in conjunction with text mining. Full details of the initial conceptualization are given in earlier publication6. This ontology has been extended to describe the LIPIDMAPS nomenclature classification explicitly using description logics (OWL-DL) and to support reasoning and inference tasks. Prior to extending the ontology for classification tasks we reviewed existing ontologies for reusable components. We reviewed the Chemical Ontology7 for reuse of the Organic_Group hierarchy and the Lipid Ontology was enriched with 32 functional groups from Chemical Ontology. An additional 63 new concepts were added into the Organic_Group super-concept in the current iteration of Lipid Ontology. The Organic_Group hierarchy was reorganized and asserted with new is-a relationships. Moreover, in order to describe the lipids with complex chemical moieties, we rename the Organic_Group concept into Simple_Organic_Group and position it together with newly created Complex_Organic_Group and Chain_Group concepts under a new Organic_Group concept. In addition to that, we have also created the new Ring_System concept to describe lipids with ring structure. In Chemical Ontology, Organic_Compound are concepts with hasPart relationship to concepts under Organic_Group. The same property is used in Lipid Ontology to relate concepts under Lipid to concepts under Organic_Group. Inversely, an inverse property partOf is used to relate concepts under Organic_Group with concepts under Lipid. For the latest iteration of Lipid Ontology, we added 400 DL definitions into the ontology for all lipid classes, except for Polyketides. (See Table 1 for complete summary of Lipid Ontology) Functional Groups used in Lipid Classification Lipids are complicated biomolecules and most lipids can only be adequately classified with more than one distinct functional group. Lipids are defined by multiple sets of organic groups and these definitions are used to restrict the membership of individual lipids to specific classes of lipids. DL rules with greater complexity than described in Chemical Ontology7 are needed to describe lipids. Primarily we re-use, from the Chemical Ontology, the axiom “Organic_Compound hasPart Organic_Group” to relate Lipid concepts to Organic_Group concepts. We then define in Lipid Ontology, two concepts to describe lipid structures, namely Organic_Group and Ring_System. Organic_Group has the sub-groups (i) Simple_Organic_Group, (ii) Complex_Organic_Group, (iii) Chain_Group. Simple_Organic_Group subsumes concepts that describe basic functional groups whereas Complex_Organic_Group encapsulates glycans and amino acids. Glycans, in particular, are used to classify lipids such as sacharrolipid, and other sugar- linked lipids such as sphingolipids. Chain_Group consists of the Carbon_Chain_Group and the Sphingoid_Base_Chain_Group. The Sphingoid_Base_Chain_Group is used exclusively for Sphingolipid whereas Carbon_Chain_Group is applied to other lipid classes accordingly. The Ring_System consists of (i) Isoprenoid_ring_derivative, (ii) Monocyclic_Ring_ Group and (iii) Polycyclic_Ring_System. These concepts are used to define lipids that have at least one or more rings. Specifically, they are used mainly for Sterol_Lipid, Prenol_Lipid and other ring lipids. In the Lipid Ontology these concepts are extensively used to provide the necessary structural descriptions to define the identity of lipid-based compounds. Subject Total No. of Classes Primitive Class Defined Class No. of Lipid Classes No. of Lipid Classes with DL axioms Total No. of Restrictions Total No. of Properties DL Expressivity No. 715 447 268 428 400 901 46 ALCHIQ(D) Table 1. Summary of the current Lipid Ontology Hierarchical Classification of Lipids Lipid concepts are organized hierarchically with the super-classes restricted by generic necessary conditions. More specific necessary conditions are used to define membership requirements for sub classes of lipid. At the end of hierarchy, lipid classes are restricted by necessary and sufficient conditions and closure axioms. Super-classes are not closed by closure axiom to avoid inconsistency among disjointed sibling classes. More specific lipid classes are defined this way. The first involves specifying the subclass of the present class to restrict the definition of a lipid. Necessary conditions such a “hasPart some Carboxylic_Acid_ derivative_Group” can be further specified by the subclass of Carboxylic_Acid_ Derivative_Group, eg. an aldehyde. The second approach uses a Cardinality Axiom. The axiom can be applied to concepts at any level. Once declared, the cardinality axiom restricts the number of a particular concept to be allowed in a restriction. Lipid classes can be defined by the number of certain organic group concept or Chain_ Group concept. For example, Triacylglycerol is an Acylglycerol with 3 acyl chains. Its superclass is restricted with an existential axiom “has some Acyl_ Chain”. This is further specified with the following cardinality axiom “hasAcyl_Chain exactly 3”. DL Axioms for the definition of Fatty_Acyl Fatty acyls are a diverse group synthesized by chainelongation of an acetyl-CoA primer with malonylCoA/methylmalonyl-CoA groups2. We define a Fatty_Acyl as a lipid that has at least one Carboxylic_Acid_derivative_Group and at least one Acyl_Chain. Docosanoid is a subclass of Fatty_Acyl that inherits from Fatty_Acyl, the Carboxylic_Acid_ derivative_Group as well as Acyl_Chain. The Carboxylic_Acid_derivative_Group in Docosanoid is further specified to be a Carboxylic_ Acid, whereas the Acyl_Chain of is specified with a cardinality axiom and the property hasAcyl_ Chain. Consequently, Docosanoid is defined to have only 1 Acyl_Chain. Moreover, Docosanoid has multiple and distinct functional groups such as Carboxylic_Acid, Alkenyl_Group, Alcohol and Cyclopentenone. These functional groups are made to relate with Dosocanoid via the property “hasPart” in conjuction with the existential axiom “some”. A closure axiom is needed to restrict the type of relationship constraints allowed for a lipid class. The closure axiom is applied to Docosanoids so that lipids of this class can only have the following functional groups, namely, Carboxylic_Acid, Alkenyl_Group, Alcohol, Cyclopentenone and Acyl_Chain. (see Figure 2) DL Axioms for unusual Fatty_Acyl classes As LIPIDMAPS nomenclature classifies lipids based on their chemical structure or biosynthetic origin, lipids such as Fatty_alcohol, Fatty_Nitrille, Fatty_ether and Hydrocarbon are classified as a member of Fatty_Acyl although they do not have an Acyl_Group. We reconcile this by expanding the definition of Fatty_Acyl to include Alkyl_Chain, a characteristic structure of these un-usual Fatty_Acyl classes. A Fatty_alcohol inherits an Alkyl_Chain from Fatty_Acyl and is further defined to have only 1 Alkyl_Chain in the necessary and sufficient condition. This necessary and sufficient condition also includes a “hasPart” property that connects Fatty_alcohol to an Alcohol class. This definition enables us to include lipids without an Acyl_Group as a member of Fatty_Acyl (see Figure 2). In addition we create a new lipid class, namely Fatty_Acyl_ derivative, a subclass of Fatty _Acyl. Extension of Mycolic Acid Class Mycolic acids are a key component of the cell wall of Mycobacterium tubeculosis sps. and are implicated mycobacterial disease. By 1998 there existed 500 or more known chemical structures of related mycolates8 and yet LIPIDMAPS currently contains only 3 mycolic acid records. Consequently many mycolic acids with known structures have yet to be systematically classified. Classification of these lipids is critical for system-level analysis of mycobacterial pathogenesis. We illustrate extension of the ontology to include new Mycolic_Acid classes and demonstrate the assignment of an alpha mycolate (see Figure 1) to the appropriate Lipid class. Based on LIPIDMAPS nomenclature, we assign Mycolic_acid as a member of Fatty_Acid and extend Mycolic_acid classification with 9 defined subclasses (see Table 2). Figure 2. DL definitions of Docosanoid, Fatty alcohol, Alpha mycolic acid These are distributed among five primitive classes. General_mycolic_acid, General_ methylated_ mycolic_ acid, General_ alpha_ mycolic_ acid, Oxygenated_mycolic_acid, General_ methoxy_ mycolic_ acid. classification of lipids. Ontology driven classification has been applied to proteins9 and small molecules7 through the coordination of protein domain or pharmacophore analysis, OWL-DL ontology, an instance store, and DL reasoning. By adding precisely defined DL-axioms to the lipid ontology we can apply as smiliar approach for the automated classification of lipids. Our approach is extensible to accommodate novel lipids and we have extended the use of DL-axioms to classify all lipid classes (except for polyketides). In support of mycobacterial diseases it is imperative to further extend lipid nomenclature and classification. Our goal is to make available formalized DL definitions of lipids organized according to LIPIDMAPS nomenclature and to establish a reusable standard for the lipid researchers and the bioinformatics community interested in lipidomics . Acknowledgements Table 2. Known classes of mycolic acid Alpha mycolic acid is a mycolic acid that has carboxylic acid, cyclopropane and an alpha-hydroxyl acid group. The carboxylic acid group is a member of the acyl group and it is not an ester group. Therefore, according to the classification scheme below, alpha mycolic acid must be a member of Fatty_Acyl. Among members of Fatty_Acyl, only Octadecanoid, Docosanoid, Eisocsanoid and Fatty_Acid have Carboxylic_Acid. Alpha_mycolic_acid does not have a cycloketone group and therefore, it cannot be Docosanoid, Eicosanoid or Octadecanoid. Therefore, it is a member of Fatty_Acid. Among members of Fatty_Acid, only Mycolic_acid has Alpha-Hydroxy _Acid_Group and a Meromycolic_Chain. Therefore, alpha mycolic acid is classified under this class of Fatty_Acid. Because Alpha_mycolic_acid is the only class that accepts mycolic acid with Cyclopropane, the lipid example in Figure 1 is classified as a member of Alpha_mycolic_acid. (see Figure 2) NUS Office of Life Science (R-183-000-607-712), ARF (R-183-000-160-112), BMRC A*STAR (R-183-000-134305), New Brunswick Innovation Fund, Canada. References 1. 2. 3. 4. 5. 6. Conclusion Lipid research is increasingly integrated within systems level biology such as lipidomics1 where lipid classification is required before appropriate annotation of chemical functions can be applied. In this paper we have sought to address the ongoing challenge of classifying lipids through the adoption of W3C standard knowledge representation and the application of DL axioms. We initiated the process of defining lipids according to appropriate functional groups with the intent of using the ontology for 7. 8. 9. Wenk MR: The emerging field of Lipidomics. Nat. Rev. Drug Discov. 2005, 4: 594-610. Fahy E, etal. A comprehensive classification system for lipids. J. Lipid Res. 2005, 46: 839-62. Sud M, Fahy E, Cotter D, Brown A, Dennis EA, etal. LMSD: LIPID MAPS structure database. Nucl. Acids Res. 2007, 35: D527-32. Baader FI, Horrocks I, Sattler U: Description logics as ontology languages for the semantic web. In Festschrift Jörg Siekmann, LNAI 2003. Villanueva-Rosales N, Dumontier M: Describing chemical functional groups in OWL-DL for the classification of chemical compounds. OWLED 2007, co-loc. ESWC2007 Innsbruck, Austria. Baker CJO, etal.: Towards ontology-driven navigation of the lipid bibliosphere. BMC Bioinformatics. 2008, 9 (Suppl 1):S5. Feldman HJ, etal: CO: A Chemical Ontology for Identification of Functional Groups and Semantic Comparison of Small Molecules. FEBS Letters. 2005, 579:4685-4691. Barry CE 3rd, Lee RE, Mdluli K,etal.: Mycolic acids: structure, biosynthesis and physiological functions. Prog. Lip. Res. 1998, 37:143-179. Wolstencroft K, etal: Protein classification using ontology classification. Bioinf. 2006, 22: e530-8.
© Copyright 2026 Paperzz