Identifying and Resolving Inconsistencies in Biological Pathway Resources Lucy L † Wang , John H Gennari, Neil F Abernethy Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA Overview Typology of Mismatches Existence • Biological pathways are useful tools for studying genetic and molecular interactions Annotation — different entity string names and/or identifiers • Integrating knowledge from multiple pathway resources allows us to take advantage of the strengths of different resources and gives us access to more complementary data, helping to improve our analysis and understanding of biology Existence — missing or extraneous physical entities, reactions, or relationships • Inconsistencies exist between popular pathway knowledge bases, making integration difficult • Although standards in pathway exchange (BioPAX, SBML etc) exist, there are still differences in knowledge representation and content Reaction semantics — different representation of participants, direction, and stoichiometry Granularity — entities or processes represented in different degrees of detail Is H+ part of the equation? H+ Assertion — a resource explicitly contradicts a statement in another resource Level of evidence — different external citations are used to support a statement Glycolysis Pathway in Four Resources Semantics LEFT RIGHT RIGHT LEFT Left or right? Reactant or product? Example Mismatches in One Reaction Annotation ATP ChEBI:30616 ChEBI:15422 pyruvate ChEBI:15361 ChEBI:32816 Which is the correct identifier? Assertion 2 ATP + Pyruvate OR 1 ATP + Pyruvate Which is correct? Entities related to the reaction by the BioPAX left property are red, and entities related by the BioPAX right property are green 2 3 4 Granularity Evidence 1 Are citations… References & Acknowledgements 1. 2. 3. 4. P. Romero, J. Wagg, M. Green, D. Kaiser, M. Krummenacker, and P. Karp. Computational prediction of human metabolic pathways from the complete human genome. Genome Biology, 6(R2):1–17,2004. M. Kanehisa and S. Goto. Kegg: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res, 28:27–30, 2000. P. Thomas, M. Campbell, and A. Kejariwal et al. Panther: a library of protein families and subfamilies indexed by function. Genome Res, 13:2129– 2141, 2003. D. Croft, A. Mundo, and R. Haw et al. The reactome pathway knowledgebase. Nucleic Acids Res, 42(Database issue):D472–477, 2013. This research was funded by the NLM under training grant T15LM007442. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institute of Health. Contemporary or historical? Few or many? One reaction or two? *author example † Corresponding !" author: [email protected] Future Work Matches from HumanCyc proteins to Reactome proteins HumanCyc Name match No name match • Identify inconsistencies in exemplar Identifier match 1264 759 resources. We have begun by identifying No identifier match 55 659 annotation mismatches between two Total 1319 1418 resources: HumanCyc and Reactome. • Perform computational alignment of portions of pathway resources using the typology of mismatches as guidance Total 2023 714 2737 • Use known mismatches to identify areas of uncertainty in existing resources and guide integration of content for pathway analysis applications
© Copyright 2026 Paperzz