ontoProc - Ontology interfaces for Bioconductor

ontoProc - Ontology interfaces for Bioconductor


Author(s): Sara H Stankiewicz,Vincent James Carey

Affiliation(s): Brigham and Women's Hospital , Channing Division of Network Medicine

Social media: https://www.linkedin.com/in/sara-stankiewicz-bb992135/

The goal of the ontoProc package is to make progress in the adoption and application of ontological discipline in Bioconductor-oriented data analysis. The ontoProc package currently provides 14 formal ontologies in RDA format. The ontologies include vocabularies and concept relationships in the domains of human anatomy, proteins, human diseases, chemical ontologies, and more. There are three major barriers in the adoption of biological ontology. First, there is often a gap between a concept of interest and terms available in ontologies. Second, there is the practical problem of decoding ontology identifiers. For example, a GO tag or a CL tag is great for programming but it's very clumsy to co-locate with a natural language term or phrase. Third, there's a likelihood of disagreement between adopters concerning terms to be used for conditions observed at the edge of knowledge. To cope with the first problem, ontoProc has a function called liberalMap which will search an ontology for term(s) and return a data frame of the results. For the second problem we introduce helpers for ontology visualization and term mapping derived from the ontologyIndex suite of Daniel Greene. The third problem is hard to avoid and requires community engagement to resolve disputes. With future editions of the ontoProc package is the hope that genomic data scientists will more readily take advantage of ontologies in their analyses.