Late breaking poster abstracts

Odd number posters will be presented on Monday, 8th April and even numbered posters on Tuesday, 9th April.

Posters 195 - 200.

195 Integrating ontologies in comparative analyses of trait evolution

Wasila Dahdul (University of South Dakota), James Balhoff (UNC Chapel Hill), Hilmar Lapp (Duke University), Paula Mabee (University of South Dakota), Sergei Tarasov (Virginia Tech), Josef Uyeda (Virginia Tech), and Todd Vision (UNC Chapel Hill)

Corresponding author: Wasila Dahdul (University of South Dakota) ; wasila.dahdul@usd.edu

Understanding patterns and drivers of trait evolution are fundamental aims of evolutionary biology, and sophisticated methods and tools exist for comparative trait analysis. Ontologies encode knowledge about how anatomical structures and phenotypes are related, which is crucial information for modeling trait evolution but not previously accessible by comparative analysis tools. Phenoscape, in its current Semantic Comparative Analyses for Trait Evolution (SCATE; http://scate.phenoscape.org) project, is developing tools that use the computable knowledge in ontologies to improve models of trait evolution. These new capabilities will enable researchers to use R packages such as RPhenoscape to access a knowledgebase of computable phenotypes (kb.phenoscape.org) and build trait matrices based on anatomical dependencies and the semantic similarity of phenotype descriptions across taxa. By using these tools to combine and analyze trait data from different studies, users can better model trait evolution and generate hypotheses for the drivers of trait adaptation.

 

196 Ontology based text mining of gene-phenotype associations: application to candidate gene prediction

Şenay Kafkas and Robert Hoehndorf

Computer, Electrical and Mathematical Sciences & Engineering Division, Computational Bioscience Research Center, King Abdullah University of Science and Technology, Thuwal 23955, Kingdom of Saudi Arabia

Gene-phenotype associations play an important role in understanding the disease mechanisms which is a requirement for treatment   development. A portion of gene-phenotype associations are observed mainly experimentally and made publicly available through several   standard resources such as MGI. However, there is still a vast   amount of gene-phenotype associations buried in the biomedical   literature. Given the large amount of literature data, we need   automated text mining tools to alleviate the burden in manual curation of gene-phenotype associations and to develop comprehensive resources. In this study, we present an ontology based approach in combination with statistical methods to text mine gene-phenotype associations from the literature. Our method achieved AUC values of 0.90 and 0.75 in recovering known gene-phenotype associations from HPO and MGI respectively. We posit that candidate genes and their relevant diseases should be expressed with similar phenotypes in publications. Thus, we demonstrate the utility of our approach by predicting disease candidate genes based on the semantic similarities of phenotypes associated with genes and diseases. To   the best of our knowledge, this is the first study using an ontology based approach to extract gene--phenotype associations from the literature. We evaluated our disease candidate prediction model on the gene-disease associations from MGI. Our model achieved AUC values of 0.90 and 0.87 on OMIM (human) and MGI (mouse) datasets of gene-disease associations respectively. Our manual analysis on the text mined data revealed that our method can accurately extract gene-phenotype associations which are not currently covered by the existing public gene-phenotype resources. Overall, results indicate that our method can precisely extract known as well as new gene-phenotype associations from literature. All the data and methods are available at https://github.com/bio-ontology-research-group/genepheno.

 

197 Harnessing community expertise to fight plant disease

Helder Pedro, Nishadi De Silva, Manuel Carbajo, Andy Yates

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom

Plant pathogens continue to threaten food security and have a significant economic impact. For this, an understanding of gene function and pathogen-host interactions is critical; underpinned by accurate and comprehensive annotation of genomic sequence.

The pathogenic species within Ensembl (bacterial, fungal and protists) have continued to grow, accumulating genomic, transcriptomic, variation, pathogen-host interaction and comparative data. However, it was obvious that some key plant pathogens had incomplete and inconsistent gene sets. Through PhytoPath, a BBSRC-funded project in collaboration with Rothamsted Research, we were able to initiate successful community annotation projects to rectify this problem. In the first instance, we facilitated the collaboration of over 40 members of the Botrytis cinerea community spread across 8 countries. Using the gene editing tool Apollo, web-based training and infrastructure support from Ensembl Fungi, this group systematically reviewed the entire gene set. This new gene set was checked and disseminated via Ensembl Fungi.

Following this, a similar project was completed for another important pathogen, Blumeria graminis, also leading to a refined gene set and publication. Currently, we are engaging with the Zymoseptoria tritici (a growing threat to wheat) community to enable them to achieve the same.

While the tangible outcome of these projects appears to be a better gene set, it is apparent that the inherent agreement and dialogue among community members as they undergo this curation process is also consequential to the acceleration of the field. With increasing volumes of easily-accessible data, the role of unifying both the divergent data sets and the associated research teams is paramount. We anticipate that by continuing to be a hub for community-driven efforts, the microbial portals within Ensembl can pave the way for a new kind of collaboration among geographically dispersed pathogen research communities.

 

198 Towards an Ontologies Mapping Service

Ian Harrow, Jane Lomax, Yasmin Alam-Faruque, Ernesto Jimenez-Ruiz, Thomas Liener, Simon Jupp and The Ontologies Mapping Project

The Ontologies Mapping Project enables better tools and services for mapping between ontologies and to establish best practices for ontology management in the Life Sciences. It is now focussed on the development of an ontology mapping service (OMS). Three major requirements or deliverables were identified: 1) Development of an algorithm for mapping between ontologies hosted by the Ontology Lookup Service (OLS) and the Ontology Mapping Repository (OxO) at EMBL-EBI; 2) Prediction of 10 ontology mappings between 5 public ontologies in the phenotype and disease domain 3) Evaluation of the predicted ontology mappings by comparison to a silver standard and via random sampling and assessment by experts. By the end of the prototype service period, the algorithm parameters had been optimised for recall (63-82%) and precision (32-69%). In addition, the 4 mappings between MeSH and the 4 public ontologies were optimised for recall (81-100%) and precision (>90%). The predicted mappings generated by this service will become accessible openly via the OxO repository. We are now planning to optimise the parameters of the algorithm to predict mappings between ontologies in the biological and chemical laboratory domain. This will test the generic applicability of the new ontology matching algorithm.

 

199 Histories of curation

Jenny Bangham, University of Cambridge

How do scientific communities come together to produce community resources? What kinds of professional experts do those produce? How have those professionals historically negotiated and shaped the needs of communities? I am a historian of science, and my Wellcome funded project addresses these questions by tracing the early history of ‘FlyBase’. I am exploring the politics, infrastructures and professional expertise produced by database technologies. I am investigating what difference these have made to biology and biomedicine.

Part of this project is to chart the history of curation in biology and specifically the practices and professional lives of curators (of FlyBase, and other databases). How have people become biocurators? What kinds of work do curators do (in the past and today)? How have their decisions shaped biological knowledge? I am keen to talk to curators about their views and experiences!

 

200 People + Technology + Data + Credit: Developing a sustainable community-driven approach to attribution

Nicole Vasilevsky, Marijane White, Karen Gutzman, David Eichmann, Annie Wescott, Patty Smith, Sara Gonzales, Melissa Haendel and Kristi Holmes

Open science, team science, and a drive to understand meaningful outcomes have transformed research at all levels. It is not sufficient to consider scholarship simply from the perspective of papers written, citations garnered, and grant dollars awarded. Biocurators contribute to research and scholarship in ways that are not always traditionally recognized. We need a more nuanced characterization and contextualization of contributions of varying types and intensities that are critical to power research. Unfortunately, little infrastructure exists to identify, aggregate, present, and understand the impact of these non-traditional contributions. Moreover, these challenges are technical as well as social and require an approach that assimilates cultural perspectives for investigators and organizations, alike. Here we will present ongoing work through the US National Center for Data to Health (CD2H) to address these challenges, including the creation of the Contribution Role Ontology, a structured representation of scholarly contributions and research outputs. By using contributor roles & research object types to develop infrastructure to understand the scholarly ecosystem, we can better understand, leverage, and credit a diverse workforce and community.

Session topic: Data standards and ontologies: Making data FAIR