Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011;12 Suppl 4(Suppl 4):S6.
doi: 10.1186/1471-2105-12-S4-S6. Epub 2011 Jul 5.

Deploying mutation impact text-mining software with the SADI Semantic Web Services framework

Affiliations

Deploying mutation impact text-mining software with the SADI Semantic Web Services framework

Alexandre Riazanov et al. BMC Bioinformatics. 2011.

Abstract

Background: Mutation impact extraction is an important task designed to harvest relevant annotations from scientific documents for reuse in multiple contexts. Our previous work on text mining for mutation impacts resulted in (i) the development of a GATE-based pipeline that mines texts for information about impacts of mutations on proteins, (ii) the population of this information into our OWL DL mutation impact ontology, and (iii) establishing an experimental semantic database for storing the results of text mining.

Results: This article explores the possibility of using the SADI framework as a medium for publishing our mutation impact software and data. SADI is a set of conventions for creating web services with semantic descriptions that facilitate automatic discovery and orchestration. We describe a case study exploring and demonstrating the utility of the SADI approach in our context. We describe several SADI services we created based on our text mining API and data, and demonstrate how they can be used in a number of biologically meaningful scenarios through a SPARQL interface (SHARE) to SADI services. In all cases we pay special attention to the integration of mutation impact services with external SADI services providing information about related biological entities, such as proteins, pathways, and drugs.

Conclusion: We have identified that SADI provides an effective way of exposing our mutation impact data such that it can be leveraged by a variety of stakeholders in multiple use cases. The solutions we provide for our use cases can serve as examples to potential SADI adopters trying to solve similar integration problems.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Mutation impact ontology structure. Visualization of top level concepts as Mutation Specification, Protein, Mutation Impact and Protein Property being connected through object property predicates.
Figure 2
Figure 2
Listing of the final SPARQL query for use case 1. This SPARQL formalises “Given a list of publications, identify mutations studied in the papers with their wildtype proteins and impacts on protein properties”.
Figure 3
Figure 3
Listing of the baseline SPARQL query for use case 2. This SPARQL formalises “Find all mutations and the structure images of wild type proteins that were mutated, where the impact of the mutation is an enhanced haloalkane dehalogenase activity”.
Figure 4
Figure 4
Screenshot of a Jmol rendering of the structure of P51698 with L248I. This image was obtained by running the Jmol viewer on a PDB file representing the amino acid sequence of protein with the UniProt ID P51698. The highlighted amino acid is the wildtype of the point mutation L248I.
Figure 5
Figure 5
Listing of the extended functionality query for use case 2. Improves on the query in Figure 4 by requesting mutations to be shown on the protein 3D structure.
Figure 6
Figure 6
Listing of the SPARQL query for use case 3. This SPARQL formalises “Find all pathways, together with the corresponding pathway images, that might have been altered by a mutation of the protein Fibroblast growth factor receptor 3”.
Figure 7
Figure 7
Listing of the SPARQL query for use case 4. This SPARQL formalises “Find all drugs related to mutated proteins, together with their interaction partners, where the mutation impact is a decreased carbonic anhydrase activity”.
Figure 8
Figure 8
Listing of the SPARQL query for use case 5. This SPARQL formalises “From the literature find all reported mutations of the protein with the nsSNP rs2305178”.

Similar articles

Cited by

References

    1. Baumgartner WA, Cohen KB, Fox L, Acquaah-Mensah G, Hunter L. Manual annotation is not sufficient for curating genomic databases. Bioinformatics. 2007;23:i41–i48. doi: 10.1093/bioinformatics/btm229. - DOI - PMC - PubMed
    1. Laurilla J, Naderi N, Witte R, Riazanov A, Kouznetsov A, Baker CJO. Algorithms and semantic infrastructure for mutation impact extraction and grounding. BMC Genomics. 2010;11(Suppl 4):S24. doi: 10.1186/1471-2164-11-S4-S24. - DOI - PMC - PubMed
    1. Cunningham H, Maynard D, Bontcheva K, Tablan V. GATE: A Framework And Graphical Development Environment For Robust NLP Tools And Applications. Proceedings of the 40th Anniversary Meeting of the Association for Computational Linguistics (ACL’02) 2002.
    1. Lau EY, Kahn K, Bash P, Bruice T. The importance of reactant positioning in enzyme catalysis: a hybrid quantum mechanics/molecular mechanics study of a haloalkane dehalogenase. Proc. Natl. Acad. Sci. USA. 2000;97(18):9937–42. - PMC - PubMed
    1. Rajaraman K, Choo KH, Ranganathan S, Baker CJO. A Workflow for Mutation Extraction and Structure Annotation. J. Bioinfor-matics and Computational Biology. 2007;5(6):1319–1337. doi: 10.1142/S0219720007003119. - DOI - PubMed

LinkOut - more resources