Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Apr 29:11:220.
doi: 10.1186/1471-2105-11-220.

Integration of open access literature into the RCSB Protein Data Bank using BioLit

Affiliations

Integration of open access literature into the RCSB Protein Data Bank using BioLit

Andreas Prlić et al. BMC Bioinformatics. .

Abstract

Background: Biological data have traditionally been stored and made publicly available through a variety of on-line databases, whereas biological knowledge has traditionally been found in the printed literature. With journals now on-line and providing an increasing amount of open access content, often free of copyright restriction, this distinction between database and literature is blurring. To exploit this opportunity we present the integration of open access literature with the RCSB Protein Data Bank (PDB).

Results: BioLit provides an enhanced view of articles with markup of semantic data and links to biological databases, based on the content of the article. For example, words matching to existing biological ontologies are highlighted and database identifiers are linked to their database of origin. Among other functions, it identifies PDB IDs that are mentioned in the open access literature, by parsing the full text for all research articles in PubMed Central (PMC) and exposing the results as simple XML Web Services. Here, we integrate BioLit results with the RCSB PDB website by using these services to find PDB IDs that are mentioned in research articles and subsequently retrieving abstract, figures, and text excerpts for those articles. A new RCSB PDB literature view permits browsing through the figures and abstracts of the articles that mention a given structure. The BioLit Web Services that are providing the underlying data are publicly accessible. A client library is provided that supports querying these services (Java).

Conclusions: The integration between literature and websites, as demonstrated here with the RCSB PDB, provides a broader view for how a given structure has been analyzed and used. This approach detects the mention of a PDB structure even if it is not formally cited in the paper. Other structures related through the same literature references can also be identified, possibly providing new scientific insight. To our knowledge this is the first time that database and literature have been integrated in this way and it speaks to the opportunities afforded by open and free access to both database and literature content.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Literature tab view for PDB ID 1KX3from the RCSB PDB. The view provides the following data fields: Primary Citation for the protein structure; Publication Details: MeSH Keywords for the article and related citations from iHOP and GeneRIF. Related Citations in PDB entry as provided by the depositors of the structure; PubMed Central articles are articles identified by BioLit that mention the PDB ID; Other PDB IDs (not shown) that co-occur with 1KX3 in PubMedCentral articles.

References

    1. Bourne PE. Will a biological database be different from a biological journal? PLoS Computational Biology. 2005;1(3):179–181. doi: 10.1371/journal.pcbi.0010034. - DOI - PMC - PubMed
    1. Bourne PE, Fink JL, Gerstein M. Open access: taking full advantage of the content. PLoS Computational Biology. 2008;4(3):e1000037. doi: 10.1371/journal.pcbi.1000037. - DOI - PMC - PubMed
    1. Fink L, Bourne P. Reinventing Scholarly Communication for the Electronic Age. CTWatch Quarterly. 2007;3(3)
    1. Bourne PE, McEntyre J. Biocurators: contributors to the world of science. PLoS Computational Biology. 2006;2(10):e142. doi: 10.1371/journal.pcbi.0020142. - DOI - PMC - PubMed
    1. Fink J, Kushch S, Williams P, Bourne P. BioLit: Integrating Biological Literature with Databases. Nucleic Acids Research. 2008;36(11):W385–9. doi: 10.1093/nar/gkn317. - DOI - PMC - PubMed

Publication types

LinkOut - more resources