Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 May 8;5(1):23.
doi: 10.1186/1758-2946-5-23.

The ChEMBL database as linked open data

Affiliations

The ChEMBL database as linked open data

Egon L Willighagen et al. J Cheminform. .

Abstract

Background: Making data available as Linked Data using Resource Description Framework (RDF) promotes integration with other web resources. RDF documents can natively link to related data, and others can link back using Uniform Resource Identifiers (URIs). RDF makes the data machine-readable and uses extensible vocabularies for additional information, making it easier to scale up inference and data analysis.

Results: This paper describes recent developments in an ongoing project converting data from the ChEMBL database into RDF triples. Relative to earlier versions, this updated version of ChEMBL-RDF uses recently introduced ontologies, including CHEMINF and CiTO; exposes more information from the database; and is now available as dereferencable, linked data. To demonstrate these new features, we present novel use cases showing further integration with other web resources, including Bio2RDF, Chem2Bio2RDF, and ChemSpider, and showing the use of standard ontologies for querying.

Conclusions: We have illustrated the advantages of using open standards and ontologies to link the ChEMBL database to other databases. Using those links and the knowledge encoded in standards and ontologies, the ChEMBL-RDF resource creates a foundation for integrated semantic web cheminformatics applications, such as the presented decision support.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The various resource types found in the ChEMBL-RDF triples. Some entities are subclasses of common classes, using a rdfs:subClassOf predicate, such as target:t100122, while others are instances using the rdf:type predicate, such as assay:a17. The predicates between classes are also provided, showing how the resources are semantically linked.
Figure 2
Figure 2
The links out of the ChEMBL-RDF data into the Linked Open Data cloud versions of various external databases. Edges are labeled by the predicates making the links.
Figure 3
Figure 3
Screenshot of the CitedIn web service showing a 2010 Nature paper cited 94671 times in the ChEMBL database. The Details button on the webpage links to a ChEMBL webpage with detail on what parts of the database are linked to that paper.
Figure 4
Figure 4
Screenshot from the Bioclipse Decision Support with results from a combined ChemSpider and ChEMBL-RDF search. The top left canvas contains the query structure, carbamezapine, and the top right canvas shows its near neighbors found in ChemSpider for which ChEMBL-RDF data exists. The lower right canvas shows the chemical structure selected in the top right canvas, and the lower left canvas shows the available activities in ChEMBL-RDF for this compound.

References

    1. Staab CA, Ceder R, Jägerbrink T, Nilsson JA, Roberg K, Jörnvall H, Höög JO, Grafström RC. Bioinformatics processing of protein and transcript profiles of normal and transformed cell lines indicates functional impairment of transcriptional regulators in Buccal Carcinoma. J Proteome Res. 2007;5(9):3705–3717. doi: 10.1021/pr070308q. - DOI - PubMed
    1. Belleau F, Nolin MA, Tourigny N, Rigault P, Morissette J. Bio2RDF: Towards a mashup to build bioinformatics knowledge systems. J Biomed Inform. 2008;5(5):706–716. doi: 10.1016/j.jbi.2008.03.004. - DOI - PubMed
    1. Samwald M, Jentzsch A, Bouton C, Kallesoe C, Willighagen E, Hajagos J, Marshall S, Prud’hommeaux E, Hassanzadeh O, Pichler E, Stephens S. Linked open drug data for pharmaceutical research and development. J Cheminformatics. 2011;5:19. doi: 10.1186/1758-2946-3-19. - DOI - PMC - PubMed
    1. Williams AJ, Harland L, Groth P, Pettifer S, Chichester C, Willighagen EL, Evelo CT, Blomberg N, Ecker G, Goble C, Mons B. Open PHACTS: semantic interoperability for drug discovery. Drug Discov Today. 2012;5(21–22) - PubMed
    1. Gaulton A, Bellis L, Bento AP, Chambers J, Davies M, Hersey A, Light Y, McGlinchey S, Michalovich D, Al-Lazikani B, Overington J. ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res. 2012;5(D1):D1100–D1107. doi: 10.1093/nar/gkr777. - DOI - PMC - PubMed