Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2014 Apr 28:6:17.
doi: 10.1186/1758-2946-6-17. eCollection 2014.

Chemical named entities recognition: a review on approaches and applications

Affiliations
Review

Chemical named entities recognition: a review on approaches and applications

Safaa Eltyeb et al. J Cheminform. .

Abstract

The rapid increase in the flow rate of published digital information in all disciplines has resulted in a pressing need for techniques that can simplify the use of this information. The chemistry literature is very rich with information about chemical entities. Extracting molecules and their related properties and activities from the scientific literature to "text mine" these extracted data and determine contextual relationships helps research scientists, particularly those in drug development. One of the most important challenges in chemical text mining is the recognition of chemical entities mentioned in the texts. In this review, the authors briefly introduce the fundamental concepts of chemical literature mining, the textual contents of chemical documents, and the methods of naming chemicals in documents. We sketch out dictionary-based, rule-based and machine learning, as well as hybrid chemical named entity recognition approaches with their applied solutions. We end with an outlook on the pros and cons of these approaches and the types of chemical entities extracted.

Keywords: Chemical entities; Chemical names; Information extraction.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Example of classes of chemical entities-bolded-extracted by different systems from the chemical literature.
Figure 2
Figure 2
The main steps for developing the chemical NER system.
Figure 3
Figure 3
Types of NER systems with some related techniques.

References

    1. Hawizy L, Jessop DM, Adams N, Murray-Rust P. ChemicalTagger: A tool for semantic text-mining in chemistry. J Cheminform. 2011;3:17. doi: 10.1186/1758-2946-3-17. - DOI - PMC - PubMed
    1. Klinger R, Kolárik C, Fluck J, Hofmann-Apitius M, Friedrich CM. Detection of IUPAC and IUPAC-like chemical names. Bioinformatics. 2008;24:i268–i276. doi: 10.1093/bioinformatics/btn181. - DOI - PMC - PubMed
    1. Borkent J, Oukes F, Noordik J. Chemical reaction searching compared in REACCS, SYNLIB, and ORAC. J Chem Inf Comput Sci. 1988;28:148–150. doi: 10.1021/ci00059a005. - DOI
    1. Brüggemann R, Voigt K. An evaluation of online databases by methods of lattice theory. Chemosphere. 1995;31:3585–3594. doi: 10.1016/0045-6535(95)00207-O. - DOI
    1. Banville DL. Mining chemical structural information from the drug literature. Drug Discov Today. 2006;11:35. doi: 10.1016/S1359-6446(05)03682-2. - DOI - PubMed