Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 May 13;64(9):3640-3649.
doi: 10.1021/acs.jcim.3c02082. Epub 2024 Feb 15.

Image2InChI: Automated Molecular Optical Image Recognition

Affiliations

Image2InChI: Automated Molecular Optical Image Recognition

Da-Zhou Li et al. J Chem Inf Model. .

Abstract

The accurate identification and analysis of chemical structures in molecular images are prerequisites of artificial intelligence for drug discovery. It is important to efficiently and automatically convert molecular images into machine-readable representations. Therefore, in this paper, we propose an automated molecular optical image recognition model based on deep learning, called Image2InChI. Additionally, the proposed Image2InChI introduces a novel feature fusion network with attention to integrate image patch and InChI prediction. The improved SwinTransformer as an encoder and the Transformer Decoder as a decoder with patch embedding are applied to predict the image features for the corresponding InChI. The experimental results showed that the Image2InChI model achieves an accuracy of InChI (InChI acc) of 99.8%, a Morgan FP of 94.1%, an accuracy of maximum common structures (MCS acc) of 94.8%, and an accuracy of longest common subsequence (LCS acc) of 96.2%. The experiments demonstrated that the proposed Image2InChI model improves the accuracy and efficiency of molecular image recognition and provided a valuable reference about optical chemical structure recognition for InChI.

PubMed Disclaimer

References

MeSH terms

LinkOut - more resources