Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 May 21;27(6):110041.
doi: 10.1016/j.isci.2024.110041. eCollection 2024 Jun 21.

PredCoffee: A binary classification approach specifically for coffee odor

Affiliations

PredCoffee: A binary classification approach specifically for coffee odor

Yi He et al. iScience. .

Abstract

Compared to traditional methods, using machine learning to assess or predict the odor of molecules can save costs in various aspects. Our research aims to collect molecules with coffee odor and summarize the regularity of these molecules, ultimately creating a binary classifier that can determine whether a molecule has a coffee odor. In this study, a total of 371 coffee-odor molecules and 9,700 non-coffee-odor molecules were collected. The Knowledge-guided Pre-training of Graph Transformer (KPGT), support vector machine (SVM), random forest (RF), multi-layer perceptron (MLP), and message-passing neural networks (MPNN) were used to train the data. The model with the best performance was selected as the basis of the predictor. The prediction accuracy value of the KPGT model exceeded 0.84 and the predictor has been deployed as a webserver PredCoffee.

Keywords: Chemistry; Computer science; Food science.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Figures

None
Graphical abstract
Figure 1
Figure 1
The workflow of our study
Figure 2
Figure 2
Clustering analysis of 271 coffee odor molecules The light gray circles represent the threshold radius of molecules (1/48 of the distance between the farthest two molecules), and two molecules with intersecting radii are divided into a group. The different colors of the molecules represent different categories, groups containing fewer than 7 molecules are shown in gray, and groups containing more than 7 molecules are shown in color.
Figure 3
Figure 3
HOMO and LUMO of the four representative molecules (A) Difurfuryl disulfide. (B) 2-Isopropyl-5-methylpyrazine. (C) (S)-2-methyl butyraldehyde. (D) Hexahydrophenol.
Figure 4
Figure 4
Molecular docking results of OR51E2 and four representative molecules (A) Difurfuryl disulfide docking with OR51E2 and active residues around it. (B) 2-Isopropyl-5-methylpyrazine docking with OR51E2 and active residues around it. (C) (S)-2-methyl butyraldehyde docking with OR51E2 and active residues around it. (D) Hexahydrophenol docking with OR51E2 and active residues around it.
Figure 5
Figure 5
Performance of 5 models on 6 performance metrics
Figure 6
Figure 6
Most important Morgan fingerprint substructure that MLP learned of molecules (A) Difurfuryl disulfide group. (B) 2-Isopropyl-5-methylpyrazine group. (C) (S)-2-methyl butyraldehyde group. (D) Hexahydrophenol group. Blue highlight indicated that the atomic or molecular fragment corresponds directly to the activated bit in Morgan’s fingerprint. Yellow highlight was used to indicate atoms that are adjacent to blue highlighted atoms or that contribute to the generation of fingerprint sites but do not directly determine their activation. Uncolored atoms did not contribute directly to generating the Morgan fingerprint bit of the current focus.
Figure 7
Figure 7
Factor analysis of the coffee/non-coffee dataset (A) The scree plot for eigenvalue with factor. (B) The radar chart of the screened descriptors contributing to 3 factors. (C) The heatmap of factor loading matrix. (D) Normalized 12 molecular properties of coffee and non-coffee.
Figure 8
Figure 8
Significant difference analysis of 12 properties of coffee and non-coffee molecules The p-value indicates the difference between two samples, the smaller the p-value, the more significant the difference.
Figure 9
Figure 9
Webserver of PredCoffee (A) Website homepage of PredCoffee. (B) Submit page of PredCoffee. (C) Result page of PredCoffee. (D) Chemical space of coffee molecules.

Similar articles

References

    1. Hatt H. Molecular and cellular basis of human olfaction. Chem. Biodivers. 2004;1:1857–1869. doi: 10.1002/cbdv.200490142. - DOI - PubMed
    1. Menini A., Lagostena L., Boccaccio A. Olfaction: from odorant molecules to the olfactory cortex. News Physiol. Sci. 2004;19:101–104. doi: 10.1152/nips.1507.2003. - DOI - PubMed
    1. Rinaldi A. The scent of life. The exquisite complexity of the sense of smell in animals and humans. EMBO Rep. 2007;8:629–633. doi: 10.1038/sj.embor.7401029. - DOI - PMC - PubMed
    1. Brookes J.C. Science is perception: what can our sense of smell tell us about ourselves and the world around us? Philos. Trans. A Math. Phys. Eng. Sci. 2010;368:3491–3502. doi: 10.1098/rsta.2010.0117. - DOI - PMC - PubMed
    1. Braun T., Doerr J.M., Peters L., Viard M., Reuter I., Prosiegel M., Weber S., Yeniguen M., Tschernatsch M., Gerriets T., et al. Age-related changes in oral sensitivity, taste and smell. Sci. Rep. 2022;12:1533. doi: 10.1038/s41598-022-05201-2. - DOI - PMC - PubMed

LinkOut - more resources