Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Apr 12;119(15):e2116576119.
doi: 10.1073/pnas.2116576119. Epub 2022 Apr 4.

Transport features predict if a molecule is odorous

Affiliations

Transport features predict if a molecule is odorous

Emily J Mayhew et al. Proc Natl Acad Sci U S A. .

Abstract

In studies of vision and audition, stimuli can be chosen to span the visible or audible spectrum; in olfaction, the axes and boundaries defining the analogous odorous space are unknown. As a result, the population of olfactory space is likewise unknown, and anecdotal estimates of 10,000 odorants have endured. The journey a molecule must take to reach olfactory receptors (ORs) and produce an odor percept suggests some chemical criteria for odorants: a molecule must 1) be volatile enough to enter the air phase, 2) be nonvolatile and hydrophilic enough to sorb into the mucous layer coating the olfactory epithelium, 3) be hydrophobic enough to enter an OR binding pocket, and 4) activate at least one OR. Here, we develop a simple and interpretable quantitative model that reliably predicts whether a molecule is odorous or odorless based solely on the first three criteria. Applying our model to a database of all possible small organic molecules, we estimate that at least 40 billion possible compounds are odorous, six orders of magnitude larger than current estimates of 10,000. With this model in hand, we can define the boundaries of olfactory space in terms of molecular volatility and hydrophobicity, enabling representative sampling of olfactory stimulus space.

Keywords: machine learning; odor space; olfaction; physical transport.

PubMed Disclaimer

Conflict of interest statement

Competing interest statement: J.D.M. received research funding from Ajinomoto Co., Inc.

Figures

Fig. 1.
Fig. 1.
A model can accurately classify molecules as odorous or odorless based only on transport features. (A) Schematic of the transport process that molecules must complete to act as olfactory stimuli. To elicit an odor, molecules must reach the olfactory epithelium (OE), adsorb into the olfactory mucosa, enter OR binding pockets, and trigger OR neuron (ORN) activation. (B) Transport-feature ML model-generated odorous probabilities for all molecules in the dataset. Each dot represents one molecule colored by the ground truth, and the width of the violin plot is the density of molecules at a given prediction value. (C) Odorous and odorless molecules in transport space. LogP and log(vapor pressure [mmHg]) are plotted for each molecule in the dataset; odorous molecules are represented by circles, and odorless are represented by crosses; molecules are colored by transport-feature ML model-generated odorous probabilities. An LR-generated 50% odorous probability boundary for solids or liquids (Eq. 1) is plotted as a solid line, and the boundary for gases (Eq. 2) is plotted as a dashed line; increasing the value of any feature by X increases the log odds of odorousness by kX, where k is the corresponding model coefficient. (D) Density of odorous and odorless molecules in transport space defined by molecular weight and number of heteroatoms. Each successive contour line indicates a step increase in density (odorous, red = 0.05%; odorless, blue = 0.01%). Each molecule has an integer number of heteroatoms, but these values are jittered along the y axis to better show density. Plotted within the black box, molecules that obey the rule of three are generally odorous. (E) Heat map of mean AUROC generated by the transport ML, many-feature ML, and the rule of three models for molecules of common chemical classes (number of matching molecules in parentheses).
Fig. 2.
Fig. 2.
Common inaccuracies in data impact model performance. (A) Difference between experimentally determined BP values and BP values calculated using the Burnop (9) and Banks (8) methods. (B and C) Odor classification predictions by transport-feature ML models using BP values calculated by the (B) Burnop or (C) Banks method. (D) Human subject-classified molecules in transport space defined by BP and log P. Many clearly nonvolatile molecules were initially classified as odors due to odorous contaminants. (E) Transport-feature ML model odor predictions for human subject-classified molecules. Chemical compounds that are odorless but had odorous contaminants are correctly predicted to be odorless by the model.
Fig. 3.
Fig. 3.
The transport model can be used to predict the population of odor space. (A) Proportion of molecules predicted by the transport ML model to be odorous as a function of HAC. Red circles show the mean probability generated for HAC tranches from the GDB database (13) with SE indicated. (B) Estimated number of possible molecules and predicted odorous molecules from the GDB databases as a function of HAC. (C) Cumulative estimates of possible molecules and odorous molecules with increasing HAC on a logarithmic scale. The red data point at HAC 17 reflects our conservative estimate of 40 billion odorous molecules.
Fig. 4.
Fig. 4.
Visualization of olfactory space highlights understudied regions. (A) UMAP plot of known odorous molecules (green) and possible molecules from GDB-17 colored by their transport ML-predicted odorous probability. Many regions dense with probable odors are sparsely represented by known odors. (B) Eugenol, a known odorant. (CE) Example molecules from GDB-17 and their transport ML-predicted probability of being odorous (podor).

References

    1. Buck L., Axel R., A novel multigene family may encode odorant receptors: A molecular basis for odor recognition. Cell 65, 175–187 (1991). - PubMed
    1. Jaubert J.-N., Tapiero C., Dore J.-C., The field of odors: Toward a universal language for odor relationships. Perfum. Flavorist 20, 1–16 (1995).
    1. Hahn I., Scherer P. W., Mozell M. M., A mass transport model of olfaction. J. Theor. Biol. 167, 115–128 (1994). - PubMed
    1. Boelens H., Structure-activity relationships in chemoreception by human olfaction. Trends Pharmacol. Sci. 4, 421–426 (1983).
    1. Ruddigkeit L., Awale M., Reymond J. L., Expanding the fragrance chemical space for virtual screening. J. Cheminform. 6, 27 (2014). - PMC - PubMed

Substances