Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2025 May 8;392(18):1836-1845.
doi: 10.1056/NEJMra2413243.

Chemical Complexity of Food and Implications for Therapeutics

Affiliations
Review

Chemical Complexity of Food and Implications for Therapeutics

Giulia Menichetti et al. N Engl J Med. .
No abstract available

PubMed Disclaimer

Figures

Figure 1:
Figure 1:. Structural Characterization of the Molecules in the NDM Database.
Panel A shows the chemical diversity in the Nutrition Dark Matter (NDM) database. With the use of contextual embedding vectors generated by the MoLFormer chemical language model and reduced to two dimensions with UMAP (Uniform Manifold Approximation and Projection), the space of food molecules in the NDM database can be visualized. Molecules are color-coded on the basis of structural subclasses (ClassyFire chemical taxonomy), each containing at least 500 compounds. Panel B shows the chemical compounds in garlic. The same map highlights the 6802 molecules documented in garlic, categorized by structural subclasses. Three key chemical compounds — allicin and ajoene (organosulfur compounds) and p-coumaric acid (a polyphenol) — are emphasized for their relevance to human health, yet they are often overlooked in food composition databases. Resources Djoumbou Feunang Y, Eisner R, Knox C, et al. ClassyFire: automated chemical classification with a comprehensive, computable taxonomy. J Cheminform 2016;8:61. [Medline] McInnes L, Healy J, Melville J. UMAP: uniform manifold approximation and projection for dimension reduction. Version 3. September 18, 2020 (https://arxiv.org/abs/1802.03426). preprint.
Figure 2:
Figure 2:. Comparison of Molecules in the NDM Database and DrugBank.
Panel A shows drug compounds clustered in a distinct region of the NDM database. With the use of MoLFormer and UMAP, DrugBank small molecules are visualized within the NDM space. Two antiplatelet agents — rosmarinic acid (a polyphenol) and clopidogrel (a synthetic drug) — coexist in the same neighborhood of the map. Panel B shows localization of 115 small molecules classified as adrenergic antagonists in DrugBank, either as naturally occurring or synthetic. Panel C shows that as compared with synthetic drugs, polyphenols have a wider range of structures and features, spreading across a broad region of the chemical space rather than clustering in a narrow neighborhood.
Figure 3:
Figure 3:. Protein Targets of NDM Mapped onto the Human Interactome.
Using node2vec to compress the complex wiring of the human interactome into a 64-dimensional protein-embedding vector, we further apply UMAP to visualize the relative positions of 18,659 human proteins on the basis of their 354,659 physical interactions in a 2-dimensional space. Proteins that frequently co-occur through random walks on the interactome cluster closely together in this space. To facilitate visualization, we have not drawn links between the haveproteins. Proteins that have experimentally validated binding interactions with small molecules from the NDM library are highlighted in dark pink, revealing that food molecules target nearly half the interactome (8997 proteins). Resources Grover A, Leskovec J. node2vec: scalable feature learning for networks. KDD 2016;2016:855–64. [Medline]
Figure 4:
Figure 4:. Co-Occurrence of Food Molecules and Overlapping Mechanisms of Action.
Panel A shows a Spearman correlation matrix of nutrient concentrations for 108 raw fruits and vegetables, profiled with the same resolution by the U.S. Department of Agriculture (USDA). The clusters of highly correlated nutrients indicate compounds likely to have similar production by source plant metabolism. For example, this is the case for vitamin K1 and lutein, which show a high correlation (r=0.6349), despite a difference of at least one order of magnitude in the average concentration. In contrast, delphinidin, an anthocyanidin, belongs to a distinct cluster, indicating no shared biosynthetic pathway. Vitamin K1, lutein, and delphinidin are highlighted in bold. As shown in Panel B, food molecules produced together tend to target similar regions of the human interactome. Anthocyanidins and anthocyanins, such as delphinidin, cyanidin, malvidin, and pelargonidin, which differ primarily with respect to the presence of sugar groups, share experimentally validated targets within a common neighborhood of the interactome. The 10 protein targets are color-coded on the basis of whether one molecule or multiple molecules bind to them.

References

    1. Department of Agriculture. Food and Nutrition Security (https://www.nal.usda.gov/human-nutrition-and-food-safety/nutrition-security).
    1. Centers for Disease Control and Prevention. Fast facts: health and economic costs of chronic conditions. July 12, 2024. (https://www.cdc.gov/chronic-disease/data-research/facts-stats/index.html).
    1. Office of Disease Prevention and Health Promotion. Food is medicine: a project to unify and advance collective action (https://health.gov/our-work/nutrition-physical-activity/food-medicine#4).
    1. Khera AV, Emdin CA, Drake I, et al. Genetic risk, adherence to a healthy lifestyle, and coronary disease. N Engl J Med 2016;375:2349–58. - PMC - PubMed
    1. National Cancer Institute. Past CSSI programs. June 17, 2024. (https://www.cancer.gov/about-nci/organization/cssi/research/past-program...).

LinkOut - more resources