Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Oct 10;120(41):e2307149120.
doi: 10.1073/pnas.2307149120. Epub 2023 Sep 25.

A robust, agnostic molecular biosignature based on machine learning

Affiliations

A robust, agnostic molecular biosignature based on machine learning

H James Cleaves 2nd et al. Proc Natl Acad Sci U S A. .

Abstract

The search for definitive biosignatures-unambiguous markers of past or present life-is a central goal of paleobiology and astrobiology. We used pyrolysis-gas chromatography coupled to mass spectrometry to analyze chemically disparate samples, including living cells, geologically processed fossil organic material, carbon-rich meteorites, and laboratory-synthesized organic compounds and mixtures. Data from each sample were employed as training and test subsets for machine-learning methods, which resulted in a model that can identify the biogenicity of both contemporary and ancient geologically processed samples with ~90% accuracy. These machine-learning methods do not rely on precise compound identification: Rather, the relational aspects of chromatographic and mass peaks provide the needed information, which underscores this method's utility for detecting alien biology.

Keywords: biosignatures; carbonaceous meteorites; machine learning; organic chemistry; taphonomy.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interest.

Figures

Fig. 1.
Fig. 1.
Combined three-dimensional Pyr-GC-EI-MS data for complex organic mixtures in carbonaceous meteorites (A) and microbial samples (B). These graphs display peak intensities (vertical scale, normalized to the highest peak intensity) for 3,000 elution time bins (right-hand scale) and their mass spectra over 150 m/z bins (left-hand scale). Green circles with vertical “stems” do not represent intensity values, but rather features the machine-learning algorithm recognizes as important discriminants among samples.
Fig. 2.
Fig. 2.
Grouping of samples according to the machine-learning methods explored here. Biologically derived samples (green/blue) are distinguished from abiotic samples (orange). Taphonomically altered biological samples (blue) lie along a trend distinct from that of contemporary biological samples (green).

References

    1. Dobson C. M., Chemical space and biology. Nature 432, 824–828 (2004). - PubMed
    1. Wong M. L., Prabhu A., Cells as the first data scientists. Interface 20, 20220810 (2023). - PMC - PubMed
    1. Kvenvolden K., et al. , Evidence for extraterrestrial amino-acids and hydrocarbons in the Murchison meteorite. Nature 228, 923–926 (1970). - PubMed
    1. Kauffman S., Is there a fourth law for non-ergodic systems that do work to construct their expanding phase space? Entropy 24, 1383 (2022). - PMC - PubMed
    1. Mitchell P., Coupling of phosphorylation to electron and hydrogen transfer by a chemi-osmotic type of mechanism. Nature 191, 144–148 (1961). - PubMed

Publication types

LinkOut - more resources