Document Type
Article
Publication Date
2025
DOI
10.1073/pnas.2514534122
Publication Title
Proceedings of the National Academy of Sciences
Volume
122
Issue
47
Pages
e2514534122
Abstract
Throughout Earth’s history, organic molecules from both abiogenic and biogenic sources have been buried in sedimentary rocks. Most of these organic molecules have been significantly altered by geologic processes through deep time. Nonetheless, the nature and distribution of those ancient fragmentary organic remains have the potential to reveal diagnostic biomolecular information after billions of years of burial. Here, we analyzed 406 fossil, modern biological, meteoritic, and synthetic samples using pyrolysis gas chromatography and mass spectrometry. We explored these analytical data via supervised machine-learning methods to discriminate samples of biogenic vs. abiogenic origin, plant vs. animal phylogenetic affinity, and photosynthetic vs. nonphotosynthetic physiology. Dividing 272 samples with known phylogenetic affinity and physiology into 9 categories, each further divided into 75% training and 25% testing sets, our random forest models accurately predict pairwise assignments of modern vs. fossil or meteoritic organics (100% correct assignments), fossil plant tissues vs. meteoritic organics (97%), modern vs. fossil plant tissues (98%), and modern plants vs. animal tissues (95%). Pairwise comparisons between fossil biogenic samples vs. abiogenic samples resulted in 93% correct classifications, while analysis of modern and ancient photosynthetic vs. nonphotosynthetic samples also resulted in 93% correct assignments. Our analyses demonstrate that molecular biosignatures can survive in ancient fossils and allow for the identification of organismal origins and traits. Consistent with previous morphological and isotopic inferences, we present evidence for biogenic molecular assemblages in Paleoarchean rocks (3.33 Ga) and for photoautotrophy in Neoarchean rocks (2.52 Ga).
Rights
© 2025 The Authors.
This open access article is distributed under a Creative Commons Attribution-NonCommercial-NoDerivatives License 4.0 (CC BY-NC-ND).
Data Availability
Article states: "The data generated and analyzed in this manuscript can be found on the Open Science Framework repository titled “Organic geochemical evidence for life in Archean rocks identified by py–GC–MS and supervised machine learning” (https://doi.org/10.17605/OSF.IO/G93CS). The code for the paper can be found at https://github.com/PrabhuLab/PyGCMS-Biosign-ML. All data, code, and materials used in the analysis are available to any researcher for purposes of reproducing or extending the analysis. Licenses for the data and code usage and relevant attribution information will be updated on the respective repositories."
Original Publication Citation
Wong, M. L., Prabhu, A., Alexander, C. O., Cleaves, H. J., 2nd, Cody, G. D., Hystad, G., Bermanec, M., Bleeker, W., Boyce, C. K., Corpolongo, A., Czaja, A. D., Das, S., Gaines, R. R., Gregory, D. D., Jaszczak, J. A., Javaux, E. J., Jodder, J., Knoll, A. H., Van Kranendonk, M.,…Hazen, R. M. (2025). Organic geochemical evidence for life in Archean rocks identified by pyrolysis-GC-MS and supervised machine learning. Proceedings of the National Academy of Sciences, 122(47), Article e2514534122. https://doi.org/10.1073/pnas.2514534122
Repository Citation
Wong, Michael L.; Prabhu, Anirudh; Alexander, Conel O'D.; Cleaves II, H. James; Cody, George D.; Hystad, Grethe; Bermanec, Marko; Bleeker, Wouter; Boyce, C. Kevin; Corpolongo, Andrea; Czaja, Andrew D.; Das, Souvik; Gaines, Robert R.; Gregory, Daniel D.; Jaszczak, John A.; Javaux, Emmanuelle J.; Jodder, Jaganmoy; Knoll, Andrew H.; Kranendonk, Martin Van; Maloney, Katie M.; Noffke, Nora; Rainbird, Robert; Slaughter, Emersyn; Stüeken, Eva E.; Summons, Roger E.; Westall, Frances; Wiemann, Jasmina; Xiao, Shuhai; and Hazen, Robert M., "Organic Geochemical Evidence for Life in Archean Rocks Identified by Pyrolysis-GC-MS and Supervised Machine Learning" (2025). OES Faculty Publications. 563.
https://digitalcommons.odu.edu/oeas_fac_pubs/563
Supporting Information
Included in
Artificial Intelligence and Robotics Commons, Biochemistry, Biophysics, and Structural Biology Commons, Organic Chemistry Commons, Physiology Commons