Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comment
. 2023 Aug;9(8):mgen001088.
doi: 10.1099/mgen.0.001088.

Caution regarding the specificities of pan-cancer microbial structure

Affiliations
Comment

Caution regarding the specificities of pan-cancer microbial structure

Abraham Gihawi et al. Microb Genom. 2023 Aug.

Abstract

Results published in an article by Poore et al. (Nature. 2020;579:567-574) suggested that machine learning models can almost perfectly distinguish between tumour types based on their microbial composition using machine learning models. Whilst we believe that there is the potential for microbial composition to be used in this manner, we have concerns with the paper that make us question the certainty of the conclusions drawn. We believe there are issues in the areas of the contribution of contamination, handling of batch effects, false positive classifications and limitations in the machine learning approaches used. This makes it difficult to identify whether the authors have identified true biological signal and how robust these models would be in use as clinical biomarkers. We commend Poore et al. on their approach to open data and reproducibility that has enabled this analysis. We hope that this discourse assists the future development of machine learning models and hypothesis generation in microbiome research.

Keywords: bacteria; cancer; contamination; machine learning; microbiome; viruses.

PubMed Disclaimer

Conflict of interest statement

Colin S. Cooper, Daniel S. Brewer and Abraham Gihawi are co-inventors on a patent application (UK Patent Application No. 2200682.9) from the University of East Anglia/UEA Enterprises Limited regarding the application of biomarker bacterial genera in prostate cancer.

Figures

Fig. 1.
Fig. 1.
(a) Voom-SNM normalized TCGA samples (n=17 624) that were negative for crustacean virus hepandensovirus with zero classified reads in the original Kraken dataset with the most stringent decontamination approach. One sample contained two sequencing reads for Hepandensovirus, which has been omitted from this figure to illustrate inappropriate variation introduced by SNM. The colour of each point indicates the centre where the sample was sequenced and from where the resulting data were submitted [University of North Carolina, Harvard Medical School, Canada’s Michael Smith Genome Sciences Centre, Broat Institute MIT and Harvard, Baylor College of Medicine, Washington University School of Medicine, MD Anderson – Institute for Applied Cancer Science, Johns Hopkins/University of Southern California, MD Anderson RPPA Core Facility (Proteomics)]. The x-axis demonstrates cancer types using TCGA abbreviations as in Poore et al. [1]. This is a prominent concern, especially given how closely linked sequencing centre and disease type are (Table S3). Raw (b) and Voom-SNM normalized (c) Ignicoccus values, which was deemed the most important feature for predicting prostate cancer (PCa) from all other cancer types (n=13 883 primary tumours). Median values are as follows: Kraken raw other 0, Kraken raw PCa 1, normalized other 4.49, normalized PCa 5.82. In both the raw and normalized cases, the distributions are significantly different (Wilcox signed rank-sum test P<2.2×10–16).

Comment on

References

    1. Poore GD, Kopylova E, Zhu Q, Carpenter C, Fraraccio S, et al. Microbiome analyses of blood and tissues suggest cancer diagnostic approach. Nature. 2020;579:567–574. doi: 10.1038/s41586-020-2095-1. - DOI - PMC - PubMed
    1. Whalen S, Schreiber J, Noble WS, Pollard KS. Navigating the pitfalls of applying machine learning in genomics. Nat Rev Genet. 2022;23:169–181. doi: 10.1038/s41576-021-00434-9. - DOI - PubMed
    1. Cotmore SF, Agbandje-McKenna M, Chiorini JA, Mukha DV, Pintel DJ, et al. The family Parvoviridae. Arch Virol. 2014;159:1239–1247. doi: 10.1007/s00705-013-1914-1. - DOI - PMC - PubMed
    1. Hosoya S, Adachi K, Kasai H. Thalassomonas actiniarum sp. nov. and Thalassomonas haliotis sp. nov., isolated from marine animals. Int J Syst Evol Microbiol. 2009;59:686–690. doi: 10.1099/ijs.0.000539-0. - DOI - PubMed
    1. Liu T, Zhang Y, Zhang X, Zhou L, Meng C, et al. Leucothrix sargassi sp. nov., isolated from a marine alga [Sargassum natans (L.) Gaillon] Int J Syst Evol Microbiol. 2019;69:3857–3862. doi: 10.1099/ijsem.0.003694. - DOI - PubMed