Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jul 21;15(1):26486.
doi: 10.1038/s41598-025-10999-8.

Nanopore full length 16S rRNA gene sequencing increases species resolution in bacterial biomarker discovery

Affiliations

Nanopore full length 16S rRNA gene sequencing increases species resolution in bacterial biomarker discovery

Pablo Aja-Macaya et al. Sci Rep. .

Abstract

Discovery of disease-related bacterial biomarkers could be a useful approach for early prevention or diagnosis of various afflictions, such as colorectal cancer. This typically involves analyzing small regions of the 16S rRNA gene (e.g. V3V4) through short-read technologies like Illumina, obtaining genus-level results. However, recent developments in third-generation sequencing, such as Oxford Nanopore Technologies (ONT)'s new R10.4.1 chemistry and its improved basecalling models, are beginning to allow for a more complete and accessible species-level analysis through full-length 16S rRNA gene sequencing (spanning regions V1-V9). Thus, the goal of this study was to compare and evaluate both approaches, using colorectal cancer biomarker discovery as a representative case. This was achieved through the analysis of feces from 123 subjects, comparing both methods (Illumina-V3V4 with DADA2 and QIIME2 vs. ONT-V1V9 with Emu), multiple Dorado basecalling models (fast, hac and sup) and multiple databases (SILVA vs. Emu's Default database). Basecalling models broadly resulted in similar taxonomic output, but had significantly higher observed species and different taxonomic identification the lower the basecalling quality (p-value<0.05). Database choice with Emu influenced the identified species greatly, with Emu's Default database obtaining significantly higher diversity and identified species than SILVA (p-value<0.05). However, it overconfidently classified at times what should be an unknown species as the closest match due to its database structure. Bacterial abundance between Illumina-V3V4 and ONT-V1V9 at the genus level correlated well (R2≥0.8). Nanopore sequencing identified more specific bacterial biomarkers for colorectal cancer than those obtained with Illumina, such as Parvimonas micra, Fusobacterium nucleatum, Peptostreptococcus stomatis, Peptostreptococcus anaerobius, Gemella morbillorum, Clostridium perfringens, Bacteroides fragilis and Sutterella wadsworthensis. Prediction of colorectal cancer through manual feature selection and machine learning resulted in an AUC of 0.87 with 14 species or 0.82 with just 4 species (P. micra, F. nucleatum, B. fragilis and Agathobaculum butyriciproducens). Full 16S rRNA V1V9 sequencing through Oxford Nanopore and its new R10.4.1 chemistry achieved accurate species-level bacterial identification, facilitating the discovery of more precise disease-related biomarkers and increasing the taxonomic fidelity of future microbiome analyses.

Keywords: 16S rRNA; Biomarkers; Colorectal cancer; Gut microbiome; Illumina; Metabarcoding; Oxford Nanopore; V1V9.

PubMed Disclaimer

Conflict of interest statement

Declarations. Competing interests: All the authors declare no competing interests. Ethics approval and consent to participate: This study adhered to the standards of clinical practice and research regulations (Law of Biomedical Research 14/2007), in agreement with the Declaration of Helsinki and the Convention on Human Rights and Biomedicine. Compliance with the protection of non-public personal data of all those involved within the RGPD – UE 2016/679, LOPDGDD 3/2018 Law 41/2002 and its implementing regulations, Royal Decree 1720/2007, were enforced. This project (PI20/00413), granted by Carlos III Health Institute (ISCIII; Spain), was supervised by the local ethics committee, the Research Ethical Committee of Galicia (code CEIm-G 2018/609, Galicia, Spain), and by the Spanish Agency for Medicines and Healthcare Products (AEMPS) for the use of CRC patients’ samples from CHUAC (A Coruña, Galicia, Spain). The experimental protocols were aproved by the Ethical Committee of Galicia (CEIm-G 2018/609, Galicia, Spain). Informed consent for Biobank (CHUAC, A Coruña, Galicia, Spain, UNE-EN ISO 9001-2015) was signed by all individuals grouped in this study. Anonymized clinical data used during the study for CRC patients was obtained from the Galician Health Service (SERGAS). All individuals recruited in this project (cancer patients and controls) signed a formal consent form for the publication of scientific and clinical results in scientific articles.

Figures

Fig. 1
Fig. 1
Comparison of ONT-V1V9 basecalling models. (A) Distribution of the average quality of reads per basecalling model. (B) formula image-diversity analysis using SILVA database and MDS+JSD (Multidimensional Scaling and Jensen-Shannon Divergence). The same samples with different models are connected by lines. (C) formula image-diversity analysis using SILVA and Emu’s Default database. Significance levels are given through Wilcoxon rank sum tests and comparisons between groups are not shown.
Fig. 2
Fig. 2
Comparison of Illumina-V3V4 and ONT-V1V9 approaches using SILVA. (A) formula image-diversity indexes at the genus level, comparing the two volunteer groups (cancer vs. control). (B) formula image-diversity analysis at the genus level. (C) formula image-diversity analysis at the species level. (D) Mean Centered Log Ratio (CLR) abundance correlation between approaches and for each group (cancer vs. control). Each point represents a different genus, and color indicates if that taxa appear, on average, on both, none or only one of the approaches. Three relevant genera, which contain multiple species related to colorectal cancer are highlighted (Fusobacterium, Parvimonas and Peptostreptococcus).
Fig. 3
Fig. 3
Comparison of Illumina-V3V4 and ONT-V1V9 approaches with both databases for specific genera in each subject group. Three important genera, which contain multiple species associated with colorectal cancer are highlighted (Fusobacterium, Parvimonas and Peptostreptococcus). (A) Centered Log Ratio (CLR) abundance of each genus per group using Wilcoxon rank sum tests to assess significance. (B) Percentage of samples where each genus is present per group.
Fig. 4
Fig. 4
Colorectal cancer biomarkers obtained with ONT-V1V9 and both databases. (A) Differential abundance analysis (DAA) through ANCOM-BC using ONT-V1V9 and SILVA. Control subjects are the reference group, meaning a higher Log Fold Change (LFC) indicates higher abundance of a taxon in the cancer group. (B) DAA through ANCOM-BC using ONT-V1V9 and Emu’s Default database. Control subjects are the reference group, meaning a higher Log Fold Change (LFC) indicates higher abundance of a taxon in the cancer group. (C) Centered Log Ratio (CLR) abundance of species with significant differences between cancer and control groups, indicated through Wilcoxon rank sum tests. (D) AUC of two models (Illumina-V3V4 or ONT-V1V9) using manually selected features based on read identity to reference, ANCOM-BC and CLR abundance. Sample size (n) refers to the number of features included in each model. A complete list of species and genera included in each combination is found in Supplementary Table 1.

Similar articles

References

    1. Austin, B. The value of cultures to modern microbiology. Antonie Van Leeuwenhoek110(10), 1247–1256 (2017). - PubMed
    1. Locey, K. J. & Lennon, J. T. Scaling laws predict global microbial diversity. Proc. Natl. Acad. Sci.113(21), 5970–5975 (2016). - PMC - PubMed
    1. Větrovskỳ, T. & Baldrian, P. The variability of the 16S rRNA gene in bacterial genomes and its consequences for bacterial community analyses. PloS one8(2), 57923 (2013). - PMC - PubMed
    1. Mandlik, J. S., Patil, A. S. & Singh, S. Next-generation sequencing (NGS): Platforms and applications. J. Pharm. Bioallied Sci.16(Suppl 1), 41–45 (2024). - PMC - PubMed
    1. Gao, B. et al. An introduction to next generation sequencing bioinformatic analysis in gut microbiome studies. Biomolecules11(4), 530 (2021). - PMC - PubMed

LinkOut - more resources