Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Sep 12;25(1):856.
doi: 10.1186/s12864-024-10778-1.

Optimizing next-generation sequencing efficiency in clinical settings: analysis of read length impact on cost and performance

Affiliations

Optimizing next-generation sequencing efficiency in clinical settings: analysis of read length impact on cost and performance

Pedro Milet Meirelles et al. BMC Genomics. .

Abstract

Background: The expansion of sequencing technologies as a result of the response to the COVID-19 pandemic enabled pathogen (meta)genomics to be deployed as a routine component of surveillance in many countries. Scaling genomic surveillance, however, comes with associated costs in both equipment and sequencing reagents, which should be optimized. Here, we evaluate the cost efficiency and performance of different read lengths in identifying pathogens in metagenomic samples. We carefully evaluated performance metrics, costs, and time requirements relative to choices of 75, 150 and 300 base pairs (bp) read lengths in pathogen identification.

Results: Our findings revealed that moving from 75 bp to 150 bp read length approximately doubles both the cost and sequencing time. Opting for 300 bp reads leads to approximately two- and three-fold increases, respectively, in cost and sequencing time compared to 75 bp reads. For viral pathogen detection, the sensitivity median ranged from 99% with 75 bp reads to 100% with 150-300 bp reads. However, bacterial pathogens detection was less effective with shorter reads: 87% with 75 bp, 95% with 150 bp, and 97% with 300 bp reads. These findings were consistent across different levels of taxa abundance. The precision of pathogen detection using shorter reads was comparable to that of longer reads across most viral and bacterial taxa.

Conclusions: During disease outbreak situations, when swift responses are required for pathogen identification, we suggest prioritizing 75 bp read lengths, especially if detection of viral pathogens is aimed. This practical approach allows better use of resources, enabling the sequencing of more samples using streamlined workflows, while maintaining a reliable response capability.

Keywords: Cost efficiency; Health surveillance; Metagenomics; Pathogen detection.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Sensitivity and precision for viral (A and B) and bacterial (C and D) pathogens species identification for 75, 150, and 300 bp read lengths. Each boxplot represents the distribution of these metrics, and the black horizontal line within each box indicates the median value for the respective metric at each read length
Fig. 2
Fig. 2
Accuracy, precision, sensitivity, and specificity for viral pathogen species identification for 75, 150, and 300 bp read lengths. Each row in the heatmap corresponds to a specific viral pathogen species. At the same time, the color gradient indicates the mean percentage value of each metric, calculated for the taxon considering the samples in which it was present. Dark red color represents higher values and white represents lower values
Fig. 3
Fig. 3
Sensitivity for viral pathogens. We categorized samples as WoCS (Without Close Strain) and WiCS (With Close Strain). In this plot we report the results of 300 bp read length. Each column represents a sample, the gray color indicates the absence of the taxon, the color gradient indicates the sensitivity for the taxon, dark red represents higher values and white represents lower values. The WoCS and WiCS groups were composed of the mock metagenomes from PNPT-WoCS plus NPT-WoCS and PNPT-WiCS plus NPT-WiCS, respectively. We show only the samples whose composition has at least one virus from the pathogenic list, so the direct sum of the number of samples of these groups may not be the same as the WoCS and WiCS samples in the plot

Similar articles

References

    1. Morgan OW, Aguilera X, Ammon A, Amuasi J, Fall IS, Frieden T, et al. Disease surveillance for the COVID-19 era: time for bold changes. Lancet. 2021;397:2317–9. 10.1016/S0140-6736(21)01096-5 - DOI - PMC - PubMed
    1. Al Knawy B, Adil M, Crooks G, Rhee K, Bates D, Jokhdar H, et al. The Riyadh Declaration: the role of digital health in fighting pandemics. Lancet. 2020;396:1537–9. 10.1016/S0140-6736(20)31978-4 - DOI - PMC - PubMed
    1. Biswas N, Mallick P, Maity SK, Bhowmik D, Mitra AG, Saha S, et al. Genomic surveillance and phylodynamic analyses reveal the emergence of novel mutations and co-mutation patterns within SARS-CoV-2 variants prevalent in India. Front Microbiol. 2021;12:703933. 10.3389/fmicb.2021.703933 - DOI - PMC - PubMed
    1. Ghosh N, Nandi S, Saha I. A review on evolution of emerging SARS-CoV-2 variants based on spike glycoprotein. Int Immunopharmacol. 2022;105:108565. 10.1016/j.intimp.2022.108565 - DOI - PMC - PubMed
    1. Ladner JT, Sahl JW. Towards a post-pandemic future for global pathogen genome sequencing. PLoS Biol. 2023;21:e3002225. 10.1371/journal.pbio.3002225 - DOI - PMC - PubMed

LinkOut - more resources