Navigating bottlenecks and trade-offs in genomic data analysis
- PMID: 36476810
- PMCID: PMC10204111
- DOI: 10.1038/s41576-022-00551-z
Navigating bottlenecks and trade-offs in genomic data analysis
Abstract
Genome sequencing and analysis allow researchers to decode the functional information hidden in DNA sequences as well as to study cell to cell variation within a cell population. Traditionally, the primary bottleneck in genomic analysis pipelines has been the sequencing itself, which has been much more expensive than the computational analyses that follow. However, an important consequence of the continued drive to expand the throughput of sequencing platforms at lower cost is that often the analytical pipelines are struggling to keep up with the sheer amount of raw data produced. Computational cost and efficiency have thus become of ever increasing importance. Recent methodological advances, such as data sketching, accelerators and domain-specific libraries/languages, promise to address these modern computational challenges. However, despite being more efficient, these innovations come with a new set of trade-offs, both expected, such as accuracy versus memory and expense versus time, and more subtle, including the human expertise needed to use non-standard programming interfaces and set up complex infrastructure. In this Review, we discuss how to navigate these new methodological advances and their trade-offs.
© 2022. Springer Nature Limited.
Conflict of interest statement
Competing interests
The authors declare no competing interests.
Figures
References
-
- Wetterstrand KA DNA sequencing costs: data. National Human Genome Research Institute; www.genome.gov/sequencingcostsdata (2022).
-
- Preston J, VanZeeland A, & Peiffer DA Innovation at illumina: the road to the $600 human genome. Nature Portfolio https://www.nature.com/articles/d42473-021-00030-9 (2021).
-
- Pennisi EA $100 genome? New DNA sequencers could be a ‘game changer’ for biology, medicine. Science 376, 1257–1258 (2022). - PubMed
-
- Regalado A China’s BGI says it can sequence a genome for just $100. MIT Technology Review. https://www.technologyreview.com/2020/02/26/905658/china-bgi-100-dollarg... (2020).
Related links
-
- BEETL-fastq: https://github.com/BEETL/BEETL
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
