Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Sep 10;15(18):2294.
doi: 10.3390/diagnostics15182294.

Optimization of DNA Fragmentation Techniques to Maximize Coverage Uniformity of Clinically Relevant Genes Using Whole Genome Sequencing

Affiliations

Optimization of DNA Fragmentation Techniques to Maximize Coverage Uniformity of Clinically Relevant Genes Using Whole Genome Sequencing

Vanessa Process et al. Diagnostics (Basel). .

Abstract

Background: Coverage uniformity is pivotal in whole genome sequencing (WGS), as uneven read distributions can obscure clinically relevant variants and compromise downstream analyses. While enzyme-based fragmentation methods for WGS library preparation are widely used, they can introduce sequence-specific biases that disproportionately affect high-GC or low-GC regions. Here, we compare four PCR-free WGS library preparation workflows-one employing mechanical fragmentation and three based on enzymatic fragmentation-to assess their impact on coverage uniformity and variant detection. Results: Libraries were generated with Coriell NA12878 and DNA isolated from DNA blood, saliva, and formalin-fixed paraffin-embedded (FFPE) samples. Sequencing was performed on an Illumina NovaSeq 6000, followed by alignment to the human reference genome (GRCh38/hg38) and local realignment. We assessed coverage at both chromosomal and gene levels, including 504 clinically relevant genes detected in the TruSight™ Oncology 500 (TSO500) panel. Additionally, we examined the relationship between GC content and normalized coverage, as well as variant detection across high- and low-GC regions. Conclusions: Our findings show that mechanical fragmentation yields a more uniform coverage profile across different sample types and across the GC spectrum. Enzymatic workflows, on the other hand, demonstrated more pronounced coverage imbalances, particularly in high-GC regions, potentially affecting the sensitivity of variant detection. This effect was evident in analyses focusing on the TSO500 gene set, where uniform coverage is critical for accurate identification of disease-associated variants and for minimizing false negatives. Downsampling experiments further revealed that mechanical fragmentation maintained lower Single Nucleotide Polymorphism (SNPs) false-negative and false-positive rates at reduced sequencing depths, thereby highlighting the advantages of consistent coverage for resource-efficient WGS. This study introduces a novel framework for evaluating WGS coverage uniformity, providing guidance for optimizing library preparation protocols in clinical and translational research. By quantifying how fragmentation strategies influence coverage depth and variant calling accuracy, laboratories can refine their sequencing workflows to ensure more reliable detection of clinically actionable variants-especially in high-GC regions often implicated in hereditary disease and oncology.

Keywords: GC-bias; PCR-free library preparation; adaptive focused acoustics (AFA) fragmentation; chromosomal coverage; coverage uniformity; enzymatic fragmentation; library preparation; next-generation sequencing (NGS); variant detection; whole genome sequencing (WGS).

PubMed Disclaimer

Conflict of interest statement

Several authors (Vanessa Process, Madan Ambavaram, Sameer Vasantgadkar, Sushant Khanal, Martina Werner, Greg Endress, Ulrich Thomann, and Eugenio Daviso) are employees of Covaris LLC., a PerkinElmer Company. This study compares DNA fragmentation methods, including the truCOVER PCR-free Library Prep Kit, which is developed and manufactured by Covaris LLC. Covaris LLC. provided funding and resources for this research. The findings indicate superior performance for mechanical fragmentation, which is the technology employed by Covaris’s truCOVER product.

Figures

Figure 1
Figure 1
Base Composition of first 20 Sequenced Reads with Coriell DNA. Line plots show the percentage of base composition in the first 20 bases of Read 1 and Read 2 for each library prep kit. Each line represents a different nucleotide base.
Figure 2
Figure 2
GC-bias line plots showing normalized coverage as a function of GC content across the human genome, calculated in 100 base pair windows. The normalized coverage values shown are from the average of the three technical replicates. Each line represents a different sample type. A normalized coverage value of 1 (shown as a black line) would indicate no GC bias.
Figure 3
Figure 3
Chromosomal coverage distribution normalized by total coverage. (A) Representative line plot of normalized coverage across the autosomal chromosomes for saliva samples, with each line representing a different library kit. X-axis are the 22 autosomal chromosomes. (B) Scatterplots showing the relationship between chromosomal %GC content and the normalized coverage for each library kit. Linear regression models were analyzed for the different sample types. A normalized coverage value of 1 (shown as a gray line) would indicate no GC bias. (C) Bar plots of R2 values for the linear regression models shown in Figure 3B. The plots summarize the relationship between the average %GC of chromosomes and the corresponding normalized coverage. An R2 value close to ‘0’ indicates minimal correlation between GC content and coverage.
Figure 4
Figure 4
Linear regression analysis of GC content and normalized coverage of the TSO500 genes. (A) Linear regression models overlaid to highlight coverage trends, each line represents a different sample type. (B) Bar plots showing the R-squared values for the linear regression lines in 4a highlighting relationship of GC content and coverage. An R2 value close to ‘0’ indicates minimal correlation between GC content and coverage.
Figure 5
Figure 5
Normalized coverage distributions across the TSO500 genes. (A) Histogram plots of average normalized coverage using kernel density estimation, separated by library kit and sample type. (B) Full width at half maximum (FWHM) for each density-curve shown in 5A, used as a metric for uniformity of coverage. Lowest FWHM values are observed with Covaris library prep across sample types, exhibiting high coverage uniformity with minimal impact of GC content on coverage.
Figure 6
Figure 6
NA12878 Variant Performance within TSO500 regions. Comparison of true and false SNP calls within the TSO500 regions for NA12878 at two different coverage depths across various library kits. The Y-axis represents the count of variant calls. (A) Displays true positive SNP calls out of 549, with the top bar plot representing 10X coverage and the bottom bar plot representing 20X coverage. (B) Illustrates false negative and false positive SNP calls for the same coverage depths.
Figure 7
Figure 7
NA12878 Variant Performance across GC content regions, comparison between 10X and 20X coverage across different library kits. The X-axis represents GC content regions ranging from 15 to 85%, with most bins spanning 5% GC intervals, except for the 30–55%GC bin, which covers 20% GC. The individual bins indicate ≥100 bp hg38 regions that fall within the specified GC range, based on the GIAB stratification files. (A) F1-score of SNPs. (B) F1-score of Indels.

References

    1. Brlek P., Bulić L., Bračić M., Projić P., Škaro V., Shah N., Shah P., Primorac D. Implementing Whole Genome Sequencing (WGS) in Clinical Practice: Advantages, Challenges, and Future Perspectives. Cells. 2024;13:504. doi: 10.3390/cells13060504. - DOI - PMC - PubMed
    1. Satam H., Joshi K., Mangrolia U., Waghoo S., Zaidi G., Rawool S., Thakare R.P., Banday S., Mishra A.K., Das G., et al. Next-Generation Sequencing Technology: Current Trends and Advancements. Biology. 2023;12:997. doi: 10.3390/biology12070997. - DOI - PMC - PubMed
    1. Bick D., Ahmed A., Deen D., Ferlini A., Garnier N., Kasperaviciute D., Leblond M., Pichini A., Rendon A., Satija A., et al. Newborn Screening by Genomic Sequencing: Opportunities and Challenges. Int. J. Neonatal Screen. 2022;8:40. doi: 10.3390/ijns8030040. - DOI - PMC - PubMed
    1. Nisar H., Wajid B., Shahid S., Anwar F., Wajid I., Khatoon A., Sattar M.U., Sadaf S. Whole-genome sequencing as a first-tier diagnostic framework for rare genetic diseases. Exp. Biol. Med. 2021;246:2610–2617. doi: 10.1177/15353702211040046. - DOI - PMC - PubMed
    1. Ellingford J.M., Barton S., Bhaskar S., Williams S.G., Sergouniotis P.I., O’Sullivan J., Lamb J.A., Perveen R., Hall G., Newman W.G., et al. Whole Genome Sequencing Increases Molecular Diagnostic Yield Compared with Current Diagnostic Testing for Inherited Retinal Disease. Ophthalmology. 2016;123:1143–1150. doi: 10.1016/j.ophtha.2016.01.009. - DOI - PMC - PubMed

LinkOut - more resources