Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2024 Nov 5;18(1):120.
doi: 10.1186/s40246-024-00684-8.

Best practices for germline variant and DNA methylation analysis of second- and third-generation sequencing data

Affiliations
Review

Best practices for germline variant and DNA methylation analysis of second- and third-generation sequencing data

Ferdinando Bonfiglio et al. Hum Genomics. .

Abstract

This comprehensive review provides insights and suggested strategies for the analysis of germline variants using second- and third-generation sequencing technologies (SGS and TGS). It addresses the critical stages of data processing, starting from alignment and preprocessing to quality control, variant calling, and the removal of artifacts. The document emphasized the importance of meticulous data handling, highlighting advanced methodologies for annotating variants and identifying structural variations and methylated DNA sites. Special attention is given to the inspection of problematic variants, a step that is crucial for ensuring the accuracy of the analysis, particularly in clinical settings where genetic diagnostics can inform patient care. Additionally, the document covers the use of various bioinformatics tools and software that enhance the precision and reliability of these analyses. It outlines best practices for the annotation of variants, including considerations for problematic genetic alterations such as those in the human leukocyte antigen region, runs of homozygosity, and mitochondrial DNA alterations. The document also explores the complexities associated with identifying structural variants and copy number variations, underscoring the challenges posed by these large-scale genomic alterations. The objective is to offer a comprehensive framework for researchers and clinicians, ensuring that genetic analyses conducted with SGS and TGS are both accurate and reproducible. By following these best practices, the document aims to increase the diagnostic accuracy for hereditary diseases, facilitating early diagnosis, prevention, and personalized treatment strategies. This review serves as a valuable resource for both novices and experts in the field, providing insights into the latest advancements and methodologies in genetic analysis. It also aims to encourage the adoption of these practices in diverse research and clinical contexts, promoting consistency and reliability across studies.

Keywords: Bioinformatics; DNA methylation; Genetic diagnostics; Germline variants; Hereditary diseases; NGS.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Reads alignment within a coding region of the DCLK3 gene showing a putative T > G variant poorly supported by the reads alignment
Fig. 2
Fig. 2
Reads alignment of an SGS experiment showing a potential T > G variant with strand bias calling
Fig. 3
Fig. 3
Alignment of an SGS experiment in a trio within an intronic region flanking an exon of the TBC1D12 gene highlights misalignment issues
Fig. 4
Fig. 4
Alignment of an SGS experiment in a single sample around a low-mappability region
Fig. 5
Fig. 5
Anomalies in mapped reads and complications affect the detection of SVs. A Anomalies in mapped reads for different types of SVs. Sequencing reads are represented as arrows, with paired reads connected by lines. For discordant reads, a short or long insert is indicated by a red line, and an unexpected orientation of reads is indicated by red arrows. For split/clip reads, the clipped portion of the read is marked in orange. Split-read refers to a single read mapped to two distinct regions, and corresponding clipped reads are also marked in orange. For simplicity, only one forward mapped read is shown for split/clip-reads. B Complications in SV detection. Repetitive sequences are indicated as red boxes, whereas inserted sequences absent from the reference genome are indicated as orange boxes. These could come from population-specific sequences, mobile elements, or viral sequences Adapted from Yi et al. [32]
Fig. 6
Fig. 6
Homozygosity Mapping in recessive diseases. An individual affected by an autosomal recessive disease whose parents are consanguineous will most likely be homozygous (identical) by descent for the disease allele, as it can pass from a common ancestor through both the paternal and maternal lines, making the child homozygous for the mutation. The chromosomal segments surrounding the disease gene locus are shown with 3 marker positions on both sides. The different marker alleles are represented by different colors. Although for each parent‒child succession, there is the possibility of a crossover (dashed line) occurring in the parents' gametes, there is a high probability that in the affected child, the consecutive markers surrounding the mutation have not recombined and are identical (homozygous) by descent (from Hildebrandt et al. [117])
Fig. 7
Fig. 7
Summary of the variant calling process. Graphical outline of the proposed computational analyses for germline variant calling in short-read sequencing. Created in BioRender. BioRender.com/b98g706
Fig. 8
Fig. 8
Summary of processes related to targeted analyses, third-generation sequencing, and DNA methylation. Graphical outline describing short-read sequencing targeted approaches, third-generation sequencing based on long reads, and approaches for analyzing genome-wide DNA methylation. Created in BioRender. BioRender.com/b98g706

References

    1. Matthijs G, Souche E, Alders M, Corveleyn A, Eck S, Feenstra I, et al. Guidelines for diagnostic next-generation sequencing. Eur J Hum Genet. 2016;24:2–5. - PMC - PubMed
    1. Hu T, Chitnis N, Monos D, Dinh A. Next-generation sequencing technologies: an overview. Hum Immunol. 2021;82:801–11. - PubMed
    1. Johnson S, Lee K, Riccitelli N. A comparison of Illumina and Element Biosciences sequencing platforms. Cancer Res. 2024;327(6_Supplement):327.
    1. Kumar KR, Cowley MJ, Davis RL. Next-generation sequencing and emerging technologies. Semin Thromb Hemost. 2019;45:661–73. - PubMed
    1. Pedersen BS, Collins RL, Talkowski ME, Quinlan AR. Indexcov: fast coverage quality control for whole-genome sequencing. Gigascience. 2017;6:1–6. - PMC - PubMed

MeSH terms

LinkOut - more resources