Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Oct 6;13(1):5902.
doi: 10.1038/s41467-022-33530-3.

A method for multiplexed full-length single-molecule sequencing of the human mitochondrial genome

Affiliations

A method for multiplexed full-length single-molecule sequencing of the human mitochondrial genome

Ieva Keraite et al. Nat Commun. .

Abstract

Methods to reconstruct the mitochondrial DNA (mtDNA) sequence using short-read sequencing come with an inherent bias due to amplification and mapping. They can fail to determine the phase of variants, to capture multiple deletions and to cover the mitochondrial genome evenly. Here we describe a method to target, multiplex and sequence at high coverage full-length human mitochondrial genomes as native single-molecules, utilizing the RNA-guided DNA endonuclease Cas9. Combining Cas9 induced breaks, that define the mtDNA beginning and end of the sequencing reads, as barcodes, we achieve high demultiplexing specificity and delineation of the full-length of the mtDNA, regardless of the structural variant pattern. The long-read sequencing data is analysed with a pipeline where our custom-developed software, baldur, efficiently detects single nucleotide heteroplasmy to below 1%, physically determines phase and can accurately disentangle complex deletions. Our workflow is a tool for studying mtDNA variation and will accelerate mitochondrial research.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Cas9-mtDNA-enrichment, barcoding, pooling and demultiplexing approach for long-read sequencing.
A schematic overview of full-length mtDNA targeting with selected dual-guide Cas9 cut-sites. After optional treatment of gDNA with Exonuclease V (Exo V) and dephosphorylation, each sample is split into two or more aliquots for dual-guide targeted cleavage. Each cut-site serves downstream as a barcode in the analysis pipeline. The circular mtDNA molecules are opened, Cas9 is removed by Proteinase K (Prot K) digestion, followed by mtDNA dA-tailing and pooling of all of the aliquots. ONT library is prepared from the pooled samples, sequenced on a nanopore flow cell, followed by basecalling (Guppy v5.0.16). Figure created with BioRender.com.
Fig. 2
Fig. 2. Flowchart detailing informatics analysis pipeline steps for Cas9-mtDNA-enrichment sequencing.
After sequencing and basecalling, FASTQ reads are mapped against the reference GRCh38. Afterwards the reads are filtered and demultiplexed. The second mapping against a custom reference is critical for avoiding split reads. Then the baldur software does the variant calling for each demultiplexed sample. Resulting VCFs are used for downstream analysis for the annotation of the variants. Created in Lucidchart (www.lucidchart.com).
Fig. 3
Fig. 3. Multiple mtDNA deletions in a clinical sample.
a, b Circos plot of the lrPCR products of sample AW6506 showing three full-length lrPCR amplicons – two deletions and wild type, sequenced by Illumina short-read (a) and ONT long-read (b) instruments. White arrows at positions m.2120 and m.2119 represent forward and reverse primers, respectively. ce Circos plots of sample AW6506 targeted with the Cas9-mtDNA-enrichment sequenced on a GridIon flow cell showing three populations of mtDNA. c Targeted with the gRNA mt3 (m.3127), all three populations can be observed – two deletions and the wild type; d gRNA mt5 (m.5142) results in two populations – the small deletion and the wild type; e gRNA mt11 (m.11239) results in capturing the wild-type population only. In the circos plots, the reference circle colours denote genes encoding protein subunits of complex I (green), III (sky blue), IV (royal blue), V (light steel blue), ribosomal RNAs (ocean blue), transfer RNAs (ivory), and non-coding region D-loop (grey). f Summary of mean coverage of selected full-length reads and SV proportions in lrPCR amplicons, sequenced on Illumina and GridIon, and Cas9-mtDNA-enriched native molecules, sequenced on GridIon. Source data are provided as a Source Data file.

References

    1. Anderson S, et al. Sequence and organization of the human mitochondrial genome. Nature. 1981;290:457–465. doi: 10.1038/290457a0. - DOI - PubMed
    1. Taanman J-W. The mitochondrial genome: structure, transcription, translation and replication. Biochim. Biophys. Acta. 1999;1410:103–123. doi: 10.1016/S0005-2728(98)00161-3. - DOI - PubMed
    1. Legros F, Malka F, Frachon P, Lombès A, Rojo M. Organization and dynamics of human mitochondrial DNA. J. Cell Sci. 2004;117:2653–2662. doi: 10.1242/jcs.01134. - DOI - PubMed
    1. Payne BAI, et al. Universal heteroplasmy of human mitochondrial DNA. Hum. Mol. Genet. 2013;22:384–390. doi: 10.1093/hmg/dds435. - DOI - PMC - PubMed
    1. Boulet L, Karpati G, Shoubridge EA. Distribution and threshold expression of the tRNA (Lys) mutation in skeletal muscle of patients with myoclonic epilepsy and ragged-red fibers (MERRF) Am. J. Hum. Genet. 1992;51:1187–1200. - PMC - PubMed

Publication types