Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 May 30;18(1):80.
doi: 10.1186/s12862-018-1198-x.

Mind the gap! The mitochondrial control region and its power as a phylogenetic marker in echinoids

Affiliations

Mind the gap! The mitochondrial control region and its power as a phylogenetic marker in echinoids

Omri Bronstein et al. BMC Evol Biol. .

Abstract

Background: In Metazoa, mitochondrial markers are the most commonly used targets for inferring species-level molecular phylogenies due to their extremely low rate of recombination, maternal inheritance, ease of use and fast substitution rate in comparison to nuclear DNA. The mitochondrial control region (CR) is the main non-coding area of the mitochondrial genome and contains the mitochondrial origin of replication and transcription. While sequences of the cytochrome oxidase subunit 1 (COI) and 16S rRNA genes are the prime mitochondrial markers in phylogenetic studies, the highly variable CR is typically ignored and not targeted in such analyses. However, the higher substitution rate of the CR can be harnessed to infer the phylogeny of closely related species, and the use of a non-coding region alleviates biases resulting from both directional and purifying selection. Additionally, complete mitochondrial genome assemblies utilizing next generation sequencing (NGS) data often show exceptionally low coverage at specific regions, including the CR. This can only be resolved by targeted sequencing of this region.

Results: Here we provide novel sequence data for the echinoid mitochondrial control region in over 40 species across the echinoid phylogenetic tree. We demonstrate the advantages of directly targeting the CR and adjacent tRNAs to facilitate complementing low coverage NGS data from complete mitochondrial genome assemblies. Finally, we test the performance of this region as a phylogenetic marker both in the lab and in phylogenetic analyses, and demonstrate its superior performance over the other available mitochondrial markers in echinoids.

Conclusions: Our target region of the mitochondrial CR (1) facilitates the first thorough investigation of this region across a wide range of echinoid taxa, (2) provides a tool for complementing missing data in NGS experiments, and (3) identifies the CR as a powerful, novel marker for phylogenetic inference in echinoids due to its high variability, lack of selection, and high compatibility across the entire class, outperforming conventional mitochondrial markers.

Keywords: Control region; Echinoidea; Mitochondrial markers; Molecular phylogeny; NGS; Sea urchins.

PubMed Disclaimer

Conflict of interest statement

Ethics approval and consent to participate

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Figures

Fig. 1
Fig. 1
Representation of echinoid complete mitochondrial genomes assembled from NGS data, showing gene annotation and coverage. The annotated genomes are represented by four echinoid species: Hemicentrotus pulcherrimus, Strongylocentrotus fragilis, Mesocentrotus franciscanus, and Strongylocentrotus intermedius, corresponding to GenBank accession numbers: KC898202, KC898198, KC898199, and KC898200, respectively. Annotations are given at the outer margin of the external circle. Concentric circles represent the corresponding coverage for each of the represented species mitogenomes. Data was obtained from Kober and Bernardi [86, 87]. Enlarged segment illustrates the position of the various primers used in the current study. Coverage was calculated in BRIG [88], after read mapping with Bowtie2 [89] (using the predefined alignment threshold “very-sensitive”). Annotations are based on those for H. pulcherrimus (GenBank accession no. NC_023771) and radial plots generated using BRIG
Fig. 2
Fig. 2
Pairwise tree comparisons for phylogenetic trees based on commonly used mitochondrial markers. Trees include the two most commonly used phylogenetic mitochondrial markers: a fragment of the cytochrome c oxidase subunit 1 (a) gene and a fragment of the 16S ribosomal RNA (c) as well as the novel tRNAs and control region (e). To facilitate independent comparisons, the genetically inferred trees were restricted to the 35 publicly available complete echinoid mitochondrial genomes. Genera represented by more than one species were collapsed and are depicted by single branches. Supporting values (> 0.85 posterior probabilities and > 75% ML bootstrap values) are shown next to nodes. Topological comparisons between the genetically inferred trees and current classification (b, d, f) (see text for details) were visualised using Phylo.io [62]. Colour scale for the comparison metric (a variant of the Jaccard Index as implemented in Phylo.io) ranges from 0 (subtrees completely different) to 1 (subtree structure of the respective node is identical)
Fig. 3
Fig. 3
Substitution saturation plot of the CRA marker based on the CRA-All dataset. The number of transitions (s) and transversions (v) is plotted against F84 genetic distance. A linear correlation is sustained for both transitions and transversions as expected in the absence of saturation
Fig. 4
Fig. 4
Phylogenetic tree reconstruction of the echinoid control region and adjacent areas (CRA). The BI tree presented is based on 86 unique haplotypes retrieved from a total of 110 sequences, 405 bp long (see Table 1 for details on the sequences used for this tree). Supporting values (> 0.5 posterior probabilities and > 50% ML bootstrap values) are shown above the nodes
Fig. 5
Fig. 5
Coverage (orange curve) and GC content (black curve; 200 bp sliding window, 10 bp step width) through the mitogenome of Hemicentrotus pulcherrimus (GenBank accession no. KC898202) illustrating moderate (R2 = 0.335), but highly significant correlation (t-test, p < 10− 100) between the two graphs. Note extreme drop of coverage towards the end of the CR (highlighted in grey), which coincides with a slight decrease in GC-content, but shows a much stronger negative excursion than other GC-poor areas in the mitogenome of this species (e.g. at nucleotide positions 4.4, 8.5, or 12.6 kb)

Similar articles

Cited by

References

    1. Sanger F, Nicklen S, Coulson AR. DNA sequencing with chain-terminating inhibitors. Proc Natl Acad Sci U S A. 1977;74(12):5463–5467. doi: 10.1073/pnas.74.12.5463. - DOI - PMC - PubMed
    1. Liu L, Li Y, Li S, Hu N, He Y, Pong R, Lin D, Lu L, Law M. Comparison of next-generation sequencing systems. J Biomed Biotechnol. 2012;2012:1–11. - PMC - PubMed
    1. Goodwin S, McPherson JD, McCombie WR. Coming of age: ten years of next-generation sequencing technologies. Nat Rev Genet. 2016;17(6):333–351. doi: 10.1038/nrg.2016.49. - DOI - PMC - PubMed
    1. Wang W, Wei Z, Lam T-W, Wang J. Next generation sequencing has lower sequence coverage and poorer SNP-detection capability in the regulatory regions. Sci Rep. 2011;1:55. doi: 10.1038/srep00055. - DOI - PMC - PubMed
    1. Ekblom R, Smeds L, Ellegren H. Patterns of sequencing coverage bias revealed by ultra-deep sequencing of vertebrate mitochondria. BMC Genomics. 2014;15(1):467. doi: 10.1186/1471-2164-15-467. - DOI - PMC - PubMed

Publication types

LinkOut - more resources