Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jan 17;25(1):68.
doi: 10.1186/s12864-024-09996-4.

Thirteen complete chloroplast genomes of the costaceae family: insights into genome structure, selective pressure and phylogenetic relationships

Affiliations

Thirteen complete chloroplast genomes of the costaceae family: insights into genome structure, selective pressure and phylogenetic relationships

Dong-Mei Li et al. BMC Genomics. .

Abstract

Background: Costaceae, commonly known as the spiral ginger family, consists of approximately 120 species distributed in the tropical regions of South America, Africa, and Southeast Asia, of which some species have important ornamental, medicinal and ecological values. Previous studies on the phylogenetic and taxonomic of Costaceae by using nuclear internal transcribed spacer (ITS) and chloroplast genome fragments data had low resolutions. Additionally, the structures, variations and molecular evolution of complete chloroplast genomes in Costaceae still remain unclear. Herein, a total of 13 complete chloroplast genomes of Costaceae including 8 newly sequenced and 5 from the NCBI GenBank database, representing all three distribution regions of this family, were comprehensively analyzed for comparative genomics and phylogenetic relationships.

Result: The 13 complete chloroplast genomes of Costaceae possessed typical quadripartite structures with lengths from 166,360 to 168,966 bp, comprising a large single copy (LSC, 90,802 - 92,189 bp), a small single copy (SSC, 18,363 - 20,124 bp) and a pair of inverted repeats (IRs, 27,982 - 29,203 bp). These genomes coded 111 - 113 different genes, including 79 protein-coding genes, 4 rRNA genes and 28 - 30 tRNAs genes. The gene orders, gene contents, amino acid frequencies and codon usage within Costaceae were highly conservative, but several variations in intron loss, long repeats, simple sequence repeats (SSRs) and gene expansion on the IR/SC boundaries were also found among these 13 genomes. Comparative genomics within Costaceae identified five highly divergent regions including ndhF, ycf1-D2, ccsA-ndhD, rps15-ycf1-D2 and rpl16-exon2-rpl16-exon1. Five combined DNA regions (ycf1-D2 + ndhF, ccsA-ndhD + rps15-ycf1-D2, rps15-ycf1-D2 + rpl16-exon2-rpl16-exon1, ccsA-ndhD + rpl16-exon2-rpl16-exon1, and ccsA-ndhD + rps15-ycf1-D2 + rpl16-exon2-rpl16-exon1) could be used as potential markers for future phylogenetic analyses and species identification in Costaceae. Positive selection was found in eight protein-coding genes, including cemA, clpP, ndhA, ndhF, petB, psbD, rps12 and ycf1. Maximum likelihood and Bayesian phylogenetic trees using chloroplast genome sequences consistently revealed identical tree topologies with high supports between species of Costaceae. Three clades were divided within Costaceae, including the Asian clade, Costus clade and South American clade. Tapeinochilos was a sister of Hellenia, and Parahellenia was a sister to the cluster of Tapeinochilos + Hellenia with strong support in the Asian clade. The results of molecular dating showed that the crown age of Costaceae was about 30.5 Mya (95% HPD: 14.9 - 49.3 Mya), and then started to diverge into the Costus clade and Asian clade around 23.8 Mya (95% HPD: 10.1 - 41.5 Mya). The Asian clade diverged into Hellenia and Parahellenia at approximately 10.7 Mya (95% HPD: 3.5 - 25.1 Mya).

Conclusion: The complete chloroplast genomes can resolve the phylogenetic relationships of Costaceae and provide new insights into genome structures, variations and evolution. The identified DNA divergent regions would be useful for species identification and phylogenetic inference in Costaceae.

Keywords: Chloroplast genome; Comparative genomics; Costaceae; Divergence time; Genome evolution; Phylogenetic relationships.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Chloroplast genome map of C. barbatus (GenBank accession number: OP712648; the outermost three rings) and CGView comparison of thirteen complete chloroplast genomes in the Costaceae family (the inter rings with different colors). Genes shown on the outside of the outermost first ring are transcribed counter-clockwise and on the inside clockwise. Outermost second ring with darker gray corresponds to GC content, whereas outermost third ring with the lighter gray corresponds to AT content of C. barbatus chloroplast genome by OGDRAW. The gray arrowheads indicate the direction of the genes. LSC, large single copy region; IR, inverted repeat; SSC, small single copy region. The innermost first black ring indicates the chloroplast genome size of C. barbatus. The innermost second and third rings indicate GC content and GC skews deviations in chloroplast genome of C. barbatus, respectively: GC skew + indicates G > C, and GC skew − indicates G < C. CGView comparison result of thirteen complete chloroplast genomes in Costaceae displayed from innermost fourth color ring to outwards 16th ring in turn: C. barbatus OP712648, C. beckii OP712653, C. dubius OP712651, C. speciosus Guangdong OP712649, C. speciosus var. marginatus OP712652, C. tonkinensis Yunnan OP712650, C. viridis MK262733, C. woodsonii OP712654, H. speciosa Guizhou OK641589, M. uniflorus OP712655, H. lacera ON598391, H. speciosa Yunnan ON598392, and C. tonkinensis ON598393; chloroplast genome similar and highly divergent locations are represented by continuous and interrupted track lines, respectively. The species in bold are sequenced in this study
Fig. 2
Fig. 2
Analysis of long repeats in thirteen complete chloroplast genomes of the Costaceae family. (A), Total numbers and different types of long repeats in each chloroplast genome. (B), Numbers of long repeats more than 30 bp long in each chloroplast genome. * indicates chloroplast genome of the species sequenced in this study
Fig. 3
Fig. 3
Analysis of SSRs in thirteen complete chloroplast genomes of the Costaceae family. (A), Total numbers and different types of SSRs detected in each chloroplast genome. (B), Frequencies of the identified SSRs in different motifs. (C), Frequencies of the identified SSRs in the LSC, SSC and IR regions. (D), SSR distribution in protein-coding regions, introns and intergenic regions detected in each chloroplast genome. * indicates chloroplast genome of the species sequenced in this study
Fig. 4
Fig. 4
Heat map analysis for relative synonymous codon usage (RSCU) values of all protein-coding genes of thirteen complete chloroplast genomes in the Costaceae family. Red indicates higher RSCU values and blue indicates lower RSCU values. The species in bold are sequenced in this study
Fig. 5
Fig. 5
Comparisons of border distances between adjacent genes and junctions of the LSC, SSC and two IR regions among thirteen complete chloroplast genomes of the Costaceae family. Numbers above or near the colored genes indicate the distances between the genes and the boundary sites. The figure is not in scale for sequence length, and only shows relative changes at or near the IR/SC boundaries. The species in bold are sequenced in this study
Fig. 6
Fig. 6
Visualized alignment of thirteen complete chloroplast genomes sequences of the Costaceae family using mVISTA. C. barbatus chloroplast genome sequence was used as a reference. Gray arrows and thick black lines indicate gene orientation. Purple bars represent exons, sky-blue bars represent untranslated regions (UTRs), red bars represent non-coding sequences (CNS), gray bars represent mRNA and white regions represent sequence differences among all analyzed chloroplast genomes. Horizontal axis indicates the coordinates within the chloroplast genome. Vertical scale represents the identity percentage that ranges from 50–100%. The species in bold are sequenced in this study
Fig. 7
Fig. 7
Comparisons of nucleotide diversity (Pi) values among thirteen complete chloroplast genomes of the Costaceae family. (A), Protein-coding genes. Protein-coding genes with Pi values > 0.007 are labeled with gene names. (B), Intergenic regions. Intergenic regions with Pi values > 0.025 are labeled with intergenic region names
Fig. 8
Fig. 8
Phylogenetic relationships of Costaceae species based on chloroplast genomes sequences reconstructed using maximum likelihood (ML) and the bayes inference (BI) methods. (A), ML tree. (B), BI tree. The species in bold are sequenced in this study
Fig. 9
Fig. 9
Divergence time estimation of Costaceae species based on nucleotide sequences of 75 single-copy protein-coding genes shared in 22 chloroplast genomes of Costaceae. The fossil and calibration taxa are indicated with red points on the corresponding nodes. Mean divergence time of the nodes are shown at the nodes with blue. The numbers inside each blue bracket after mean divergence time represent 95% highest posterior density (HPD) of estimated divergence time, with minimum and maximum values, respectively. The species in bold are sequenced in this study

Similar articles

Cited by

References

    1. Wu D, Larsen K, Zingiberaceae. Flora of China, vol.24. Beijing: Science press;2000. p.320– 21.
    1. Specht CD, Kress WJ, Stevenson DW, DeSalle R. A molecular phylogeny of Costaceae (Zingiberales) Mol Phylogenet Evol. 2001;21(3):333–45. - PubMed
    1. Kress WJ, Prince LM, Hahn WJ, Zimmer EA. Unraveling the evolutionary radiation of the families of the Zingiberales using morphological and molecular evidence. Syst Biol. 2001;50(6):926–44. - PubMed
    1. Branney TME. Hardy gingers: including Hedychium, Roscoea and Zingiber. Portland and London: Timber press; 2005. pp. 7–22.
    1. Specht CD, Stevenson DW. A new phylogeny-based generic classification of Costaceae (Zingiberales) Taxon. 2006;55(1):153–63.

LinkOut - more resources