Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Nov 28;9(12):1671.
doi: 10.3390/plants9121671.

Comparative Analysis of Complete Chloroplast Genome Sequences of Wild and Cultivated Bougainvillea (Nyctaginaceae)

Affiliations

Comparative Analysis of Complete Chloroplast Genome Sequences of Wild and Cultivated Bougainvillea (Nyctaginaceae)

Mary Ann C Bautista et al. Plants (Basel). .

Abstract

Bougainvillea (Nyctaginaceae) is a popular ornamental plant group primarily grown for its striking colorful bracts. However, despite its established horticultural value, limited genomic resources and molecular studies have been reported for this genus. Thus, to address this existing gap, complete chloroplast genomes of four species (Bougainvillea glabra, Bougainvillea peruviana, Bougainvillea pachyphylla, Bougainvillea praecox) and one Bougainvillea cultivar were sequenced and characterized. The Bougainvillea cp genomes range from 153,966 bp to 154,541 bp in length, comprising a large single-copy region (85,159 bp-85,708 bp) and a small single-copy region (18,014 bp-18,078 bp) separated by a pair of inverted repeats (25,377-25,427 bp). All sequenced plastomes have 131 annotated genes, including 86 protein-coding, eight rRNA, and 37 tRNA genes. These five newly sequenced Bougainvillea cp genomes were compared to the Bougainvillea spectabilis cp genome deposited in GeBank. The results showed that all cp genomes have highly similar structures, contents, and organization. They all exhibit quadripartite structures and all have the same numbers of genes and introns. Codon usage, RNA editing sites, and repeat analyses also revealed highly similar results for the six cp genomes. The amino acid leucine has the highest proportion and almost all favored synonymous codons have either an A or U ending. Likewise, out of the 42 predicted RNA sites, most conversions were from serine (S) to leucine (L). The majority of the simple sequence repeats detected were A/T mononucleotides, making the cp genomes A/T-rich. The contractions and expansions of the IR boundaries were very minimal as well, hence contributing very little to the differences in genome size. In addition, sequence variation analyses showed that Bougainvillea cp genomes share nearly identical genomic profiles though several potential barcodes, such as ycf1, ndhF, and rpoA were identified. Higher variation was observed in both B. peruviana and B. pachyphylla cp sequences based on SNPs and indels analysis. Phylogenetic reconstructions further showed that these two species appear to be the basal taxa of Bougainvillea. The rarely cultivated and wild species of Bougainvillea (B. pachyphylla, B. peruviana, B. praecox) diverged earlier than the commonly cultivated species and cultivar (B. spectabilis, B. glabra, B. cv.). Overall, the results of this study provide additional genetic resources that can aid in further phylogenetic and evolutionary studies in Bougainvillea. Moreover, genetic information from this study is potentially useful in identifying Bougainvillea species and cultivars, which is essential for both taxonomic and plant breeding studies.

Keywords: Bougainvillea; Nyctaginaceae; chloroplast genome; phylogeny.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
Circular gene map of five newly sequenced Bougainvillea chloroplast genomes. The genes drawn outside the circle are transcribed clockwise, while the genes on the inside are transcribed counterclockwise. Genes belonging to different functional groups are color-coded. The dark gray plot in the inner circle represents the GC (Guanine-Cytosine) content, whereas the light-grey corresponds to the AT (Adenine-Thymine) content.
Figure 2
Figure 2
Percentages of amino acid in the protein-coding regions of six Bougainvillea chloroplast genomes.
Figure 3
Figure 3
Analysis of simple sequence repeats (SSRs) in six Bougainvillea chloroplast genomes. (A) Simple sequence repeats detected in coding and non-coding regions of six Bougainvillea cp genomes. (B) Simple sequence repeats distributions in the LSC, SSC, and IR regions of Bougainvillea cp genomes. (C) Numbers of different types of SSRs identified in the Bougainvillea cp genomes. (D) Frequency of various SSR types identified in six Bougainvillea cp genomes.
Figure 4
Figure 4
Tandem repeat analysis in six Bougainvillea cp genomes. (A) Frequency of tandem repeats in the non-coding and coding regions of six cp genomes. (B) Distributions of the detected tandem repeats in LSC, SSC, and IR regions. (C) Lengths of the identified tandem repeats in all six Bougainvillea cp genomes.
Figure 5
Figure 5
Comparisons of LSC, SSC, and IRs junctions among the six chloroplast genomes.
Figure 6
Figure 6
Sequence identity plots (mVISTA) among Bougainvillea species. Alignments of the five Bougainvillea plastomes, with Bougainvillea glabra as the reference genome. Genes are color-coded, whereby pink regions represent conserved non-coding sequences (CNS) and purple regions indicate protein-coding sequences. Grey arrows above the alignments indicate gene directions. The y-axis denotes the percentages of identity, ranging between 50% and 100%.
Figure 7
Figure 7
Nucleotide diversity (Pi) of various regions in Bougainvillea chloroplast genomes. (A) Nucleotide diversity values in the protein-coding regions. (B) Nucleotide diversity values in the non-coding regions.
Figure 8
Figure 8
Summary of SNPs detected in the five Bougainvillea chloroplast genomes. (A) Frequency of SNPs in the coding and non-coding regions. (B) Protein-coding genes with highest numbers of synonymous and non-synonymous SNPs.
Figure 9
Figure 9
Summary of insertions and deletions found in five Bougainvillea cp genomes. (A) Total number of indels in five Bougainvillea species. (B) Numbers of indels located in the protein-coding genes. (C) Lengths of indels identified in five cp genomes.
Figure 10
Figure 10
Maximum Likelihood (ML) and Bayesian Inference (BI) consensus tree based on the 79 concatenated protein-coding regions of 14 Nyctaginaceae cp genomes. Species from Petiveriaceae were used as outgroups. Numbers on each node represent bootstrap support and Bayesian posterior probability (BPP) values. Branches with bootstrap values > 75 and BPP values > 95 are considered as highly supported.

References

    1. Mabberley D.J. The Plant Book. Cambridge Univercity Press; Cambridge, UK: 1987. pp. 1–706.
    1. Bittrich V., Kühn U. Nyctaginaceae. In: Kubitzki K., Rohwer J.G., Bittrich V., editors. The Families and Genera of Flowering Plants. Volume 2. Springer; Berlin, Germany: 1993. pp. 473–486.
    1. Bremer B., Bremer K., Chase M.W., Fay M.F., Reveal J.L., Soltis D.E., Soltis P.S., Stevens P.F., Anderberg A.A., Moore M.J., et al. An update of the Angiosperm Phylogeny Group classification for the orders and families of flowering plants: APG II. Bot. J. Linn. Soc. 2003;141:399–436.
    1. Kobayashi K.D., McConnell J., Griffis J. Ornamentals and Flowers. Volume OF-38. College of Tropical Agriculture and Human Resources, University of Hawaii; Honolulu, HI, USA: 2007. Bougainvillea; pp. 1–12.
    1. Plants of the World Online. Facilitated by the Royal Botanic Gardens, Kew. [(accessed on 10 October 2020)]; Available online: http://www.plantsoftheworldonline.org/

LinkOut - more resources