Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Jul 28;15(8):1643.
doi: 10.3390/v15081643.

Discovery and Analyses of Caulimovirid-like Sequences in Upland Cotton (Gossypium hirsutum)

Affiliations

Discovery and Analyses of Caulimovirid-like Sequences in Upland Cotton (Gossypium hirsutum)

Nina Aboughanem-Sabanadzovic et al. Viruses. .

Abstract

Analyses of Illumina-based high-throughput sequencing data generated during characterization of the cotton leafroll dwarf virus population in Mississippi (2020-2022) consistently yielded contigs varying in size (most frequently from 4 to 7 kb) with identical nucleotide content and sharing similarities with reverse transcriptases (RTases) encoded by extant plant pararetroviruses (family Caulimoviridiae). Initial data prompted an in-depth study involving molecular and bioinformatic approaches to characterize the nature and origins of these caulimovirid-like sequences. As a result, here, we report on endogenous viral elements (EVEs) related to extant members of the family Caulimoviridae, integrated into a genome of upland cotton (Gossypium hirsutum), for which we propose the provisional name "endogenous cotton pararetroviral elements" (eCPRVE). Our investigations pinpointed a ~15 kbp-long locus on the A04 chromosome consisting of head-to-head orientated tandem copies located on positive- and negative-sense DNA strands (eCPRVE+ and eCPRVE-). Sequences of the eCPRVE+ comprised nearly complete and slightly decayed genome information, including ORFs coding for the viral movement protein (MP), coat protein (CP), RTase, and transactivator/viroplasm protein (TA). Phylogenetic analyses of major viral proteins suggest that the eCPRVE+ may have been initially derived from a genome of a cognate virus belonging to a putative new genus within the family. Unexpectedly, an identical 15 kb-long locus composed of two eCPRVE copies was also detected in a newly recognized species G. ekmanianum, shedding some light on the relatively recent evolution within the cotton family.

Keywords: Caulimoviridae; cotton; endogenous form; episomal form; genome integration; pararetrovirus; virus.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Figures

Figure 1
Figure 1
Maximum likelihood phylogenetic tree reconstructed with IQtree [42] from the amino acid alignments of reverse-transcriptases encoded by genomes of recognized and putative members of the family Caulimoviridae and manually adjusted sequences of the endogenous cotton pararetroviral element—eCPRVE (red font). The representative tree was visualized with iTOL v5 [44]. Different colors denote the 11 currently recognized genera in the family. Clades with statistical support of >90% are indicated by a red circle.
Figure 2
Figure 2
PCR results in 1.5% TAE agarose gels visualized by GelRed. The eCPRVE-specific primers used in reactions are presented in the upper gel portion (A), while the PCR results obtained with Nad5-specific primers from the same samples are presented in the lower part (B). Lanes 1–8: various non-upland cotton samples; lanes 10–16: diverse upland cotton genotypes; lanes 9 and 17: positive control (cultivar PhytoGen 490 W3FE); lane W: water control; lane M: 1 kb Plus DNA Ladder.
Figure 3
Figure 3
Schematic representation of a tandem insertion of contig DEC02-76 sequences into chromosome A04 of the Gossypium hirsutum nuclear genome. Numbers below the figure indicate positions of extreme 5′ and 3′ terminal nucleotides on the A04 chromosome of the G. hirsutum TM1 UTX v2 genome used as a reference. Notice the slightly different organization between the two EVE copies and the presence of ORFs 7a and 7b in the sense copy (eCPRVE+).
Figure 4
Figure 4
Schematic representation of BLAST+ search results of the representative genomes of Gossypium hirsutum and G. ekmanianum for endogenous cotton pararetroviral elements. Panels (A,B) contain results of BLAST+ analyses of chromosomes A04 and D03, respectively. Red bars represent genome sequences with high similarity with the query. The analysis was performed in the CottonGen database (cottongen.org) with contig DEC02-76 as a query.
Figure 5
Figure 5
Schematic representation of the evolutionary history of tetraploid Gossypium spp. with an estimated timeline. The possible original eCPRVE integration event is depicted by the black circle. The two species containing identical 15 kb loci with tandem copies of the eCPRVE are reported in red font.

References

    1. Weiss R.A. The discovery of endogenous retroviruses. Retrovirology. 2006;3:67. doi: 10.1186/1742-4690-3-67. - DOI - PMC - PubMed
    1. Lander E.S., Linton L.M., Birren B., Nusbaum C., Zody M.C., Baldwin J., Devon K., Dewar K., Doyle M., FitzHugh W., et al. International Human Genome Sequencing Consortium. Initial sequencing and analysis of the human genome. Nature. 2001;409:860–921. doi: 10.1038/35087627. - DOI - PubMed
    1. Bejarano E.R., Khashoggi A., Witty M., Lichtenstein C. Integration of multiple repeats of geminiviral DNA into the nuclear genome of tobacco during evolution. Proc. Natl. Acad. Sci. USA. 1996;93:759–764. doi: 10.1073/pnas.93.2.759. - DOI - PMC - PubMed
    1. Tanne E., Sela I. Occurrence of a DNA sequence of a non-retro RNA virus in a host plant genome and its expression: Evidence for recombination between viral and host RNAs. Virology. 2005;332:614–622. doi: 10.1016/j.virol.2004.11.007. - DOI - PubMed
    1. Chiba S., Kondo H., Tani A., Saisho D., Sakamoto W., Kanematsu S., Suzuki N. Widespread endogenization of genome sequences of non-retroviral RNA viruses into plant genomes. PloS Pathog. 2011;7:e1002146. doi: 10.1371/journal.ppat.1002146. - DOI - PMC - PubMed

Publication types