Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2020 Dec 6;12(12):1395.
doi: 10.3390/v12121395.

Advances in the Bioinformatics Knowledge of mRNA Polyadenylation in Baculovirus Genes

Affiliations
Review

Advances in the Bioinformatics Knowledge of mRNA Polyadenylation in Baculovirus Genes

Iván Gabriel Peros et al. Viruses. .

Abstract

Baculoviruses are a group of insect viruses with large circular dsDNA genomes exploited in numerous biotechnological applications, such as the biological control of agricultural pests, the expression of recombinant proteins or the gene delivery of therapeutic sequences in mammals, among others. Their genomes encode between 80 and 200 proteins, of which 38 are shared by all reported species. Thanks to multi-omic studies, there is remarkable information about the baculoviral proteome and the temporality in the virus gene expression. This allows some functional elements of the genome to be very well described, such as promoters and open reading frames. However, less information is available about the transcription termination signals and, consequently, there are still imprecisions about what are the limits of the transcriptional units present in the baculovirus genomes and how is the processing of the 3' end of viral mRNA. Regarding to this, in this review we provide an update about the characteristics of DNA signals involved in this process and we contribute to their correct prediction through an exhaustive analysis that involves bibliography information, data mining, RNA structure and a comprehensive study of the core gene 3' ends from 180 baculovirus genomes.

Keywords: Baculoviridae; RNA structure; mRNA; pattern-searching; polyadenylation process.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest. The funders had no role in the design of the study, in the collection, analyses, or interpretation of data, in the writing of the manuscript, or in the decision to publish the results.

Figures

Figure 1
Figure 1
Baculovirus infective phenotypes. The illustration represents the typical characteristics of baculoviruses. The occlusion body (OB) for betabaculoviruses (not illustrated), is granular in shape (compose of a granulin network) and usually contains 1 occlusion-derived virus (ODV) with 1 nucleocapsid. In the other genera, OBs have a polyhedral shape (as illustrated in the figure) and ODVs can be “multiple” (containing several nucleocapsids as represented in the illustration) or “single” (containing 1 nucleocapsid). The multiprotein complex named as PIF (Per os Infectivity Factor) is responsible for the primary infection in the host. The Budded viruses (BV) contain 1 nucleocapsid. The fusogenic proteins GP64 (Group I alphabaculoviruses) or F (remaining baculoviruses) mediate the entry of the BVs into the larval cells (secondary infection). The lipid envelopes of both virions (ODV and BVs) have different composition. The genome is a cccdsDNA of 80–180 kbp containing 100–200 protein genes (38 of which are shared by all baculoviral species). RNAP II: RNA polymerase II (from host); vRNAP: viral RNA polymerase (encoded by baculoviruses); TT: Transcription terminator. This review focuses on the 3′ end of baculoviral protein genes.
Figure 2
Figure 2
Eukaryotic polyadenylation process. DNA sequence elements involved in polyadenylation (a); and the polyadenylation mechanism (b) in mammalian pre-mRNA. The different protein factors involved in the process are identified by their names most frequently used in the literature. PAS: Polyadenylation signal; USE: Upstream sequence element; DSE: Downstream sequence element; CS: Cleavage site; CPSF: Cleavage and polyadenylation specific factor; CstF: Cleavage stimulation factor; CFI and CFII: Cleavage factors I and II; RNAP II: RNA polymerase II; PAP: Polyadenylate polymerase; RBP: Retinoblastoma-binding protein 6; PABPII: Poly(A) binding protein II.
Figure 3
Figure 3
RNA Structures. RNA is a very important molecule for many processes within the cell and its activity is largely determined by its structure (the way it is folded on itself). Although in most cases RNA is a single-stranded molecule, the most stable conformation of a nucleic acid is double-stranded, which is why RNA molecules tend to adopt secondary and tertiary structures by means of intramolecular interactions among the ribonucleotide bases of primary sequence, and even quaternary structures. The illustrated examples are the common structures that RNA molecules adopt in cells.
Figure 4
Figure 4
Workflow summary. A workflow diagram summarizing the bioinformatic analysis performed on downstream gene regions (DGRs: −100 to +350 relative to stop codon) of all predicted baculoviral genes is shown. 180 complete genomes (GenBank) were used, of which 53 were alphabaculoviruses Group I, 90 alphabaculoviruses Group II and 37 betabaculoviruses. The programs used in each step are indicated in parentheses. Alpha I: alphabaculoviruses Group I; Alpha II: alphabaculoviruses Group II; Beta: betabaculoviruses. PAS: Polyadenylation signal; USE: Upstream sequence element; DSE: Downstream sequence element; CS: Cleavage site: Aux-DSE: Auxiliary downstream sequence element; UGUA: motif that is eventually found upstream to the PAS and transcription end site.
Figure 5
Figure 5
Comprehensive bioinformatic analysis of the 3′ end of baculoviral genes. (a) Sequence logos showing the nucleotide context of the main polyadenylation signals identified at DGR core genes. Each logo contains 237 sequences that correspond to DGR core genes in which the 6 motifs were detected (98 sequences from alphabaculoviruses Group I, 99 from alphabaculoviruses Group II and 37 from betabaculoviruses). PAS: Polyadenylation signal; USE: Upstream sequence element; DSE: Downstream sequence element; CS: Cleavage site: Aux-DSE: Auxiliary downstream sequence element. (b) Structural context of the polyadenylation signals. The secondary structure of the same sequences mentioned in (a) was determined and after a comparison, several conserved structures were detected; the 6 structures shown are the conserved ones adopted by most of the sequences used for the analysis. The number of genes in which the structures were identified is indicated in brackets. Alpha I: Alphabaculovirus Group I; Alpha II: Alphabaculovirus Group II; Beta: Betabaculovirus. (c) Sequence elements involved in the polyadenylation mechanism in baculoviral genes. The positions of the elements involved and the ranges of distances between them are indicated, according to the improve model proposed in our working group. The different cellular factors that would be involved are also indicated. UGUA: motif that is eventually found upstream to the PAS and transcription end site; CPSF: Cleavage and specific polyadenylation factor; CstF: Cleavage stimulation factor; CFI and CFII: Cleavage factors I and II; Fip1: Pre-mRNA 3′ end-processing factor. (d) Detection of the improved model postulated in different data sets for validation. Searches were carried out on the DGR of core (DGR core) and non-core (DGR non core) genes, in addition to random sequences and all coding regions (nucleotides between the initial and stop codon) of all genes of genomes used. The amount of sequences in each data set is shown in brackets. The same colors are used in all panels: USE, green; PAS, red; CS: blue; DSE: yellow.

Similar articles

References

    1. Klasberg S., Bitard-Feildel T., Mallet L. Computational identification of novel genes: Current and future perspectives. Bioinform. Biol. Insights. 2016;10:121–131. doi: 10.4137/BBI.S39950. - DOI - PMC - PubMed
    1. Wiemann S., Arlt D., Huber W., Wellenreuther R., Schleeger S., Mehrle A., Bechtel S., Sauermann M., Korf U., Pepperkok R., et al. From ORFeome to biology: A functional genomics pipeline. Genome Res. 2004;14:2136–2144. doi: 10.1101/gr.2576704. - DOI - PMC - PubMed
    1. Down T.A., Hubbard T.J. Computational detection and location of transcription start sites in mammalian genomic DNA. Genome Res. 2002;12:458–461. doi: 10.1101/gr.216102. - DOI - PMC - PubMed
    1. Vishnevsky O.V., Kolchanov N.A. ARGO: A web system for the detection of degenerate motifs and large-scale recognition of eukaryotic promoters. Nucleic Acids Res. 2005;33:W417–W422. doi: 10.1093/nar/gki459. - DOI - PMC - PubMed
    1. Elkon R., Ugalde A.P., Agami R. Alternative cleavage and polyadenylation: Extent, regulation and function. Nat. Rev. Genet. 2013;14:496–506. doi: 10.1038/nrg3482. - DOI - PubMed

Publication types

Substances

LinkOut - more resources