Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2022 Mar;13(2):e1679.
doi: 10.1002/wrna.1679. Epub 2021 Jun 21.

Compositional biases in RNA viruses: Causes, consequences and applications

Affiliations
Review

Compositional biases in RNA viruses: Causes, consequences and applications

Eleanor R Gaunt et al. Wiley Interdiscip Rev RNA. 2022 Mar.

Abstract

If each of the four nucleotides were represented equally in the genomes of viruses and the hosts they infect, each base would occur at a frequency of 25%. However, this is not observed in nature. Similarly, the order of nucleotides is not random (e.g., in the human genome, guanine follows cytosine at a frequency of ~0.0125, or a quarter the number of times predicted by random representation). Codon usage and codon order are also nonrandom. Furthermore, nucleotide and codon biases vary between species. Such biases have various drivers, including cellular proteins that recognize specific patterns in nucleic acids, that once triggered, induce mutations or invoke intrinsic or innate immune responses. In this review we examine the types of compositional biases identified in viral genomes and current understanding of the evolutionary mechanisms underpinning these trends. Finally, we consider the potential for large scale synonymous recoding strategies to engineer RNA virus vaccines, including those with pandemic potential, such as influenza A virus and Severe Acute Respiratory Syndrome Coronavirus Virus 2. This article is categorized under: RNA in Disease and Development > RNA in Disease RNA Evolution and Genomics > Computational Analyses of RNA RNA Interactions with Proteins and Other Molecules > Protein-RNA Recognition.

Keywords: dinucleotides; mutation bias; selection bias; viral genome composition.

PubMed Disclaimer

Conflict of interest statement

The authors have declared no conflicts of interest for this article.

Figures

FIGURE 1
FIGURE 1
(a) There are 16 possible dinucleotide compositions in RNA. (b) Schematic of CpG motif, with “p” referring to the phosphate bridge (green) joining the cytosine (C) (blue) and guanine (G) (red) bases
FIGURE 2
FIGURE 2
GC content vs CpG ratio for various invertebrate (blue circle) and vertebrate (pink circle) species. In blue from left to right: Spodoptera exempta (African armyworm), Drosophila melanogaster (fruit fly), Bombus bombus (bumble bee), Anopheles gambiae (mosquito). In pink from left to right: Danio rerio (zebrafish), Halichoerus spp (seals), Phocoena spp (porpoise), Didelphis virginiana (opossum), Homo sapiens (human), Rattus norvegicus (brown rat), Takifugu rubripes (pufferfish), Ornithorhynchus anatinus (platypus)
FIGURE 3
FIGURE 3
Under‐representation of CpG dinucleotides (a) and UpA dinucleotides (b) in the genomes of representative viruses. Abbreviations are Adeno, human adenovirus 2; HCMV, human cytomegalovirus; HSV‐1, herpes simplex virus 1; parvo, parvovirus; BTV, bluetongue virus; HCV, hepatitis C virus; FMDV, foot and mouth disease virus; SARS2, severe acute respiratory syndrome coronavirus 2; EBOV, ebola virus; IAV, influenza A virus; RSV, respiratory syncytial virus; HIV‐1, human immunodeficiency virus 1. The Baltimore classifications are I dsDNA; II ssDNA; III dsRNA; IV +ssRNA; V –ssRNA; VI rtRNA
FIGURE 4
FIGURE 4
Four types of bias are described in the genomes of organisms and the viruses they are infected with
FIGURE 5
FIGURE 5
Compositional biases in viral genomes may be driven by three types of evolutionary pressure—Translational, selection and mutational. Translationally derived biases arise due to the different translational efficiencies of transcripts with varying composition in different cell conditions (e.g., resting vs. stress). Biases driven by selection arise through viral genomes avoiding encoding specific motifs that may be recognized by components of the innate immune response. Biases driven by mutation arise through editing of viral genomes or transcripts by host cell proteins
FIGURE 6
FIGURE 6
Possible mechanisms by which ZAP activity leads to viral transcript degradation. CpG motifs in viral RNA (red) are bound by the cytoplasmic PRR ZAP, which can lead to recruitment of 5′ decapping enzymes (Dcp1/2 complex), the 3′ deadenylation enzyme PARN and potentially the KHNYN RNA endonuclease, followed by 5′–3′ degradation mediated by Xrn1 and/or 3′–5′ degradation mediated by the RNA exosome. Interactions between ZAP and RIG‐I and/or TRIM25 may also lead to innate immune signaling
FIGURE 7
FIGURE 7
Comparison of CpG and UpA suppression in the genomes of various viruses. RNA viruses: BTV, bluetongue virus; EBOV, ebola virus; FMDV, foot and mouth disease virus; HCV, hepatitis C virus; RSV, respiratory syncytial virus; SARS2, severe acute respiratory syndrome coronavirus 2. DNA viruses: adeno, adenovirus; HCMV, human cytomegalovirus; HSV‐1, herpes simplex virus 1; Parvo, canine parvovirus 2

References

    1. Abrahams, L. , & Hurst, L. D. (2017). Adenine enrichment at the fourth CDS residue in bacterial genes is consistent with error proofing for +1 frameshifts. Molecular Biology and Evolution, 34, 3064–3080. - PMC - PubMed
    1. Adams, M. , & Antoniw, J. (2004). Codon usage bias amongst plant viruses. Archives of Virology, 149, 113–135. - PubMed
    1. Antzin‐Anduetza, I. , Mahiet, C. , Granger, L. A. , Odendall, C. , & Swanson, C. M. (2017). Increasing the CpG dinucleotide abundance in the HIV‐1 genomic RNA inhibits viral replication. Retrovirology, 14, 49. - PMC - PubMed
    1. Aragonès, L. , Guix, S. , Ribes, E. , Bosch, A. , & Pintó, R. M. (2010). Fine‐tuning translation kinetics selection as the driving force of codon usage bias in the hepatitis A virus capsid. PLoS Pathogens, 6, e1000797–e1000797. - PMC - PubMed
    1. Aravind, L. (2001). The WWE domain: A common interaction module in protein ubiquitination and ADP ribosylation. Trends in Biochemical Sciences, 26, 273–275. - PubMed

Publication types

LinkOut - more resources