Prediction of two novel overlapping ORFs in the genome of SARS-CoV-2
- PMID: 34339929
- PMCID: PMC8317007
- DOI: 10.1016/j.virol.2021.07.011
Prediction of two novel overlapping ORFs in the genome of SARS-CoV-2
Abstract
Six candidate overlapping genes have been detected in SARS-CoV-2, yet current methods struggle to detect overlapping genes that recently originated. However, such genes might encode proteins beneficial to the virus, and provide a model system to understand gene birth. To complement existing detection methods, I first demonstrated that selection pressure to avoid stop codons in alternative reading frames is a driving force in the origin and retention of overlapping genes. I then built a detection method, CodScr, based on this selection pressure. Finally, I combined CodScr with methods that detect other properties of overlapping genes, such as a biased nucleotide and amino acid composition. I detected two novel ORFs (ORF-Sh and ORF-Mh), overlapping the spike and membrane genes respectively, which are under selection pressure and may be beneficial to SARS-CoV-2. ORF-Sh and ORF-Mh are present, as ORF uninterrupted by stop codons, in 100% and 95% of the SARS-CoV-2 genomes, respectively.
Keywords: Codon usage; Membrane protein; Multivariate statistics; Overlapping reading frame; Selection pressure; Spike protein; Virus evolution.
Copyright © 2021 Elsevier Inc. All rights reserved.
Conflict of interest statement
None.
Figures



References
-
- Aragonés L., Guix S., Ribes E., Bosch A., Pintó R.M. Fine-tuning translation kinetics selection as the driving force of codon usage bias in the hepatitis A virus capsid. PLoS Pathog. 2010;6 https://doi:10.1371/journal.ppat.1000797 - DOI - PMC - PubMed
-
- Boni M.F., Lemey P., Jiang X., Lam T.T., Perry B.W., Castoe T.A., Rambaut A., Robertson D.L. Evolutionary origins of the SARS-CoV-2 sarbecovirus lineage responsible for the COVID-19 pandemic. Nat. Microbiol. 2020;5:1408–1417. https://doi:10.1038/s41564-020-0771-4 - DOI - PubMed
-
- Cagliani R., Forni D., Clerici M., Sironi M. Coding potential and sequence conservation of SARS-CoV-2 and related animal viruses. Infect. Genet. Evol. 2020;83:104353. https://doi:10.1016/j.meegid.2020.104353 - DOI - PMC - PubMed
-
- Chan W.S., Wu C., Chow S.C., Cheung T., To K.F., Leung W.K., Chan P.K., Lee K.C., Ng H.K., Au D.M., Lo A.W. Coronaviral hypothetical and structural proteins were found in the intestinal surface enterocytes and pneumocytes of severe acute respiratory syndrome (SARS) Mod. Pathol. 2005;18:1432–1439. doi: 10.1038/modpathol.3800439. - DOI - PMC - PubMed
-
- Chirico N., Vianelli A., Belshaw R. Why genes overlap in viruses. Proc. Biol. Sci. 2010;277:3809–3817. https://doi:10.1098/rspb.2010.1052 - DOI - PMC - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Miscellaneous