Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2007 Oct 15:8:371.
doi: 10.1186/1471-2164-8-371.

Frame disruptions in human mRNA transcripts, and their relationship with splicing and protein structures

Affiliations

Frame disruptions in human mRNA transcripts, and their relationship with splicing and protein structures

Paul Harrison et al. BMC Genomics. .

Abstract

Background: Efforts to gather genomic evidence for the processes of gene evolution are ongoing, and are closely coupled to improved gene annotation methods. Such annotation is complicated by the occurrence of disrupted mRNAs (dmRNAs), harbouring frameshifts and premature stop codons, which can be considered indicators of decay into pseudogenes.

Results: We have derived a procedure to annotate dmRNAs, and have applied it to human data. Subsequences are generated from parsing at key frame-disruption positions and are required to align significantly within any original protein homology. We find 419 high-quality human dmRNAs (3% of total). Significant dmRNA subpopulations include: zinc-finger-containing transcription factors with long disrupted exons, and antisense homologies to distal genes. We analysed the distribution of initial frame disruptions in dmRNAs with respect to positions of: (i) protein domains, (ii) alternatively-spliced exons, and (iii) regions susceptible to nonsense-mediated decay (NMD). We find significant avoidance of protein-domain disruption (indicating a selection pressure for this), and highly significant overrepresentation of disruptions in alternatively-spliced exons, and 'non-NMD' regions. We do not find any evidence for evolution of novelty in protein structures through frameshifting.

Conclusion: Our results indicate largely negative selection pressures related to frame disruption during gene evolution.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Three examples of dmRNAs. The translated dmRNA sequence is shown along with the corresponding nucleotide sequence; the aligning protein sequence is shown above these in each case. They are as follows: (a) a multiply-disrupted example (homologous to a cytochrome P450); (b) a multiply-disrupted example from a zinc-finger -containing transcription factor family; (c) an alternative splicing of the transmembrane sugar transporter gene, C20orf59, which appears to be a transmembrane sugar transporter.
Figure 2
Figure 2
Numbers of paralogs. The distribution of the number of paralogs for all genes, and for genes yielding dmRNAs. The bin labeled x contains all values N such that x-5 <N x.
Figure 3
Figure 3
Numbers of frame disruptions. The number of frame disruptions in dmRNAs plotted versus the total occurrences of this number, on a log-log scale. This distribution is governed by a power law relationship, with the parameters for this linear relationship indicated on the plot.
Figure 4
Figure 4
Distribution of frame-disrupted and non-frame-disrupted exon lengths in the disrupted mRNAs. The exon lengths are in bins labelled at either end of the bin with the upper (≤) and lower (>) bounds, with occurrences in each bin on the y axis. The percentage of exons >1000 nucleotides is given for each data set. The upper left panel is for the whole set of exons; the lower left panel for 5' exons, the upper right for internal exons, and the lower right for 3' exons.
Figure 5
Figure 5
Pipeline for annotating dmRNAs. The steps discussed in Methods are illustrated schematically.

Similar articles

Cited by

References

    1. Harrison P, Gerstein M. Studying genomes through the aeons: protein families, pseudogenes and proteome evolution. J Mol Biol. 2002;318:1155–1174. doi: 10.1016/S0022-2836(02)00109-2. - DOI - PubMed
    1. Harrison PM, Zheng D, Zhang Z, Carriero N, Gerstein M. Transcribed processed pseudogenes in the human genome: an intermediate form of expressed retrosequence lacking protein-coding ability. Nucleic acids research. 2005;33:2374–2383. doi: 10.1093/nar/gki531. - DOI - PMC - PubMed
    1. Harrison PM, Hegyi H, Balasubramanian S, Luscombe NM, Bertone P, Echols N, Johnson T, Gerstein M. Molecular fossils in the human genome: identification and analysis of the pseudogenes in chromosomes 21 and 22. Genome Res. 2002;12:272–280. doi: 10.1101/gr.207102. - DOI - PMC - PubMed
    1. Esnault C, Maestre J, Heidmann T. Human LINE retrotransposons generate processed pseudogenes. Nature genetics. 2000;24:363–367. doi: 10.1038/74184. - DOI - PubMed
    1. Karro JE, Yan Y, Zheng D, Zhang Z, Carriero N, Cayting P, Harrison P, Gerstein M. Pseudogene.org: a comprehensive database and comparison platform for pseudogene annotation. Nucleic acids research. 2007;35:D55–D60. . doi: 10.1093/nar/gkl851. - DOI - PMC - PubMed

Publication types

LinkOut - more resources