Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Aug 29;3(8):e3093.
doi: 10.1371/journal.pone.0003093.

Longer first introns are a general property of eukaryotic gene structure

Affiliations

Longer first introns are a general property of eukaryotic gene structure

Keith R Bradnam et al. PLoS One. .

Abstract

While many properties of eukaryotic gene structure are well characterized, differences in the form and function of introns that occur at different positions within a transcript are less well understood. In particular, the dynamics of intron length variation with respect to intron position has received relatively little attention. This study analyzes all available data on intron lengths in GenBank and finds a significant trend of increased length in first introns throughout a wide range of species. This trend was found to be even stronger when using high-confidence gene annotation data for three model organisms (Arabidopsis thaliana, Caenorhabditis elegans, and Drosophila melanogaster) which show that the first intron in the 5' UTR is--on average--significantly longer than all downstream introns within a gene. A partial explanation for increased first intron length in A. thaliana is suggested by the increased frequency of certain motifs that are present in first introns. The phenomenon of longer first introns can potentially be used to improve gene prediction software and also to detect errors in existing gene annotations.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. First introns are the longest introns in most species.
Results shown for all species in GenBank release 164 which have at least 500 CDSs that specify multiple introns. Z-tests were used to determine significance and color denotes level of significance (see legend, N.S. = not significant).
Figure 2
Figure 2. Intron size variation for selected species with different numbers of introns.
Intron lengths are shown for species with CDSs that contain 4, 6, 7 or 9 introns (in D. melanogaster, A. thaliana, C. elegans, and H. sapiens respectively). Bars on graph show standard error of the mean. Numbers of CDSs used for each species are shown.
Figure 3
Figure 3. Intron length variation in three model organisms.
Mean intron length is calculated for the first intron in the 5′ UTR (position −1, in blue) and for the first eight introns of the coding sequence (in red) for three named species. Error bars indicate standard error of the mean. Bottom right panel shows the occurrence of a potential IME motif (pictured) in A. thaliana introns. %Motif density is calculated by concatenating together all introns in each category, and then calculating what fraction of the total sequence is occupied by the motif.
Figure 4
Figure 4. Incorrect C. elegans gene annotation determined by inspection of intron lengths.
This gene prediction contained an incorrect in-frame intron sequence in the first exon. Transcript evidence, homology evidence from C. briggsae, and an alternative gene prediction (Twinscan) suggested that the first intron is an annotation error. Image taken from Genome Browser display of WormBase release WS180 (http://ws180.wormbase.org).

Similar articles

Cited by

References

    1. Gilbert W. Why genes in pieces? Nature. 1978;271:501. - PubMed
    1. Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL. GenBank. Nucleic Acids Res. 2007;35:D21–D25. - PMC - PubMed
    1. Sakurai A, Fujimori S, Kochiwa H, Kitamura-Abe S, Washio T, et al. On biased distribution of introns in various eukaryotes. Gene. 2002;300:89–95. - PubMed
    1. Lin K, Zhang DY. The excess of 5′ introns in eukaryotic genomes. Nucleic Acids Res. 2005;33:6522–6527. - PMC - PubMed
    1. Nielsen H, Wernersson R. An overabundance of phase 0 introns immediately after the start codon in eukaryotic genes. BMC Genomics. 2006;7:256. - PMC - PubMed

Publication types