Tandem repeat copy-number variation in protein-coding regions of human genes
- PMID: 16086851
- PMCID: PMC1273636
- DOI: 10.1186/gb-2005-6-8-r69
Tandem repeat copy-number variation in protein-coding regions of human genes
Abstract
Background: Tandem repeat variation in protein-coding regions will alter protein length and may introduce frameshifts. Tandem repeat variants are associated with variation in pathogenicity in bacteria and with human disease. We characterized tandem repeat polymorphism in human proteins, using the UniGene database, and tested whether these were associated with host defense roles.
Results: Protein-coding tandem repeat copy-number polymorphisms were detected in 249 tandem repeats found in 218 UniGene clusters; observed length differences ranged from 2 to 144 nucleotides, with unit copy lengths ranging from 2 to 57. This corresponded to 1.59% (218/13,749) of proteins investigated carrying detectable polymorphisms in the copy-number of protein-coding tandem repeats. We found no evidence that tandem repeat copy-number polymorphism was significantly elevated in defense-response proteins (p = 0.882). An association with the Gene Ontology term 'protein-binding' remained significant after covariate adjustment and correction for multiple testing. Combining this analysis with previous experimental evaluations of tandem repeat polymorphism, we estimate the approximate mean frequency of tandem repeat polymorphisms in human proteins to be 6%. Because 13.9% of the polymorphisms were not a multiple of three nucleotides, up to 1% of proteins may contain frameshifting tandem repeat polymorphisms.
Conclusion: Around 1 in 20 human proteins are likely to contain tandem repeat copy-number polymorphisms within coding regions. Such polymorphisms are not more frequent among defense-response proteins; their prevalence among protein-binding proteins may reflect lower selective constraints on their structural modification. The impact of frameshifting and longer copy-number variants on protein function and disease merits further investigation.
Figures



Similar articles
-
Diversity in coding tandem repeats in related Neisseria spp.BMC Microbiol. 2003 Nov 12;3:23. doi: 10.1186/1471-2180-3-23. BMC Microbiol. 2003. PMID: 14611665 Free PMC article.
-
Digital genotyping of macrosatellites and multicopy genes reveals novel biological functions associated with copy number variation of large tandem repeats.PLoS Genet. 2014 Jun 19;10(6):e1004418. doi: 10.1371/journal.pgen.1004418. eCollection 2014 Jun. PLoS Genet. 2014. PMID: 24945355 Free PMC article.
-
TRPing up the genome: Tandem repeat polymorphisms as dynamic sources of genetic variability in health and disease.Discov Med. 2010 Oct;10(53):314-21. Discov Med. 2010. PMID: 21034672 Review.
-
Tandem repeats in protein coding regions of primate genes.Genome Res. 2002 Jun;12(6):909-15. doi: 10.1101/gr.138802. Genome Res. 2002. PMID: 12045144 Free PMC article.
-
Coding repeats and evolutionary "agility".Bioessays. 2005 Jun;27(6):581-7. doi: 10.1002/bies.20248. Bioessays. 2005. PMID: 15892112 Review.
Cited by
-
Mutation and selection processes regulating short tandem repeats give rise to genetic and phenotypic diversity across species.J Evol Biol. 2023 Feb;36(2):321-336. doi: 10.1111/jeb.14106. Epub 2022 Oct 26. J Evol Biol. 2023. PMID: 36289560 Free PMC article. Review.
-
Patterned sequence in the transcriptome of vascular plants.BMC Genomics. 2007 Jun 15;8:173. doi: 10.1186/1471-2164-8-173. BMC Genomics. 2007. PMID: 17573970 Free PMC article.
-
Tools for the identification of variable and potentially variable tandem repeats.BMC Genomics. 2006 Nov 15;7:290. doi: 10.1186/1471-2164-7-290. BMC Genomics. 2006. PMID: 17107618 Free PMC article.
-
Repetitive DNA sequence detection and its role in the human genome.Commun Biol. 2023 Sep 19;6(1):954. doi: 10.1038/s42003-023-05322-y. Commun Biol. 2023. PMID: 37726397 Free PMC article. Review.
-
Mutation dynamics of CpG dinucleotides during a recent event of vertebrate diversification.Epigenetics. 2019 Jul;14(7):685-707. doi: 10.1080/15592294.2019.1609868. Epub 2019 May 9. Epigenetics. 2019. PMID: 31070073 Free PMC article.
References
-
- Verkerk AJ, Pieretti M, Sutcliffe JS, Fu YH, Kuhl DP, Pizzuti A, Reiner O, Richards S, Victoria MF, Zhang FP, et al. Identification of a gene (FMR-1) containing a CGG repeat coincident with a breakpoint cluster region exhibiting length variation in fragile X syndrome. Cell. 1991;65:905–914. doi: 10.1016/0092-8674(91)90397-H. - DOI - PubMed
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources