Analysis of sequence conservation at nucleotide resolution
- PMID: 18166073
- PMCID: PMC2230682
- DOI: 10.1371/journal.pcbi.0030254
Analysis of sequence conservation at nucleotide resolution
Abstract
One of the major goals of comparative genomics is to understand the evolutionary history of each nucleotide in the human genome sequence, and the degree to which it is under selective pressure. Ascertainment of selective constraint at nucleotide resolution is particularly important for predicting the functional significance of human genetic variation and for analyzing the sequence substructure of cis-regulatory sequences and other functional elements. Current methods for analysis of sequence conservation are focused on delineation of conserved regions comprising tens or even hundreds of consecutive nucleotides. We therefore developed a novel computational approach designed specifically for scoring evolutionary conservation at individual base-pair resolution. Our approach estimates the rate at which each nucleotide position is evolving, computes the probability of neutrality given this rate estimate, and summarizes the result in a Sequence CONservation Evaluation (SCONE) score. We computed SCONE scores in a continuous fashion across 1% of the human genome for which high-quality sequence information from up to 23 genomes are available. We show that SCONE scores are clearly correlated with the allele frequency of human polymorphisms in both coding and noncoding regions. We find that the majority of noncoding conserved nucleotides lie outside of longer conserved elements predicted by other conservation analyses, and are experiencing ongoing selection in modern humans as evident from the allele frequency spectrum of human polymorphism. We also applied SCONE to analyze the distribution of conserved nucleotides within functional regions. These regions are markedly enriched in individually conserved positions and short (<15 bp) conserved "chunks." Our results collectively suggest that the majority of functionally important noncoding conserved positions are highly fragmented and reside outside of canonically defined long conserved noncoding sequences. A small subset of these fragmented positions may be identified with high confidence.
Conflict of interest statement
Figures




Similar articles
-
OMA Browser--exploring orthologous relations across 352 complete genomes.Bioinformatics. 2007 Aug 15;23(16):2180-2. doi: 10.1093/bioinformatics/btm295. Epub 2007 Jun 1. Bioinformatics. 2007. PMID: 17545180
-
Comparative analysis of noncoding sequences of orthologous bovine and human gene pairs.Genet Mol Res. 2004 Dec 30;3(4):465-73. Genet Mol Res. 2004. PMID: 15688313
-
Local conservation scores without a priori assumptions on neutral substitution rates.BMC Bioinformatics. 2008 Apr 11;9:190. doi: 10.1186/1471-2105-9-190. BMC Bioinformatics. 2008. PMID: 18405366 Free PMC article.
-
Evolutionary conservation in noncoding genomic regions.Trends Genet. 2021 Oct;37(10):903-918. doi: 10.1016/j.tig.2021.06.007. Epub 2021 Jul 5. Trends Genet. 2021. PMID: 34238591 Review.
-
Exploring the genesis and functions of Human Accelerated Regions sheds light on their role in human evolution.Curr Opin Genet Dev. 2014 Dec;29:15-21. doi: 10.1016/j.gde.2014.07.005. Epub 2014 Aug 23. Curr Opin Genet Dev. 2014. PMID: 25156517 Review.
Cited by
-
Experimental Polymorphism Survey in Intergenic Regions of the icaADBCR Locus in Staphylococcus aureus Isolates from Periprosthetic Joint Infections.Microorganisms. 2022 Mar 10;10(3):600. doi: 10.3390/microorganisms10030600. Microorganisms. 2022. PMID: 35336176 Free PMC article.
-
Conservation and regulatory associations of a wide affinity range of mouse transcription factor binding sites.Genomics. 2010 Apr;95(4):185-95. doi: 10.1016/j.ygeno.2010.01.002. Epub 2010 Jan 15. Genomics. 2010. PMID: 20079828 Free PMC article.
-
A method for calculating probabilities of fitness consequences for point mutations across the human genome.Nat Genet. 2015 Mar;47(3):276-83. doi: 10.1038/ng.3196. Epub 2015 Jan 19. Nat Genet. 2015. PMID: 25599402 Free PMC article.
-
Quantitative prediction of the effect of genetic variation using hidden Markov models.BMC Bioinformatics. 2014 Jan 9;15:5. doi: 10.1186/1471-2105-15-5. BMC Bioinformatics. 2014. PMID: 24405700 Free PMC article.
-
What fraction of the human genome is functional?Genome Res. 2011 Nov;21(11):1769-76. doi: 10.1101/gr.116814.110. Epub 2011 Aug 29. Genome Res. 2011. PMID: 21875934 Free PMC article. Review.
References
-
- Koonin EV, Galperin MY. Sequence—evolution—function: computational approaches in comparative genomics. Boston: Kluwer Academic; 2003. p. xiii.461. p., 411 plates. - PubMed
-
- Ponting CP, Schultz J, Copley RR, Andrade MA, Bork P. Evolution of domain families. Adv Protein Chem. 2000;54:185–244. - PubMed
-
- Gaucher EA, Gu X, Miyamoto MM, Benner SA. Predicting functional divergence in protein evolution by site-specific rate shifts. Trends Biochem Sci. 2002;27:315–321. - PubMed
-
- Lichtarge O, Sowa ME. Evolutionary predictions of binding surfaces and interactions. Curr Opin Struct Biol. 2002;12:21–27. - PubMed
-
- Hurst LD. The Ka/Ks ratio: diagnosing the form of sequence evolution. Trends Genet. 2002;18:486. - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources