Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2009 Jan;19(1):136-42.
doi: 10.1101/gr.083634.108. Epub 2008 Nov 24.

Fast and flexible simulation of DNA sequence data

Affiliations
Comparative Study

Fast and flexible simulation of DNA sequence data

Gary K Chen et al. Genome Res. 2009 Jan.

Abstract

Simulation of genomic sequences under the coalescent with recombination has conventionally been impractical for regions beyond tens of megabases. This work presents an algorithm, implemented as the program MaCS (Markovian Coalescent Simulator), that can efficiently simulate haplotypes under any arbitrary model of population history. We present several metrics comparing the performance of MaCS with other available simulation programs. Practical usage of MaCS is demonstrated through a comparison of measures of linkage disequilibrium between generated program output and real genotype data from populations considered to be structured.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Average r2 values were computed across all pairs of nearby (maximum 1 Mb) SNPs for a 270-Mb region. Values for simulated data and HapMap YRI (Chromosome 1) are indicated by the green and red lines, respectively.
Figure 2.
Figure 2.
A demonstration of the algorithm behind MaCS for a sample of three sequences and the tree-retention parameter set to k = 2. The algorithm proceeds from the left end of the region to be simulated toward the right end. Vertical edges are labeled to their immediate right with the ID of the most recent tree that it belongs to.

References

    1. Begun D.J., Holloway A.K., Stevens K., Hillier L.W., Poh Y.P., Hahn M.W., Nista P.M., Jones C.D., Kern A.D., Dewey C.N., et al. Population genomics: Whole-genome analysis of polymorphism and divergence in Drosophila simulans. PLoS Biol. 2007;5:e310. doi: 10.1371/journal.pbio.0050310. - DOI - PMC - PubMed
    1. Cauchi S., Froguel P. TCF7L2 genetic defect and type 2 diabetes. Curr. Diab. Rep. 2008;8:149–155. - PubMed
    1. Clark R.M., Schweikert G., Toomajian C., Ossowski S., Zeller G., Shinn P., Warthmann N., Hu T.T., Fu G., Hinds D.A., et al. Common sequence polymorphisms shaping genetic diversity in Arabidopsis thaliana. Science. 2007;317:338–342. - PubMed
    1. Durrant C., Zondervan K.T., Cardon L.R., Hunt S., Deloukas P., Morris A.P. Linkage disequilibrium mapping via cladistic analysis of single-nucleotide polymorphism haplotypes. Am. J. Hum. Genet. 2004;75:35–43. - PMC - PubMed
    1. Frazer K.A., Ballinger D.G., Cox D.R., Hinds D.A., Stuve L.L., Gibbs R.A., Belmont J.W., Boudreau A., Hardenbol P., Leal S.M., et al. A second generation human haplotype map of over 3.1 million SNPs. Nature. 2007;449:851–861. - PMC - PubMed

Publication types