Complexity reduction of polymorphic sequences (CRoPS): a novel approach for large-scale polymorphism discovery in complex genomes

Nathalie J van Orsouw¹, René C J Hogers, Antoine Janssen, Feyruz Yalcin, Sandor Snoeijers, Esther Verstege, Harrie Schneiders, Hein van der Poel, Jan van Oeveren, Harold Verstegen, Michiel J T van Eijk

Affiliations

PMID: 18000544
PMCID: PMC2048665
DOI: 10.1371/journal.pone.0001172

Complexity reduction of polymorphic sequences (CRoPS): a novel approach for large-scale polymorphism discovery in complex genomes

Nathalie J van Orsouw et al. PLoS One. 2007.

. 2007 Nov 14;2(11):e1172.

doi: 10.1371/journal.pone.0001172.

Authors

Affiliation

¹ Keygene NV, Wageningen, The Netherlands. nathalie.van-orsouw@keygene.com

PMID: 18000544
PMCID: PMC2048665
DOI: 10.1371/journal.pone.0001172

Abstract

Application of single nucleotide polymorphisms (SNPs) is revolutionizing human bio-medical research. However, discovery of polymorphisms in low polymorphic species is still a challenging and costly endeavor, despite widespread availability of Sanger sequencing technology. We present CRoPS as a novel approach for polymorphism discovery by combining the power of reproducible genome complexity reduction of AFLP with Genome Sequencer (GS) 20/GS FLX next-generation sequencing technology. With CRoPS, hundreds-of-thousands of sequence reads derived from complexity-reduced genome sequences of two or more samples are processed and mined for SNPs using a fully-automated bioinformatics pipeline. We show that over 75% of putative maize SNPs discovered using CRoPS are successfully converted to SNPWave assays, confirming them to be true SNPs derived from unique (single-copy) genome sequences. By using CRoPS, polymorphism discovery will become affordable in organisms with high levels of repetitive DNA in the genome and/or low levels of polymorphism in the (breeding) germplasm without the need for prior sequence information.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: Paid employment: All authors and co-authors with the exception of S. Snoeijers are employees of Keygene N.V. S. Snoeijers was a Keygene N.V. employee during the execution of this project but he has left Keygene N.V. on December 31, 2006. Patent application: The CRoPS technology is subject to patent applications owned by Keygene N.V. See also acknowledgements.

Figures

**Figure 1. Bioinformatics pipeline for high-throughput analysis of CRoPS sequence runs.**

**Figure 2. Example of a multiple sequence alignment (MSA) with SNP and sample related properties.**
SNP properties include sequence depth (sd), the count on the number of reads at the polymorphic position, the relative position of the SNP on the consensus sequence, the distance to the neighboring SNP, flanking sequence size and homopolymeric region information. Sample related properties were derived from the Oracle database. The ratio sample sequence depth to MSA sequence depth is calculated.

**Figure 3. Number of putative SNPs and indels as a function of the minimal length of flanking sequences surrounding the SNP and the minimal interval devoid of additional SNPs/indels.**

**Figure 4. Pseudo-gel image visualizations of two SNPWave assays in maize detected by capillary electrophoresis.**
Left panel: 13-plex SNPWave assay; right panel: 10-plex SNPWave assay. Number 1-9 represent different recombinant inbred line offspring of B73 and Mo17.

**Figure 5. Composition and hypothesized cause of “mixed fragments”.**
“Mixed fragments” are characterized by the occurrence of the sample identification tag of sample 1 on one side and the sample identification tag of sample 2 on the other side. (A) Schematic representation of observed homoduplex and heteroduplex fragment types containing expected tags and “mixed fragments”. (B) “Mixed fragments” are formed when (1) a heteroduplex is formed between complementary strands of samples 1 and 2, (2) 3′-5′ exonuclease activity of T4 DNA polymerase removes the sequence tags at the 3′ ends, (3) polymerase activity of T4 DNA polymerase extends the 3′ ends using the opposite strand as template, resulting in incorporation of the “wrong” sequence tag, i.e. the observation of “mixed fragments”.

**Figure 6. Protocol modification to avoid “mixed fragments”.**
(A) Blunt-end adapter ligation as per the original GS 20 library preparation protocol. (B) T/A ligation as applied in the CRoPS protocol. Amplification using a polymerase lacking 3′-5′ exonuclease (proofreading) activity is performed resulting in A-addition to the AFLP fragments, after which the T-adapters can be ligated. (C) Flowcharts of the original GS 20 library preparation protocol and the CRoPS library preparation protocol.

See this image and copyright information in PMC

References

1. The Arabidopsis Initiative. Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature. 2000;408:796–815. - PubMed
1. International Rice Genome Sequencing Project. The map-based sequence of the rice genome. Nature. 2005;436:793–800. - PubMed
1. SanMiguel P, Tikhonov A, Jin Y-K, Motchoulskaia N, Zakharov D, et al. Nested retrotransposons in the intergenic regions of the maize genome. Science. 1996;274:765–768. - PubMed
1. Li W, Zhang P, Fellers JP, Friebe B, Gill BS. Sequence composition, organization, and evolution of the core Triticeae genome. Plant J. 2004;40:500–511. - PubMed
1. Swaminathan K, Varala K, Hudson ME. Global repeat discovery and estimation of genomic copy number in a large, complex genome using a high-throughput 454 sequence survey. BMC Genomics. 2007;8:1471–2164. - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Complexity reduction of polymorphic sequences (CRoPS): a novel approach for large-scale polymorphism discovery in complex genomes

Affiliation

Complexity reduction of polymorphic sequences (CRoPS): a novel approach for large-scale polymorphism discovery in complex genomes

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

LinkOut - more resources

Full Text Sources

Other Literature Sources