Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2012;7(1):e29437.
doi: 10.1371/journal.pone.0029437. Epub 2012 Jan 9.

Generation of long insert pairs using a Cre-LoxP Inverse PCR approach

Affiliations

Generation of long insert pairs using a Cre-LoxP Inverse PCR approach

Ze Peng et al. PLoS One. 2012.

Abstract

Large insert mate pair reads have a major impact on the overall success of de novo assembly and the discovery of inherited and acquired structural variants. The positional information of mate pair reads generally improves genome assembly by resolving repeat elements and/or ordering contigs. Currently available methods for building such libraries have one or more of limitations, such as relatively small insert size; unable to distinguish the junction of two ends; and/or low throughput. We developed a new approach, Cre-LoxP Inverse PCR Paired-End (CLIP-PE), which exploits the advantages of (1) Cre-LoxP recombination system to efficiently circularize large DNA fragments, (2) inverse PCR to enrich for the desired products that contain both ends of the large DNA fragments, and (3) the use of restriction enzymes to introduce a recognizable junction site between ligated fragment ends and to improve the self-ligation efficiency. We have successfully created CLIP-PE libraries up to 22 kb that are rich in informative read pairs and low in small fragment background. These libraries have demonstrated the ability to improve genome assemblies. The CLIP-PE methodology can be implemented with existing and future next-generation sequencing platforms.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. A schematic representation of the CLIP-PE library construction strategy.
Following fragmentation, the DNA molecules are end-repaired and ligated with LoxP-P1 and LoxP-P2 adaptors integrated with Illumina P1 or P2 sequences. After separation and size selection, DNA is circularized by Cre recombinase, and non-circularized DNA is removed by exonuclease digestion. bp enzyme cutter is then used to digest and fragment DNA. (Alternatively, circularized DNA can be fragmented by random shearing to 400–500 bp followed by end-repair). DNA is then self-ligated. Inverse PCR with Illumina P1 and P2 PCR primers is used to enrich the mate paired molecules for sequencing. The final prepared libraries consist of short fragments made up of two DNA segments that were originally separately by 5–22 kb.
Figure 2
Figure 2. Histogram of insert lengths from the Haloterrigena turkmenica VKM, DSM 5511 5 kb mate pair libraries.
A: CLIP-PE method, B: Illumina jumping method. The distribution of insert lengths was determined by aligning the reads to the reference genome.
Figure 3
Figure 3. Histogram of insert sizes from Saccharomyces cerevisiae Illumina 12 kb CLIP-PE libraries.
The distribution of insert lengths was determined by aligning the reads to the reference genome.
Figure 4
Figure 4. Histogram of insert sizes from Saccharomyces cerevisiae Illumina 22 kb CLIP-PE libraries.
A: cut with NlaIII, B: cut with HpyCh4IV, C: random shearing approach. The distribution of insert lengths was determined by aligning the reads to the reference genome.
Figure 5
Figure 5. Assembly metrics for Saccharomyces cerevisiae Illumina CLIP-PE libraries.
std refers to standard Illumina 250 bp library, sim 12 kb refers to simulated 12 kb mate pair library, and sim 22 kb refers to simulated 22 kb mate pair library.

Similar articles

Cited by

References

    1. Fullwood MJ, Wei CL, Liu ET, Ruan Y. Next-generation DNA sequencing of paired-end tags (PET) for transcriptome and genome analyses. Genome Research. 2009;19:521–532. - PMC - PubMed
    1. Kelley JM, Field CE, Craven MB, Bocskai D, Kim UJ, et al. High throughput direct end sequencing of BAC clones. Nucleic Acids Res. 1999;27:1539–1546. - PMC - PubMed
    1. Ng P, Wei CL, Sung WK, Chiu KP, Lipovich L, et al. Gene identification signature (GIS) analysis for transcriptome characterization and genome annotation. Nature Methods. 2005;2:105–111. - PubMed
    1. Matsumura H, Reich S, Ito A, Saitoh H, Kamoun S, et al. Gene expression analysis of plant host-pathogen interactions by SuperSAGE. Proc Natl Acad Sci U S A. 2003;100:15718–15723. - PMC - PubMed
    1. Gnerre S, Maccallum I, Przybylski D, Ribeiro FJ, Burton JN, et al. High-quality draft assemblies of mammalian genomes from massively parallel sequence data. Proc Natl Acad Sci U S A. 2011;108:1513–1518. - PMC - PubMed

Publication types

MeSH terms