Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Jul;41(12):6119-38.
doi: 10.1093/nar/gkt345. Epub 2013 Apr 29.

Single-cell paired-end genome sequencing reveals structural variation per cell cycle

Affiliations

Single-cell paired-end genome sequencing reveals structural variation per cell cycle

Thierry Voet et al. Nucleic Acids Res. 2013 Jul.

Abstract

The nature and pace of genome mutation is largely unknown. Because standard methods sequence DNA from populations of cells, the genetic composition of individual cells is lost, de novo mutations in cells are concealed within the bulk signal and per cell cycle mutation rates and mechanisms remain elusive. Although single-cell genome analyses could resolve these problems, such analyses are error-prone because of whole-genome amplification (WGA) artefacts and are limited in the types of DNA mutation that can be discerned. We developed methods for paired-end sequence analysis of single-cell WGA products that enable (i) detecting multiple classes of DNA mutation, (ii) distinguishing DNA copy number changes from allelic WGA-amplification artefacts by the discovery of matching aberrantly mapping read pairs among the surfeit of paired-end WGA and mapping artefacts and (iii) delineating the break points and architecture of structural variants. By applying the methods, we capture DNA copy number changes acquired over one cell cycle in breast cancer cells and in blastomeres derived from a human zygote after in vitro fertilization. Furthermore, we were able to discover and fine-map a heritable inter-chromosomal rearrangement t(1;16)(p36;p12) by sequencing a single blastomere. The methods will expedite applications in basic genome research and provide a stepping stone to novel approaches for clinical genetic diagnosis.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Single-cell DNA copy number profiling by focal read-depth analysis. (A) A tree of the single-cell–derived subclones and isolated HCC38-tumour cells. (B) Concordances of the DNA copy number profiles of the MDA-WGAed cells (blue), the PicoPlex-WGAed cells (red) and the non-WGAed subclones (green) with the reference B8FF4C copy number profile. The copy number concordance between a sample and B8FF4C was calculated by comparing the copy number states of each 10-kb bin genome wide following focal sequence-depth analyses. The y-axis represents the copy number concordance, the x-axis the γ penalty parameter of the PCF algorithm used for segmentation (‘Materials and Methods’ section). The mean copy number concordance is depicted as a line, the standard deviation as a shaded region. Two vertical dashed lines indicate the γ values of 25 and 150, respectively. (C) Complementary DNA copy number changes on chromosome 5 in two sister cells related by one cell cycle. Orange lines, representing the B8FF4C copy number segments, are overlaid on top of the red lines, which represent the single-cell PicoPlex copy number segments. (Top) Cell ‘PicoPlex-sc9’, (bottom) cell ‘PicoPlex-sc10’. (D) Segments of integer DNA copy number states following focal sequence-depth analyses using 10-kb bins and PCF segmentation (γ = 25) across all autosomes and the X chromosome. The integer DNA copy number is depicted as a heat map of which a color legend has been integrated in the figure. The profiles of the non-WGA single-cell–derived subclone samples (A6GD7A, A6GE4F and B8FB3A) and the reference B8FF4C sample are shown, followed by the four PicoPlex-amplified single cells (PicoPlex-sc1, PicoPlex-sc2, PicoPlex-sc9 and PicoPlex-sc10) and the four MDA-amplified single cells (mda-sc82, mda-sc83, mda-sc1 and mda-sc2).
Figure 2.
Figure 2.
Sensitivity and positive predictive value of single-cell paired-end maps. (A) Sensitivity of the single-cell paired-end maps in function of thresholds on the minimum amount of discordant read pairs that had to support a rearrangement signature. A set of 24 deletion-, 124 tandem duplication-, 18 inversion- and 31 inter-chromosomal signatures confirmed by PCR in HCC38 were scored for their presence in the single-cell paired-end maps. The mean sensitivity across the non-WGA subclone, the single-cell MDA and the single-cell PicoPlex paired-end maps are depicted in the y-axis. Sensitivities for deletion, tandem duplication, inter-chromosomal rearrangement and inversion signatures are shown separately. For the computations, the refined paired-end maps were used that contained only rearrangement signatures supported by a minimum threshold amount of read pairs (= x-axis). (B) Positive predictive values of the single-cell paired-end maps for deletion, tandem duplication, inter-chromosomal rearrangement and inversion signatures in function of thresholds on the minimum amount of discordant read pairs that had to support a rearrangement signature. The positive predictive values (= y-axis) were computed as the amount of single-cell rearrangements with a matching rearrangement signature in the reference B8FF4C paired-end map, divided by the total number of single-cell rearrangements present in the respective single-cell paired-end map. Refined paired-end maps that contained only rearrangement signatures supported by a minimum threshold-amount of discordant read pairs (= x-axis) and which encompassed >5 kb (except for putative inter-chromosomal events) were used for all calculations. The reference B8FF4C paired-end map consisted of signatures that encompassed >5 kb (except for putative inter-chromosomal events) and that were supported by two or more discordantly mapping read pairs. (C) A Circos-plot depicting confirmed HCC38 rearrangements identified in single cell ‘mda-sc82’ following paired-end sequencing of the MDA product. From the outside to the inside of the Circos-plot: (i) chromosome ideograms, (ii) the integer DNA copy number heat map (using 10-kb bins and γ = 25) of the non-WGA B8FF4C subclone, (iii) the integer DNA copy number heat map (using 10-kb bins and γ = 25) of the single-cell ‘mda-sc82’ sample, (iv) the amount of read pairs supporting each single-cell rearrangement is depicted by a bar (scale 2–30) at the start of each rearrangement signature and (v) confirmed HCC38 rearrangements identified in single cell ‘mda-sc82’ following paired-end sequencing. Color legends for the rearrangements and the copy number heat map are indicated. (D) A Circos-plot depicting confirmed HCC38 rearrangements identified in single cell ‘PicoPlex-sc2’ following paired-end sequencing. From the outside to the inside of the Circos-plot: (i) chromosome ideograms, (ii) the integer DNA copy number heat map (using 10-kb bins and γ = 25) of the non-WGA B8FF4C subclone, (iii) the integer DNA copy number heat map (using 10-kb bins and γ = 25) of the single-cell ‘PicoPlex-sc2’ sample, (iv) the amount of read pairs supporting each single-cell rearrangement is depicted by a bar (scale 2–30) at the start of each rearrangement signature and (v) confirmed HCC38 rearrangements identified in single cell ‘PicoPlex-sc2’ following paired-end sequencing. Color legends for the rearrangements and the copy number heat map are indicated. Circos-plots depicting confirmed HCC38 rearrangements that are identified in all non-WGA subclone and single-cell paired-end maps individually are presented in Supplementary Figure S8.
Figure 3.
Figure 3.
Detection of imbalanced structural variants by paired-end mapping of single-cell MDA-sequences. Integration of focal read-depth anomalies with aberrantly mapping read pairs allows accurate copy number variant detection in single cells and discloses the structure of the DNA imbalances. Read pair signatures typical for tandem duplications, deletions or inter-chromosomal lesions are depicted in the centre of the Circos-plot in green, red and purple, respectively. The amount of read pairs supporting each rearrangement is depicted by a bar (scale 2–30) at the start of each rearrangement signature in the outer circle of the Circos-plot. Subsequently, the logR values are shown on a grid (logR values above zero are depicted in green, below zero in red). Dark blue lines depict the B8FF4C reference logR segments determined from sequences of a non-WGA DNA sample; yellow lines indicate the single-cell MDA logR segments (segmentation penalty γ = 150). The top shows the data of the B8FF4C reference subclone, the bottom four panels depict the single cells ‘mda-sc1’, ‘mda-sc2’, ‘mda-sc82’ and ‘mda-sc83’, respectively. For these samples, the following rearrangements are shown: (i) a 1.7-Mb tandem duplication signature on chromosome 1 (read pair count in the reference = 24, read pair count in the single cells: 3, 14, 8 and 47, respectively). (ii) An inter-chromosomal rearrangement between chromosomes 2 and 6 (a minimum read pair count of nine was applied for putative inter-chromosomal events, if this threshold was not reached a faded purple line represents the rearrangement). (iii) A 1.7-Mb tandem duplication signature on chromosome 3 (read pair count in the reference = 24, read pair count in the single cells: 18, 14, 6 and 21, respectively). (iv) A 46-Mb tandem duplication signature on chromosome 5 (read pair count in the reference = 17, read pair count in the single cells: 9, 6, 4 and 11, respectively). (v) A 1.3-Mb tandem duplication signature on chromosome 6 (read pair count in the reference = 10, read pair count in the single cell: 4, 5, 3 and 4, respectively). (vi) A 4.5-Mb deletion signature on chromosome 10 (read pair count in the reference = 16, read pair count in the single cells: 11, 3, 2 and 12, respectively). (vii) A 1.6-Mb tandem duplication signature on chromosome 11 (read pair count in the reference = 12, read pair count in the single cells: 30, 10, 12 and 0, respectively). (viii) A 8.6-Mb deletion signature on chromosome 18 (read pair count in the reference = 27, read pair count in the single cells: 25, 2, 5 and 5, respectively). Circos-plots for all non-WGA subclone, single-cell MDA and single-cell PicoPlex samples depicting the same loci can be found in Supplementary Figure S9.
Figure 4.
Figure 4.
Detection of imbalanced structural variants by paired-end mapping of single-cell PicoPlex-sequences. Aberrantly mapping read pairs typical for tandem-duplication (green), deletion (red) and inter-chromosomal rearrangement (purple) signatures were captured from the refined pool of aberrantly mapping read pairs using a ∼50-kb radius around the single-cell PicoPlex logR break points. For intra-chromosomal rearrangements, only those encompassing >5 kb are depicted in the centre of the Circos-plot. The amount of read pairs supporting each rearrangement is depicted by a bar (scale 2–30) at the start of each rearrangement signature in the outer circle of the Circos-plot. Subsequently, the logR values are shown on a grid (logR values above zero are depicted in green, below zero in red). Dark blue lines depict the B8FF4C reference logR segments determined from sequences of a non-WGA DNA sample, yellow lines the single-cell PicoPlex logR-segments (γ = 150 for the rearrangement on chromosome 10 and γ = 25 for all other rearrangements). The top shows the data of the reference subclone B8FF4C, the bottom four panels depict the single cells ‘PicoPlex-sc1’, ‘PicoPlex-sc2’, ‘PicoPlex-sc9’ and ‘PicoPlex-sc10’, respectively. For these samples, the following rearrangements are shown: (i) a 98-kb deletion signature on chromosome 1 (read pair count in the reference = 6, read pair count in the single cells: 6, 2, 4 and 9, respectively). In PicoPlex-sc10, the discordant read pair signature was present, yet not captured by baiting as the logR segmentation missed the deletion in this cell (shown by a faded red line). (ii) A 2.3-Mb tandem duplication signature on chromosome 2 (read pair count in the reference = 6, read pair count in the single cells: 3, 5, 11 and 1 (shown faded), respectively). (iii) A 1.7-Mb tandem duplication signature on chromosome 3 (read pair count in the reference = 24, read pair count in the single cells: 11, 17, 6 and 4, respectively). (iv) A 1-Mb tandem duplication signature on chromosome 5 (read pair count in the reference = 18, read pair count in the single cell: 5, 7, 10 and 15, respectively). (v) A 1.1-Mb tandem duplication signature on chromosome 6 (read pair count in the reference = 12, read pair count in the single cells: 70, 52, 48 and 83, respectively). (vi) A 62.3-Mb deletion signature on chromosome 10 (read pair count in the reference = 19, read pair count in the single cells: 19, 71, 48 and 59, respectively). The Circos-plots for all non-WGA subclone-, single-cell MDA- and PicoPlex-samples depicting the same loci can be found in Supplementary Figure S10.
Figure 5.
Figure 5.
Accuracy of WGA nucleotide copying and genotyping. (A) Nucleotide mismatch frequency with the hg19-reference genome at each base of the read. Only bases with a base-call quality of ≥30 in reads having a minimum mapping quality of 30 were considered. It is clear that the PicoPlex WGA method introduces significantly more WGA nucleotide errors than MDA. (B and C) Approximately 450 000 SNPs, which were heterozygous in the sequences of two HCC38 subclones (B8FF4C and B8FB3A), were genotyped in the single-cell sequences. (B) Single-cell SNP zygosity concordance with the reference genotype (y-axis) in function of read depth across the SNPs (x-axis). (C) Single-cell SNP call-rate (y-axis) in function of read depth across the SNPs (x-axis).
Figure 6.
Figure 6.
De novo structural variants acquired over a single tumour cell cycle and cleavage cell divisions in a human embryo. (A) Tumour cells related by one cell cycle. The single-cell genomes were amplified by PicoPlex technology. Chromosome 2 is shown. Single-cell DNA copy number signals are depicted in black and single-cell DNA copy number segments in red. Note that a pericentric DNA gain in cell ‘PicoPlex-sc9’ is not compensated by a deletion in the sister cell ‘PicoPlex-sc10’. (B) Genome-wide integer DNA copy number heat maps and BAF of three sister blastomeres of a biopsied human cleavage stage embryo following IVF. The blastomere genomes were amplified by MDA. From the outer to the inner side of the Circos-plot, the DNA copy number heat map and BAF profile of three blastomeres ‘mda-sc1113’, ‘mda-sc1116’ and ‘mda-sc1117’ are shown consecutively. The following de novo DNA imbalances were detected across the cell’s genomes (using 50-kb genomic bins for focal read analysis; PCF segmentation penalty γ = 150 for cells ‘mda-sc1113’ and ‘mda-sc1116’; γ = 200 for cell ‘mda-sc1117’, which received lower sequencing coverage; notice that all genuine deletions are corroborated by a loss-of-heterozygosity signature in the BAF): (i) a ∼21-Mb 1pter deletion in blastomere ‘mda-sc1117’ with reciprocal duplications of the same locus in blastomeres ‘mda-sc1113’ and ‘mda-sc1116’. Cell ‘mda-sc1117’ in addition contains a ∼54-Mb duplication flanking the 1pter deletion. (ii) Blastomere ‘mda-sc1116’ carries a 1q-arm deletion with a reciprocal DNA gain in cell ‘mda-sc1113’. (iii) Blastomere ‘mda-sc1117’ has a 4qter deletion with a reciprocal amplification of this locus in cell ‘mda-sc1113’ (notice the clear distortion of the BAF across the nine DNA copies of this locus). The remaining part of chromosome 4 in ‘mda-sc1117’ shows a DNA gain. (iv) Blastomere ‘mda-sc1116’ carries a monosomy 7 with reciprocal trisomy in cell ‘mda-sc1113’. (v) Blastomere ‘mda-sc1117’ carries a 10q-arm duplication. The monosomy X (vi) of this male embryo is detected in all cells. Apparent DNA losses at pericentromeric and telomeric loci, not corroborated by LOH in the BAF (e.g. chromosomes 15 and 19), were interpreted as false positives.
Figure 7.
Figure 7.
Paired-end sequence analysis of a single cell allows the characterization of an unmapped inter-chromosomal rearrangement to base resolution. By paired-end sequence analysis of a single cell ‘mda-sc124’ biopsied of a human cleavage stage embryo that was derived from a PGD-IVF cycle for a balanced translocation t(1;16)(p36;p12), we were able to pinpoint and characterize the break points on the derivative chromosomes der(1) and der(16) segregating in the family. The male individual of this couple opting for PGD carried the balanced translocation t(1;16)(p36;p12). (A) A Circos-plot for the chromosomes 1 and 16 representing (from the outside to the inside): (i) a chromosome ideogram, (ii) the logR values derived from an SNP array analysis performed on the DNA of the affected sibling (γ = 25, orange line), which indicates that the sibling is carrier of the der(1) chromosome, (iii) the BAF derived from the affected sibling’s SNP array analysis supports the DNA imbalances caused by the der(1) in the sibling, (iv) the logR derived from the paired-end sequence data of single cell ‘mda-sc124’ (50-kb bins, PCF segmentation penalty γ = 300, orange line) indicates that cell ‘mda-sc124’ carries the der(16) chromosome and (v) the inter-chromosomal rearrangement read pair signature (purple curve, amount of supporting read pairs = 6) corroborating the der(16) break point in cell ‘mda-sc124’ is shown following single-cell paired-end sequencing and mapping. This single-cell rearrangement is also in line with the der(1) break point in the affected sibling. (B) Gel electrophoresis images of the PCR products across the break points. Primers to amplify over the break points were designed based on the paired-end sequence data of the single blastomere ‘mda-sc124’. As expected, the der(16) break point is present in the father carrying the balanced translocation, as well as in cell ‘mda-sc124’, but not in the mother or the affected child. In contrast, the der(1) break point could only be amplified in the father and the affected sibling. A control PCR for a fragment on chromosome 16p confirmed the quality of our DNA samples. (C) Capillary sequencing of the PCR products obtained for the single cell ‘mda-sc124’, the father and the affected sibling confirmed the translocation break points to base resolution. At the translocation break point, a single base pair deletion was observed.

References

    1. Campbell PJ, Pleasance ED, Stephens PJ, Dicks E, Rance R, Goodhead I, Follows GA, Green AR, Futreal PA, Stratton MR. Subclonal phylogenetic structures in cancer revealed by ultra-deep sequencing. Proc. Natl Acad. Sci. USA. 2008;105:13081–13086. - PMC - PubMed
    1. Stephens PJ, Greenman CD, Fu B, Yang F, Bignell GR, Mudie LJ, Pleasance ED, Lau KW, Beare D, Stebbings LA, et al. Massive genomic rearrangement acquired in a single catastrophic event during cancer development. Cell. 2011;144:27–40. - PMC - PubMed
    1. Stratton MR. Exploring the genomes of cancer cells: progress and promise. Science. 2011;331:1553–1558. - PubMed
    1. Ding L, Ley TJ, Larson DE, Miller CA, Koboldt DC, Welch JS, Ritchey JK, Young MA, Lamprecht T, McLellan MD, et al. Clonal evolution in relapsed acute myeloid leukaemia revealed by whole-genome sequencing. Nature. 2012;481:506–510. - PMC - PubMed
    1. Rausch T, Jones DT, Zapatka M, Stutz AM, Zichner T, Weischenfeldt J, Jager N, Remke M, Shih D, Northcott PA, et al. Genome sequencing of pediatric medulloblastoma links catastrophic DNA rearrangements with TP53 mutations. Cell. 2012;148:59–71. - PMC - PubMed

Publication types