Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Nov 21;114(47):12512-12517.
doi: 10.1073/pnas.1707609114. Epub 2017 Oct 24.

Ultraaccurate genome sequencing and haplotyping of single human cells

Affiliations

Ultraaccurate genome sequencing and haplotyping of single human cells

Wai Keung Chu et al. Proc Natl Acad Sci U S A. .

Abstract

Accurate detection of variants and long-range haplotypes in genomes of single human cells remains very challenging. Common approaches require extensive in vitro amplification of genomes of individual cells using DNA polymerases and high-throughput short-read DNA sequencing. These approaches have two notable drawbacks. First, polymerase replication errors could generate tens of thousands of false-positive calls per genome. Second, relatively short sequence reads contain little to no haplotype information. Here we report a method, which is dubbed SISSOR (single-stranded sequencing using microfluidic reactors), for accurate single-cell genome sequencing and haplotyping. A microfluidic processor is used to separate the Watson and Crick strands of the double-stranded chromosomal DNA in a single cell and to randomly partition megabase-size DNA strands into multiple nanoliter compartments for amplification and construction of barcoded libraries for sequencing. The separation and partitioning of large single-stranded DNA fragments of the homologous chromosome pairs allows for the independent sequencing of each of the complementary and homologous strands. This enables the assembly of long haplotypes and reduction of sequence errors by using the redundant sequence information and haplotype-based error removal. We demonstrated the ability to sequence single-cell genomes with error rates as low as 10-8 and average 500-kb-long DNA fragments that can be assembled into haplotype contigs with N50 greater than 7 Mb. The performance could be further improved with more uniform amplification and more accurate sequence alignment. The ability to obtain accurate genome sequences and haplotype information from single cells will enable applications of genome sequencing for diverse clinical needs.

Keywords: haplotyping; microfluidics; mutation detection; single-cell sequencing.

PubMed Disclaimer

Conflict of interest statement

Conflict of interest statement: X.H. and K.Z. are listed as inventors for a patent application related to the method disclosed in this manuscript. K.Z. is a cofounder and equity holder of Singlera Genomics Inc. V. Bafna is a cofounder, has an equity interest, and receives income from Digital Proteomics, LLC. The terms of this arrangement have been reviewed and approved by the University of California, San Diego in accordance with its conflict of interest policies. Digital Proteomics, LLC was not involved in the research presented here.

Figures

Fig. 1.
Fig. 1.
An overview of the experimental process of SISSOR technology. A single cell in suspension was identified by imaging and captured. The cell was lysed, and chromosomal DNA molecules were separated into single-stranded form using ALS. The single-stranded DNA molecules were randomly distributed and partitioned in 24 chambers. Each partition was pushed into an air-filled MDA chamber using a neutralization buffer, followed by an MDA reaction solution. MDA reaction was carried out by heating the entire device at 30 °C overnight. The amplified product in each individual chamber was collected out of the device and processed into the barcoded sequencing library.
Fig. 2.
Fig. 2.
Haplotyping of single-stranded DNA fragments using sequencing reads from a single-cell genome amplified using a SISSOR device with 24 chambers. Large subhaploid SISSOR fragments were first computed per chamber and then phased into haplotype 1 and haplotype 2 with HapCUT2 (13). SISSOR fragments could be visualized by mapping the sequencing reads to a reference genome. Some fragments were not phased due to either the lack of heterozygous SNPs or the presence of mixed sequences from two or more strands.
Fig. 3.
Fig. 3.
Error rate analysis of base consensus in phased SISSOR fragments. (A) Base sequences in single-stranded DNA fragments were constructed by variant calling of the mapped MDA products in each individual chamber, and the complementary strands were identified by comparing the haplotypes of the single-stranded fragments from different chambers. (B) Matching variant calls in the contigs from the same haplotype between two cells (cross-cell), representing the PGP1-specific sequence, were validated by the PGP1/WGS reference. Common MDA and library preparation error was defined by the mismatches of variant calls between two matching phased haplotypes within the same cell (position 1). Single-cell de novo mutation was defined by matching variant calls between two matching phased haplotypes, together with a matching variant call from at least one chamber in the other cell to the PGP1 reference (position 2). The rates of single-chamber MDA-based sequencing error (10−5) and single-cell de novo mutation (10−7) were calculated for SISSOR. Cross-cell consensus, where de novo variants were removed, was defined by the matching variant calls between phased haplotypes in two different cells (position 3–7). The mismatch consensus to the PGP1 reference call (position 5) represented the discordance rate for SISSOR technology (10−8).
Fig. 4.
Fig. 4.
Differences between allele calls in PGP1 reference and SISSOR consensus. The consensus of CGI and Illumina WGS was used as the PGP1 reference. Positions lacking coverage in both PGP1 reference and SISSOR consensus were discarded.

Comment in

  • Haplotype resolution at the single-cell level.
    Adey AC. Adey AC. Proc Natl Acad Sci U S A. 2017 Nov 21;114(47):12362-12364. doi: 10.1073/pnas.1717798114. Epub 2017 Nov 7. Proc Natl Acad Sci U S A. 2017. PMID: 29114045 Free PMC article. No abstract available.

References

    1. Ke X, Taylor MS, Cardon LR. Singleton SNPs in the human genome and implications for genome-wide association studies. Eur J Hum Genet. 2008;16:506–515. - PubMed
    1. Collins FS, Brooks LD, Chakravarti A. A DNA polymorphism discovery resource for research on human genetic variation. Genome Res. 1998;8:1229–1231. - PubMed
    1. Fu Y, et al. Uniform and accurate single-cell sequencing based on emulsion whole-genome amplification. Proc Natl Acad Sci USA. 2015;112:11923–11928. - PMC - PubMed
    1. Wheeler DA, et al. The complete genome of an individual by massively parallel DNA sequencing. Nature. 2008;452:872–876. - PubMed
    1. Wang J, et al. The diploid genome sequence of an Asian individual. Nature. 2008;456:60–65. - PMC - PubMed

Publication types

MeSH terms