Highly multiplexed, fast and accurate nanopore sequencing for verification of synthetic DNA constructs and sequence libraries

Andrew Currin^{1

2}, Neil Swainston^{1

2

3}, Mark S Dunstan^{1

2}, Adrian J Jervis^{1

2}, Paul Mulherin^{1

2}, Christopher J Robinson^{1

2}, Sandra Taylor^{1

2}, Pablo Carbonell^{1

2}, Katherine A Hollywood^{1

2}, Cunyu Yan^{1

2}, Eriko Takano^{1

2}, Nigel S Scrutton^{1

2}, Rainer Breitling^{1

2}

Affiliations

¹ Manchester Centre for Synthetic Biology of Fine and Speciality Chemicals (SYNBIOCHEM), Manchester Institute of Biotechnology, The University of Manchester, Manchester M1 7DN, UK.
² School of Natural Sciences, Department of Chemistry, Faculty of Science and Engineering, The University of Manchester, Manchester M13 9PL, UK.
³ Institute of Integrative Biology, University of Liverpool, Liverpool L69 7ZB, UK.

PMID: 32995546
PMCID: PMC7445882
DOI: 10.1093/synbio/ysz025

Highly multiplexed, fast and accurate nanopore sequencing for verification of synthetic DNA constructs and sequence libraries

Andrew Currin et al. Synth Biol (Oxf). 2019.

. 2019 Oct 29;4(1):ysz025.

doi: 10.1093/synbio/ysz025. eCollection 2019.

Authors

Affiliations

¹ Manchester Centre for Synthetic Biology of Fine and Speciality Chemicals (SYNBIOCHEM), Manchester Institute of Biotechnology, The University of Manchester, Manchester M1 7DN, UK.
² School of Natural Sciences, Department of Chemistry, Faculty of Science and Engineering, The University of Manchester, Manchester M13 9PL, UK.
³ Institute of Integrative Biology, University of Liverpool, Liverpool L69 7ZB, UK.

PMID: 32995546
PMCID: PMC7445882
DOI: 10.1093/synbio/ysz025

Abstract

Synthetic biology utilizes the Design-Build-Test-Learn pipeline for the engineering of biological systems. Typically, this requires the construction of specifically designed, large and complex DNA assemblies. The availability of cheap DNA synthesis and automation enables high-throughput assembly approaches, which generates a heavy demand for DNA sequencing to verify correctly assembled constructs. Next-generation sequencing is ideally positioned to perform this task, however with expensive hardware costs and bespoke data analysis requirements few laboratories utilize this technology in-house. Here a workflow for highly multiplexed sequencing is presented, capable of fast and accurate sequence verification of DNA assemblies using nanopore technology. A novel sample barcoding system using polymerase chain reaction is introduced, and sequencing data are analyzed through a bespoke analysis algorithm. Crucially, this algorithm overcomes the problem of high-error rate nanopore data (which typically prevents identification of single nucleotide variants) through statistical analysis of strand bias, permitting accurate sequence analysis with single-base resolution. As an example, 576 constructs (6 × 96 well plates) were processed in a single workflow in 72 h (from Escherichia coli colonies to analyzed data). Given our procedure's low hardware costs and highly multiplexed capability, this provides cost-effective access to powerful DNA sequencing for any laboratory, with applications beyond synthetic biology including directed evolution, single nucleotide polymorphism analysis and gene synthesis.

Keywords: DNA assembly; nanopore sequencing; next-generation sequencing; strand bias; synthetic biology.

PubMed Disclaimer

Figures

**Figure 1.**
Allocation of primer pairs to enable the identification of individual wells from highly multiplexed samples, using well B1 from plate 1 as an example. Each well is allocated a forward primer, identifying the source plate, and a reverse primer, identifying the well. This enables the accurate identification of each individual well by data analysis after sequencing.

**Figure 2.**
Overview of the construct-sequencing workflow. Colonies harbouring assembled plasmids are first (A) picked and (B) cultured in deep well plates, prior to (C) dilution to create the PCR template. (D) PCR amplification of the construct generates 5′ (red) and 3′ (green) barcoded amplicons which are (E) analyzed by capillary electrophoresis. (F) Pooled amplicons are prepared for NGS sequencing by adapter ligation and (G) sequenced using the MinION device. (H) Bioinformatics processing of data identifies mutations and removes systematic errors by probabilistic analysis and (I) data metrics are outputted.

**Figure 3.**
Examples of strand bias and a genuine SNV in nanopore data. Each row corresponds to a different read. Bases correctly aligned to the target sequence are shown in gray and potential mutations are highlighted in color (A = yellow, G = blue, T = pink, C = green, deletion = red). The consensus basecall is identified at the relevant position. Strand bias is shown by the inconsistent SNV basecalling between the (A) forward and (B) reverse strand reads from the same sample. Our statistical analysis of this alignment prevents erroneous SNV identification. In contrast, genuine SNVs are identified by an agreement between the (C) forward and (D) reverse strand read data.

See this image and copyright information in PMC

References

1. Paddon C.J., Westfall P.J., Pitera D.J., Benjamin K., Fisher K., McPhee D., Leavell M.D., Tai A., Main A., Eng D. (2013) High-level semi-synthetic production of the potent antimalarial artemisinin. Nature, 496, 528–532. - PubMed
1. Carbonell P., Currin A., Jervis A.J., Rattray N.J.W., Swainston N., Yan C., Takano E., Breitling R. (2016) Bioinformatics for the synthetic biology of natural products: integrating across the Design–Build–Test cycle. Nat. Prod. Rep., 33, 925–932. - PMC - PubMed
1. Carbonell P., Jervis A.J., Robinson C.J., Yan C., Dunstan M., Swainston N., Vinaixa M., Hollywood K.A., Currin A., Rattray N.J.W.. et al. (2018) An automated Design-Build-Test-Learn pipeline for enhanced microbial production of fine chemicals. Commun. Biol., 1, 66.. - PMC - PubMed
1. Khara B., Menon N., Levy C., Mansell D., Das D., Marsh E.N.G., Leys D., Scrutton N.S. (2013) Production of propane and other short-chain alkanes by structure-based engineering of ligand specificity in aldehyde-deformylating oxygenase. Chembiochem, 14, 1204–1208. - PMC - PubMed
1. Jang Y.-S., Park J.M., Choi S., Choi Y.J., Seung D.Y., Cho J.H., Lee S.Y. (2012) Engineering of microorganisms for the production of biofuels and perspectives based on systems metabolic engineering approaches. Biotechnol. Adv., 30, 989–1000. - PubMed

Grants and funding

BB/M017702/1/BB_/Biotechnology and Biological Sciences Research Council/United Kingdom

LinkOut - more resources

Full Text Sources
- Europe PubMed Central
- PubMed Central
Other Literature Sources
- The Lens - Patent Citations Database
Research Materials
- NCI CPTC Antibody Characterization Program

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Highly multiplexed, fast and accurate nanopore sequencing for verification of synthetic DNA constructs and sequence libraries

Affiliations

Highly multiplexed, fast and accurate nanopore sequencing for verification of synthetic DNA constructs and sequence libraries

Authors

Affiliations

Abstract

Figures

References

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Research Materials