Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jan;42(1):132-138.
doi: 10.1038/s41587-023-01750-7. Epub 2023 May 25.

Sequencing by avidity enables high accuracy with low reagent consumption

Sinan Arslan #  1 Francisco J Garcia #  1 Minghao Guo #  1 Matthew W Kellinger #  1 Semyon Kruglyak #  1 Jake A LeVieux #  1 Adeline H Mah #  1 Haosen Wang #  1 Junhua Zhao #  1 Chunhong Zhou #  1 Andrew Altomare  1 John Bailey  1 Matthew B Byrne  1 Chiting Chang  1 Steve X Chen  1 Byungrae Cho  1 Claudia N Dennler  1 Vivian T Dien  1 Derek Fuller  1 Ryan Kelley  1 Omid Khandan  1 Michael G Klein  1 Michael Kim  1 Bryan R Lajoie  1 Bill Lin  1 Yu Liu  1 Tyler Lopez  1 Peter T Mains  1 Andrew D Price  1 Samantha R Robertson  1 Hermes Taylor-Weiner  1 Ramreddy Tippana  1 Austin B Tomaney  1 Su Zhang  1 Minna Abtahi  1 Mark R Ambroso  1 Rosita Bajari  1 Ava M Bellizzi  1 Chris B Benitez  1 Daniel R Berard  1 Lorenzo Berti  1 Kelly N Blease  1 Angela P Blum  1 Andrew M Boddicker  1 Leo Bondar  1 Chris Brown  1 Chris A Bui  1 Juan Calleja-Aguirre  1 Kevin Cappa  1 Joshua Chan  1 Victor W Chang  1 Katherine Charov  1 Xiyi Chen  1 Rodger M Constandse  1 Weston Damron  1 Mariam Dawood  1 Nicole DeBuono  1 John D Dimalanta  1 Laure Edoli  1 Keerthana Elango  1 Nikka Faustino  1 Chao Feng  1 Matthew Ferrari  1 Keith Frankie  1 Adam Fries  1 Anne Galloway  1 Vlad Gavrila  1 Gregory J Gemmen  1 James Ghadiali  1 Arash Ghorbani  1 Logan A Goddard  1 Adriana Roginski Guetter  1 Garren L Hendricks  1 Jendrik Hentschel  1 Daniel J Honigfort  1 Yun-Ting Hsieh  1 Yu-Hsien Hwang Fu  1 Scott K Im  1 Chaoyi Jin  1 Shradha Kabu  1 Daniel E Kincade  1 Shawn Levy  1 Yu Li  1 Vincent K Liang  1 William H Light  1 Jonathan B Lipsher  1 Tsung-Li Liu  1 Grace Long  1 Rui Ma  1 John M Mailloux  1 Kyle A Mandla  1 Anyssa R Martinez  1 Max Mass  1 Daniel T McKean  1 Michael Meron  1 Edmund A Miller  1 Celyne S Moh  1 Rachel K Moore  1 Juan Moreno  1 Jordan M Neysmith  1 Cassandra S Niman  1 Jesus M Nunez  1 Micah T Ojeda  1 Sara Espinosa Ortiz  1 Jenna Owens  1 Geoffrey Piland  1 Daniel J Proctor  1 Josua B Purba  1 Michael Ray  1 Daisong Rong  1 Virginia M Saade  1 Sanchari Saha  1 Gustav Santo Tomas  1 Nicholas Scheidler  1 Luqmanal H Sirajudeen  1 Samantha Snow  1 Gudrun Stengel  1 Ryan Stinson  1 Michael J Stone  1 Keoni J Sundseth  1 Eileen Thai  1 Connor J Thompson  1 Marco Tjioe  1 Christy L Trejo  1 Greg Trieger  1 Diane Ni Truong  1 Ben Tse  1 Benjamin Voiles  1 Henry Vuong  1 Jennifer C Wong  1 Chiung-Ting Wu  1 Hua Yu  1 Yingxian Yu  1 Ming Yu  1 Xi Zhang  1 Da Zhao  1 Genhua Zheng  1 Molly He  1 Michael Previte  2
Affiliations

Sequencing by avidity enables high accuracy with low reagent consumption

Sinan Arslan et al. Nat Biotechnol. 2024 Jan.

Abstract

We present avidity sequencing, a sequencing chemistry that separately optimizes the processes of stepping along a DNA template and that of identifying each nucleotide within the template. Nucleotide identification uses multivalent nucleotide ligands on dye-labeled cores to form polymerase-polymer-nucleotide complexes bound to clonal copies of DNA targets. These polymer-nucleotide substrates, termed avidites, decrease the required concentration of reporting nucleotides from micromolar to nanomolar and yield negligible dissociation rates. Avidity sequencing achieves high accuracy, with 96.2% and 85.4% of base calls having an average of one error per 1,000 and 10,000 base pairs, respectively. We show that the average error rate of avidity sequencing remained stable following a long homopolymer.

PubMed Disclaimer

Conflict of interest statement

All authors are current or former employees of Element Biosciences. All authors may hold stock options in the company.

Figures

Fig. 1
Fig. 1. Avidity sequencing workflow and scheme.
a, Sequencing by avidity. A reagent containing multivalent avidite substrates and an engineered polymerase are combined with DNA polonies inside a flowcell. The engineered polymerase binds to the free 3′ ends of the primer-template of a polony and selects the correct cognate avidite via base-pairing discrimination. The multivalent avidite interacts with multiple polymerases on one polony to create avidity binding that reduces the effective Kd of the avidite substrates 100-fold compared with a monovalent dye-labeled nucleotide, allowing productive binding of nanomolar concentrations. Multiple polymerase-mediated binding events per avidite ensure a long signal persistence time. Imaging of fluorescent, bound avidites enables base classification. Following detection, avidites are removed from the polonies. Extension by one base using an engineered polymerase incorporates an unlabeled, blocked nucleotide. A terminal 3′ hydroxyl is regenerated on the DNA strand, allowing repetition of the cycle. b, Rendering of a single avidite bound to a DNA polony via polymerase-mediated selection. The initial surface primer used for library hybridization and extension during polony formation is shown in blue. Sequencing primers (red) are shown annealed to the single-strand DNA polony (gray). Each arm of the avidite (black) connects the avidite core containing multiple fluorophores (green) to a nucleotide substrate. The polymerase bound to the sequencing primer selects the correct nucleotide to base pair with the templating base (inset). The result is multiple base-mediated anchor points noncovalently attaching the avidite to the DNA polony. c, Rendering of multiple DNA polonies with template-specific avidites bound during the binding step of the cycle (polymerase not shown for simplicity). Many avidites bind to each DNA polony generating a fluorescent signal during detection. Multiple long, flexible polymer linkers connect the core to the nucleotide substrates.
Fig. 2
Fig. 2. Nucleotide and avidite binding kinetics.
a, Monovalent fluorophore-labeled nucleotide concentration dependence of the observed rate of incorporation. Time series were performed at each concentration and fit to a single exponential equation to derive a rate. Observed rates were plotted as a function of concentration and fit to a hyperbolic equation, deriving a value of kpol = 0.86 ± 0.14 s−1 and Kd,app = 1.6 ± 0.6 µM. b,c, Real-time association kinetics of signal generation resulting from reacting multivalent avidite substrates (b) and monovalent nucleotides (c) with DNA polonies. d,e, Real-time measurement of signal decay following flow cell washing for imaging of multivalent avidite substrates (d) and monovalent nucleotides (e).
Fig. 3
Fig. 3. Predicted and observed quality scores for a 2 × 150-bp sequencing run of human genome HG002.
a, Read 1 (R1). b, Read 2 (R2). Points on the diagonal indicate that predicted scores match observed scores. The histograms show that the majority of the data points are >Q40.
Fig. 4
Fig. 4. Post-homopolymer performance across platforms.
Mismatch percentages of AVITI, NovaSeq 6000 and NextSeq 2000 reads before and after homopolymers of length 12 or greater.
Fig. 5
Fig. 5. Comparison of mismatch rate following homopolymers of length between four and 29.
Mismatch percentage difference between avidity sequencing and SBS increases with homopolymer length. The box plot shows median, quartiles and whiskers, which are 1.5× interquartile range.
Fig. 6
Fig. 6. Performance of a 300-cycle E. coli sequencing run.
a, Percentage Q30 by cycle. Overall Q30 percentage exceeds 96% and end of read has 85% Q30. b, E. coli error rate as a function of cycle. Alignment settings strongly discourage soft clipping, and >99% of reads pass filter. Final cycle error rate was 0.019.
Extended Data Fig. 1
Extended Data Fig. 1. Model of an avidite.
(a) side and top views of a modeled avidite. The protein core consists of fluorophore labeled streptavidin. The monomers of tetrameric streptavidin are colored red, blue, green, and yellow. Dye conjugation sites through lysine-NHS chemistry are denoted in the surface rendering as magenta. Fluorophores are not pictured. Avidite arms are associated via a biotin interaction with the core streptavidin protein. Arms are mixed stoichiometrically to achieve averages of three nucleotide containing arms and one linker to additional cores. Molecules conjugated to have been shortened in this representation. (b) Structure of an avidite arm. (c) Structure of the 4-arm linker connecting avidite cores.
Extended Data Fig. 2
Extended Data Fig. 2. Percentage of instances that a k-mer contained at least one mismatch compared across 3 instruments.
Panels a, b, and c display 1-mers, 2-mers, and 3-mers, respectively. The bars are sorted by AVITI contexts from most to least accurate.
Extended Data Fig. 3
Extended Data Fig. 3. Histogram of pairwise error differences.
Difference was selected as the metric to cancel the effects of human variants from the mismatch percent.
Extended Data Fig. 4
Extended Data Fig. 4. IGV display of homopolymer loci at the 5th, 50th, and 95th percentile of AVITI minus NovaSeq mismatch percent (corresponding to the dashed lines of Extended Data Fig. 3).
The red bar at the top indicates the homopolymer. Colors within the IGV read stack correspond to mismatches and softclipping. Only mismatches contribute to the error rate calculation and softclipped bases are ignored.
Extended Data Fig. 5
Extended Data Fig. 5. Comparison of read number vs genomic coverage computed via Picard for PCR-free whole genome data.
AVITI most closely matches the 45-degree line due to the low duplicate rate.
Extended Data Fig. 6
Extended Data Fig. 6. F1 Score of SNPs and indels across GiaB stratifications.
F1 score for SNPs and indels stratified by all GiaB regions with at least 100 variants in the 4.2.1 truth set of sample HG002.

References

    1. Levy SE, Myers RM. Advancements in next-generation sequencing. Annu. Rev. Genomics Hum. Genet. 2016;17:95–115. doi: 10.1146/annurev-genom-083115-022413. - DOI - PubMed
    1. van Dijk EL, et al. Ten years of next-generation sequencing technology. Trends Genet. 2014;30:418–426. doi: 10.1016/j.tig.2014.07.001. - DOI - PubMed
    1. Yohe S, Thyagarajan B. Review of clinical next-generation sequencing. Arch. Pathol. Lab. Med. 2017;141:1544–1557. doi: 10.5858/arpa.2016-0501-RA. - DOI - PubMed
    1. Zhang Y, et al. Single-cell RNA sequencing in cancer research. J. Exp. Clin. Cancer Res. 2021;40:81. doi: 10.1186/s13046-021-01874-1. - DOI - PMC - PubMed
    1. Ekblom R, Galindo J. Applications of next generation sequencing in molecular ecology of non-model organisms. Heredity. 2011;107:1–15. doi: 10.1038/hdy.2010.152. - DOI - PMC - PubMed