Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2023 Jan 13:2022.10.14.22280783.
doi: 10.1101/2022.10.14.22280783.

Development of an amplicon-based sequencing approach in response to the global emergence of human monkeypox virus

Nicholas F G Chen  1 Chrispin Chaguza  1 Luc Gagne  2 Matthew Doucette  2 Sandra Smole  2 Erika Buzby  2 Joshua Hall  2 Stephanie Ash  2 Rachel Harrington  2 Seana Cofsky  2 Selina Clancy  2 Curtis J Kapsak  3 Joel Sevinsky  3 Kevin Libuit  3 Daniel J Park  4 Peera Hemarajata  5 Jacob M Garrigues  5 Nicole M Green  5 Sean Sierra-Patev  6 Kristin Carpenter-Azevedo  6 Richard C Huard  6 Claire Pearson  7 Kutluhan Incekara  7 Christina Nishimura  7 Jian Ping Huang  7 Emily Gagnon  7 Ethan Reever  7 Jafar Razeq  7 Anthony Muyombwe  7 Vítor Borges  8 Rita Ferreira  8 Daniel Sobral  8 Silvia Duarte  9 Daniela Santos  9 Luís Vieira  9 João Paulo Gomes  8   10 Carly Aquino  11 Isabella M Savino  11 Karinda Felton  11 Moneeb Bajwa  11 Nyjil Hayward  11 Holly Miller  11 Allison Naumann  11 Ria Allman  11 Neel Greer  11 Amary Fall  12 Heba H Mostafa  12 Martin P McHugh  13   14 Daniel M Maloney  13   15 Rebecca Dewar  13 Juliet Kenicer  13 Abby Parker  13 Katharine Mathers  13 Jonathan Wild  13 Seb Cotton  13 Kate E Templeton  13 George Churchwell  16 Philip A Lee  16 Maria Pedrosa  16 Brenna McGruder  16 Sarah Schmedes  16 Matthew R Plumb  17 Xiong Wang  17 Regina Bones Barcellos  18 Fernanda M S Godinho  18 Richard Steiner Salvato  18 Aimee Ceniseros  19 Mallery I Breban  1 Nathan D Grubaugh  1   20 Glen R Gallagher  2   6 Chantal B F Vogels  1
Affiliations

Development of an amplicon-based sequencing approach in response to the global emergence of human monkeypox virus

Nicholas F G Chen et al. medRxiv. .

Update in

  • Development of an amplicon-based sequencing approach in response to the global emergence of mpox.
    Chen NFG, Chaguza C, Gagne L, Doucette M, Smole S, Buzby E, Hall J, Ash S, Harrington R, Cofsky S, Clancy S, Kapsak CJ, Sevinsky J, Libuit K, Park DJ, Hemarajata P, Garrigues JM, Green NM, Sierra-Patev S, Carpenter-Azevedo K, Huard RC, Pearson C, Incekara K, Nishimura C, Huang JP, Gagnon E, Reever E, Razeq J, Muyombwe A, Borges V, Ferreira R, Sobral D, Duarte S, Santos D, Vieira L, Gomes JP, Aquino C, Savino IM, Felton K, Bajwa M, Hayward N, Miller H, Naumann A, Allman R, Greer N, Fall A, Mostafa HH, McHugh MP, Maloney DM, Dewar R, Kenicer J, Parker A, Mathers K, Wild J, Cotton S, Templeton KE, Churchwell G, Lee PA, Pedrosa M, McGruder B, Schmedes S, Plumb MR, Wang X, Barcellos RB, Godinho FMS, Salvato RS, Ceniseros A, Breban MI, Grubaugh ND, Gallagher GR, Vogels CBF. Chen NFG, et al. PLoS Biol. 2023 Jun 13;21(6):e3002151. doi: 10.1371/journal.pbio.3002151. eCollection 2023 Jun. PLoS Biol. 2023. PMID: 37310918 Free PMC article.

Abstract

The 2022 multi-country monkeypox (mpox) outbreak concurrent with the ongoing COVID-19 pandemic has further highlighted the need for genomic surveillance and rapid pathogen whole genome sequencing. While metagenomic sequencing approaches have been used to sequence many of the early mpox infections, these methods are resource intensive and require samples with high viral DNA concentrations. Given the atypical clinical presentation of cases associated with the outbreak and uncertainty regarding viral load across both the course of infection and anatomical body sites, there was an urgent need for a more sensitive and broadly applicable sequencing approach. Highly multiplexed amplicon-based sequencing (PrimalSeq) was initially developed for sequencing of Zika virus, and later adapted as the main sequencing approach for SARS-CoV-2. Here, we used PrimalScheme to develop a primer scheme for human monkeypox virus that can be used with many sequencing and bioinformatics pipelines implemented in public health laboratories during the COVID-19 pandemic. We sequenced clinical samples that tested presumptive positive for human monkeypox virus with amplicon-based and metagenomic sequencing approaches. We found notably higher genome coverage across the virus genome, with minimal amplicon drop-outs, in using the amplicon-based sequencing approach, particularly in higher PCR cycle threshold (lower DNA titer) samples. Further testing demonstrated that Ct value correlated with the number of sequencing reads and influenced the percent genome coverage. To maximize genome coverage when resources are limited, we recommend selecting samples with a PCR cycle threshold below 31 Ct and generating 1 million sequencing reads per sample. To support national and international public health genomic surveillance efforts, we sent out primer pool aliquots to 10 laboratories across the United States, United Kingdom, Brazil, and Portugal. These public health laboratories successfully implemented the human monkeypox virus primer scheme in various amplicon sequencing workflows and with different sample types across a range of Ct values. Thus, we show that amplicon based sequencing can provide a rapidly deployable, cost-effective, and flexible approach to pathogen whole genome sequencing in response to newly emerging pathogens. Importantly, through the implementation of our primer scheme into existing SARS-CoV-2 workflows and across a range of sample types and sequencing platforms, we further demonstrate the potential of this approach for rapid outbreak response.

PubMed Disclaimer

Conflict of interest statement

Competing interests

NDG is a consultant for Tempus Labs and the National Basketball Association for work related to COVID-19. All other authors declare no competing interests.

Figures

Figure 1:
Figure 1:. Comparison of percent genome coverage at 10X of clinical specimens sequenced with amplicon-based and metagenomic sequencing approaches.
DNA was extracted from 10 clinical samples manually extracted with the QIAamp DSP DNA Blood Mini kit and PCR cycle threshold (Ct) values were determined with the non-variola Orthopox real-time PCR assay. Libraries were prepared with amplicon-based and metagenomic sequencing approaches and sequenced on the Illumina MiSeq (2×150 bp) with a targeted 0.5–1 million reads per library for amplicon-based sequencing and 1.5–3 million reads per library for metagenomic sequencing. A negative template control was included during library prep for each sequencing run. For amplicon-based sequencing, consensus genomes were generated at a read depth coverage of 10X and percent genome coverage as compared to the reference genome (MT903345) was determined using TheiaCoV_Illumina_PE Workflow Series on Terra.bio. For metagenomic sequencing, genomes were generated using the Broad Institute’s viral-pipelines workflows on Terra.bio using both the assemble_refbased and assemble_denovo workflows.
Figure 2:
Figure 2:. Percent genome coverage at 10X for clinical specimens sequenced with the amplicon-based sequencing approach.
A. Clinical specimens (n=123) consisting of 115 lesion swabs, 5 oropharyngeal swabs in the absence of lesions, and 3 oropharyngeal swabs in the presence of lesions sequenced by the Massachusetts State Public Health Laboratory (MASPHL). Libraries were prepared using the Illumina DNA prep kit and sequenced on the MiSeq with 0.5–1 million reads per sample. A negative template control was included during library prep for each sequencing run. B. Lesion swabs (n=22) obtained from 12 individuals through the Connecticut Department of Public Health (CDPH) and sequenced by the Yale School of Public Health (YSPH). Libraries were prepared using the Illumina COVIDSeq test (RUO version) and sequenced on the NovaSeq with on average 12 million reads per sample. A negative template control was included during library prep. Bioinformatic analyses were unified between both laboratories using iVar with a minimum read depth of 10.
Figure 3.
Figure 3.. Percent genome coverage at 10X mapped read depth for 22 clinical specimens after randomly down-sampling to a specific number of sequencing reads.
To further investigate the combined effects of Ct value and number of sequencing reads per sample, we randomly down-sampled the CDPH/YSPH data to 2, 1.5, 1, and 0.5 million reads per sample. We used a logistic function analysis to plot the fitted lines indicating the decrease in percent genome coverage with higher Ct values.
Figure 4:
Figure 4:. Depth of coverage by genome position for samples representing a range of Ct values and randomly down-sampled to different numbers of sequencing reads.
We determined the depth of coverage at each nucleotide position for selected samples that represented a range in Ct values from 16.4–35.2, and for which the number of raw sequencing reads was randomly down-sampled to 2, 1.5, 1, and 0.5 million sequencing reads. Each row represents a single specimen, ranked by Ct value from high (low DNA titer) to low (high DNA titer). Highlighted in yellow are positions of the genome with a depth of coverage below 10X.
Figure 5.
Figure 5.. Geographical distribution of public health laboratories that implemented the human monkeypox virus primer scheme with their established amplicon-based sequencing workflows.
Public health laboratories contributing data to this study include: Connecticut Department of Public Health (CDPH), Centro Estadual de Vigilância em Saúde (CEVS), Delaware Public Health Lab (DPHL), Florida Department of Health (FDH), Idaho Bureau of Laboratories (IBL), Johns Hopkins Medical Institutions (JHMI), Los Angeles County Public Health Lab (LACPHL), Massachusetts State Public Health Laboratory (MASPHL), Minnesota Department of Health (MDH), National Health Service Lothian (NHS Lothian), National Institute of Health Dr. Ricardo Jorge (INSA), and Rhode Island State Health Laboratory (RISHL).
Figure 6:
Figure 6:. Percent genome coverage at 10X read depth for clinical specimens sequenced with the amplicon-based sequencing approach.
A. Lesion swabs (n=10) sequenced by the DPHL. Data are fitted with a logistic function and the dashed line corresponds to 80% genome coverage at a threshold of Ct 28.4. B. Dry vesicle swabs (n=56) from various anatomical sites and sequenced by the RIDOH RISHL. Data are fitted with a logistic function and the dashed line corresponds to 80% genome coverage at a threshold of Ct 31.2. C. Lesion swabs (n=6) sequenced by the FDH. Data are fitted with a logistic function and the dashed line corresponds to 80% genome coverage at a threshold of Ct 30.6. D. Lesion swabs (n=9) sequenced by IBL. Data are fitted with a logistic function. E. Lesion swabs (n=27) sequenced by the LACPHL. Data are fitted with a logistic function. F. Lesion swabs (n=25) from various anatomical sites and sequenced by the MDH. Data are fitted with a logistic function. G. Clinical specimens (n=19) consisting of lesion swabs and crusts of healing lesions sequenced by the CEVS. Data are fitted with a logistic function. H. Clinical specimens (n=78) consisting of lesion swabs from various anatomical sites as well as oropharyngeal swabs sequenced by INSA. I. Vesicle swabs (n=34) from various anatomical sites tested in parallel on Illumina and Oxford Nanopore Technology (ONT) sequencing platforms by the NHS. Data from both sequencing platforms are fitted with logistic function and the dashed lines correspond to 80% genome coverage at a threshold of Ct 25.0 on the Illumina platform and Ct 24.4 on the ONT platform. J. Lesion swabs (n=8) sequenced on the ONT platform by JHMI.

References

    1. Grubaugh ND, Ladner JT, Lemey P, Pybus OG, Rambaut A, Holmes EC, et al. Tracking virus outbreaks in the twenty-first century. Nat Microbiol. 2019;4: 10–19. - PMC - PubMed
    1. Gardy J, Loman NJ, Rambaut A. Real-time digital pathogen surveillance - the time is now. Genome Biol. 2015;16: 155. - PMC - PubMed
    1. Gire SK, Goba A, Andersen KG, Sealfon RSG, Park DJ, Kanneh L, et al. Genomic surveillance elucidates Ebola virus origin and transmission during the 2014 outbreak. Science. 2014;345: 1369–1372. - PMC - PubMed
    1. Chen Z, Azman AS, Chen X, Zou J, Tian Y, Sun R, et al. Global landscape of SARS-CoV-2 genomic surveillance and data sharing. Nat Genet. 2022;54: 499–507. - PMC - PubMed
    1. GISAID. EpiCoV. [cited 21 Feb 2022]. Available: https://www.gisaid.org/

Publication types