Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2021 Sep 7;11(1):17758.
doi: 10.1038/s41598-021-97158-x.

Nanopore sequencing and de novo assembly of a misidentified Camelpox vaccine reveals putative epigenetic modifications and alternate protein signal peptides

Affiliations
Comparative Study

Nanopore sequencing and de novo assembly of a misidentified Camelpox vaccine reveals putative epigenetic modifications and alternate protein signal peptides

Zack Saud et al. Sci Rep. .

Abstract

DNA viruses can exploit host cellular epigenetic processes to their advantage; however, the epigenome status of most DNA viruses remains undetermined. Third generation sequencing technologies allow for the identification of modified nucleotides from sequencing experiments without specialized sample preparation, permitting the detection of non-canonical epigenetic modifications that may distinguish viral nucleic acid from that of their host, thus identifying attractive targets for advanced therapeutics and diagnostics. We present a novel nanopore de novo assembly pipeline used to assemble a misidentified Camelpox vaccine. Two confirmed deletions of this vaccine strain in comparison to the closely related Vaccinia virus strain modified vaccinia Ankara make it one of the smallest non-vector derived orthopoxvirus genomes to be reported. Annotation of the assembly revealed a previously unreported signal peptide at the start of protein A38 and several predicted signal peptides that were found to differ from those previously described. Putative epigenetic modifications around various motifs have been identified and the assembly confirmed previous work showing the vaccine genome to most closely resemble that of Vaccinia virus strain Modified Vaccinia Ankara. The pipeline may be used for other DNA viruses, increasing the understanding of DNA virus evolution, virulence, host preference, and epigenomics.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
Read mapping coverage of genome assemblies for a. the initial Flye assembly and b. the final polished assembly. Read coverage was found to be more uniformly distributed in the final assembly in comparison to the initial assembly (Flye assembly using > 3000 Viral DNA Read Set), which was found to have uneven read coverage distributions at the contig ends. This is indicative of the final polished assembled containing terminal repeat sequence lengths that more closely match that of the ground truth.
Figure 2
Figure 2
Dotplot comparison of the Ducapox long read assembly vs the closest matching viral genome, that of VACV Acambis 3000 MVA. Genomic deletions of 5449 bp and 916 bp in size are illustrated. The VACV Acambis 3000 MVA was also found to be 227 bp and 435 bp longer at its ends, with respect to the Ducapox genome.
Figure 3
Figure 3
Annotated Ducapox gene map. The genome contained a total of 186 predicted genes.
Figure 4
Figure 4
Statistical plot and sequence logo of the AGAAGRC motif. The statistical plot is based on 17 regions within the genome that contain the motif sequence. Signal fluctuation away from the canonical model can be seen around the central AAG nucleotides.
Figure 5
Figure 5
Statistical plot and sequence logo of the AARRRGATKH motif. The statistical plot is based on 42 regions within the genome that contain the motif sequence. Signal fluctuation away from the canonical model can be seen around the central GA nucleotides.
Figure 6
Figure 6
Statistical plot and sequence logo of the WWAATGWC motif. The statistical plot is based on 77 regions within the genome that contain the motif sequence. Signal fluctuation away from the canonical model can be seen around the central TGT nucleotides.
Figure 7
Figure 7
Bioinformatics pipeline used for the long-read only assembly of the Ducapox genome. Basecalling of reads was performed using Guppy v4.0.11. Adapter sequences in reads were removed using Porechop v.0.2.4. Reads were subsequently filtered to a minimum length of 3000 bases using Nanofilt v2.6.0. An initial assembly was performed using Flye v.2.8 (using reads containing both viral and non-viral DNA sequences), after which a BLAST search for each contig generated was performed against the NCBI nucleotide database. A file containing all non-viral reads was used to generate an exclusive viral read set by mapping reads to the non-viral contigs using Minimap2 v 2.17-r941, followed by extraction of the unmapped reads using Samtools v1.7. A Flye assembly was performed on the exclusive viral reads set, which was subsequently polished with TandemTools, followed by 3 rounds of Racon v.1.4.13 polishing, and a final polishing round using Medaka v0.11.5 to generate a 159,696 bp genome. An incorrect insertion within an adenine homopolymer region of this assembly was corrected, producing a final genome sequence length of 159,695 bp.

Similar articles

Cited by

References

    1. Fenner, F., Henderson, D.A., Arita, I., Jezek, Z. & Ladnyi, I.D. Smallpox and its eradication. Geneva: World Health Organization; 1988. [March 14, 2003]. p. 1460. Reference out-of-print. See the World Health Organization, Communicable Disease Surveillance and Response Web site. www.who.int/emc/diseases/smallpox/smallpoxeradication.html.
    1. Jenner, E. An inquiry into the causes and effects of the variole vaccinae, a disease discovered in some of the Western Counties of England, Particularly Gloucestershire and Known by the Name of the cow‐pox. London: Sampson Low, 1798.
    1. Sklenovská N, Van Ranst M. Emergence of monkeypox as the most important orthopoxvirus infection in humans. Front. Public Health. 2018;6:241. doi: 10.3389/fpubh.2018.00241. - DOI - PMC - PubMed
    1. Gubser C, Smith GL. The sequence of camelpox virus shows it is most closely related to variola virus, the cause of smallpox. J. Gen. Virol. 2002;83:855–872. doi: 10.1099/0022-1317-83-4-855. - DOI - PubMed
    1. Moss B. Poxvirus DNA replication. Cold Spring Harb. Perspect. Biol. 2013;5(9):a010199. doi: 10.1101/cshperspect.a010199. - DOI - PMC - PubMed

Publication types

MeSH terms