Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Apr 21;6(2):e01336-20.
doi: 10.1128/mSphere.01336-20.

k-mer-Based Metagenomics Tools Provide a Fast and Sensitive Approach for the Detection of Viral Contaminants in Biopharmaceutical and Vaccine Manufacturing Applications Using Next-Generation Sequencing

Affiliations

k-mer-Based Metagenomics Tools Provide a Fast and Sensitive Approach for the Detection of Viral Contaminants in Biopharmaceutical and Vaccine Manufacturing Applications Using Next-Generation Sequencing

Madolyn L MacDonald et al. mSphere. .

Abstract

Adventitious agent detection during the production of vaccines and biotechnology-based medicines is of critical importance to ensure the final product is free from any possible viral contamination. Increasing the speed and accuracy of viral detection is beneficial as a means to accelerate development timelines and to ensure patient safety. Here, several rapid viral metagenomics approaches were tested on simulated next-generation sequencing (NGS) data sets and existing data sets from virus spike-in studies done in CHO-K1 and HeLa cell lines. It was observed that these rapid methods had comparable sensitivity to full-read alignment methods used for NGS viral detection for these data sets, but their specificity could be improved. A method that first filters host reads using KrakenUniq and then selects the virus classification tool based on the number of remaining reads is suggested as the preferred approach among those tested to detect nonlatent and nonendogenous viruses. Such an approach shows reasonable sensitivity and specificity for the data sets examined and requires less time and memory as full-read alignment methods.IMPORTANCE Next-generation sequencing (NGS) has been proposed as a complementary method to detect adventitious viruses in the production of biotherapeutics and vaccines to current in vivo and in vitro methods. Before NGS can be established in industry as a main viral detection technology, further investigation into the various aspects of bioinformatics analyses required to identify and classify viral NGS reads is needed. In this study, the ability of rapid metagenomics tools to detect viruses in biopharmaceutical relevant samples is tested and compared to recommend an efficient approach. The results showed that KrakenUniq can quickly and accurately filter host sequences and classify viral reads and had comparable sensitivity and specificity to slower full read alignment approaches, such as BLASTn, for the data sets examined.

Keywords: Chinese hamster ovary cells; HeLa cells; adventitious agent testing; next-generation sequencing; vaccine; viral metagenomics; virus detection.

PubMed Disclaimer

Figures

FIG 1
FIG 1
(A) Estimated abundances from various virus classification tools for the simulation 2 (a) and simulation 6 (b) data sets. The “Expected” category reflects the simulated abundance of each virus. The “Off-target Hits” category are reads that mapped to species that were not simulated.
FIG 2
FIG 2
Estimated species abundances from various virus classification tools for the Lab A mixed sample (HeLa cell lysate with virus spiked in at 1 genome copy per cell) from Khan et al. (45). “HF” in the legend signifies that host filtering with KrakenUniq was done before classification against U-RVDB16. The “Unclassified” category refers to reads that could not be mapped to the host genome or U-RVDB16. The “Viral Off-targets Hits” category are reads that mapped to viruses other than the spiked-in viruses and human endogenous viruses. The “Host (Homo sapiens)” category for Centrifuge and KrakenUniq are reads that mapped to the human genome that was used in the reference database along with U-RVDB16. For KrakenUniq-HF and BLAST-HF, the “Host (Homo sapiens)” category are reads that mapped to the human genome during the filtering step.
FIG 3
FIG 3
Estimated species abundances from various virus classification tools for LabC-1 sample (HeLa whole cells with viruses spiked in at 100 genome copies per cell) from Khan et al. (45). The “Unclassified” category refers to reads that could not be mapped to the host genome or RVDBv16.
FIG 4
FIG 4
Runtimes for the tested tools when applied to samples from the HeLa cell viral spike-in study (45). Each tool was run three times on each sample using 16 threads on an AMD Opteron 6386 SE processor with 256 GB of RAM. KrakenUniq-HF and BLAST-HF are the times required by the KrakenUniq and BLAST, respectively, after host reads were filtered.
FIG 5
FIG 5
Estimated species abundances from various virus classification tools for the LabB-6 HeLa cell lysate sample with 0.1 genome copies of each virus per cell (a) and the LabB-5 HeLa cell lysate sample with 3 genome copies of each virus per cell (b). The “Unclassified” category refers to reads that could not be mapped to the host genome or RVDBv16.
FIG 6
FIG 6
Estimated species abundances from various virus classification tools for the LabB-2 HeLa whole cell sample with three genome copies of each virus per cell. The “Unclassified” category refers to reads that could not be mapped to the host genome or RVDBv16.
FIG 7
FIG 7
Viral read counts per million of the total sequencing reads for the HeLa whole cell samples from Lab B that received different amounts of viral spike-ins (Lab B samples 1 to 3). These spike-in samples consisted of RSV, FeLV, EBV, and Reo1, each spiked-in at the concentration on the x axis.
FIG 8
FIG 8
Estimated percent abundances from various virus classification tools for the reovirus-3 spike-in (experiment 1) NGS data set (46). The host reference genome was either the 2018 CH genome (CriGri-PICR, GCF_003668045.1) or both the 2018 CH and 2011 CHO-K1 (CriGri_1.0, GCF_000223135.1) genomes as distinguished in parentheses in the legend. “HF”’ in the legend signifies that host filtering with KrakenUniq was done before classification against U-RVDBv16. The “Viral Off-target Hits” category are reads mapping to viruses other than the spiked-in virus (reovirus) and the possible cross-contamination viruses (EMCV and VSV).

Similar articles

Cited by

References

    1. Barone PW, Wiebe ME, Leung JC, Hussein ITM, Keumurian FJ, Bouressa J, Brussel A, Chen D, Chong M, Dehghani H, Gerentes L, Gilbert J, Gold D, Kiss R, Kreil TR, Labatut R, Li Y, Müllberg J, Mallet L, Menzel C, Moody M, Monpoeho S, Murphy M, Plavsic M, Roth N, Roush D, Ruffing M, Schicho R, Snyder R, Stark D, Zhang C, Wolfrum J, Sinskey AJ, Spring SL. 2020. Viral contamination in biologic manufacture and implications for emerging therapies. Nat Biotechnol 38:563–572. doi: 10.1038/s41587-020-0507-2. - DOI - PubMed
    1. Garnick RL. 1996. Experience with viral contamination in cell culture. Dev Biol Stand 88:49–56. - PubMed
    1. Moody M, Alves W, Varghese J, Khan F. 2011. Mouse minute virus (MMV) contamination, a case study: detection, root cause determination, and corrective actions. PDA J Pharm Sci Technol 65:580–588. doi: 10.5731/pdajpst.2011.00824. - DOI - PubMed
    1. Nims RW. 2006. Detection of adventitious viruses in biologicals-a rare occurrence. Dev Biol (Basel) 123:153–164. - PubMed
    1. Bethencourt V. 2009. Virus stalls Genzyme plant. Nat Biotechnol 27:681–681. doi: 10.1038/nbt0809-681a. - DOI

Publication types

LinkOut - more resources