Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jul 31;16(1):7041.
doi: 10.1038/s41467-025-62469-4.

vPro-MS enables identification of human-pathogenic viruses from patient samples by untargeted proteomics

Affiliations

vPro-MS enables identification of human-pathogenic viruses from patient samples by untargeted proteomics

Marica Grossegesse et al. Nat Commun. .

Abstract

Viral infections are commonly diagnosed by the detection of viral genome fragments or proteins using targeted methods such as PCR and immunoassays. In contrast, metagenomics enables the untargeted identification of viral genomes, expanding its applicability across a broader spectrum. In this study, we introduce proteomics as a complementary approach for the untargeted identification of human-pathogenic viruses from patient samples. The viral proteomics workflow (vPro-MS) is based on an in-silico derived peptide library covering the human virome in UniProtKB (331 viruses, 20,386 genomes, 121,977 peptides). A scoring algorithm (vProID score) is developed to assess the confidence of virus identification from proteomics data ( https://github.com/RKI-ZBS/vPro-MS ). In combination with diaPASEF-based data acquisition, this workflow enables the analysis of up to 60 samples per day. The specificity is determined to be >99,9% in an analysis of 221 plasma, swab and cell culture samples covering 17 different viruses. The sensitivity of this approach for the detection of SARS-CoV-2 in nasopharyngeal swabs corresponds to a PCR cycle threshold of 27 with comparable quantitative accuracy to metagenomics. vPro-MS enables the integration of untargeted virus identification in large-scale proteomic studies of biofluids such as human plasma to detect previously undiscovered virus infections in patient specimens.

PubMed Disclaimer

Conflict of interest statement

Competing interests: The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Overview of the vPro-MS workflow for virus identification by untargeted proteomics.
The first step of the vPro-MS workflow is the sample preparation. Proteins are digested into tryptic peptides using S-Trap micro columns and loaded onto Evotips. Afterwards, peptides are analyzed for 24 min per sample, corresponding to a throughput of 60 samples per day (SPD) using diaPASEF on an Evosep One coupled to a timsTOF HT mass spectrometer. Peptide sequences are identified from the MS data using DIA-NN (v 1.8.1) with the vPro peptide library, which is based on UniProt. These peptide sequences are further analyzed by the vPro-MS R script to identify human-pathogenic viruses and generate the vPro-MS report. The confidence of virus identification is assessed by the vProID score. (Created in BioRender. Doellinger, J. (2025) https://BioRender.com/8611aej).
Fig. 2
Fig. 2. Data flow chart of vPro-MS for virus identification by proteomics.
Data processing of the vPro-MS workflow is split into two parts: peptide library construction and virus detection. At first, a peptide library is generated based on UniProt protein sequences. The UniProt database (release 2023_01) contains >1.4 million protein sequences from human-pathogenic viruses. Structural virus proteins are extracted from these sequences and are used to predict a viral peptide spectral library. Peptides are further filtered for detectability (m/z, iRT, IM) and taxonomic specificity. The remaining peptides form the vPro peptide library, to which human and contaminant peptide sequences are added. This library is used to identify peptides from DIA-MS data using DIA-NN. The peptide sequences are analyzed using the vPro-MS R script to identify human-pathogenic viruses. vPro-MS controls the reliability of virus detection by calculating a confidence score (vProID) and summarizes the results in a tabular report. (Created in BioRender. Doellinger, J. (2025) https://BioRender.com/j84lltq).
Fig. 3
Fig. 3. Impact of the vProID score on the sensitivity and specificity of virus identification.
The vPro peptide library covering the human virome (331 viruses) was used to identify human and viral peptides in 66 nasal swab samples, of which 58 were positive for SARS-CoV-2 (ct range 18–35) and 8 were negative. This corresponds to 21,846 individual virus tests within a dataset consisting of 808,704 (redundant) peptide identifications. 16 different parameter sets, including variations of FDR, min. CScore and min. peptides per species, either with or without applying a vProID score threshold, were used to filter the DIA-NN report (see Table 1 for details). Initially, DIA-NN reported the identification of 2361 peptides from 188 different viruses. Applying the vProID score filter on this dataset (parameter set 10), increased the specificity of virus detection to 100 %. The reduction of false positive peptides per virus is visualized in the heatmap (A), and the corresponding vPro score distribution is shown as a scatter plot (B). The influence of the vProID score on the percentage reduction of false positive virus peptides is shown for all parameter sets in (C). The improvement in sensitivity due to the vProID score, with which viruses can be identified with the respective highest specificity, is shown for various parameters in (D). For this comparison, the sensitivities of the specified parameters were compared for the minimum number of virus peptides that achieved the highest specificity in each case.
Fig. 4
Fig. 4. Evaluation of the specificity of vPro-MS for virus identification.
The specificity of the vPro-MS workflow was evaluated by analyzing 221 samples from 4 sources covering 17 different human-pathogenic virus species (A). Two sample panels were analyzed by MS for this study and the raw data from two further studies of patient samples were downloaded from PRIDE,. The sample types included cell-culture supernatants, respiratory swabs and plasma. MS data of all samples were analyzed using the vPro-MS data workflow covering 331 human-pathogenic viruses. This corresponded to the analysis of 73,151 individual virus tests (221 samples tested for 331 viruses). The specificities ranged from 99.97–100% on virus-level and from 95.56–100% on sample-level. The plasma study (PRIDE) does not contain negative samples, and no specificity on sample-level can be calculated (B). The results of the specificity panel are further visualized in a heatmap, which displays the vProID score for each species and sample. A minimum vProID score of 2 was required for virus identification (C). In two orthopoxvirus samples (T8 and T19), closely related species were identified together with the correct species. The correct species has the highest vProID score in both cases. Wrong virus identifications are outlined in red.
Fig. 5
Fig. 5. Evaluation of the sensitivity of vPro-MS for the identification of SARS-CoV-2 in nasal swabs.
The sensitivity of vPro-MS for the identification of SARS-CoV-2 was evaluated in 66 respiratory swab samples. The panel included 8 negative samples and 58 SARS-CoV-2 positive samples covering a Ct range of 18–35 of three different variants (alpha, delta, omicron). The quantitative values for proteomics were calculated as the sum of the three most intense peptides (Top3) and compared to qPCR (Ct). Results were compared when using the vPro peptide library (A) and a library constructed from the analysis of synthetic SARS-CoV-2 peptides (B). The limits of detection are displayed at 95% confidence.

Similar articles

References

    1. Davis, H. E., McCorkell, L., Vogel, J. M. & Topol, E. J. Long COVID: major findings, mechanisms and recommendations. Nat. Rev. Microbiol.21, 133–146 (2023). - PMC - PubMed
    1. Cassedy, A., Parle-McDermott, A. & O’Kennedy, R. Virus detection: a review of the current and emerging molecular and immunological methods. Front. Mol. Biosci.8, 637559 (2021). - PMC - PubMed
    1. Dutta, D. et al. COVID-19 diagnosis: a comprehensive review of the RT-qPCR method for detection of SARS-CoV-2. Diagnostics12, 1503 (2022). - PMC - PubMed
    1. Scheiblauer, H. et al. Comparative sensitivity evaluation for 122 CE-marked rapid diagnostic tests for SARS-CoV-2 antigen, Germany, September 2020 to April 2021. Euro. Surveill. 26, 2100441 (2021). - PMC - PubMed
    1. Puyskens, A. et al. Performance of 20 rapid antigen detection tests to detect SARS-CoV-2 B.1.617.2 (Delta) and B.1.1.529 (Omicron) variants using a clinical specimen panel from January 2022, Berlin, Germany. Euro. Surveill.28, 2200615 (2023). - PMC - PubMed

MeSH terms