Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Jul;24(7):101004.
doi: 10.1016/j.mcpro.2025.101004. Epub 2025 May 29.

P4PP: A Universal Shotgun Proteomics Data Analysis Pipeline for Virus Identification

Affiliations

P4PP: A Universal Shotgun Proteomics Data Analysis Pipeline for Virus Identification

Armand Paauw et al. Mol Cell Proteomics. 2025 Jul.

Abstract

Humans can be infected by a wide variety of virus species. We developed a data analysis approach for shotgun proteomic data to detect these viruses. A proteome for pandemic preparedness (P4PP) pipeline, a corresponding database (P4PP v01), and a web application (P4PP) were constructed. The P4PP pipeline enables the identification of 1896 virus species from the 32 virus families, based on multiple identified discriminatory peptides, in which at least one human infectious virus is described. P4PP was evaluated using different datasets of cell-cultivated viruses, generated at different institutes, measured with different instruments, and prepared with different sample preparation methods. In total, 174 mass spectrometry datasets of 160 and 14 protein trypsin digests of virus-infected and noninfected cell lines were analyzed, respectively. Of the 160 samples, 146 were correctly identified at the species level, and an additional four samples were identified at the family level. In the remaining 10 samples, no virus was detected. However, all these 10 samples tested positive in follow-up samples obtained later in time series were negative samples were measured, indicating that the number of peptides derived from the virus was initially too low in the samples obtained at the start of the experiment. Furthermore, results show that influenza A or severe acute respiratory syndrome coronavirus 2 can be subtyped if enough discriminative peptides of the virus are identified. In the noninfected cell lines, no virus was detected except in one sample where the in that experiment studied virus was detected. Shotgun proteomics, in combination with the developed data analysis approach, can identify all types of virus species after cultivation in a cell line. Implementing this agnostic virus proteome analysis capability in viral diagnostic laboratories has the potential to improve their capabilities to cope with unexpected, mutated, or re-emerging viruses.

Keywords: P4PP pipeline; pandemic preparedness; peptides; proteomics; virus identification.

PubMed Disclaimer

Conflict of interest statement

Conflict of Interest The authors declare no competing interests.

Figures

None
Schematic overview of shotgun proteomics-based virus identification of cultivated viruses. Virus cell culture, sample preparation, data acquisition, peptide identification, and virus identification.
Fig. 1
Fig. 1
Analysis of samples using P4PP application.A, this figure provides an overview of the 174 different sample sets tested. The black bar indicates the virus-infected cell culture samples, and the gray bar represents the negative controls (noninfected cell cultures). B, results of the 160 virus-infected samples. The black bar shows the number of correctly identified viruses at the species level, the gray bar indicates the number of correctly identified viruses at the family level, and the white bar represents samples where no virus was identified. The negative samples were taken shortly after infection or infected with a low MOI. All 10 samples eventually tested positive, suggesting that the viral protein levels were initially below the detection limit of our method and data analysis approach. C, the test results for the negative controls (noninfected cell cultures). The black bar represents the number of correct results (no virus detected), while the red bar indicate a false positive result, likely due to carryover effects. The identified virus in this false positive sample was the same species used in this study. MOI, multiplicity of infection; P4PP, proteome for pandemic preparedness.
Fig. 2
Fig. 2
Number of peptide identifications and distribution across sample concentrations.A, number of identified peptides in isolates tested in different concentrations. ∗Estimated concentrations, number of genome copies per milliliter was not determined. B, distribution of the species-specific peptides identified in samples. Yellow percentage of species-specific peptides identified in the highest concentration tested only. Red of species-specific peptides identified in the lowest concentration tested only and in orange the percentage of peptides identified in both samples.
Fig. 3
Fig. 3
SARS-CoV-2 identified peptides of SARS-CoV-2 strain BetaCoV/Netherlands/01 infected cell cultures at varying viral loads.A, The Venn diagram shows the overlap of peptides identified from SARS-CoV-2 strain BetaCoV/Netherlands/01 in cell cultures infected with 1.109, 1.108, and 1.107 gc/ml. Between brackets the number of specific identified peptides in each isolate. B, the sequences of the seven peptides derived from SARS-CoV-2 identified in all three concentrations analyzed. SARS-CoV-2, severe acute respiratory syndrome coronavirus 2.
Fig. 4
Fig. 4
Relationships of the identified peptides of the four SARS-CoV-2 strains analyzed in dataset 2.A, Venn diagram of SARS-CoV-2 strains (Alpha B.1.1.7, Beta B.1.351, Gamma P.1 and Delta B.1.617.2). Between the brackets the number of species-specific peptides identified of the tested isolate. B, the sequences of the 20 peptides identified that were identified in all four SARS-CoV-2 strains. SARS-CoV-2, severe acute respiratory syndrome coronavirus 2.
Fig. 5
Fig. 5
Average number of unique peptides identified of human alphaherpesvirus 1 in dataset 5. On family level (black line) and species level (dotted line).
Fig. 6
Fig. 6
Average number of unique peptides identified of human alphaherpesvirus 3 in dataset 5. On family level (black line) and species level (dotted line).
Fig. 7
Fig. 7
Venn diagram of the number of peptides identified in the four analyzed samples (B11, B12, B12, and B14) of RSV and infected A540 cells of MSV000080032.A, number of species discriminative peptides of each sample. B, number of family discriminative peptides of each sample. Between brackets the number of species unique peptides identified in the sample. RSV, respiratory syncytial virus.
Fig. 8
Fig. 8
Increase of the average number of unique peptides identified of SARS-CoV-2 in dataset 7. Infected Vero E6 cells with MOI 0.01 on family level (black line) and species level (large dotted line) and MOI 0.001 on family discriminative peptides (small dotted line) and species level (irregular line). MOI, multiplicity of infection. SARS-CoV-2, severe acute respiratory syndrome coronavirus 2.
Fig. 9
Fig. 9
Venn diagram of the number of peptides identified in the three analyzed samples (E02292, E02293, and E02294) of Mpox of PXD034494.A, number of species discriminative peptides of each sample. B, number of family discriminative peptides of each sample. Between brackets the number of species unique peptides in the sample.
Fig. 10
Fig. 10
Increase in the average number of unique peptides identified in influenza A strains. H1N1 (black line), H3N2 (large dotted line), and H5N1 (small dotted line) grown in human bronchial epithelial cells (NHBE cells) overtime. NHBE, Normal Human Bronchial Epithelial.
Fig. 11
Fig. 11
Distribution of the number of species false positive peptides in each sample.

References

    1. Minhaj F.S., Ogale Y.P., Whitehill F., Schultz J., Foote M., Davidson W., et al. Monkeypox Response Team, 2 Monkeypox outbreak - nine states, may 2022. MMWR Morb. Mortal. Wkly. Rep. 2022;71:764–769. - PMC - PubMed
    1. Perez Duque M., Ribeiro S., Martins J.V., Casaca P., Leite P.P., Tavares M., et al. Ongoing monkeypox virus outbreak, Portugal, 29 april to 23 may 2022. Euro Surveill. 2022;27 - PMC - PubMed
    1. Zhu N., Zhang D., Wang W., Li X., Yang B., Song J., et al. China Novel Coronavirus Investigating and Research Team A novel coronavirus from patients with pneumonia in China, 2019. N. Engl. J. Med. 2020;382:727–733. - PMC - PubMed
    1. Wang C., Horby P.W., Hayden F.G., Gao G.F. A novel coronavirus outbreak of global health concern. Lancet. 2020;395:470–473. - PMC - PubMed
    1. Colella J.P., Bates J., Burneo S.F., Camacho M.A., Carrion Bonilla C., Constable I., et al. Leveraging natural history biorepositories as a global, decentralized, pathogen surveillance network. PLoS Pathog. 2021;17 - PMC - PubMed

LinkOut - more resources