Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2023 Apr 7;22(4):1024-1042.
doi: 10.1021/acs.jproteome.2c00498. Epub 2022 Nov 1.

The 2022 Report on the Human Proteome from the HUPO Human Proteome Project

Affiliations
Review

The 2022 Report on the Human Proteome from the HUPO Human Proteome Project

Gilbert S Omenn et al. J Proteome Res. .

Abstract

The 2022 Metrics of the Human Proteome from the HUPO Human Proteome Project (HPP) show that protein expression has now been credibly detected (neXtProt PE1 level) for 18 407 (93.2%) of the 19 750 predicted proteins coded in the human genome, a net gain of 50 since 2021 from data sets generated around the world and reanalyzed by the HPP. Conversely, the number of neXtProt PE2, PE3, and PE4 missing proteins has been reduced by 78 from 1421 to 1343. This represents continuing experimental progress on the human proteome parts list across all the chromosomes, as well as significant reclassifications. Meanwhile, applying proteomics in a vast array of biological and clinical studies continues to yield significant findings and growing integration with other omics platforms. We present highlights from the Chromosome-Centric HPP, Biology and Disease-driven HPP, and HPP Resource Pillars, compare features of mass spectrometry and Olink and Somalogic platforms, note the emergence of translation products from ribosome profiling of small open reading frames, and discuss the launch of the initial HPP Grand Challenge Project, "A Function for Each Protein".

Keywords: Biology and Disease-HPP (B/D-HPP); Grand Challenge Project; Human Protein Atlas; Human Proteome Project (HPP); Mass Spectrometry Interactive Virtual Environment (MassIVE); PeptideAtlas; Ribo-Seq; chromosome-centric HPP (C-HPP); missing proteins (MP); neXtProt protein existence (PE metrics); non-MS PE1 proteins; small open reading frames (smORFs); uncharacterized protein existence 1 (uPE1).

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Flow diagram of the changes in numbers of PE1, PE2, PE3, PE4 (and PE5) classes of predicted proteins from 2021–02 to 2022–02. Blue solid arrows indicate increases to PE1; red arrows indicate demotions from PE1 or PE2,3,4 MPs, to PE5; black arrows show increases to MPs. Dashed arrows indicate continuation in the same status. In the margins we show examples of the 12 new entries to and 32 removals from neXtProt (see discussion later about uORFs and smORFs). Note, PE5 represents a category of dubious or uncertain genes, mostly pseudogenes, which we have excluded from the HPP metrics since 2013.
Figure 2.
Figure 2.
Pie chart depicting the 2022 status of proteins in the human proteome. PE1 proteins that have MS validation are shown in blue. PE1 validated via means other than MS are shown in orange. Missing proteins (PE2+PE3+PE4) are shown in gray.
Figure 3.
Figure 3.
Bar charts summarizing the 868 PE1 proteins that have MS evidence in neither PeptideAtlas nor MassIVE meeting HPP Guidelines (designated non-MS PE1). A) Number of different categories of evidence for these proteins, with 484 proteins having only a single type. B) Distribution of the 484 proteins with only a single category of evidence across the 10 different categories. Protein-protein interactions (PPI) from IntAct far outweigh all other categories. C) Distribution of all evidence categories (many proteins have multiple categories of evidence). D) Similar to panel C but within each major type (depicted with the outline of the appropriate color), each of the 10 categories is depicted by a narrow bar, thereby showing the overlap between categories. Of the 551 proteins with “Gold Binary PPI” evidence, 194 also have “Curated Interactions” (blue), and, of those, 62 have “Mutagenesis” evidence as well (red). A complete listing of the individual proteins is provided in Supplementary Table 1.
Figure 4.
Figure 4.
This bar chart documents progress in reducing PE2,3,4 missing proteins (grey) and increasing PE1 proteins (blue + orange). Within the PE1 proteins, there is a continuing shift of non-MS PE1 proteins (orange) to MS-based PE1 proteins (blue). The bulge in 2021 for non-MS reflects the inclusion and then removal of 112 PPI-based entries, as noted above.
Figure 5.
Figure 5.
Primary papers that generated ten or more new PE1 proteins in Protein Atlas-2022-01 and neXtProt-2022-02. These eight reports include CPTAC proteogenomic studies of gastric, lung adeno, and lung squamous carcinomas, , ; proteins associated with inflammation, proteolysis, and splicing in aging skeletal muscle; hundreds of non-canonical shared and tumor-specific HLA peptides from non-coding regions, ; 10,701 proteins identified across 4 layers and 9 cell types in skin; and 12 missing proteins from a medulloblastoma cell line confirmed with synthetic peptides in PRM, enhanced to 35 MPs upon review by PeptideAtlas.
Figure 6.
Figure 6.
Personalized outcome risk prediction of colon cancers is a specific example of high value “clinical need” with possible proteomics solutions. (a) Two patients with colon cancers appear – to the best of today’s diagnostic capabilities – identical by demographics, histo-morphology, stage, and genomics, yet have vastly different survival outcomes (patient on left cured by surgery vs. patient on right died of progressive disease). (b) Proteomics analyses may reveal subtypes of previously indistinguishable cancers (as in CPTAC). The patients from (a) fall into two different proteomic subtypes of microsatellite stable (MSS) colon cancer (green and orange clusters, see arrows), illustrating that proteomics “sees” more than traditional molecular diagnostics. For comparison, clusters of other cancer types (pancreatic adenocarcinoma and neuroendocrine tumor, microsatellite unstable (MSI) colon cancer, metastatic colon cancer) and benign parenchyma (colon, pancreas, liver) are shown.

Similar articles

Cited by

References

    1. Legrain P; Aebersold R; Archakov A; Bairoch A; Bala K; Beretta L; Bergeron J; Borchers CH; Corthals GL; Costello CE; Deutsch EW; Domon B; Hancock W; He F; Hochstrasser D; Marko-Varga G; Salekdeh GH; Sechi S; Snyder M; Srivastava S; Uhlen M; Wu CH; Yamamoto T; Paik YK; Omenn GS, The Human Proteome Project: Current state and future direction. Mol Cell Proteomics 2011, 10 (7), M111 009993. - PMC - PubMed
    1. Hanash S; Celis JE, The Human Proteome Organization: a mission to advance proteome knowledge. Mol Cell Proteomics 2002, 1 (6), 413–4. - PubMed
    1. Adhikari S; Nice EC; Deutsch EW; Lane L; Omenn GS; Pennington SR; Paik YK; Overall CM; Corrales FJ; Cristea IM; Van Eyk JE; Uhlen M; Lindskog C; Chan DW; Bairoch A; Waddington JC; Justice JL; LaBaer J; Rodriguez H; He F; Kostrzewa M; Ping P; Gundry RL; Stewart P; Srivastava S; Srivastava S; Nogueira FCS; Domont GB; Vandenbrouck Y; Lam MPY; Wennersten S; Vizcaino JA; Wilkins M; Schwenk JM; Lundberg E; Bandeira N; Marko-Varga G; Weintraub ST; Pineau C; Kusebauch U; Moritz RL; Ahn SB; Palmblad M; Snyder MP; Aebersold R; Baker MS, A high-stringency blueprint of the human proteome. Nat Commun 2020, 11 (1), 5301. - PMC - PubMed
    1. Omenn GS; Lane L; Overall CM; Paik YK; Cristea IM; Corrales FJ; Lindskog C; Weintraub S; Roehrl MHA; Liu S; Bandeira N; Srivastava S; Chen YJ; Aebersold R; Moritz RL; Deutsch EW, Progress identifying and analyzing the human proteome: 2021 Metrics from the HUPO Human Proteome Project. J Proteome Res 2021, 20 (12), 5227–5240. - PMC - PubMed
    1. Omenn GS, Reflections on the HUPO Human Proteome Project, the flagship project of the Human Proteome Organization, at 10 Years. Mol Cell Proteomics 2021, 20, 100062. - PMC - PubMed

Publication types