Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Nov 12;7(46):eabk0734.
doi: 10.1126/sciadv.abk0734. Epub 2021 Nov 12.

The Human Proteoform Project: Defining the human proteome

Affiliations

The Human Proteoform Project: Defining the human proteome

Lloyd M Smith et al. Sci Adv. .

Abstract

Proteins are the primary effectors of function in biology, and thus, complete knowledge of their structure and properties is fundamental to deciphering function in basic and translational research. The chemical diversity of proteins is expressed in their many proteoforms, which result from combinations of genetic polymorphisms, RNA splice variants, and posttranslational modifications. This knowledge is foundational for the biological complexes and networks that control biology yet remains largely unknown. We propose here an ambitious initiative to define the human proteome, that is, to generate a definitive reference set of the proteoforms produced from the genome. Several examples of the power and importance of proteoform-level knowledge in disease-based research are presented along with a call for improved technologies in a two-pronged strategy to the Human Proteoform Project.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.. Proteoforms: Distinct protein forms arising from a single gene.
Fig. 2.
Fig. 2.. Proteoforms in human disease.
Five important clinical areas of interest are depicted and serve as examples where proteoforms have been identified and linked to the progression of human disease; they are discussed at length in an extended preprint version of this Perspective (32). mAb, monoclonal antibody.
Fig. 3.
Fig. 3.. Approach to creating an integrated Human Proteoform Atlas.
The upper path illustrates the use of protein affinity reagents to capture proteoform families derived from targeted genes. The lower path illustrates the in-depth analysis of human cell types for proteoform discovery and characterization. Relative abundance refers to the ratio of a given proteoform to the sum of all proteoforms in that family.
Fig. 4.
Fig. 4.. The most studied proteins have essential proteoforms that contain common PTMs such as phosphorylation, methylation, acetylation, and other important variations of primary structure such as disulfide bond formation, metal attachment, and proteolytic processing.
TNF, tumor necrosis factor; TAU, tubulin-associated unit; HB, hemoglobin subunits (HbA, HbB, etc.); CASP, cysteine-aspartic proteases (Casp1 to Casp9); SOD, superoxide dismutases (SOD-1, SOD-2, and SOD-3); EGFR, estrogen growth factor receptor; CYTC, cytochrome C; TN, troponin (Tn-C, Tn-I, and Tn-T); APOE, apolipoprotein E; CA, carbonic anhydrase; CREB, cyclic adenosine 5′-monophosphate response element–binding protein; TP53, cellular tumor antigen p53. Citations are from the Web of Science Core Collection from 1975 to 2020. Citations per year and a history of research trends have been chronicled for a subset of these proteins (46).
Fig. 5.
Fig. 5.. Once proteoforms have been identified, affinity reagents and targeted assays will enable emergent strategies to delineate their spatial distribution and temporal dynamics of proteoforms and their PTMs.
CyTOF, mass cytometry; CODEX, CO-Detection by indEXing.
Fig. 6.
Fig. 6.. Projected interactions and impact from the Human Proteoform Project.

References

    1. Dranke N., What is the human genome worth? Nature , (2011).
    1. Collins F. S., Green E. D., Guttmacher A. E., Guyer M. S.; US National Human Genome Research Institute , A vision for the future of genomics research. Nature 422, 835–847 (2003). - PubMed
    1. Smith L. M., Kelleher N. L.; Consortium for Top Down Proteomics , Proteoform: A single term describing protein complexity. Nat. Methods 10, 186–187 (2013). - PMC - PubMed
    1. Aebersold R., Agar J. N., Amster I. J., Baker M. S., Bertozzi C. R., Boja E. S., Costello C. E., Cravatt B. F., Fenselau C., Garcia B. A., Ge Y., Gunawardena J., Hendrickson R. C., Hergenrother P. J., Huber C. G., Ivanov A. R., Jensen O. N., Jewett M. C., Kelleher N. L., Kiessling L. L., Krogan N. J., Larsen M. R., Loo J. A., Ogorzalek Loo R. R., Lundberg E., MacCoss M. J., Mallick P., Mootha V. K., Mrksich M., Muir T. W., Patrie S. M., Pesavento J. J., Pitteri S. J., Rodriguez H., Saghatelian A., Sandoval W., Schlüter H., Sechi S., Slavoff S. A., Smith L. M., Snyder M. P., Thomas P. M., Uhlén M., van Eyk J. E., Vidal M., Walt D. R., White F. M., Williams E. R., Wohlschlager T., Wysocki V. H., Yates N. A., Young N. L., Zhang B., How many human proteoforms are there? Nat. Chem. Biol. 14, 206–214 (2018). - PMC - PubMed
    1. Venter J. C., Adams M. D., Myers E. W., Li P. W., Mural R. J., Sutton G. G., Smith H. O., Yandell M., Evans C. A., Holt R. A., Gocayne J. D., Amanatides P., Ballew R. M., Huson D. H., Wortman J. R., Zhang Q., Kodira C. D., Zheng X. H., Chen L., Skupski M., Subramanian G., Thomas P. D., Zhang J., Gabor Miklos G. L., Nelson C., Broder S., Clark A. G., Nadeau J., McKusick V. A., Zinder N., Levine A. J., Roberts R. J., Simon M., Slayman C., Hunkapiller M., Bolanos R., Delcher A., Dew I., Fasulo D., Flanigan M., Florea L., Halpern A., Hannenhalli S., Kravitz S., Levy S., Mobarry C., Reinert K., Remington K., Abu-Threideh J., Beasley E., Biddick K., Bonazzi V., Brandon R., Cargill M., Chandramouliswaran I., Charlab R., Chaturvedi K., Deng Z., Francesco V. D., Dunn P., Eilbeck K., Evangelista C., Gabrielian A. E., Gan W., Ge W., Gong F., Gu Z., Guan P., Heiman T. J., Higgins M. E., Ji R. R., Ke Z., Ketchum K. A., Lai Z., Lei Y., Li Z., Li J., Liang Y., Lin X., Lu F., Merkulov G. V., Milshina N., Moore H. M., Naik A. K., Narayan V. A., Neelam B., Nusskern D., Rusch D. B., Salzberg S., Shao W., Shue B., Sun J., Wang Z. Y., Wang A., Wang X., Wang J., Wei M. H., Wides R., Xiao C., Yan C., Yao A., Ye J., Zhan M., Zhang W., Zhang H., Zhao Q., Zheng L., Zhong F., Zhong W., Zhu S. C., Zhao S., Gilbert D., Baumhueter S., Spier G., Carter C., Cravchik A., Woodage T., Ali F., An H., Awe A., Baldwin D., Baden H., Barnstead M., Barrow I., Beeson K., Busam D., Carver A., Center A., Cheng M. L., Curry L., Danaher S., Davenport L., Desilets R., Dietz S., Dodson K., Doup L., Ferriera S., Garg N., Gluecksmann A., Hart B., Haynes J., Haynes C., Heiner C., Hladun S., Hostin D., Houck J., Howland T., Ibegwam C., Johnson J., Kalush F., Kline L., Koduru S., Love A., Mann F., May D., McCawley S., McIntosh T., McMullen I., Moy M., Moy L., Murphy B., Nelson K., Pfannkoch C., Pratts E., Puri V., Qureshi H., Reardon M., Rodriguez R., Rogers Y. H., Romblad D., Ruhfel B., Scott R., Sitter C., Smallwood M., Stewart E., Strong R., Suh E., Thomas R., Tint N. N., Tse S., Vech C., Wang G., Wetter J., Williams S., Williams M., Windsor S., Winn-Deen E., Wolfe K., Zaveri J., Zaveri K., Abril J. F., Guigó R., Campbell M. J., Sjolander K. V., Karlak B., Kejariwal A., Mi H., Lazareva B., Hatton T., Narechania A., Diemer K., Muruganujan A., Guo N., Sato S., Bafna V., Istrail S., Lippert R., Schwartz R., Walenz B., Yooseph S., Allen D., Basu A., Baxendale J., Blick L., Caminha M., Carnes-Stine J., Caulk P., Chiang Y. H., Coyne M., Dahlke C., Mays A. D., Dombroski M., Donnelly M., Ely D., Esparham S., Fosler C., Gire H., Glanowski S., Glasser K., Glodek A., Gorokhov M., Graham K., Gropman B., Harris M., Heil J., Henderson S., Hoover J., Jennings D., Jordan C., Jordan J., Kasha J., Kagan L., Kraft C., Levitsky A., Lewis M., Liu X., Lopez J., Ma D., Majoros W., McDaniel J., Murphy S., Newman M., Nguyen T., Nguyen N., Nodell M., Pan S., Peck J., Peterson M., Rowe W., Sanders R., Scott J., Simpson M., Smith T., Sprague A., Stockwell T., Turner R., Venter E., Wang M., Wen M., Wu D., Wu M., Xia A., Zandieh A., Zhu X., The sequence of the human genome. Science 291, 1304–1351 (2001). - PubMed