Review

. 2021 Aug 31;9(3):38.

doi: 10.3390/proteomes9030038.

Proteomes Are of Proteoforms: Embracing the Complexity

Katrina Carbonara¹, Martin Andonovski¹, Jens R Coorssen¹

Affiliations

Affiliation

¹ Faculties of Applied Health Sciences and Mathematics & Science, Departments of Health Sciences and Biological Sciences, Brock University, 1812 Sir Isaac Brock Way, St. Catharines, ON L2S 3A1, Canada.

PMID: 34564541
PMCID: PMC8482110
DOI: 10.3390/proteomes9030038

Review

Proteomes Are of Proteoforms: Embracing the Complexity

Katrina Carbonara et al. Proteomes. 2021.

. 2021 Aug 31;9(3):38.

doi: 10.3390/proteomes9030038.

Authors

Katrina Carbonara¹, Martin Andonovski¹, Jens R Coorssen¹

Affiliation

¹ Faculties of Applied Health Sciences and Mathematics & Science, Departments of Health Sciences and Biological Sciences, Brock University, 1812 Sir Isaac Brock Way, St. Catharines, ON L2S 3A1, Canada.

PMID: 34564541
PMCID: PMC8482110
DOI: 10.3390/proteomes9030038

Abstract

Proteomes are complex-much more so than genomes or transcriptomes. Thus, simplifying their analysis does not simplify the issue. Proteomes are of proteoforms, not canonical proteins. While having a catalogue of amino acid sequences provides invaluable information, this is the Proteome-lite. To dissect biological mechanisms and identify critical biomarkers/drug targets, we must assess the myriad of proteoforms that arise at any point before, after, and between translation and transcription (e.g., isoforms, splice variants, and post-translational modifications [PTM]), as well as newly defined species. There are numerous analytical methods currently used to address proteome depth and here we critically evaluate these in terms of the current 'state-of-the-field'. We thus discuss both pros and cons of available approaches and where improvements or refinements are needed to quantitatively characterize proteomes. To enable a next-generation approach, we suggest that advances lie in transdisciplinarity via integration of current proteomic methods to yield a unified discipline that capitalizes on the strongest qualities of each. Such a necessary (if not revolutionary) shift cannot be accomplished by a continued primary focus on proteo-genomics/-transcriptomics. We must embrace the complexity. Yes, these are the hard questions, and this will not be easy…but where is the fun in easy?

Keywords: Western blotting; bottom-up; immunoassay; mass spectrometry; proteomics; top-down; two-dimensional gel electrophoresis.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

**Figure 1**
Schematic illustration of proteoform synthesis. Depicted are a handful (but not all) factors contributing to final proteoform configuration that are not seen in, nor predicted by, the central dogma. The PTM noted are but examples of the 100s of currently identified native modifications [23,24]. Each modification that occurs throughout the development and lifespan of a given amino acid backbone will yield multiple different proteoforms, each differing in their biological localization and/or function.

**Figure 2**
Proteomics: discovery and targeted approaches. Discovery proteomics is defined by two main approaches: top-down (resolution of intact protein species) and bottom-up (peptide mass spectrometry (MS) of proteolytic digests). Targeted proteomics involves either antibody- or MS-dependent approaches. Data dependent acquisition (DDA) and data independent acquisition (DIA) were initially developed for discovery but can be modified to also serve in a targeted approach.

**Figure 3**
Top-down versus bottom-up proteomics. This schematic depicts a general description of the workflows for these two discovery approaches. While both rely on final MS analysis for identifications (not to oversimplify the analysis of intact proteoforms), the main differences lie in the up-front analytical approaches. Top-down resolves intact proteoforms prior to MS while bottom-up generally bypasses any initial separation technique. Thus, top-down provides proteoform information while bottom-up can only provide (limited) amino acid sequence information. Nonetheless, perhaps the most important point to immediately emphasize is the critical importance of high quality/high resolution MS to proteomics as an integrative discipline, now and into the future.

**Figure 4**
Schematic of MS/MS. A basic overview of the four main systems of MS/MS and the different methods for each. Peptides undergo separation via LC prior to ionization. Peptides are then transformed into ions before entering the mass filter where precursor ions are then selected prior to collision-induced dissociation. The resulting fragment ions are then separated and transmitted to the detector. The mass filter measures the mass of the ions and the detector counts the ions. This information can then be combined to determine the mass-to-charge ratio (m/z), leading to identification of a peptide.

**Figure 5**
Peptide MS. This illustrates the information obtained via routine peptide MS. (A) Canonical protein (primary amino acid sequence); (B) PTM = Proteolytic cleavage; (C) PTM = Ubiquitination; (D) PTM = Two phosphorylations; (E) PTM = Phosphorylation and methylation. As only peptides are being sequenced, the ‘canonical protein’ identifications are based on inference; thus, as shown in (B), even though there has been a native proteolytic cleavage to generate another proteoform (i.e., likely to modify the biological activity of the canonical protein—Proteoform 1), it will not be detected by inference identification. Notably, other than potentially identifying SNP, no proteoform information is obtained via peptide MS without specific additional processing and assays.

**Figure 6**
Integrative top-down proteomics via 2DE and PTM post-staining. (A) PTM = two phosphorylation sites; (B) Phosphorylation and methylation; (C) Glycosylation. Different PTM can change the pI and MW of a protein species thus, altering its final resolution in a 2D gel, which can be seen using a total staining method. Additional selective staining (e.g., phospho- and glyco-protein staining) can be used to identify these proteoforms prior to digestion and MS. Phosphorylation yields more acidic species and sugar groups increase MW [113]. Typically, a chain of protein species as seen in the 2D gel is often indicative of an identical canonical protein with varying modifications.

**Figure 7**
Integrative and MS-intensive proteome analysis. This schematic depicts the workflows of these two top-down approaches. Integrative MS involves the separation of intact protein species via 2DE prior to peptide MS. Additionally, spots of high abundance or areas at the pH extremes and unresolved small peptides in the migrating front can be further subjected to 3rd electrophoretic separations. MS-intensive involves separation of intact protein species, currently mainly via GELFrEE, prior to intact protein MS. Dashed line represents the potential combination of integrative and MS-intensive approaches, which has not yet been pursued.

**Figure 8**
Antibodies and proteoforms. As antibodies are mainly raised to identify amino acid epitopes, it is possible that a PTM at, or near, the epitope will interfere with binding of the antibody. This may prevent the detection of the target. (A) Antibody binding without any interference; (B) Antibody binding without phosphate group interfering; (C) Antibody binding blocked by methyl group; (D) phosphate and sugar group adjacent to epitope affect/block antibody binding.

**Figure 9**
MS-based targeted proteomics. Shown are the different acquisition modes commonly used for targeted detection of protein species with MS. (A) SRM—quantifies specific, predetermined ions from peptide of interest; (B) PRM—simultaneously analyzes all fragment ions of the pre-selected peptides of interest; (C) DIA—analyzes all peptide mass ranges within the window without pre-selection.

See this image and copyright information in PMC

References

1. Fey S.J., Larsen P.M. 2D or not 2D. Curr. Opin. Chem. Biol. 2001;5:26–33. doi: 10.1016/S1367-5931(00)00167-8. - DOI - PubMed
1. Wilkins M.R., Sanchez J.-C., Gooley A.A., Appel R.D., Humphery-Smith I., Hochstrasser D.F., Williams K.L. Progress with proteome projects: Why all proteins expressed by a genome should be identified and how to do it. Biotechnol. Genet. Eng. Rev. 1996;13:19–50. doi: 10.1080/02648725.1996.10647923. - DOI - PubMed
1. Duncan M.W., Yergey A.L., Gale P.J., Kate Y. Quantifying proteins by mass spectrometry. LC-GC N. Am. 2014;32:726–735.
1. Jungblut P.R., Holzhütter H., Apweiler R., Schlüter H. The speciation of the proteome. Chem. Cent. J. 2008;2:16. doi: 10.1186/1752-153X-2-16. - DOI - PMC - PubMed
1. Jungblut P.R., Thiede B., Schlüter H. Towards deciphering proteomes via the proteoform, protein speciation, moonlighting and protein code concepts. J. Proteom. 2016;134:1–4. doi: 10.1016/j.jprot.2016.01.012. - DOI - PubMed

Publication types

Actions

Grants and funding

2019-04324/Natural Sciences and Engineering Research Council of Canada

LinkOut - more resources

Full Text Sources
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Proteomes Are of Proteoforms: Embracing the Complexity

Affiliation

Proteomes Are of Proteoforms: Embracing the Complexity

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

References

Publication types

Grants and funding

LinkOut - more resources

Full Text Sources

Miscellaneous