Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Aug 10;66(3):255-285.
doi: 10.1042/EBC20200108.

Uncovering protein function: from classification to complexes

Affiliations

Uncovering protein function: from classification to complexes

Rhiannon Morris et al. Essays Biochem. .

Abstract

Almost all interactions and reactions that occur in living organisms involve proteins. The various biological roles of proteins include, but are not limited to, signal transduction, gene transcription, cell death, immune function, structural support, and catalysis of all the chemical reactions that enable organisms to survive. The varied roles of proteins have led to them being dubbed 'the workhorses of all living organisms'. This article discusses the functions of proteins and how protein function is studied in a laboratory setting. In this article, we begin by examining the functions of protein domains, followed by a discussion of some of the major classes of proteins based on their function. We consider protein binding in detail, which is central to protein function. We then examine how protein function can be altered through various mechanisms including post-translational modification, and changes to environment, oligomerisation and mutations. Finally, we consider a handful of the techniques employed in the laboratory to understand and measure the function of proteins.

Keywords: biochemical techniques and resources; post translational modification; protein binding.

PubMed Disclaimer

Conflict of interest statement

The authors declare that there are no competing interests associated with the manuscript.

Figures

Figure 1
Figure 1. Classification of human proteins by function
Proteins tend to be classified based on the biological function they perform. These groups can then be further divided into subsections. Enzymes can be further divided into groups based on the reactions they catalyse, and binding proteins based on their ligands, or binding partners. The subcategories for enzymes are from the internationally agreed enzyme classification system from the Enzyme Commission (EC).
Figure 2
Figure 2. Primary, secondary, tertiary, and quaternary protein structure
Primary structure is the sequence of a chain of amino acids. Secondary structure is due to hydrogen bonding of the peptide backbone, allowing the peptide to fold into a repeating structure such as α-helices or β-sheets. Tertiary structure is the 3D fold of a protein that occurs due to side chain interactions and core packing. Quaternary structure occurs when multiple chains, or subunits, interact to form a functional complex. Here, each polypeptide chain is depicted in a unique colour in a quaternary complex.
Figure 3
Figure 3. The three distinct domains of pyruvate kinase
Pyruvate kinase is arranged into three domains, shown here in blue (domain A), green (domain B), and white (domain C). The domains are illustrated through a schematic diagram (left) or with structural models (right). Domain C is the regulatory domain and binding of fructose-1,6-bisphosphate to domain C activates the catalytic activity of pyruvate kinase due to a conformational change. Together domains A and B form an enzymatic active site and perform a catalytic function. Despite the different functions of the individual domains of pyruvate kinase (regulatory and catalytic), the overall function of the protein is typically considered to be enzymatic; PDB ID: 1PKN.
Figure 4
Figure 4. Protein domains that bind modified residues
Schematic diagram of select examples of domains that bind modified residues and examples of proteins that contain this type of domain. Binding depends on a match in both shape and chemical nature between the domain and the modified residue, which is represented by shape and colour in this figure. SH2 and PTB domains bind phosphotyrosine residues. WW domains mediate binding to phosphoserine and phosphothreonine residues, while chromodomains bind methylated lysine residues. UBA domains bind ubiquitin molecules and bromodomains allow interactions with acetylated lysine residues.
Figure 5
Figure 5. LNK (SH2B3) SH2 domain with phosphopeptide bound
Cartoon representation of the backbone of an SH2 domain with the secondary structural features indicated beneath a surface representation. LNK SH2 domain structure with the JAK2 pY813 peptide shown; PDB ID: 7R8W.
Figure 6
Figure 6. Protein domains that bind other biological molecules
Select examples of protein domains binding other macromolecules. (A) The PX domain of a yeast sorting nexin (orange) binding to the headgroup of a lipid, PtdIns3P (blue), PDB ID: 1OCU. (B) The glucocorticoid receptor, a homodimer, (purple, red) bound to a DNA molecule (tan helix), PDB ID: 1R4R. (C) The Trp RNA-binding attenuation protein (TRAP) (green) bound to an RNA molecule (pink helix), PDB ID: 1C9S.
Figure 7
Figure 7. Examples of the diverse functions of proteins and the domains they comprise
Many proteins are made up of multiple functional domains that come together to give rise to the overall function of the protein. Shown above is a schematic representation of how catalytic, regulatory, binding, structural and oligomerisation domains can come together to give rise to different functions in different proteins. This figure aims to show that different functional domains work together to control different elements of a protein’s function, rather than to accurately depict the protein’s structure.
Figure 8
Figure 8. Regulation of metabolism by phosphofructokinase-2/fructosebisphosphatase-2
Glycolysis and gluconeogenesis are two key metabolic processes that regulate energy levels in the body through cellular respiration. Glycolysis is the process whereby glucose is broken down and is the first step in respiration in eukaryotes, whereas gluconeogenesis allows glucose to be made from simpler precursors such as lactate or pyruvate. Phosphofructokinase-2/fructosebisphosphatase-2 is a bifunctional enzyme that is a key regulator of glycolysis and gluconeogenesis. Both phosphofructokinase-2 (green, left) and fructosebisphosphatase-2 (blue, right) are part of the same 55 kDa polypeptide chain that contains an N-terminal regulatory domain, a kinase domain and a phosphatase domain. Phosphofructokinase-2/fructosebisphosphatase-2 catalyses the formation and degradation of the key allosteric regulator fructose-2,6-bisphosphate, which acts as a mechanism for switching between glycolysis and gluconeogenesis. In brief, when blood sugar is high (depicted on the left of the figure), insulin is produced. A downstream effect of insulin signalling results in dephosphorylation of the PFK-2/FBPase-2 complex and increased PFK-2 kinase activity. This results in increased fructose-2,6-bisphosphate levels and increased glycolysis. Alternately, when blood glucose is low (depicted on the right of the figure), the hormone glucagon is produced. This results in phosphorylation of the complex and increased FBPase-2 phosphatase activity. This decreases fructose-2,6-bisphosphate levels, slowing glycolysis and increasing gluconeogenesis.
Figure 9
Figure 9. Examples of a non-symmetric transient multi-protein complex and some symmetric obligatory homooligomers
The TFIID–TFIIA–TBP initiation complex is formed by many subunits that come together in a non-symmetric way to carry out a specific function. Ferritin, alcohol dehydrogenase and the Satellite Tobacco Necrosis Virus coat protein are examples of symmetric homooligomers, i.e. they comprise repeating subunits that form the functional unit. Each protein chain (subunit) in each example has a unique colour to highlight each component. PDB ID from left to right: 6MZM, 1HRS, 2OHX, 2BUK.
Figure 10
Figure 10. Two types of protein interaction
On the left is an example of an obligatory homodimer between two folded monomers that creates an active site (coloured purple) between the subunits. On the right is an example of a transient multifunctional complex between the homodimer and an unstructured region containing two phosphorylated residues (indicated by the letter P) within a second multi-domain protein. The formation of the multifunctional complex causes the two monomers within the dimer to rotate to reveal secondary binding sites (white) and subsequently close the original active site (purple). This is an example of allostery, where binding of one protein changes the conformation of another, which may be controlled according to the presence/absence of post-translational modifications. The thermodynamic and kinetic parameters are different for the two types of interactions as indicated. The obligatory interaction may occur in a different compartment and subsequently move to another compartment (as indicated by the dashed line) to subsequently engage in the transient interaction.
Figure 11
Figure 11. Electrostatics and the encounter complex in protein interactions
A protein domain (grey) may have positive and negatively charged regions that generates electric force field lines that guides/orientates the positively charged peptide onto the negatively charged binding site. This allows it to dock in multiple ways on the surface of the protein via long-range and looser interactions in an encounter complex that facilitates a fast on-rate for the interaction (each peptide position is shown in a different colour). In the second step, as the correctly oriented peptide is experienced, it goes on to bind as the final complex. The encounter complex is a stable intermediate as shown in the energy diagram that describes a possible pathway from free peptide/domain to encounter to the final domain-peptide complex. See the uncovering protein structure review in this series for more information of these type of energy diagrams.
Figure 12
Figure 12. Proteins can find their partners faster by searching in 1D or 2D
A multi-domain protein may first associate non-specifically with the DNA predominantly through one domain facilitated by long-range electrostatics to the charged DNA backbone and then ‘slide’ along DNA until its target DNA site (green) is found where the second domain can now also bind, locking it in place. Another multi-domain protein may bind a disordered region of a protein that is enriched in charged and hydrophobic residues which would allow non-specific binding and then slide or hop along the disordered region until its binding site (green) is found where again the second domain can now also bind, locking it in place. Another multi-domain protein may bind a lipid bilayer and diffuse until the second domain binds to a target membrane protein (green). In all cases, the search would be quicker since the protein is only scanning possible areas of the cell that are more likely to lead to the correct target. This is facilitated by having a separate domain that specialises in binding either DNA, disordered protein regions or lipid bilayers possibly through forming an encounter complex. In some cases, a single domain could achieve the same effect by having two distinct regions, a non-specific and specific binding surface.
Figure 13
Figure 13. Free energy of specificity
The protein P can bind a non-target A and two different targets B and C. The free energy of the protein complexes PA, PB, and PC are lower than the free energy of the components before binding and thus indicate favourable binding in all cases. However, the binding of B and C is more favourable than the binding of A, thus the difference in the binding free energy changes also known as the free energy of ‘specificity’ indicates how much better C and B binds than A. The bigger the difference the more specific the target is for the protein. C is more specific than B due to additional unique contacts with the protein. On the right is an example of an SH3 domain bound to an extended peptide that uses the common surface I (red) as well as an additional surface II (blue), which gives the interaction greater affinity and specificity (PDB ID: 2KXC).
Figure 14
Figure 14. Mechanism of alternative splicing
Genes consist of non-coding (introns) and coding (exons) regions. Each exon in this figure is indicated by a colour (blue, red, yellow) in this figure. Different arrangements of exons are ‘spliced’ together to make varying mRNA transcripts that give rise to proteins with different structural elements that may perform different functions.
Figure 15
Figure 15. Overview of the various types of post translational modifications
The different types of post-translational modifications can be classified depending on the modification type. Small molecule modifications include addition of small moieties such as methyl, phosphor, or acetyl groups to proteins. More complex modifications include glycosylation and lipidation. Other proteins such as SUMO and ubiquitin can be added to proteins. Proteolysis is a form of non-reversible cleavage, and finally disulphide formation and nitrosation involve oxidation of proteins.
Figure 16
Figure 16. Four forms of mutation
Silent mutations are due to a single base change that does not alter the final protein sequence whereas a missense mutation is a single base change resulting in a different amino acid being added to the protein sequence. A nonsense mutation leads to the addition of a premature stop codon within the protein. For ease of interpretation, green squares indicate the presence of a mutation in the DNA sequence, and where relevant, downstream changes to the mRNA or protein sequences based on the mutation.
Figure 17
Figure 17. General process of biopanning with phage display libraries
The first step involves generating and expressing a library of DNA that will encode the peptides or proteins to be displayed. This library will then be expressed in phage and tested for target binding. The target is immobilised and exposed to the phage library to allow binding to occur. During the binding step, the physical, chemical and biological parameters can be altered. Unbound phage are then washed away and the remaining bound phage are eluted. These phage can then be used to infect bacteria and generate more copies of the phage, from which the DNA sequence can be determined to identify the binding peptide or protein.
Figure 18
Figure 18. Overview of a typical FRET assay
Two proteins of interest are labelled with fluorophores that form a FRET pair. One is a donor fluorophore (green) and the other is an acceptor (red). Each fluorophore has an absorption spectrum (shown on the left side in each colour) and an emission spectrum (shown on the right side in each colour). If the proteins are in close physical proximity, exciting the donor fluorophore will enable transfer of energy to the acceptor fluorophore, exciting it in turn, and enabling the interaction to be measured as emitted light from the acceptor fluorophore.
Figure 19
Figure 19. Overview of a liposome assay
A liposome forms an artificial membrane environment that contains a sodium channel (blue) and encapsulates a sodium sensitive fluorescent dye (green crescents). Sodium ions are transported into the liposomal lumen by the channel whereby they interact with the fluorescent dye, resulting in an increased fluorescent readout (bottom).
Figure 20
Figure 20. Schematic outlining the IL-6 JAK-STAT signalling pathway and regulation
The JAK proteins are constitutively associated with the GP130 receptor. IL-6 binds IL6RA and GP130, allowing the JAKs to come into close proximality to one another. The JAKs are then activated by transphosphorylation, and subsequently phosphorylate nearby tyrosine residues including those found on the GP130 receptor tail. The four distal phosphotyrosine sites act as docking sites for STATs. Activated STATs then translocate to the nucleus where they bind to specific regions on target DNA and up-regulate gene transcription. These changes in gene transcription dictate the response the cell has to the initial cytokine binding. There are several proteins involved in the regulation of JAK-STAT signalling, including the SOCS proteins and phosphatases which control ubiquitination and dephosphorylation of the proteins, respectively. Additionally, drugs such as JAK and IL-6 inhibitors have been developed as therapeutics to block signalling via the JAK-STAT signalling pathway.

References

    1. Watch a video on proteins and see their range of structures and functions: https://pdb101.rcsb.org/learn/videos/what-is-a-protein-video (Feb, 2022)
    1. Explore protein folding on your computer with the game Foldit: https://fold.it/ (Feb, 2022)
    1. Read about how de novo proteins promise new covid vaccines and medicines: https://www.scientificamerican.com/article/artificial-proteins-never-se... (Feb, 2022)
    1. Look at the predicted structures of proteins in the AlphaFold database: https://alphafold.ebi.ac.uk/ (Feb, 2022)
    1. Learn more about biophysical techniques for measuring protein function: https://portlandpress.com/emergtoplifesci/article-abstract/5/1/1/228110/... (Feb, 2022)

MeSH terms