Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2019 Nov:166:52-76.
doi: 10.1016/j.biochi.2019.09.004. Epub 2019 Sep 7.

Surface loops of trypsin-like serine proteases as determinants of function

Affiliations
Review

Surface loops of trypsin-like serine proteases as determinants of function

Peter Goettig et al. Biochimie. 2019 Nov.

Abstract

Trypsin and chymotrypsin-like serine proteases from family S1 (clan PA) constitute the largest protease group in humans and more generally in vertebrates. The prototypes chymotrypsin, trypsin and elastase represent simple digestive proteases in the gut, where they cleave nearly any protein. Multidomain trypsin-like proteases are key players in the tightly controlled blood coagulation and complement systems, as well as related proteases that are secreted from diverse immune cells. Some serine proteases are expressed in nearly all tissues and fluids of the human body, such as the human kallikreins and kallikrein-related peptidases with specialization for often unique substrates and accurate timing of activity. HtrA and membrane-anchored serine proteases fulfill important physiological tasks with emerging roles in cancer. The high diversity of all family members, which share the tandem β-barrel architecture of the chymotrypsin-fold in the catalytic domain, is conferred by the large differences of eight surface loops, surrounding the active site. The length of these loops alters with insertions and deletions, resulting in remarkably different three-dimensional arrangements. In addition, metal binding sites for Na+, Ca2+ and Zn2+ serve as regulatory elements, as do N-glycosylation sites. Depending on the individual tasks of the protease, the surface loops determine substrate specificity, control the turnover and allow regulation of activation, activity and degradation by other proteins, which are often serine proteases themselves. Most intriguingly, in some serine proteases, the surface loops interact as allosteric network, partially tuned by protein co-factors. Knowledge of these subtle and complicated molecular motions may allow nowadays for new and specific pharmaceutical or medical approaches.

Keywords: Allosteric network; Chymotrypsin fold; Regulatory mechanism; Structure-function relationship; Surface loops; Zymogenicity.

PubMed Disclaimer

Figures

Figure 1
Figure 1. The chymotrypsin/trypsin fold consisting of two β-barrels and eight surface loops.
1A: The left panel shows the crystal structure of the proform or zymogen of bovine chymotrypsinogen A (PDB code 1CHG [288]) and the right panel displays the active form (2CHA [289]), both in ribbon representation. The catalytic triad residues Ser195, His57, and Asp102 are depicted as stick models, as well as the specificity determining Ser189 and the five disulfide bridges. Upon activation with formation of a salt bridge between the new N-terminus Ile16 and Asp194, large rearrangements are seen around the active site, which rigidifies and shapes the S1 pocket. 1B: Close-up view of the active site. Hydrogen bonds are shown as dotted lines. Residue 189 at the bottom of the S1 subsite is the crucial residue that determines the primary specificity of serine proteases, while the oxyanion hole, based on the N-H groups of Ser195 and Gly193 stabilizes the transition state of substrates at the scissile bond. 1C: Stereo representation of chymotrypsin with the eight surface loops in color around the active site. The numbering scheme is mainly based on the central residue of the loop, as derived from bovine chymotrypsinogen A (bCTRA) or by the most common designations. All these loops show significant variations among the S1 family proteases and can serve as regulatory elements. 1D: Molecular surface of bCTRA with the eight colored active site loops and alternative numbering schemes, e.g. I-VIII, or as proposed by Perona and Craik (A–E and 1-3) [14].
Figure 2
Figure 2. Trypsin-like serine proteases in the digestive tract of the small intestine.
2A: Activation scheme. The membrane anchored, mosaic enteropeptidase (EK, trypsin-like serine protease domain in red, with additional domains) activates trypsin (TRY) by cleavage of the N-terminal propeptide, whereby the neo-N-terminus inserts in the activation pocket, depicted as N within the active protease domain in red. Trypsin activates chymotrypsins (CTR), elastases (ELA), carboxypeptidases (CBP) and tissue kallikrein 1 (KLK1). 2B: Overlay of bovine chymotrypsin B (green, PDB code 1CBW [290]), human trypsin-1 (red, 1TRN [291]) and porcine elastase 2A (yellow, 1BRU). These proteases are highly conserved in mammals, whereby trypsin is seven residues shorter than chymotrypsin, while elastase is eighth residues longer. The most distinguishing features are the primary specificity determining Asp189 of trypsin and Ser189 in chymotrypsin and elastase. The latter possesses a Ser226 instead of Gly226, which excludes the binding of large S1 side-chains. In addition, trypsin exhibits three Glu residues in the 75-loop, which bind the activity-enhancing Ca2+. 2C: Molecular surface of chymotrypsin, trypsin and elastase with electrostatic potential. Depending on the overall short surface loops, most specificity pockets, except the S1 subsite are not well defined and allow interaction with various substrate side-chains. Trypsin has a strong negative potential (red) in the S1 pocket and binds basic P1-Arg and Lys residues. Chymotrypsin prefers hydrophobic, aromatic P1-residues, such as Phe, Tyr and Trp, whereas elastase accepts shorter side-chains, such as Ala, Val, Ile, and Thr.
Figure 3
Figure 3. Trypsin-like proteases in blood coagulation and fibrinolysis.
3A: Scheme of the blood coagulation system. Intrinsic and extrinsic pathways lead to the common pathway with thrombin as central protease that cleaves fibrinogen monomers, which assemble to fibrin polymers. The protease activation cascades include active tryptic proteases (red), non-protease regulators participate (white), and the crosslinking transglutaminase fXIIIa (grey). Plasmin removes fibrin clots for restoring blood vessels. 3B: Regulation of blood coagulation depends on distinct conformational changes of the active site loops, such as in fXII, whose zymogen conformation involves allostery of the N-terminus with the undefined 189- and 220- loops (dots) (4XE4 [34]). The color scheme for the loops corresponds to Figs. 1C and 1D, e.g. the 99-loop is depicted yellow, the 75-loop is green, etc. 3C: A comparison of fIXa with and without Ca2+ bound in the 75-loop (green) confirmed an allosteric communication line (arrow) to the N-terminus and the 189-loop (pink) (2WPI [40]); same color scheme as in Fig. 3B. 3D: A complex of snake venom homologs is a good model of the fVa-fXa complex or prothrombinase, and resembles the Xase complex of fVIIIa-fIXa (4BXS [47]). The crucial interaction is between the A2 domain (green) and the c175-loop (orange), resulting in an open “unlocked” conformation of the protease. 3E: Bovine thrombin depicted with electrostatic surface potential and fibrinogen peptide bound to the active site (1UCY [292]). Two anion binding exosites mediate interactions with various substrates and regulators. 3F: The structure of full-length plasminogen type II (4DUR [79]) reveals the arrangement of the five kringle domains (various colors), which contribute to fibrin clot and receptor binding. The N-terminal PAp domain (blue) maintains the closed and inactive conformation, as well as the unprocessed propeptide (black), while the catalytic domain (white) is in the zymogen state (catalytic triad residues shown as spheres).
Figure 4
Figure 4. Trypsin-like proteases of the complement system.
4A: Scheme of the complement cascade and the three major pathways. The classical pathway involves proteases C1r and C1s, which activate the C2a component of the C3 convertase. C2a forms the protease component of the C5 convertase as well. The lectin pathway is initiated by mannan binding lectin (MBL) or ficolin, whereby the complex with the MASP proteases generates the C3 convertase C4bC2a. The alternative pathway generates the C5 convertase with factors CD and CB corresponding functionally to C1s and C2. Proteases are displayed red, thioesterase complexes black, and other protein cofactors white. All pathways are more complicated due to complement receptors, additional functions of cleavage products and various interconnections [88]. 4B: The MASP2-C4 Michaelis complex as example of the initiating step of the cascade (4FXG [175]). The trypsin-like protease domain with additional domains is shown in red shades, while the substrate C4 domains are shown in green and yellow. The active site bound segment and the catalytic triad residues are shown as spheres. Except for the alternative pathway, interaction of the serine protease with pathogen binding molecules is required. 4C: Factor C2a, the protease component of the C3 and C5 convertase of the classical pathway, consists of a serine protease domain (white), with many alterations compared to trypsin, and a regulatory VWA domain (green). The regulator helix α7 (dark green) sits at the domain interface with an N-glycan (spheres) at the linker (black) and can together with helix α1 switch to an active, open conformation (2I6Q and 2I6S [99]). Active C2a is still in a partial zymogen-like state, due to the flipped peptide of Lys656-Gly657 of the oxyanion hole, albeit activated by the unusual salt bridge of Arg696 (c224) in the 220-loop (purple) to Glu658 (c194). 4D: Factor CI, a major control element of the complement cascades circulates as active protease, although in a zymogen-like state, as corroborated by a largely disordered N-terminal segment up to residue 325 (c19), including the 75- (green), 148- (brown), 189- (pink), and 220-loops (magenta), which are depicted with their defined residues as spheres, like His57 (2XRC [103]). By contrast, the N-terminal FIMAC, LDLRA1/2 and SRCR domains, which mediate contact to the major substrate C3b, are well defined. 4E: Factor CB structure with three N-terminal sushi or CCP domains (labeled 1 to 3), which require cleavage at the Arg234-Lys235 bond for activation (2OK5 [108]). Active CBb of the alternative pathway resembles largely factor C2a, with a regulatory VWA domain and similar characteristics of the active site, including the unusual salt bridge (c224-c194) and the same oxyanion hole conformation.
Figure 5
Figure 5. Kallikreins and kallikrein-related peptidases.
5A: Tissue kallikrein and plasma kallikrein KLKB are kininogen cleaving regulators of blood pressure, while the prostatic KLKs 2, 3, 4, 5, and 11 are either part of an activation cascade and/or degrade gel forming proteins to enhance sperm motility for impregnation. In brain, KLK6 and 8 participate in the regulation of glia cell and synaptic remodeling, whereas the skin related KLKs 5, 7, and 14 activate themselves and cleave cell connecting molecules for skin cell shedding. 5B: KLK2 is glycosylated at Asn95, depicted as core glycan, which favors the closed conformation of the 99-loop (yellow) with an 11-residue insertion with respect to bCTRA. KLK3 has an N-glycan linked to Asn61 (not shown) and a 99-loop, which can cover the non-prime-side from S4 to S2 like a lid, as in KLK2 (lower panel). The eKLK3 structure represents an E* form, with closed 99- (yellow), 148- (brown), 189- (pink), and 220-loops (magenta), but intact catalytic triad and Ile16-Asp194 salt bridge (1GVZ [165]), color scheme as in Figs. 1C and 1D. 5C: KLK4 exhibits a unique Zn2+ binding site that connects the 75-loop with the N-terminal segment via the ligands Glu74 and His25, with an additional cation binding Glu74* ligand from a neighbor molecule (2BDH [174]). Zn2+ binding causes a conformational change to a zymogen-like conformation of the active site via an exposed N-terminus, perhaps mediated by the disulfide Cys22-Cys157, as suggested for fVIIa. 5D: KLK5, 7, and 8 possess inhibitory Zn2+ binding sites, involving His99 and other ligands, such as His96 (2PSX and 2PSY [203]). The inhibition process is depicted in three steps: The two His side chains of the 99-loop in KLK5 are ready to bind Zn2+ in solution (1), an intermediate state with His96 and His99 bound to Zn2+ was observed for KLK5 (2). Eventually, the inactive state with a disrupted catalytic triad (3) is represented by tonin, a KLK2 ortholog from rat, where Zn2+ is bound by His57 (1TON [180]). 5E: KLK8 exhibits both a stimulatory Ca2+ binding site in the 75-loop and the aforementioned Zn2+ inhibition site in the 99-loop. Structure based molecular dynamics simulation corroborated that these loops are connected by a variant of the communication line of the coagulation factors (5MS3 [193]). Significant concerted loop movements upon Ca2+ removal with respect to the crystal structure are depicted as spheres for the 37- (red), 75- (green) and 99-loops (olive).
Figure 6
Figure 6. Various trypsin-like serine proteases with allosteric loop networks. Significant loops are labeled.
6A: Human HtrA1 displays a zymogen-like conformation with a highly disordered region around the 175-loop (L3), since it misses an N-terminal activating salt bridge. Trimerization and substrate binding induce conformational changes, including the 189- and 220-loops (L1 and L2) and the catalytic triad (3NWU, 3NZI [211]). 6B: Matriptase of the TMPRSS family is a membrane anchored protease related to enterokinase. A locked zymogen mutant Arg614Ala (c15) displays 3% activity, due to a missing activating salt bridge to the N-terminus, as well as closed 148-, 175-, and 220-loops (transparent surfaces), while the mature protease adopts standard conformations (cartoon representation) (1EAX, 5LYO [227, 293]). 6C: Streptogrisin B from a fungus represents a minimal version of trypsin-like serine proteases with only 186 residues. It contains extremely short loops with respect to trypsin, while only the 189- and 220-loops reach standard lengths (1DS2 [294]).
Figure 7
Figure 7. Sequence alignment for stretches of the loops in selected trypsin-like serine protease domains.
Human trypsin-1 (TRY1) is included with bovine chymotrypsinogen (bCTRA) as numbering reference (underlined residues), while a gap is left for residues 107 to 130. Residues in bold font define the loops according to the reference chymotrypsin or have special functions, such as the catalytic triad residue His57, Asp102 and Ser195 or the specificity determining c189 und c190 residues. Negatively charged residues are shown in red, positively charged ones in blue. Glycosylation sites or sequons are shown in green, while additional metal binding residues for Na+, Ca2+ and Zn2+ are displayed magenta. Most mature trypsin-like protease domains possess the standard N-terminus with a hydrophobic residue (c16), such as Ile or Val. Blood coagulation factors fVII, fIX, fX, fXI, fXII and thrombin (Thrb) are partially aligned. The nine residue insertion with an N-glycosylation site in the 60-loop makes thrombin unique among the blood coagulation proteases, since it considerably shields the active site. Insertions in the usually short 37-loop influence substrate binding in the prime side and can enhance the specificity for plasmin (Plasm) as in the plasminogen activators uPA and tPA, which is accompanied by extended 60-loops. Partial protease domain segments of complement factors C1s, C2a, human neutrophil elastase (ELNE), and the human pancreatic elastase PE IIA (CEL2A) are displayed as well as of KLKs 2, 3, 4, 5, 8 and 10. Apart from zymogen forms, KLK10 is only active with a three residue extension, beginning with Leu13, while KLK13 can be active with a standard c16 N-terminus or with a Leu5.
Figure 8
Figure 8. Overall scheme of the allosteric loop network in trypsin-like serine proteases.
The allosterically interacting N-terminus and loops are colored and labeled for the respective proteases, as well as proteinaceous and ionic cofactors. Several examples of crosstalk between distinct loops have been discovered and confirm that serine proteases are dynamic and flexible molecular machines, with adaptations to their specific tasks. For example in thrombin, Na+ binding changes the 220- and 189-loop conformations to the fast form, while thrombomodulin binding at the 37-, 60,- and 75-loops alters the substrate specificity (section 2.2.3.). In case of KLK10 a zymogen-like conformation of the N-terminus, the 75-, and 148-loops, involving the active site is capable to bind substrates and most likely rearranges to an active state. (see section 2.4.5.) Detailed knowledge of the individual allosteric network will facilitate the development of new modulatory compounds, biologics or synthetic molecules, which can be exploited in novel therapeutic approaches.

Similar articles

Cited by

References

    1. Rawlings ND, Barrett AJ. Evolutionary families of peptidases. Biochem J. 1993;290:205–218. - PMC - PubMed
    1. Matthews BW, Sigler PB, Henderson R, Blow DM. Three-dimensional Structure of Tosyl-α-chymotrypsin. Nature. 1967;214:652–656. - PubMed
    1. Hartley BS, Neurath H. Homologies in Serine Proteinases [and Discussion] Phil Trans R Soc B. 1970;257:77–87. - PubMed
    1. Rawlings ND, Barrett AJ, Thomas PD, Huang X, Bateman A, Finn RD. The MEROPS database of proteolytic enzymes, their substrates and inhibitors in 2017 and a comparison with peptidases in the PANTHER database. Nucleic Acids Res. 2018;46:D624–D632. - PMC - PubMed
    1. Gorbalenya AE, Donchenko AP, Blinov VM, Koonin EV. Cysteine proteases of positive strand RNA viruses and chymotrypsin-like serine proteases. Febs Lett. 1989;243:103–114. - PubMed

Substances