Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jun 7;12(6):793.
doi: 10.3390/biom12060793.

The Repeating, Modular Architecture of the HtrA Proteases

Affiliations

The Repeating, Modular Architecture of the HtrA Proteases

Matthew Merski et al. Biomolecules. .

Abstract

A conserved, 26-residue sequence [AA(X2)[A/G][G/L](X2)GDV[I/L](X2)[V/L]NGE(X1)V(X6)] and corresponding structure repeating module were identified within the HtrA protease family using a non-redundant set (N = 20) of publicly available structures. While the repeats themselves were far from sequence perfect, they had notable conservation to a statistically significant level. Three or more repetitions were identified within each protein despite being statistically expected to randomly occur only once per 1031 residues. This sequence repeat was associated with a six stranded antiparallel β-barrel module, two of which are present in the core of the structures of the PA clan of serine proteases, while a modified version of this module could be identified in the PDZ-like domains. Automated structural alignment methods had difficulties in superimposing these β-barrels, but the use of a target human HtrA2 structure showed that these modules had an average RMSD across the set of structures of less than 2 Å (mean and median). Our findings support Dayhoff's hypothesis that complex proteins arose through duplication of simpler peptide motifs and domains.

Keywords: HtrA protease; PA clan; protein evolution; protein repeat; serine protease.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflict of interest.

Figures

Figure 1
Figure 1
An illustration of Dayhoff’s hypothesis about the origin of proteins [9,17]. From left to right, starting from individual, spontaneously formed amino acids in the Archaean seas, short oligopeptides formed spontaneously which then organized into homogenous complexes and eventually fused into a single transcript module, probably after being encoded in the genome. Duplication and repetition of these modules along with drift in their sequence and function eventually gave rise to complex, globular proteins.
Figure 2
Figure 2
The active site in the HtrA proteases is separate between the modules. Cartoon diagram of human HtrA2 (PDB ID 5m3n [26]) showing the N-terminal protease (blue), C-terminal protease (cyan) and PDZ-like (green) modules. The catalytic triad of His198, Asp 228, and Ser306 are shown as sticks with light grey carbon, blue nitrogen, and red oxygen atoms. Those residues which correspond to conserved canonical repeat residues are indicated in purple (Figure S1).
Figure 3
Figure 3
Identification of the sequence repeats in the HtrA proteases. The canonical sequence [AA(X2)[A/G][G/L](X2)GDV[I/L](X2)[V/L]NGE(X1)V(X6)] is shown on top in bold, with an additional residue shown for the four positions which have two possible canonical residues. Residues that match the canonical sequence are highlighted in green. The PDB ID, species, and protein name are given along with the module in which the sequence is located. When a module has two copies of the sequence repeat, the most N-terminal is denoted as A and the other as B. The beginning sequence position (using the PDB numbering) is to the left of the sequence while the ending position is given to the right of the sequence.
Figure 4
Figure 4
Cartoon diagram of the modules from HhoA, an HtrA protease from Synechocystis sp. PCC 6803 (PDB ID 5t69) showing the conserved structures of the HtrA modules (RMSD to PDB ID 7co3: mean = 1.748 Å, median = 1.816 Å). Strands are colored orange, yellow, green, blue, and magenta in order from N to C in the protease modules and in the equivalent spatial position in the PDZ-like module. Helices are colored red and coil regions are white. Top-down views of the (A) N-terminal protease module, (B) C-terminal protease module, (C) PDZ-like module and (D) all three modules superimposed. Side views of the (E) N-terminal protease module, (F) C-terminal protease module, (G) PDZ-like module, and (H) all three modules superimposed.

Similar articles

Cited by

References

    1. Radó-Trilla N., Albà M. Dissecting the role of low-complexity regions in the evolution of vertebrate proteins. BMC Evol. Biol. 2012;12:155. doi: 10.1186/1471-2148-12-155. - DOI - PMC - PubMed
    1. Andrade M.A., Perez-Iratxeta C., Ponting C.P. Protein repeats: Structures, functions, and evolution. J. Struct. Biol. 2001;134:117–131. doi: 10.1006/jsbi.2001.4392. - DOI - PubMed
    1. Espada R., Parra R.G., Sippl M.J., Mora T., Walczak A.M., Ferreiro D.U. Repeat proteins challenge the concept of structural domains. Biochem. Soc. Trans. 2015;43:844–849. doi: 10.1042/BST20150083. - DOI - PubMed
    1. Kajava A.V. Tandem repeats in proteins: From sequence to structure. J. Struct. Biol. 2011;179:279–288. doi: 10.1016/j.jsb.2011.08.009. - DOI - PubMed
    1. Burley S.K., Berman H.M., Bhikadiya C., Bi C.X., Chen L., Di Costanzo L., Christie C., Dalenberg K., Duarte J.M., Dutta S., et al. RCSB Protein Data Bank: Biological macromolecular structures enabling research and education in fundamental bi-ology, biomedicine, biotechnology and energy. Nucleic Acids Res. 2019;47:D464–D474. doi: 10.1093/nar/gky1004. - DOI - PMC - PubMed

Publication types

LinkOut - more resources