Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Nov;141(11):1705-1722.
doi: 10.1007/s00439-022-02435-y. Epub 2022 Feb 5.

Evolutionary history of type II transmembrane serine proteases involved in viral priming

Affiliations

Evolutionary history of type II transmembrane serine proteases involved in viral priming

Diego Forni et al. Hum Genet. 2022 Nov.

Abstract

Type II transmembrane serine proteases (TTSPs) are a family of trypsin-like membrane-anchored serine proteases that play key roles in the regulation of some crucial processes in physiological conditions, including cardiac function, digestion, cellular iron homeostasis, epidermal differentiation, and immune responses. However, some of them, in particular TTSPs expressed in the human airways, were identified as host factors that promote the proteolytic activation and spread of respiratory viruses such as influenza virus, human metapneumovirus, and coronaviruses, including SARS-CoV-2. Given their involvement in viral priming, we hypothesized that members of the TTSP family may represent targets of positive selection, possibly as the result of virus-driven pressure. Thus, we investigated the evolutionary history of sixteen TTSP genes in mammals. Evolutionary analyses indicate that most of the TTSP genes that have a verified role in viral proteolytic activation present signals of pervasive positive selection, suggesting that viral infections represent a selective pressure driving the evolution of these proteases. We also evaluated genetic diversity in human populations and we identified targets of balancing selection in TMPRSS2 and TMPRSS4. This scenario may be the result of an ancestral and still ongoing host-pathogen arms race. Overall, our results provide evolutionary information about candidate functional sites and polymorphic positions in TTSP genes.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
HYPERLINK "sps:id::fig1||locator::gr1||mediaobject::0" Evolutionary Analysis of TTSPs in Mammals. A Schematic representation of Type II Transmembrane Serine Proteases (TTSPs). TTPSs are grouped into four subfamilies based on domain composition and phylogenetic analyses (Szabo and Bugge. 2008) The name and the localization of the domains in the protein sequences are taken from the Uniprot database (https://www.uniprot.org/). LDL-receptor class A: Low-density lipoprotein receptor domain class A; SRCR: Scavenger receptor Cys-rich; SEA: sea urchin sperm protein, enterokinase, agrin; CUB: C1r, C1s, uEGF, and bone morphogenetic protein; MAM: domain in meprin, A5, receptor protein tyrosine phosphatase mu; SRCR: Scavenger receptor Cys-rich; FZ: Frizzled. Red arrow heads indicate the position of positively selected sites (see Table 1). The CORIN regions obtained on the basis of the recombination breakpoints are also reported. Viruses target of proteolytic activation by TTSPs are also reported (see Supplementary Table S2). B. Comparison of dN/dS values. Boxplot comparison of average dN/dS values calculated for TTSP genes. p values from the Wilcoxon rank sum test and average dN/dS values for each gene are reported. TTSPs involved (red) or not involved (blue) in the activation of viral glycoproteins are indicated. C. Branch-site analysis of positive selection. aBS-REL analysis for TMPRSS2 in mammals identified one branch showing evidence of positive selection (pink box). Branch lengths are scaled to the expected number of substitutions per nucleotide. The p value (red) from aBS-REL is reported. See Supplementary Table 1 for the full name of the species reported in the tree. See also Figure S1
Fig. 2
Fig. 2
Positive selection acting on TTSP serine protease domains. Cartoon 3D representation of the serine protease domain for TMPRSS11E (a), TMPRSS15 (b), and TMPRSS2 (c). Positively selected sites are colored in red and refer to the human protein; the catalytic triad (Ser-His-Asp) is shown in blue. TMPRSS11E, TMPRSS15, and TMPRSS2 serine protease domain structures are derived from the Protein Data Bank (PDB) ids 2OQ5, 4DGJ, and 7MEQ, respectively. For all other TTSP domains with positively selected sites (see Fig. 1A and Table 1), no 3D structures were available in the PDB database
Fig. 3
Fig. 3
Human population genetic analysis of TMPRSS2. A Screenshot from the UCSC genome browser (http://genome.ucsc.edu/, GRCh37/hg19). The panel shows the analyzed genomic region (chr21:42816232–42900085), polymorphic variants associated with COVID-19 and with a potential functional effects (Asselta et al. ; Piva et al. ; Vargas-Alarcón et al. 2020), as well as a track with genetic variants likely affecting proximal gene expression in human tissues from the genotype-tissue expression (GTEx, V6 data release). The eQTL item color indicates the effect size attributed to the eQTL: red, high positive; light red, moderate positive; light blue, moderate negative; blue, high negative. B Nucleotide diversity analysis in human populations. Data from 1000 genomes Project were used to calculate θW and π in sliding windows of 5 kb moving with a step of 250 bp. Horizontal lines represent the 95th percentiles in the distribution of θW (blue) and π (red). Vertical dashed lines indicate the start and the end of TMPRSS2. Regions with θW and π greater than the 95th percentile in at least two populations are shaded in gray
Fig. 4
Fig. 4
Human population genetic analysis of TMPRSS4. A Screenshot from the UCSC genome browser (http://genome.ucsc.edu/, GRCh37/hg19). The panel shows the analyzed genomic region (chr11:117927793–118012605) and a track with genetic variants likely affecting proximal gene expression in human tissues from the genotype-tissue expression (GTEx, V6 data release). The eQTL item color indicates the effect size attributed to the eQTL: red, high positive; light red, moderate positive; light blue, moderate negative; blue, high negative. B Nucleotide diversity analysis in human populations. Data from 1000 genomes Project were used to calculate θW and π in sliding windows of 5 kb moving with a step of 250 bp. Horizontal lines represent the 95th percentiles in the distribution of θW (blue) and π (red). Vertical dashed lines indicate the start and the end of TMPRSS4. Regions with θW and π greater than the 95th percentile in at least two populations are shaded in gray

References

    1. 1000 Genomes Project Consortium. Durbin RM, Abecasis GR, Altshuler DL, Auton A, Brooks LD, Durbin RM, Gibbs RA, Hurles ME, McVean GA. A map of human genome variation from population-scale sequencing. Nature. 2010;467:1061–1073. doi: 10.1038/nature09534. - DOI - PMC - PubMed
    1. Alshahawey M, Raslan M, Sabri N. Sex-mediated effects of ACE2 and TMPRSS2 on the incidence and severity of COVID-19; the need for genetic implementation. Curr Res Transl Med. 2020;68:149–150. doi: 10.1016/j.retram.2020.08.002. - DOI - PMC - PubMed
    1. Anisimova M, Bielawski JP, Yang Z. Accuracy and power of bayes prediction of amino acid sites under positive selection. Mol Biol Evol. 2002;19:950–958. doi: 10.1093/oxfordjournals.molbev.a004152. - DOI - PubMed
    1. Anisimova M, Nielsen R, Yang Z. Effect of recombination on the accuracy of the likelihood method for detecting positive selection at amino acid sites. Genetics. 2003;164:1229–1236. doi: 10.1093/genetics/164.3.1229. - DOI - PMC - PubMed
    1. Antalis TM, Buzza MS, Hodge KM, Hooper JD, Netzel-Arnett S. The cutting edge: membrane-anchored serine protease activities in the pericellular microenvironment. Biochem J. 2010;428:325–346. doi: 10.1042/BJ20100046. - DOI - PMC - PubMed