Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Dec:103:53-69.
doi: 10.1016/j.ibmb.2018.10.006. Epub 2018 Oct 24.

Building a platform for predicting functions of serine protease-related proteins in Drosophila melanogaster and other insects

Affiliations

Building a platform for predicting functions of serine protease-related proteins in Drosophila melanogaster and other insects

Xiaolong Cao et al. Insect Biochem Mol Biol. 2018 Dec.

Abstract

Serine proteases (SPs) and serine protease homologs (SPHs) play essential roles in insect physiological processes including digestion, defense and development. Studies of insect genomes, transcriptomes and proteomes have generated a vast amount of information on these proteins, dwarfing the biological data acquired from a few model species. The large number and high diversity of homologous sequences makes it a challenge to use the limited functional information for making predictions across a broad taxonomic group of insects. In this work, we have extensively updated the framework of knowledge on the SP-related proteins in Drosophila melanogaster by identifying 52 new SPs/SPHs, classifying the 257 proteins into four groups (CLIP, gut, single- and multi-domain SPs/SPHs), and detecting inherent connections among phylogenetic relationships, genomic locations and expression profiles for 99 of the genes. Information on the existence of specific proteins in eggs, larvae, pupae and adults is presented to facilitate future research. More importantly, we have developed an approach to reveal close homologous or orthologous relationships among SPs/SPHs from D. melanogaster, Anopheles gambiae, Apis mellifera, Manduca sexta, and Tribolium castaneum thus inspiring functional studies in these and other holometabolous insects. This approach is useful for tackling similar problems on large and diverse protein families in other groups of organisms.

Keywords: Chromosomal location; Clip domain; Expression profiling; Gene duplication; Hemolymph protein; Insect immunity; Phylogenetic analysis; Serine protease cascade.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Domain organization of 66 multi-domain SPs and SPHs in D. melanogaster. Signal peptide and other structural elements (see symbols in inset) were predicted. The schematic diagrams are not drawn to scale.
Fig. 2.
Fig. 2.
Phylogenetic relationships of the 52 CLIPs (A), 65 GP(H)s (B), and other 140 SP(H)s (including 21 multi-domain proteins lacking clip domain) (C) in D. melanogaster. Complete sequences of the proteins in each group were aligned and a phylogenetic tree was constructed using MrBayes v3.2.6. Probability values for branches are indicated near the branching points, with “*” representing 100%. Those on red background with the capital letters (A–Y) represent branches (probability ≥55%) on the tree, and other branches with lower probabilities are assigned with lower case letters (a–q) on blue background. Group name (C for clip, G for gut, M for other multi-domain, S for other single domain SP(H)), chromosomal location (A–Z, 1–6, and a–z in Fig. 3), tree position (A–Y and a–q in this figure), and expression profile (A–N and a–D in Fig. 4) are used to generate G-C-T-E identification code for each SP-related gene. These IDs, in various colors depending on their categories, are listed before the corresponding systemic names.
Fig. 3.
Fig. 3.
Chromosomal locations of the 257 genes coding for D. melanogaster SP-related proteins. As indicated by the scale bar, positions of the genes are plotted in proportion on chromosomes, with “+” and “−” indicating positive and negative strands on the left and right, respectively. The G-C-T-E identification code for each SP-related gene is defined in the legend to Fig. 2, indicated along with the systemic name, and linked to its location by a straight line. Adjacent genes with high sequence similarities are grouped by lines in the same color and marked by a series of location IDs (A–Z and 1–6) on red background for different chromosomal segments. Regions in between are labeled with IDs on blue background (a–z).
Fig. 4.
Fig. 4.
Transcript profiles of the 52 CLIPs (A), 65 GPs/GPHs (B) and 140 SPs/SPHs (including multi-domain SP-related ones) (C) in D. melanogaster. The mRNA levels in 52 kinds of tissue samples, as represented by log2(FPKM+1) values, are shown in the hierarchically clustered gradient heatmap from blue (0) to maroon (≥10). The values of 0–0.49, 0.50–1.49, 1.50–2.49, … 8.50–9.49, 9.50–10.49, 10.50–11.49, … 15.50–16.49 are labeled in the color blocks as 0, 1, 2 … 9, A, B, … G, respectively. Groups of genes with similar expression patterns are shown in different colors. The branches on red background with the capital letters A–N represent reliable assemblies, while other branches are assigned with lower case letters a–D on blue background. Systemic names are listed on the right along with the G-C-T-E IDs (see definition in the legend to Fig. 2). Briefly, in the first position, C (in red) for CLIPs, G (in green) for gut SPs/SPHs, M (in pink) for other multi-domain SP-like proteins, and S (in blue) for other single domain SP(H)s. In the 2nd to 4th positions, chromosomal location (Fig. 3), tree position (Fig. 2), and expression assembly (Fig. 4) are shown in various colors as defined in the corresponding figures. The libraries names are abbreviated using: W, whole insect; E, embryo; L, larval; preP, prepupal; P, pupal; A, adult; M, male; F, female; h, hour; D, day; G, gut; FB, fat body; ID, imaginal discs, SG, salivary glands; C, carcass; AG-AmM, accessary glands of adult mated male; T, testes; O, ovaries of adult mated female; AvF, adult virgin female; MT, Malpighian tubules.
Fig. 4.
Fig. 4.
Transcript profiles of the 52 CLIPs (A), 65 GPs/GPHs (B) and 140 SPs/SPHs (including multi-domain SP-related ones) (C) in D. melanogaster. The mRNA levels in 52 kinds of tissue samples, as represented by log2(FPKM+1) values, are shown in the hierarchically clustered gradient heatmap from blue (0) to maroon (≥10). The values of 0–0.49, 0.50–1.49, 1.50–2.49, … 8.50–9.49, 9.50–10.49, 10.50–11.49, … 15.50–16.49 are labeled in the color blocks as 0, 1, 2 … 9, A, B, … G, respectively. Groups of genes with similar expression patterns are shown in different colors. The branches on red background with the capital letters A–N represent reliable assemblies, while other branches are assigned with lower case letters a–D on blue background. Systemic names are listed on the right along with the G-C-T-E IDs (see definition in the legend to Fig. 2). Briefly, in the first position, C (in red) for CLIPs, G (in green) for gut SPs/SPHs, M (in pink) for other multi-domain SP-like proteins, and S (in blue) for other single domain SP(H)s. In the 2nd to 4th positions, chromosomal location (Fig. 3), tree position (Fig. 2), and expression assembly (Fig. 4) are shown in various colors as defined in the corresponding figures. The libraries names are abbreviated using: W, whole insect; E, embryo; L, larval; preP, prepupal; P, pupal; A, adult; M, male; F, female; h, hour; D, day; G, gut; FB, fat body; ID, imaginal discs, SG, salivary glands; C, carcass; AG-AmM, accessary glands of adult mated male; T, testes; O, ovaries of adult mated female; AvF, adult virgin female; MT, Malpighian tubules.
Fig. 5.
Fig. 5.
Correlations of the D. melanogaster SP-related genes in clusters on chromosome (C), branches of phylogenetic tree (T), and assemblies of expression (E). (A) Venn diagram of numbers the genes located in clusters, branches, and assemblies of the entire S1A SP/SPH family. (B) Two- and three-way matches of the genes in the C(LIP), G(ut), M(ulti-domain), and S(ingle domain) groups. For each CTE triangle, numbers of 2- (C-T, T-E, or E-C) and 3- (C-T-E) way matched genes are indicated on the sides and center. Each match has at least two genes. For instance, in the G group, 42 C-T, 53 T-E, 37 E-C, and 37 C-T-E matches are found, involving a total of 65 SPs/SPHs. (C) A list of SP/SPH names with C-T-E matches in the C, G, M and S groups. Among the 52 CLIPs, 7 belong to three 3-way matches: cSPH69-231, cSP32-44-59, and cSP26-115. For the G group, there are ten C-T-E matches with 2–6 members in each and 37 proteins in total.
Fig. 6.
Fig. 6.
Phylogenetic trees of the CLIPAs (A), CLIPBs (B), CLIPCs (C), CLIPDs (D), and CLIPEs (E) in the five insects. Based on the initial analysis of 247 CLIPA–D’s (Fig. S1), entire protein sequences in each subgroup were aligned for building a phylogenetic tree using MrBayes v3.2.6. Probability values are indicated near the branching points, with “*” representing 100 and colored branches representing various sets of potential orthologous genes (probability >80, 3–5 species). The bold branches represent 1:1:1, 1:1:1:1 or 1:1:1:1:1 orthology, except for the set containing closely linked MsHP1a and MsHP1b. As shown in the inset, the protein names are in different colors.
Fig. 7.
Fig. 7.
Predicted functions of the CLIPs and other multi-domain SPs and SPHs in the insect SP-SPH pathways. (A) The SP cascade that establishes the dorsal-ventral axis of D. melanogaster embryo is used as a template to identify ortholog sets for proposing similar pathways in the other insects. When an ortholog is not identified in the phylogenetic analysis, its closest homolog(s) are listed to assist functional exploration: for example, A. mellifera cSP9/10/14 under cSP26/Snk and T. castanusm cSP136–8 under cSP24/Ea. (B) Members of the SP-SPH network that mediates proPO and proSpätzle-1 activation in M. sexta are presented along with their orthologs or close homologs, as revealed by the serial phylogenetic analyses and domain structure comparison (Fig. 6, Table 2). The protein names are in blue (Dm), red (Ag), black (Ms), green (Tc), and brown (Am) fonts. Circles with a “+” sign represent positive feedback mechanisms, one being auto-activation of proPAP3 by PAP3 (Wang et al., 2014) and the other being indirect activation proHP6 by PAP1 (dashed arrow) and direct activation of proPAP1 by HP6 (Wang and Jiang, 2008). “?” indicates that the step (i.e. cleavage activation of proHP6 by uncut but active proHP1) is partially established (He et al., 2017).
Fig. 8.
Fig. 8.
Abundances of the 110 SP-related proteins in D. melanogaster at various developmental stages. Relative protein levels in the 16 egg, 16 larval, 20 pupal, 8 female adult, and 8 male adult samples at different time points, as represented by log2(LFQ/5×106 + 1) values, are shown in the hierarchically clustered gradient heat map from blue (0) to maroon (≥10). The values of 0–0.49, 0.50–1.49, 1.50–2.49, … 8.50–9.49, 9.50–10.49, 10.50–11.49, and 11.50–12.49 are labeled in the color blocks as 0, 1, 2 … 9, A, B, and C, respectively. Proteins identified in two or less of the 68 samples are eliminated. Due to high sequence identity, no distinction can be made in SP51-117, SP122-143, SPH195-196b, cSP7-10, cSP4-229, and cSPH69-231 pairs. The datasets or library names are abbreviated using: E, embryo; L, larval; P, pupal; M, male; F, female; L3c, crawling third instar larva; h, hour; D, day.

References

    1. Akam ME, Carlson JR, 1985. The detection of Jonah gene transcripts in Drosophila by in situ hybridization. EMBO J. 4, 155–161. - PMC - PubMed
    1. An C, Budd A, Kanost MR, Michel K, 2011. Characterization of a regulatory unit that controls melanization and affects longevity of mosquitoes. Cell. Mol. Life Sci 68, 1929–1939. - PMC - PubMed
    1. An C, Ishibashi J, Ragan E, Jiang H, Kanost MR, 2009. Functions of Manduca sexta hemolymph proteinases HP6 and HP8 in two innate immune pathways. J. Biol. Chem 284, 19716–19726. - PMC - PubMed
    1. An C, Jiang H, Kanost MR, 2010. Proteolytic activation and function of the cytokine Spätzle in innate immune response of a lepidopteran insect, Manduca sexta. FEBS J. 277, 148–162. - PMC - PubMed
    1. An C, Zhang M, Chu Y, Zhao Z, 2013. Serine protease MP2 activates prophenoloxidase in the melanization immune response of Drosophila melanogaster. PLoS One 8, e79533. - PMC - PubMed

Publication types

LinkOut - more resources