. 2023 Mar 17;4(1):101905.

doi: 10.1016/j.xpro.2022.101905. Epub 2022 Dec 17.

Prediction and verification of glycosyltransferase activity by bioinformatics analysis and protein engineering

Dietlind L Gerloff¹, Elena I Ilina², Camille Cialini², Uxue Mata Salcedo², Michel Mittelbronn³, Tanja Müller⁴

Affiliations

¹ Foundation for Applied Molecular Evolution (FfAME), Alachua, FL 32615, USA.
² Department of Cancer Research (DoCR), Luxembourg Institute of Health (LIH), 1526 Luxembourg, Luxembourg; Luxembourg Centre of Neuropathology (LCNP), 1526 Luxembourg, Luxembourg.
³ Department of Cancer Research (DoCR), Luxembourg Institute of Health (LIH), 1526 Luxembourg, Luxembourg; Luxembourg Centre of Neuropathology (LCNP), 1526 Luxembourg, Luxembourg; National Center of Pathology (NCP), Laboratoire National de Santé (LNS), 3555 Dudelange, Luxembourg; Department of Life Sciences and Medicine (DLSM), University of Luxembourg, 4365 Esch sur Alzette, Luxembourg; Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, 4365 Esch-sur-Alzette, Luxembourg; Faculty of Science, Technology and Medicine (FSTM), University of Luxembourg, 4365 Esch-sur-Alzette, Luxembourg.
⁴ Department of Cancer Research (DoCR), Luxembourg Institute of Health (LIH), 1526 Luxembourg, Luxembourg; Luxembourg Centre of Neuropathology (LCNP), 1526 Luxembourg, Luxembourg. Electronic address: tanja.mueller@lih.lu.

PMID: 36528856
PMCID: PMC9792956
DOI: 10.1016/j.xpro.2022.101905

Prediction and verification of glycosyltransferase activity by bioinformatics analysis and protein engineering

Dietlind L Gerloff et al. STAR Protoc. 2023.

. 2023 Mar 17;4(1):101905.

doi: 10.1016/j.xpro.2022.101905. Epub 2022 Dec 17.

Authors

Dietlind L Gerloff¹, Elena I Ilina², Camille Cialini², Uxue Mata Salcedo², Michel Mittelbronn³, Tanja Müller⁴

Affiliations

¹ Foundation for Applied Molecular Evolution (FfAME), Alachua, FL 32615, USA.
² Department of Cancer Research (DoCR), Luxembourg Institute of Health (LIH), 1526 Luxembourg, Luxembourg; Luxembourg Centre of Neuropathology (LCNP), 1526 Luxembourg, Luxembourg.
³ Department of Cancer Research (DoCR), Luxembourg Institute of Health (LIH), 1526 Luxembourg, Luxembourg; Luxembourg Centre of Neuropathology (LCNP), 1526 Luxembourg, Luxembourg; National Center of Pathology (NCP), Laboratoire National de Santé (LNS), 3555 Dudelange, Luxembourg; Department of Life Sciences and Medicine (DLSM), University of Luxembourg, 4365 Esch sur Alzette, Luxembourg; Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, 4365 Esch-sur-Alzette, Luxembourg; Faculty of Science, Technology and Medicine (FSTM), University of Luxembourg, 4365 Esch-sur-Alzette, Luxembourg.
⁴ Department of Cancer Research (DoCR), Luxembourg Institute of Health (LIH), 1526 Luxembourg, Luxembourg; Luxembourg Centre of Neuropathology (LCNP), 1526 Luxembourg, Luxembourg. Electronic address: tanja.mueller@lih.lu.

PMID: 36528856
PMCID: PMC9792956
DOI: 10.1016/j.xpro.2022.101905

Abstract

A significant number of proteins are annotated as functionally uncharacterized proteins. Within this protocol, we describe how to use protein family multiple sequence alignments and structural bioinformatics resources to design loss-of-function mutations of previously uncharacterized proteins within the glycosyltransferase family. We detail approaches to determine target protein active sites using three-dimensional modeling. We generate active site mutants and quantify any changes in enzymatic function by a glycosyltransferase assay. With modifications, this protocol could be applied to other metal-dependent enzymes. For complete details on the use and execution of this protocol, please refer to Ilina et al. (2022).¹.

Keywords: Bioinformatics; Molecular Biology; Protein Biochemistry; Protein expression and purification; Sequence analysis.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests The authors declare no competing interests.

Figures

**Figure 1**
Schematic illustration of analysis described in major steps 1–4 (steps 1–15)

**Figure 2**
Phylogenetic consistency checking (step 10) (A) Example of a very well balanced tree that is consistent with species evolution; it was derived from the MSA we used to design GLT8D1 mutants. No further corrections are required. ***Note:*** Minor topological differences, like those between the GLT8D1 and GLT8D2 mammalian groups, are common and tolerated because the amount of sequence variation is insufficient to expect stable, accurate positions in these subtrees). (B) An illustrative example constructed with Glycogenin (GYG) sequences. Arrows mark inconsistencies in the original tree. Together with MSA inspection these could be resolved by removing a single (likely erroneous) sequence (Bmaa), and by correcting a mislabeled paralog specification (Tnig1|2). Inconsistencies may point to erroneous sequences, misalignments, or mislabeling (they could also be caused by exceptional evolutionary rates but this is rare). ***Note:*** Examining the target phylogeny will rarely turn up new errors if good sequence resources were used to generate the MSA and if sequences were inspected in MSA context (steps 2–9). Even then, gaining an overview of the protein family in this way is recommended as an informative and scientific best practice.

**Figure 3**
Bridging undefined template regions (step 13d) (A) Modification of target-template alignment for human GLT8D1 modeling on human Glycogenin-1 (PDB:3QVB). Coordinates between P126 and P129 (ends are 4.8 Å apart) are resolved in the template but extremely variable, as superposition reveals. To treat this as if the connection were undefined, G127-W128 are de-matched in addition to protocol instructions (blue font). In the resulting model the (Gly)3 bridge (red font) will align with these 2 residues in the template and replace 37 residues in the target that cannot be modeled (D201-S237, red font). It marks their insertion site. ***Note:*** (1) Alternatively, G127-W128 could be deleted from the coordinate file for 3QVB manually, and SWISS-MODEL run with this user-provided template and a target-template alignment modified exactly as per step 14d. (2) This region is known in GT-A glycosyltransferases for its conformational diversity between families (“HV2”). (B) 3D-Close-up showing the resulting model (lilac/red) with two template structures (blue, green). Part of the structure is removed to emphasize the conserved metal site (black stick representation, numbering is for GLT8D1).

**Figure 4**
Schematic illustration of experiments described in major step 5 (steps 16–18)

**Figure 5**
Regulatory elements included in our expression plasmid useful for an increased protein expression The human elongation factor promotor (EF1α) is known to drive high and efficient gene expression; the woodchuck hepatitis virus posttranscriptional regulatory element (WPRE) is a DNA sequence that upon transcription creates a tertiary structure enhancing insert expression; the SV40 poly-A sequence promotes transcript stability; the origin of replication (ORI) from SV40 enables the construct to be replicated in cells that express the SV40 large T antigen such as 293T cells that are used within described protocol.

**Figure 7**
Exemplary set-up of a 96-well plate for luminescence measurement after glycosyltransferase activity assay (step 23) The reaction of each peptide with respective substrate concentration is applied in duplicates. Reactions incubated without peptide but with varying substrate concentrations serve as no-peptide control required for later quantification. UDP-standard ranging from 0 μM–25 μM is also applied in duplicates onto the plate.

**Figure 8**
Workflow—Determination of the substrate turnover rate based on the results of the glycosyltransferase assay (step 23)

**Figure 6**
Schematic illustration of experiments described in major step 6 (steps 19–23)

See this image and copyright information in PMC

References

1. Ilina E.I., Cialini C., Gerloff D.L., Duarte Garcia-Escudero M., Jeanty C., Thézénas M.L., Lesur A., Puard V., Bernardin F., Moter A., et al. Enzymatic activity of glycosyltransferase GLT8D1 promotes human glioblastoma cell migration. iScience. 2022;25:103842. doi: 10.1016/j.isci.2022.103842. - DOI - PMC - PubMed
1. Robin X., Haas J., Gumienny R., Smolinski A., Tauriello G., Schwede T. Continuous Automated Model EvaluatiOn (CAMEO)-perspectives on the future of fully automated evaluation of structure prediction methods. Proteins. 2021;89:1977–1986. doi: 10.1002/prot.26213. - DOI - PMC - PubMed
1. UniProt Consortium UniProt: a worldwide hub of protein knowledge. Nucleic Acids Res. 2019;47:D506–D515. doi: 10.1093/nar/gky1049. - DOI - PMC - PubMed
1. Berman H.M., Westbrook J., Feng Z., Gilliland G., Bhat T.N., Weissig H., Shindyalov I.N., Bourne P.E. The protein data bank. Nucleic Acids Res. 2000;28:235–242. doi: 10.1093/nar/28.1.235. - DOI - PMC - PubMed
1. Chang A., Jeske L., Ulbrich S., Hofmann J., Koblitz J., Schomburg I., Neumann-Schaal M., Jahn D., Schomburg D. BRENDA, the ELIXIR core data resource in 2021: new developments and updates. Nucleic Acids Res. 2021;49:D498–D508. doi: 10.1093/nar/gkaa1025. - DOI - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions

Substances

Actions

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Prediction and verification of glycosyltransferase activity by bioinformatics analysis and protein engineering

Affiliations

Prediction and verification of glycosyltransferase activity by bioinformatics analysis and protein engineering

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Substances

LinkOut - more resources

Full Text Sources