Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Apr 7;13(1):1900.
doi: 10.1038/s41467-022-29530-y.

Glyco-Decipher enables glycan database-independent peptide matching and in-depth characterization of site-specific N-glycosylation

Affiliations

Glyco-Decipher enables glycan database-independent peptide matching and in-depth characterization of site-specific N-glycosylation

Zheng Fang et al. Nat Commun. .

Abstract

Glycopeptides with unusual glycans or poor peptide backbone fragmentation in tandem mass spectrometry are unaccounted for in typical site-specific glycoproteomics analysis and thus remain unidentified. Here, we develop a glycoproteomics tool, Glyco-Decipher, to address these issues. Glyco-Decipher conducts glycan database-independent peptide matching and exploits the fragmentation pattern of shared peptide backbones in glycopeptides to improve the spectrum interpretation. We benchmark Glyco-Decipher on several large-scale datasets, demonstrating that it identifies more peptide-spectrum matches than Byonic, MSFragger-Glyco, StrucGP and pGlyco 3.0, with a 33.5%-178.5% increase in the number of identified glycopeptide spectra. The database-independent and unbiased profiling of attached glycans enables the discovery of 164 modified glycans in mouse tissues, including glycans with chemical or biological modifications. By enabling in-depth characterization of site-specific protein glycosylation, Glyco-Decipher is a promising tool for advancing glycoproteomics analysis in biological research.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Workflow of Glyco-Decipher.
Glyco-Decipher contains three modules: (1) Glycan database-independent peptide matching. The in silico deglycosylated spectra are searched against the protein database without setting any glycans as modifications, which determines the peptide backbone for glycopeptide spectra with rich peptide fragment ions. Then, the fragmentation patterns of the peptide backbones are extracted and utilized to match spectra that remain unannotated. This step, termed “spectrum expansion”, enables the identification of peptide backbones of glycopeptide spectra with poor peptide fragmentation. (2) Glycan annotation. The mass of the glycan part is precisely derived by the mass difference between the precursor and the peptide backbone. The mass profile of glycans in a system is constructed without the use of glycan databases. For glycan annotation, the experimental B/Y ions in glycopeptide spectra are first matched to their theoretical fragment ions of database glycans to identify glycans. For glycans that do not match any database entries, Glyco-Decipher performs monosaccharide stepping to reveal the composition of modified glycans and potential modification moiety on them. (3) Quantification. The quantification module based on the elution profiles of glycopeptides is embedded in Glyco-Decipher and allows the computation of the abundance distributions of site-specific glycans.
Fig. 2
Fig. 2. Peptide fragmentation patterns improve spectrum interpretation and glycopeptide identification.
a Tandem mass spectrum example of glycopeptide “MHLNGSNVQVLHRLTIR- Hex(9)HexNAc(2)” (top) and the in silico deglycosylation result of the spectrum (bottom). Blue: b ions of peptide; red: y ions of peptide; purple: b/y ions with HexNAc residue; green: B ions; orange: Y ions of glycan with intact peptide backbone attached. b Fragmentation patterns of the peptide backbone “MHLNGSNVQVLHR” modified by different glycans and/or with different precursor charge states in six glycopeptide spectra. c Distributions of similarities between the peptide fragmentation pattern in each peptide-spectrum match (PSM) from in silico deglycosylation and the averaged pattern of the corresponding peptide backbone. The quartiles of the distributions are indicated by inner dashed lines. The medians and the spectrum numbers are labeled in the plot. The similarity value of 0.9 is indicated by an outer dashed line. Source data are provided as a Source Data file. d Distribution of matching scores for the peptide backbone “MHLNGSNVQVLHR” with glycopeptide spectra across the entire data acquisition time window. Specifically note that the score of PSM from in silico deglycosylation was re-calculated with the consideration of peptide fragmentation pattern. Inset: The score distribution of PSMs after score filtration and core structure peak matching in spectrum expansion. The dashed line indicates PSM score threshold of 52.75 derived from the e-value filtration method. Green box: target PSMs obtained by in silico deglycosylation; blue circle: target PSMs obtained in spectrum expansion; orange cross: decoy PSMs generated in spectrum expansion for quality control. e Comparison of identifications before (pale blue) and after (blue) spectrum expansion. The glycan part was matched with the GlyTouCan database to obtain glycopeptide-spectrum match (GPSM, PSM with definite glycan composition) identifications. Site-specific glycans were classified into three categories based on the glycan composition: truncated glycans (Hex(<4)HexNAc(<3)Fuc(<2)), oligo-mannose glycans (Hex(>3)HexNAc(2)Fuc(<2)) and complex/hybrid glycans. f Performance comparison between Glyco-Decipher and MSFragger (V3.1.1) in open search mode. Green: the consistently identified spectra. Orange: the spectra commonly identified, but matched to different peptides. Gray: spectra specifically matched by MSFragger.
Fig. 3
Fig. 3. Elucidation of the modification moieties on glycans.
a Mass histogram of glycans in glycopeptides of the mouse liver dataset. The glycan masses were obtained by the glycan database-independent pipeline in Glyco-Decipher. Blue: PSMs with glycan mass that matched GlyTouCan glycans; Orange: PSMs with glycan mass that could not be annotated by GlyTouCan glycans. The bin width was set to 1 Da and centered at integer values. b A monosaccharide stepping method was designed for the deduction of the composition and the mass of modification moiety on modified glycans. c Percentage of PSMs with glycan parts annotated by database glycans and modified glycans. d Mass values of the ten most abundant modification moieties on glycans from glycopeptide spectra of the five mouse tissues. e Mass profiles of the modification moieties with known chemical compositions on modified glycans. f Examples of mass profiles of glycan modification moieties with unannotated composition. Source data are provided as a Source Data file.
Fig. 4
Fig. 4. Comparison between Glyco-Decipher and other software tools.
a Comparison of the performance between Glyco-Decipher, StrucGP, pGlyco 3.0, MSFragger-Glyco and Byonic using the dataset of mouse tissues. Top: Comparison between Glyco-Decipher and other software tools in the performance of glycopeptide-spectrum interpretation. Green: the spectra matched to identical peptide backbones in Glyco-Decipher and another tool. Orange: the spectra commonly identified in Glyco-Decipher and another tool but matched to different peptide backbones. Gray: spectra specifically matched by another tool. Bottom: Comparison between Glyco-Decipher and other software tools in peptide sequence identification. Green: the peptide sequence commonly identified by Glyco-Decipher and another tool in pair comparison. Gray: the peptide sequence specifically matched by another tool in pair comparison. b FDR analysis using the 13C/15N metabolically labeled yeast dataset. The isotope-based FDR was calculated by matching 13C/15N isotopic peak pairs in MS1 spectra (line). Red line: FDR analysis based on the total GPSM results of each tool; blue line: FDR analysis based on the GPSM results that overlapped with Glyco-Decipher; purple line: FDR analysis based on the GPSMs specifically identified by each tool. The proportions of GPSMs of oligo-mannose glycans with composition of Hex(n)HexNAc(2) (green pie), NeuAc (orange pie) or NeuGc (red pie) containing glycans identified by each tool are shown in the bottom table. Source data are provided as a Source Data file. c Distributions of GPSMs of ammonium-adducted glycans reported by Glyco-Decipher and pGlyco 3.0. d Number of ammonium adduction GPSMs with/without oligo-mannose glycan composition. e Comparison of glycopeptide identification results between Glyco-Decipher and StrucGP/pGlyco 3.0. The additional (gain), overlap and lost identifications of Glyco-Decipher compared to other software tools are indicated by blue, green and orange bars, respectively. f Distributions of mannose-6-phosphate (M6P) GPSMs (bars) and intact glycopeptides (dots) across mouse tissues reported by Glyco-Decipher and pGlyco 3.0. All M6P identifications were validated by the diagnostic oxonium ion (phosphorylated hexose, m/z = 243.0269) in the glycopeptide spectra.
Fig. 5
Fig. 5. Quantitative analysis of site-specific glycosylation on prosaposin by Glyco-Decipher.
Relative abundance distribution of glycans at each glycosite in prosaposin across five mouse tissues (heat map). Compositions and possible structure illustration of the high-abundance glycans are annotated at the top. For a more intuitive demonstration, the abundance distributions of glycans at each glycosite are listed in the right radial diagrams. In each radial diagram, nodes around the circle denote glycans linked to prosaposin, and donuts in the center denote glycosites identified in prosaposin. Linkage between the node and the center donut indicates that the glycosite was modified by the corresponding glycan. The percentage value in each donut indicates the relative abundance for a certain type of glycan. All M6P identifications were validated by the diagnostic oxonium ion (phosphorylated hexose, m/z = 243.0269) in glycopeptide spectra. H Hex, N HexNAc, A NeuAc, G NeuGc, F Fuc, P Phosphorylation. See Supplementary Data 7 for detailed information of glycan nodes.

References

    1. Xu C, Ng DTW. Glycosylation-directed quality control of protein folding. Nat. Rev. Mol. Cell Biol. 2015;16:742–752. - PubMed
    1. Li C-W, et al. Glycosylation and stabilization of programmed death ligand-1 suppresses T-cell activity. Nat. Commun. 2016;7:12632. - PMC - PubMed
    1. Pinho SS, Reis CA. Glycosylation in cancer: mechanisms and clinical implications. Nat. Rev. Cancer. 2015;15:540–555. - PubMed
    1. Marx V. Metabolism: sweeter paths in glycoscience. Nat. Methods. 2017;14:667–670. - PubMed
    1. Stadlmann J, et al. Comparative glycoproteomics of stem cells identifies new players in ricin toxicity. Nature. 2017;549:538. - PMC - PubMed

Publication types