Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Sep 7:16:35.
doi: 10.1186/s12014-019-9254-0. eCollection 2019.

N-GlycositeAtlas: a database resource for mass spectrometry-based human N-linked glycoprotein and glycosylation site mapping

Affiliations

N-GlycositeAtlas: a database resource for mass spectrometry-based human N-linked glycoprotein and glycosylation site mapping

Shisheng Sun et al. Clin Proteomics. .

Abstract

Background: N-linked glycoprotein is a highly interesting class of proteins for clinical and biological research. The large-scale characterization of N-linked glycoproteins accomplished by mass spectrometry-based glycoproteomics has provided valuable insights into the interdependence of glycoprotein structure and protein function. However, these studies focused mainly on the analysis of specific sample type, and lack the integration of glycoproteomic data from different tissues, body fluids or cell types.

Methods: In this study, we collected the human glycosite-containing peptides identified through their de-glycosylated forms by mass spectrometry from over 100 publications and unpublished datasets generated from our laboratory. A database resource termed N-GlycositeAtlas was created and further used for the distribution analyses of glycoproteins among different human cells, tissues and body fluids. Finally, a web interface of N-GlycositeAtlas was created to maximize the utility and value of the database.

Results: The N-GlycositeAtlas database contains more than 30,000 glycosite-containing peptides (representing > 14,000 N-glycosylation sites) from more than 7200 N-glycoproteins from different biological sources including human-derived tissues, body fluids and cell lines from over 100 studies.

Conclusions: The entire human N-glycoproteome database as well as 22 sub-databases associated with individual tissues or body fluids can be downloaded from the N-GlycositeAtlas website at http://nglycositeatlas.biomarkercenter.org.

PubMed Disclaimer

Conflict of interest statement

Competing interestsThe authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Assembly of the mass spectrometry-based human glycoprotein and glycosite database (N-GlycositeAtlas). The identified glycosite-containing peptides were collected from 85 publications and 19 newly generated or unpublished datasets. The peptides were then matched to a UniProt protein database. The relevant information for each glycosite was then extracted for glycosite and glycoprotein database development
Fig. 2
Fig. 2
Overview of the human glycoprotein and glycosite database. a The identification frequencies of each glycosite in the database. The identification frequencies of each glycosite were determined based on their identification in different samples, different studies or glycosite-containing peptides of various lengths (due to different enzyme digestion or different missed cleavages). b Accumulation of identified glycosite-containing peptides, unique glycosites and glycoproteins with time. c Classification of glycosites according to their year of publication
Fig. 3
Fig. 3
Distribution of identified glycosites and glycoproteins across different human tissues or body fluids. The blue columns represent the glycosites (a) and glycoproteins (b) identified from human tissues or body fluids; the orange columns represent the glycosites (a) and glycoproteins (b) identified from the related cell lines. CSF cerebrospinal fluid, PBMC peripheral blood mononuclear cell
Fig. 4
Fig. 4
Glycoproteins identified in common between tissues and serum (a) or urine (b)
Fig. 5
Fig. 5
Representative N-GlycositeAtlas web interface output showing an N-linked glycoprotein and its glycosites. Endoplasmin (HSP90B1) is used as an example. a The database search. The database can be searched online according to the protein accession number, gene name, protein name, glycosylation site location, glycosite-containing peptide, N-glycosylation motif (N-X-S/T), name of tissue/liquid/cell line, year of publication, and/or reference. Multiple search parameters can be used for a combined or parameter-specific search. b The search results are shown in the first display page. In the first display page, the glycoprotein accession number (UniProt), gene name, protein name and glycosylation site location are exhibited. The detailed information for each glycoprotein can be accessed in the second display page by clicking the glycoprotein accession number. In the second display page, the following information is shown: c glycoprotein information; d glycosite and glycosite-containing peptide information of the glycoprotein as well as their references; and e glycosites (red) and glycosite-containing peptides (bold font) highlighted in the protein sequence

Similar articles

Cited by

References

    1. Olsen JV, Mann M. Status of large-scale analysis of post-translational modifications by mass spectrometry. Mol Cell Proteomics. 2013 doi: 10.1074/mcp.O113.034181. - DOI - PMC - PubMed
    1. Wilhelm M, Schlegl J, Hahne H, Gholami AM, Lieberenz M, Savitski MM, Ziegler E, Butzmann L, Gessulat S, Marx H. Mass-spectrometry-based draft of the human proteome. Nature. 2014;509(7502):582–587. doi: 10.1038/nature13319. - DOI - PubMed
    1. Kim M-S, Pinto SM, Getnet D, Nirujogi RS, Manda SS, Chaerkady R, Madugundu AK, Kelkar DS, Isserlin R, Jain S. A draft map of the human proteome. Nature. 2014;509(7502):575–581. doi: 10.1038/nature13302. - DOI - PMC - PubMed
    1. Craig R, Cortens JP, Beavis RC. Open source system for analyzing, validating, and storing protein identification data. J Proteome Res. 2004;3(6):1234–1242. doi: 10.1021/pr049882h. - DOI - PubMed
    1. Desiere F, Deutsch EW, King NL, Nesvizhskii AI, Mallick P, Eng J, Chen S, Eddes J, Loevenich SN, Aebersold R. The peptideatlas project. Nucleic Acids Res. 2006;34(suppl 1):D655–D658. doi: 10.1093/nar/gkj040. - DOI - PMC - PubMed