Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Jan 4;46(D1):D1237-D1247.
doi: 10.1093/nar/gkx664.

The SysteMHC Atlas project

Affiliations

The SysteMHC Atlas project

Wenguang Shao et al. Nucleic Acids Res. .

Abstract

Mass spectrometry (MS)-based immunopeptidomics investigates the repertoire of peptides presented at the cell surface by major histocompatibility complex (MHC) molecules. The broad clinical relevance of MHC-associated peptides, e.g. in precision medicine, provides a strong rationale for the large-scale generation of immunopeptidomic datasets and recent developments in MS-based peptide analysis technologies now support the generation of the required data. Importantly, the availability of diverse immunopeptidomic datasets has resulted in an increasing need to standardize, store and exchange this type of data to enable better collaborations among researchers, to advance the field more efficiently and to establish quality measures required for the meaningful comparison of datasets. Here we present the SysteMHC Atlas (https://systemhcatlas.org), a public database that aims at collecting, organizing, sharing, visualizing and exploring immunopeptidomic data generated by MS. The Atlas includes raw mass spectrometer output files collected from several laboratories around the globe, a catalog of context-specific datasets of MHC class I and class II peptides, standardized MHC allele-specific peptide spectral libraries consisting of consensus spectra calculated from repeat measurements of the same peptide sequence, and links to other proteomics and immunology databases. The SysteMHC Atlas project was created and will be further expanded using a uniform and open computational pipeline that controls the quality of peptide identifications and peptide annotations. Thus, the SysteMHC Atlas disseminates quality controlled immunopeptidomic information to the public domain and serves as a community resource toward the generation of a high-quality comprehensive map of the human immunopeptidome and the support of consistent measurement of immunopeptidomic sample cohorts.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Overview of the SysteMHC Atlas project. (A) The SysteMHC Atlas aims to be a long-term data-driven project that serves the community. It is linked to other repositories of proteomic data and consists of two main components: (i) a uniform computational pipeline for processing raw MS files and (ii) a web interface with storage, searching and browsing capabilities. First, shotgun/DDA-MS experimental data generated for specific projects are submitted by the data producers to PRIDE. Raw MS data are then uploaded into the SysteMHC Atlas and processed through a consistent and open computational pipeline (B) that controls the quality of peptide identification and peptide annotation to specific HLA alleles. Spectral libraries are generated and can be converted into high-quality HLA allele-specific peptide assay libraries, also available at SWATHAtlas. All the results generated by the computational pipeline are made available to the public domain via the SysteMHC Atlas web-based interface, which provides links to the Immune Epitope Database (IEDB) for accessing lists of peptides originally identified and published by the data producers. (B) Current computational pipeline used for generating the immunopeptidome- and spectral database for different HLA allotypes. MS output files generated from several types of instruments are first converted into mzXML file format and then searched using several open-source database search engines. The resulting peptide identifications are combined and statistically scored using PeptideProphet and iProphet within the Trans-Proteomic Pipeline (TPP) (30,31). The identified peptides are next annotated to their respective HLA allele in a fully automated fashion using the stand-alone software package of NetMHCcons 1.1 (29). Spectral libraries are generated using SpectraST (32). Allele-specific peptide spectral libraries are generated from multiple samples—an example for HLA-A03 is highlighted in red. Each HLA peptide is labeled with a unique and permanent library identifier (LibID). Details regarding the computational pipeline and how the data were processed are available at the SysteMHC Atlas website in the ‘ABOUT’ section.
Figure 2.
Figure 2.
Immunopeptidomics datasets used for building the first version of the SysteMHC Atlas. Data from 23 projects that collectively generated 1184 raw MS files constitute the initial contents of the SysteMHC Atlas. Each project is labeled with a unique SYSMHC identifier and linked to its corresponding PubMed, PRIDE and IEDB ID. For unpublished projects, IDs are not applicable (NA).
Figure 3.
Figure 3.
Cumulative number of MS/MS spectra versus cumulative number of distinct peptides for HLA class I alleles at FDR 1%. (A) All HLA class I peptides were combined. (B) HLA class I alleles that were frequently found in various datasets. Eventually, the curves are expected to reach saturation when most observable peptides will have been cataloged at 1% peptide FDR.
Figure 4.
Figure 4.
Explore page in the SysteMHC Atlas web-based interface. HLA allele-specific peptide spectral libraries can be downloaded here. The web interface can also be used to query the SysteMHC Atlas and find specific information. (A) As an example the source protein BIRC6 was searched and the Atlas returned back all HLA-associated peptides originating from this protein as well as the context (i.e. SysteMHC ID, Sample ID, iProphet score, HLA annotation score, spectral counts, assigned HLA type and class) in which this peptide was observed. Then, the user can click on a specific Sample ID hyperlink and be redirected to the corresponding raw MS files and metadata (e.g. tissue type, cell type, culture condition, purification method, antibody used, mass spectrometer used etc). (B) The peptide RLLDYVATV was searched and the Atlas returned back the datasets in which this peptide was observed. By clicking on the peptide sequence hyperlink, the user is redirected to a new page in which the LibID information is available for MS/MS spectra visualization. Information can be downloaded as .csv files for further analysis.
Figure 5.
Figure 5.
Data storage and visualization. To access information about specific datasets, the user selects a specific SYSMHC ID/Project name (e.g. SYSMHC00005) and clicks on ‘view dataset’ at the bottom left of the screen. The samples related to this project are then listed and linked to the number of replicates, organism, tissue and cell type of origin as well as the HLA typing information (upper panel). The user can then click on a specific Sample ID to visualize the metadata and to download the raw or converted mzXML MS files (red squares). A list of sample-specific HLA-associated peptides can be visualized at 1% peptide-level FDR (green squares). Sample-specific spectral libraries, including consensus fragment ion spectra, can be visualized and downloaded (orange and blue squares). Heat maps (black squares) are used to visualize the annotation of individual peptides to their respective HLA allele (dark blue peptides are predicted to be strong HLA binders according to NetMHCcons).

References

    1. Istrail S., Florea L., Halldórsson B.V., Kohlbacher O., Schwartz R.S., Yap V.B., Yewdell J.W., Hoffman S.L.. Comparative immunopeptidomics of humans and their pathogens. Proc. Natl. Acad. Sci. U.S.A. 2004; 101:13268–13272. - PMC - PubMed
    1. Caron E., Vincent K., Fortier M.-H., Laverdure J.-P., Bramoullé A., Hardy M.-P., Voisin G., Roux P.P., Lemieux S., Thibault P. et al. . The MHC I immunopeptidome conveys to the cell surface an integrative view of cellular regulation. Mol. Syst. Biol. 2011; 7:533–533. - PMC - PubMed
    1. Caron E., Kowalewski D.J., Koh C.C., Sturm T., Schuster H., Aebersold R.. Analysis of major histocompatibility complex (MHC) immunopeptidomes using mass spectrometry. Mol. Cell. Proteomics. 2015; 14:3105–3117. - PMC - PubMed
    1. Neefjes J., Jongsma M.L.M., Paul P., Bakke O.. Towards a systems understanding of MHC class I and MHC class II antigen presentation. Nat. Rev. Immunol. 2011; 11:823–836. - PubMed
    1. Rock K.L., Reits E., Neefjes J.. Present yourself! by MHC class I and MHC class II molecules. Trends Immunol. 2016; 37:724–737. - PMC - PubMed

Publication types