MassIVE.quant: a community resource of quantitative mass spectrometry-based proteomics datasets

Meena Choi¹, Jeremy Carver², Cristina Chiva^{3

4}, Manuel Tzouros⁵, Ting Huang¹, Tsung-Heng Tsai¹, Benjamin Pullman², Oliver M Bernhardt⁶, Ruth Hüttenhain⁷, Guo Ci Teo⁸, Yasset Perez-Riverol⁹, Jan Muntel⁶, Maik Müller¹⁰, Sandra Goetze^{10

11}, Maria Pavlou¹⁰, Erik Verschueren⁷, Bernd Wollscheid^{10

11}, Alexey I Nesvizhskii⁸, Lukas Reiter⁶, Tom Dunkley⁵, Eduard Sabidó^{3

4}, Nuno Bandeira¹², Olga Vitek¹³

Affiliations

¹ Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA.
² Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA, USA.
³ Proteomics Unit, Center for Genomics Regulation, The Barcelona Institute of Science and Technology, Barcelona, Spain.
⁴ Proteomics Unit, Universitat Pompeu Fabra, Barcelona, Spain.
⁵ Roche Pharma Research and Early Development, Pharmaceutical Sciences, Roche Innovation Center Basel, Hoffmann-La Roche Ltd, Basel, Switzerland.
⁶ Biognosys, Zurich, Switzerland.
⁷ Department of Molecular and Cellular Pharmacology, University of California, San Francisco, San Francisco, CA, USA.
⁸ Department of Pathology, University of Michigan, Ann Arbor, MI, USA.
⁹ Proteomics Services, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, UK.
¹⁰ Department of Health Sciences and Technology, Institute of Translational Medicine, ETH, Zurich, Switzerland.
¹¹ Swiss Institute of Bioinformatics, Lausanne, Switzerland.
¹² Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA, USA. bandeira@ucsd.edu.
¹³ Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA. o.vitek@northeastern.edu.

PMID: 32929271
PMCID: PMC7541731
DOI: 10.1038/s41592-020-0955-0

MassIVE.quant: a community resource of quantitative mass spectrometry-based proteomics datasets

Meena Choi et al. Nat Methods. 2020 Oct.

. 2020 Oct;17(10):981-984.

doi: 10.1038/s41592-020-0955-0. Epub 2020 Sep 14.

Authors

Affiliations

¹ Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA.
² Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA, USA.
³ Proteomics Unit, Center for Genomics Regulation, The Barcelona Institute of Science and Technology, Barcelona, Spain.
⁴ Proteomics Unit, Universitat Pompeu Fabra, Barcelona, Spain.
⁵ Roche Pharma Research and Early Development, Pharmaceutical Sciences, Roche Innovation Center Basel, Hoffmann-La Roche Ltd, Basel, Switzerland.
⁶ Biognosys, Zurich, Switzerland.
⁷ Department of Molecular and Cellular Pharmacology, University of California, San Francisco, San Francisco, CA, USA.
⁸ Department of Pathology, University of Michigan, Ann Arbor, MI, USA.
⁹ Proteomics Services, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, UK.
¹⁰ Department of Health Sciences and Technology, Institute of Translational Medicine, ETH, Zurich, Switzerland.
¹¹ Swiss Institute of Bioinformatics, Lausanne, Switzerland.
¹² Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA, USA. bandeira@ucsd.edu.
¹³ Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA. o.vitek@northeastern.edu.

PMID: 32929271
PMCID: PMC7541731
DOI: 10.1038/s41592-020-0955-0

Abstract

MassIVE.quant is a repository infrastructure and data resource for reproducible quantitative mass spectrometry-based proteomics, which is compatible with all mass spectrometry data acquisition types and computational analysis tools. A branch structure enables MassIVE.quant to systematically store raw experimental data, metadata of the experimental design, scripts of the quantitative analysis workflow, intermediate input and output files, as well as alternative reanalyses of the same dataset.

PubMed Disclaimer

Conflict of interest statement

Competing financial interests

O.M.B., J.M. and L.R. are employees of Biognosys AG. Spectronaut is a trademark of Biognosys AG. M.T. and T.D. are employees of Hoffmann-La Roche Ltd. All other authors declare no competing financial interests.

Figures

**Figure 1 :. Outline of MassIVE.quant repository structure, and reanalysis of three DDA-based experiments.**
Each step can be performed with multiple algorithms and software tools, generating tool-specific files in diverse formats. For the experiments in the figure, MassIVE.quant stores the intermediate outputs from combinations of algorithms and tools for peptide ion identification and quantification. For example, DDA:Choi2017 was processed with eight combinations of parameter settings in Skyline. Each reanalysis is saved with a unique reanalysis ID, prefixed by RMSV, under the experiment repository prefixed by MSV in MassIVE.quant.

Figure 2 :. Re-analyses of DIA:Selevsek2015, profiling changes in proteome abundance of *S. cerevisiae* over six time points: T0(0 min), T1(15 min), T2(30 min), T3(60 min), T4(90 min), T5 (120 min), n=3 biologically independent samples per each time points, in response to osmotic stress (RMSV000000251).
(a)-(d) Discrepancies of quantification of protein YKL096W across data processing tools. Gray lines: fragments reported by each tool. Red lines: protein quantification summarized by MSstats. (a) Skyline:lowCV used Skyline to quantify a subset of the fragments with low coefficient of variation. (b) Skyline:All used Skyline to quantify all detectable peptides, with a maximum of six fragments each; (c) data processed by Spectronaut; (d) data processed by DIA-Umpire. (e)–(h), Discrepancies in detecting differential abundance for protein YKL096W across data processing tools, with statistical analysis by MSstats: Skyline:lowCV (e), Skyline:all (f), Spectronaut (g) and DIA-Umpire (h). Dark red dot, center for error bars, model-based estimates of log2(fold change) of protein abundance, as determined by MSstats. Error bars, 95% confidence intervals for the log2(fold change), as determined by MSstats. *Adjusted P < 0.05. (i)–(l), Volcano plots, summarizing differential abundance between T5 and T0: Skyline:lowCV (i), Skyline:all (j), Spectronaut (k) and DIA-Umpire (l). Dashed line, FDR = 0.05; blue dots, significantly down-regulated proteins; red dots, significantly up-regulated proteins (counts are shown at the top left corner; other time points are shown in Supplementary Figure. 3). (m) Number of differentially abundant proteins across all time points and all tools, FDR = 0.05. (n) Venn diagram of differentially abundant proteins between two processing approaches by Skyline, comparing T5 versus T0. (o) Venn diagram of differentially abundant proteins across all tools, comparing T5 versus T0 (other time points are shown in Supplementary Figure. 4).

See this image and copyright information in PMC

References

1. Peng RD Reproducible research in computational science. Science 334, 1226–1227, doi:10.1126/science.1213847 (2011). - DOI - PMC - PubMed
1. Wilkinson MD et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data 3, 160018, doi:10.1038/sdata.2016.18 (2016). - DOI - PMC - PubMed
1. Perez-Riverol Y et al. The PRIDE database and related tools and resources in 2019: improving support for quantification data. Nucleic Acids Res 47, D442–D450, doi:10.1093/nar/gky1106 (2019). - DOI - PMC - PubMed
1. Sharma V et al. Panorama: a targeted proteomics knowledge base. J Proteome Res 13, 4205–4210, doi:10.1021/pr5006636 (2014). - DOI - PMC - PubMed
1. Sharma V et al. Panorama Public: A Public Repository for Quantitative Data Sets Processed in Skyline. Mol Cell Proteomics 17, 1239–1244, doi:10.1074/mcp.RA117.000543 (2018). - DOI - PMC - PubMed

Publication types

Actions
Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

MassIVE.quant: a community resource of quantitative mass spectrometry-based proteomics datasets

Affiliations

MassIVE.quant: a community resource of quantitative mass spectrometry-based proteomics datasets

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources