. 2025 Feb 7;24(2):685-695.

doi: 10.1021/acs.jproteome.4c00788. Epub 2025 Jan 7.

Integrated View of Baseline Protein Expression in Human Tissues Using Public Data Independent Acquisition Data Sets

Ananth Prakash¹, Andrew Collins², Liora Vilmovsky¹, Silvie Fexova¹, Andrew R Jones², Juan Antonio Vizcaino¹

Affiliations

¹ European Molecular Biology Laboratory-European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, U.K.
² Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, U.K.

PMID: 39764611
PMCID: PMC11811993
DOI: 10.1021/acs.jproteome.4c00788

Integrated View of Baseline Protein Expression in Human Tissues Using Public Data Independent Acquisition Data Sets

Ananth Prakash et al. J Proteome Res. 2025.

. 2025 Feb 7;24(2):685-695.

doi: 10.1021/acs.jproteome.4c00788. Epub 2025 Jan 7.

Authors

Ananth Prakash¹, Andrew Collins², Liora Vilmovsky¹, Silvie Fexova¹, Andrew R Jones², Juan Antonio Vizcaino¹

Affiliations

¹ European Molecular Biology Laboratory-European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, U.K.
² Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, U.K.

PMID: 39764611
PMCID: PMC11811993
DOI: 10.1021/acs.jproteome.4c00788

Abstract

The PRIDE database is the largest public data repository of mass spectrometry-based proteomics data and currently stores more than 40,000 data sets covering a wide range of organisms, experimental techniques, and biological conditions. During the past few years, PRIDE has seen a significant increase in the amount of submitted data-independent acquisition (DIA) proteomics data sets. This provides an excellent opportunity for large-scale data reanalysis and reuse. We have reanalyzed 15 public label-free DIA data sets across various healthy human tissues to provide a state-of-the-art view of the human proteome in baseline conditions (without any perturbations). We computed baseline protein abundances and compared them across various tissues, samples, and data sets. Our second aim was to compare protein abundances obtained here from the results of previous analyses using human baseline data-dependent acquisition (DDA) data sets. We observed a good correlation across some tissues, especially in the liver and colon, but weak correlations were found in others, such as the lung and pancreas. The reanalyzed results including protein abundance values and curated metadata are made available to view and download from the resource Expression Atlas.

Keywords: Expression Atlas; PRIDE; baseline expression; data independent acquisition; data reanalysis; mass spectrometry; proteomics.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing financial interest.

Figures

**Figure 1**
Overview of DIA data sets’ reanalysis pipeline. EA = Expression Atlas.

**Figure 2**
Distribution of protein identification and abundances across tissues and data sets. (A) Number of canonical proteins identified across different tissues and data sets. (B) iBAQ protein abundances of canonical proteins across different tissues and data sets.

**Figure 3**
Heatmap of binned protein abundances across all samples between various tissues and data sets. Brain samples clustered together are highlighted using a black border. S: skin, Sm: skeletal muscle, B: brain, Li: liver, and P: pancreas.

**Figure 4**
(A) Distribution of decoy and target protein groups present across all data sets identified using the *Arabidopsis thaliana* entrapment search database. The protein FDR values of the target proteins present in common across different numbers of data sets are shown in parentheses. The calculation of the FDR is described in the “Methods” section.

**Figure 5**
Correlation of protein abundances between human baseline DIA and the DDA data sets from a previous study.n shows the number of data points (common canonical proteins) considered in each panel.

See this image and copyright information in PMC

References

1. Gillet L. C.; Navarro P.; Tate S.; Rost H.; Selevsek N.; Reiter L.; Bonner R.; Aebersold R. Targeted data extraction of the MS/MS spectra generated by data-independent acquisition: a new concept for consistent and accurate proteome analysis. Mol. Cell. Proteomics 2012, 11 (6), O111.016717.10.1074/mcp.O111.016717. - DOI - PMC - PubMed
1. Jones A. R.; Deutsch E. W.; Vizcaino J. A. Is DIA proteomics data FAIR? Current data sharing practices, available bioinformatics infrastructure and recommendations for the future. Proteomics 2023, 23 (7–8), e220001410.1002/pmic.202200014. - DOI - PMC - PubMed
1. Xue Z.; Zhu T.; Zhang F.; Zhang C.; Xiang N.; Qian L.; Yi X.; Sun Y.; Liu W.; Cai X.; et al. DPHL v.2: An updated and comprehensive DIA pan-human assay library for quantifying more than 14,000 proteins. Patterns 2023, 4 (7), 100792.10.1016/j.patter.2023.100792. - DOI - PMC - PubMed
1. Perez-Riverol Y.; Bai J.; Bandla C.; Garcia-Seisdedos D.; Hewapathirana S.; Kamatchinathan S.; Kundu D. J.; Prakash A.; Frericks-Zipper A.; Eisenacher M.; et al. The PRIDE database resources in 2022: a hub for mass spectrometry-based proteomics evidences. Nucleic Acids Res 2022, 50 (D1), D543–D552. 10.1093/nar/gkab1038. - DOI - PMC - PubMed
1. Deutsch E. W.; Bandeira N.; Perez-Riverol Y.; Sharma V.; Carver J. J.; Mendoza L.; Kundu D. J.; Wang S.; Bandla C.; Kamatchinathan S.; et al. The ProteomeXchange consortium at 10 years: 2023 update. Nucleic Acids Res 2023, 51 (D1), D1539–D1548. 10.1093/nar/gkac1040. - DOI - PMC - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

Grants and funding

WT_/Wellcome Trust/United Kingdom

LinkOut - more resources

Full Text Sources
- American Chemical Society
- PubMed Central
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Integrated View of Baseline Protein Expression in Human Tissues Using Public Data Independent Acquisition Data Sets

Affiliations

Integrated View of Baseline Protein Expression in Human Tissues Using Public Data Independent Acquisition Data Sets

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

References

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Miscellaneous

Abstract

Conflict of interest statement

Figures

Similar articles

References

MeSH terms

Substances

Related information

Grants and funding

LinkOut - more resources

Full Text Sources

Miscellaneous