. 2019 Feb 18;15(2):e8503.

doi: 10.15252/msb.20188503.

A deep proteome and transcriptome abundance atlas of 29 healthy human tissues

Dongxue Wang¹, Basak Eraslan^{2

3}, Thomas Wieland⁴, Björn Hallström⁵, Thomas Hopf⁴, Daniel Paul Zolg¹, Jana Zecha¹, Anna Asplund⁶, Li-Hua Li¹, Chen Meng¹, Martin Frejno¹, Tobias Schmidt¹, Karsten Schnatbaum⁷, Mathias Wilhelm¹, Frederik Ponten⁶, Mathias Uhlen⁵, Julien Gagneur⁸, Hannes Hahne⁹, Bernhard Kuster^{10

11}

Affiliations

¹ Chair of Proteomics and Bioanalytics, Technische Universität München, Freising, Germany.
² Computational Biology, Department of Informatics, Technical University of Munich, Garching bei München, Germany.
³ Department of Biochemistry, Quantitative Biosciences Munich, Gene Center, Ludwig Maximilian Universität, München, Germany.
⁴ OmicScouts GmbH, Freising, Germany.
⁵ Science for Life Laboratory, KTH - Royal Institute of Technology, Stockholm, Sweden.
⁶ Science for Life Laboratory, Department of Immunology, Genetics and Pathology, Uppsala University, Uppsala, Sweden.
⁷ JPT Peptide Technologies GmbH, Berlin, Germany.
⁸ Computational Biology, Department of Informatics, Technical University of Munich, Garching bei München, Germany gagneur@in.tum.de hannes.hahne@omicscouts.com kuster@tum.de.
⁹ OmicScouts GmbH, Freising, Germany gagneur@in.tum.de hannes.hahne@omicscouts.com kuster@tum.de.
¹⁰ Chair of Proteomics and Bioanalytics, Technische Universität München, Freising, Germany gagneur@in.tum.de hannes.hahne@omicscouts.com kuster@tum.de.
¹¹ Center for Integrated Protein Science Munich (CIPSM), Munich, Germany.

PMID: 30777892
PMCID: PMC6379049
DOI: 10.15252/msb.20188503

A deep proteome and transcriptome abundance atlas of 29 healthy human tissues

Dongxue Wang et al. Mol Syst Biol. 2019.

. 2019 Feb 18;15(2):e8503.

doi: 10.15252/msb.20188503.

Authors

Affiliations

¹ Chair of Proteomics and Bioanalytics, Technische Universität München, Freising, Germany.
² Computational Biology, Department of Informatics, Technical University of Munich, Garching bei München, Germany.
³ Department of Biochemistry, Quantitative Biosciences Munich, Gene Center, Ludwig Maximilian Universität, München, Germany.
⁴ OmicScouts GmbH, Freising, Germany.
⁵ Science for Life Laboratory, KTH - Royal Institute of Technology, Stockholm, Sweden.
⁶ Science for Life Laboratory, Department of Immunology, Genetics and Pathology, Uppsala University, Uppsala, Sweden.
⁷ JPT Peptide Technologies GmbH, Berlin, Germany.
⁸ Computational Biology, Department of Informatics, Technical University of Munich, Garching bei München, Germany gagneur@in.tum.de hannes.hahne@omicscouts.com kuster@tum.de.
⁹ OmicScouts GmbH, Freising, Germany gagneur@in.tum.de hannes.hahne@omicscouts.com kuster@tum.de.
¹⁰ Chair of Proteomics and Bioanalytics, Technische Universität München, Freising, Germany gagneur@in.tum.de hannes.hahne@omicscouts.com kuster@tum.de.
¹¹ Center for Integrated Protein Science Munich (CIPSM), Munich, Germany.

PMID: 30777892
PMCID: PMC6379049
DOI: 10.15252/msb.20188503

Abstract

Genome-, transcriptome- and proteome-wide measurements provide insights into how biological systems are regulated. However, fundamental aspects relating to which human proteins exist, where they are expressed and in which quantities are not fully understood. Therefore, we generated a quantitative proteome and transcriptome abundance atlas of 29 paired healthy human tissues from the Human Protein Atlas project representing human genes by 18,072 transcripts and 13,640 proteins including 37 without prior protein-level evidence. The analysis revealed that hundreds of proteins, particularly in testis, could not be detected even for highly expressed mRNAs, that few proteins show tissue-specific expression, that strong differences between mRNA and protein quantities within and across tissues exist and that protein expression is often more stable across tissues than that of transcripts. Only 238 of 9,848 amino acid variants found by exome sequencing could be confidently detected at the protein level showing that proteogenomics remains challenging, needs better computational methods and requires rigorous validation. Many uses of this resource can be envisaged including the study of gene/protein expression regulation and biomarker specificity evaluation.

Keywords: RNA‐Seq; human proteome; human transcriptome; proteogenomics; quantitative mass spectrometry.

PubMed Disclaimer

Figures

**Figure 1. Comprehensive proteomic and transcriptomic analysis of 29 human tissues from healthy donors**
Body map of analysed tissues.
Number of genes detected on protein and mRNA level in each tissue. The colouring of the bars indicates the fractions of transcripts and proteins that are expressed everywhere or enriched in certain tissues. The full classification is provided in the text.
Abundance distribution of all transcripts detected in all tissues (grey); the fraction of detected proteins is shown in blue and the fraction of transcripts for which no protein was detected is shown in orange.
Relative distribution and absolute numbers of transcripts and proteins in selected functional classes across the expression categories shown in panel (B). Colours are the same as in panel (B).

**Figure EV1. Further characterization of human proteomes and transcriptomes**
Number of identified protein groups for each of the 29 tissues.
Number of genes in all tissues that were detected at the transcript with higher than average expression but not detected at the protein level. Note the very high number of such cases in testis.
Abundance distribution of all proteins detected in human brain (grey). Proteins in blue are expressed in all 29 tissues, and proteins in orange show elevated expression in brain.
Clustering of gene ontology terms (biological process) for proteins and transcripts that show the most divergent expression across all tissue. Boxes give examples of GO terms for four different tissues (Appendix, brain, heart and testis).

**Figure 2. Analysis of protein and transcript expression levels within and across tissues**
Distribution of global transcript and protein abundance in all tissues. It is apparent that the dynamic range of protein expression (iBAQ scale) exceeds that of mRNA expression (FPKM scale; see Appendix Fig S10 for the corresponding plot for RNA and protein copy numbers).
Protein‐to‐mRNA abundance plot for brain tissue. The slope of the regression line indicates that high‐abundance mRNAs give rise to more protein copies per mRNA than low‐abundance mRNAs.
Ranked abundance plot of proteins and transcripts in human heart. While the 10 most abundant transcripts cover almost 70% of all transcripts in this tissue, the corresponding proteins only represent about 20% of the total protein.
Analysis of the number of genes that are shared among the 100 most abundant transcripts and proteins. Regardless of the tissue, the fraction of shared genes rarely exceeds 20%.
Correlation analysis of protein‐to‐RNA abundance (in log10 scale) across tissues, resulting in almost 90% positive correlations. The proteins highlighted in the next panel are marked.
Examples for proteins that show high (SYK, left panel) or no (EIF4A3, right panel) correlation of protein/RNA ratios across tissues. While the former indicates that different tissues express different quantities of SYK, EIF4A3 expression appears to be similar in all tissues.

**Figure EV2. Relationships between mRNA and protein expression**
A
Slopes of the regression line in protein versus mRNA abundance plots (see main Fig 2B) for each tissue.
B
Number of genes that are shared among the 5,000 most abundant transcripts or proteins in each tissue.
C, D
Ranked abundance plots for transcripts and proteins in spleen and lung showing different characteristics in the abundance distributions (see also main Fig 2C and Appendix Fig S12 for all tissues).
E
Clustering of protein abundances across all tissues. It is apparent that many proteins have similar expression levels across several/many tissues.

**Figure 3. Correlation analysis of protein and transcript expression levels**
Global correlation analysis of proteomes versus proteomes and transcriptomes versus transcriptomes across human tissues. It is apparent that proteomes correlate stronger across tissues than transcriptomes.
Co‐inertia analysis of transcriptome and proteome levels of all 29 tissues (arrow base: transcriptome; arrow head: proteome) showing that the information carried by transcriptomes and proteomes was closer to each other in the same than across different tissues. Grey lines are used to aid identifying tissue names for the respective arrows. Shaded areas highlight tissues that are related by their molecular profiles.
Average cellular compositions of tissues highlighted in panel (B) showing that the molecular similarities in their transcriptomes and proteomes are driven by similarities in cell types.

**Figure EV3. Proteogenomic characterization of human tissues**
Distribution of peptide sequence coverage values obtained for proteins by mass spectrometry in each tissue. Horizontal lines: median. Box limits: 50% quantiles. Whiskers: all data points outside the box.
Analysis of the number of isoforms detected by transcriptomics or proteomics in each tissue.
Distribution of peptide sequence coverage obtained for proteins by mass spectrometry in tonsil tissue broken down by protease and fragmentation method used. Horizontal lines: median. Box limits: 50% quantiles. Whiskers: all data points outside the box.
Number of identified proteins in tonsil broken down by protease and fragmentation method used. Proteins covered by all workflows are marked in blue. Proteins identified by two or more workflows are indicated in grey and proteins exclusively identified by a single workflow in orange. The line connecting open circles indicates the cumulative number of proteins when adding data from the individual workflows.
Same as panel (D) but for peptides. Peptides covered by more than one workflow are marked in grey, and those exclusive for one workflow are marked in orange.
Number of experimental versus synthetic peptide reference spectra comparisons for candidate aTIS peptides (only the spectra with the highest spectral angle of each peptide were plotted) after database searching using Mascot as a function of the spectral angle or Pearson correlation coefficient. Dotted grey lines mark spectral angles of 0.7, 0.8 and 0.9.
Same as panel (F) but showing only candidate peptides that were identified by both Mascot and Andromeda.

**Figure 4. Proteogenomics exploration for protein‐level detection of isoforms, single amino acid variants and alternative translation sites**
Searching the tonsil proteomic data (trypsin alone or all enzymes) against a tissue‐specific sequence database constructed from RNA‐Seq data drastically reduces the number of individual protein sequences in protein groups compared to searches against Ensembl, allowing for the more efficient detection of protein isoforms.
Number of single amino acid variants detected by whole exome sequencing and DNA, by RNA‐Seq at the mRNA and by mass spectrometry at the protein level as well as confirmed candidates by validation using synthetic peptide spectra comparisons. It is apparent that only a very small fraction of all variants detected at the DNA or RNA level can be detected at the proteome level using current technology.
Analysis of which proteomic workflow contributed to the detection and confirmation of single amino acid variants.
Results of the detection of non‐canonical coding regions using proteomics data (left panel) and different alternative start codons identified by acetylated N‐terminal peptides (right panel). The majority of cases are N‐terminal extensions of annotated genes. All but one of the detected alternative translation start sites correspond to point mutations of the first base of the classical AUG codon.
Validation of a novel translation start site for the protein PLS3. The upper panel shows the novel translation site position within the amino acid sequence context, and the lower panel shows a mirror plot of the tandem mass spectra of the endogenous N‐terminally acetylated peptide (peaks pointing upwards) and the corresponding synthetic peptide spectrum (peaks pointing downwards). Y‐type sequence ions are coloured in red, b‐type ions in blue, and the intact peptide as well as neutral losses thereof are marked in yellow.

See this image and copyright information in PMC

References

1. Bánfai B, Jia H, Khatun J, Wood E, Risk B, Gundling WE Jr, Kundaje A, Gunawardena HP, Yu Y, Xie L, Krajewski K, Strahl BD, Chen X, Bickel P, Giddings MC, Brown JB, Lipovich L (2012) Long noncoding RNAs are rarely translated in two human cell lines. Genome Res 22: 1646–1657 - PMC - PubMed
1. Becher I, Werner T, Doce C, Zaal EA, Tögel I, Khan CA, Rueger A, Muelbaier M, Salzer E, Berkers CR, Fitzpatrick PF, Bantscheff M, Savitski MM (2016) Thermal profiling reveals phenylalanine hydroxylase as an off‐target of panobinostat. Nat Chem Biol 12: 908–910 - PubMed
1. Blomen VA, Májek P, Jae LT, Bigenzahn JW, Nieuwenhuis J, Staring J, Sacco R, van Diemen FR, Olk N, Stukalov A, Marceau C, Janssen H, Carette JE, Bennett KL, Colinge J, Superti‐Furga G, Brummelkamp TR (2015) Gene essentiality and synthetic lethality in haploid human cells. Science 350: 1092–1096 - PubMed
1. Branca RMM, Orre LM, Johansson HJ, Granholm V, Huss M, Pérez‐Bercoff Å, Forshed J, Käll L, Lehtiö J (2014) HiRIEF LC‐MS enables deep proteome coverage and unbiased proteogenomics. Nat Methods 11: 59–62 - PubMed
1. Calvo SE, Clauser KR, Mootha VK (2016) MitoCarta2.0: an updated inventory of mammalian mitochondrial proteins. Nucleic Acids Res 44: D1251–D1257 - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions
Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
Molecular Biology Databases
- GlyGen glycoinformatics resource
- NIAID Data Ecosystem - Find datasets on Infectious and Immune-mediated Diseases

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

A deep proteome and transcriptome abundance atlas of 29 healthy human tissues

Affiliations

A deep proteome and transcriptome abundance atlas of 29 healthy human tissues

Authors

Affiliations

Abstract

Figures

References

Publication types

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Other Literature Sources

Molecular Biology Databases