Portraying the Expression Landscapes of B-CellLymphoma-Intuitive Detection of Outlier Samples and of Molecular Subtypes

Lydia Hopp¹, Kathrin Lembcke², Hans Binder³, Henry Wirth⁴

Affiliations

¹ Interdisciplinary Centre for Bioinformatics, Universität Leipzig, Härtelstr. 16-18, Leipzig 04107, Germany. hopp@izbi.uni-leipzig.de.
² Interdisciplinary Centre for Bioinformatics, Universität Leipzig, Härtelstr. 16-18, Leipzig 04107, Germany. lembcke@izbi.uni-leipzig.de.
³ Interdisciplinary Centre for Bioinformatics, Universität Leipzig, Härtelstr. 16-18, Leipzig 04107, Germany. binder@izbi.uni-leipzig.de.
⁴ Interdisciplinary Centre for Bioinformatics, Universität Leipzig, Härtelstr. 16-18, Leipzig 04107, Germany. wirth@izbi.uni-leipzig.de.

PMID: 24833231
PMCID: PMC4009791
DOI: 10.3390/biology2041411

Portraying the Expression Landscapes of B-CellLymphoma-Intuitive Detection of Outlier Samples and of Molecular Subtypes

Lydia Hopp et al. Biology (Basel). 2013.

. 2013 Dec 2;2(4):1411-37.

doi: 10.3390/biology2041411.

Authors

Lydia Hopp¹, Kathrin Lembcke², Hans Binder³, Henry Wirth⁴

Affiliations

¹ Interdisciplinary Centre for Bioinformatics, Universität Leipzig, Härtelstr. 16-18, Leipzig 04107, Germany. hopp@izbi.uni-leipzig.de.
² Interdisciplinary Centre for Bioinformatics, Universität Leipzig, Härtelstr. 16-18, Leipzig 04107, Germany. lembcke@izbi.uni-leipzig.de.
³ Interdisciplinary Centre for Bioinformatics, Universität Leipzig, Härtelstr. 16-18, Leipzig 04107, Germany. binder@izbi.uni-leipzig.de.
⁴ Interdisciplinary Centre for Bioinformatics, Universität Leipzig, Härtelstr. 16-18, Leipzig 04107, Germany. wirth@izbi.uni-leipzig.de.

PMID: 24833231
PMCID: PMC4009791
DOI: 10.3390/biology2041411

Abstract

We present an analytic framework based on Self-Organizing Map (SOM) machine learning to study large scale patient data sets. The potency of the approach is demonstrated in a case study using gene expression data of more than 200 mature aggressive B-cell lymphoma patients. The method portrays each sample with individual resolution, characterizes the subtypes, disentangles the expression patterns into distinct modules, extracts their functional context using enrichment techniques and enables investigation of the similarity relations between the samples. The method also allows to detect and to correct outliers caused by contaminations. Based on our analysis, we propose a refined classification of B-cell Lymphoma into four molecular subtypes which are characterized by differential functional and clinical characteristics.

PubMed Disclaimer

Figures

**Figure 1**
Self-organizing map (SOM) gallery of lymphoma subtypes with a resolution of 50 × 50 metagenes: The small mosaic images refer to selected individual tumor samples assigned to the mBL, non-mBL and intermediate subtypes. The larger images represent the respective mean subtype portraits (see methodical section). Dark red/blue colored metagenes refer to the 90th/10th-percentile of expression in each sample, respectively. The complete gallery of all sample portraits is available in Supplementary File 2.

**Figure 2**
Spot module characteristics: (a) The over-expression summary map collects all over-expression spots observed in the individual portraits into one map. Subtypes frequently showing the respective spots are indicated. (b) The over-expression spot map defines the spots used for further analysis. Regions beyond the 98th-percentile threshold of metagene expression are selected. The spots are assigned by large capital letters. The blue rectangles include highly correlated spots (r > 0.7). The blue and red dashed lines connect correlated (0.4 < r < 0.7) and anti-correlated (r < −0.6) spots, respectively. (c) The overexpression heatmap shows the mean expression of the spots across all samples in the data set. The samples are sorted according to their subtype. (d) The under-expression summary map collects all under-expressed spots observed in the individual portraits. Note the antagonistic nature of mBL and non-mBL expression: spots over-expressed in mBL become under-expressed in non-mBL and *vice versa* (compare with panel a).

**Figure 3**
Functional analysis: (a) The functional context of the most abundant spots is assigned according to the topmost overexpressed gene sets in each of the spots. (b–d) GSZ-profiles and population maps are shown for gene sets accumulating in the mBL and non-mBL specific overexpression spots as indicated by the red ellipses (panel b), for mBL-vs-non-mBL signature sets published previously [10] (c) and for sets accumulating in rare spots (d).

**Figure 4**
Sample similarity analysis: (a) Independent component analysis (ICA) of lymphoma samples. The distribution of the samples is shown in the space spanned by the two leading independent components. (b) The neighbor-joining tree projects the sample similarity relations into a dendrogram. The bush-like structures reveal a finer granularity of subtypes beyond the three classes considered so far.

**Figure 5**
Pairwise correlation analysis of all lymphoma samples: (a) The pairwise correlation map (PCM) visualizes the correlation coefficients for all pairs of samples. The samples are arranged according to their subtype membership as indicated by the color bars. In the heatmap, red colors indicate positive, blue colors negative correlations between the samples. (b) The correlation network (CN) translates the PCM into a graph structure. The nodes are given by the samples and the edges connect positively correlated sample pairs (r > 0.5). Mean subtype portraits are given within the figure (large maps). Outlier nodes are highlighted by arrows. The SOM portraits of the respective samples are shown by small maps. The red circles and the spot letters indicate the outlier spots differing from the subtype specific patterns (compare these individual sample portraits with the mean subtype portraits).

**Figure 6**
Correction of outlier samples contaminated with healthy lymph node tissue. The left and right parts of the figure refer to the uncorrected and corrected data, respectively. (a) GSZ-profile and population map of the ‘tonsil’ gene set: The signature is not characteristic for one of the subtypes and their genes accumulate in spot ‘S’ of the map. (b) Correlation network of the lymphoma data set. (c) SOM portraits of selected outlier samples. The arrows point to the position of these samples in the CN and in the GSZ-profile. After correction, the expression landscape of the selected samples reveals subtype-specific signatures.

**Figure 7**
k-Means clustering into four subtypes: (a) Mean expression portraits of the four new subtypes. The green arrows indicate the spot pattern transitions from mBL to non-mBL via intermediate A or B. (b) CN colored according to the new subtypes obtained.

**Figure 8**
Consensus clustering: (a–c) Cluster-heatmaps of the consensus matrices for class numbers ranging from two to four, respectively. Pairs of samples frequently found in one joint class accumulate in the blue regions along the diagonal of the map. (d) Cumulative distribution function (CDF) for class numbers ranging from two to six.

**Figure 9**
Kaplan-Meier survival curves of the original three subtypes (a) and the new four subtype (b) classifications. Tick marks indicate patients alive at the time of last follow-up. Subtype specific survival curves are compared using log-rank test and the respective p-values are indicated within the figures.

See this image and copyright information in PMC

Cited by

Epigenetic Heterogeneity of B-Cell Lymphoma: Chromatin Modifiers.
Hopp L, Nersisyan L, Löffler-Wirth H, Arakelyan A, Binder H. Hopp L, et al. Genes (Basel). 2015 Oct 21;6(4):1076-112. doi: 10.3390/genes6041076. Genes (Basel). 2015. PMID: 26506391 Free PMC article.
A modular transcriptome map of mature B cell lymphomas.
Loeffler-Wirth H, Kreuz M, Hopp L, Arakelyan A, Haake A, Cogliatti SB, Feller AC, Hansmann ML, Lenze D, Möller P, Müller-Hermelink HK, Fortenbacher E, Willscher E, Ott G, Rosenwald A, Pott C, Schwaenen C, Trautmann H, Wessendorf S, Stein H, Szczepanowski M, Trümper L, Hummel M, Klapper W, Siebert R, Loeffler M, Binder H; German Cancer Aid consortium Molecular Mechanisms for Malignant Lymphoma. Loeffler-Wirth H, et al. Genome Med. 2019 Apr 30;11(1):27. doi: 10.1186/s13073-019-0637-7. Genome Med. 2019. PMID: 31039827 Free PMC article.
Transcriptional states of CAR-T infusion relate to neurotoxicity - lessons from high-resolution single-cell SOM expression portraying.
Loeffler-Wirth H, Rade M, Arakelyan A, Kreuz M, Loeffler M, Koehl U, Reiche K, Binder H. Loeffler-Wirth H, et al. Front Immunol. 2022 Sep 28;13:994885. doi: 10.3389/fimmu.2022.994885. eCollection 2022. Front Immunol. 2022. PMID: 36248848 Free PMC article.
Variation of RNA Quality and Quantity Are Major Sources of Batch Effects in Microarray Expression Data.
Fasold M, Binder H. Fasold M, et al. Microarrays (Basel). 2014 Dec 16;3(4):322-39. doi: 10.3390/microarrays3040322. Microarrays (Basel). 2014. PMID: 27600351 Free PMC article.
Mapping heterogeneity in patient-derived melanoma cultures by single-cell RNA-seq.
Gerber T, Willscher E, Loeffler-Wirth H, Hopp L, Schadendorf D, Schartl M, Anderegg U, Camp G, Treutlein B, Binder H, Kunz M. Gerber T, et al. Oncotarget. 2017 Jan 3;8(1):846-862. doi: 10.18632/oncotarget.13666. Oncotarget. 2017. PMID: 27903987 Free PMC article.

See all "Cited by" articles

References

1. Cancer Genome Atlas Research Network Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature. 2008;455:1061–1068. doi: 10.1038/nature07385. - DOI - PMC - PubMed
1. Cancer Genome Atlas Research Networ Comprehensive molecular characterization of human colon and rectal cancer. Nature. 2012;487:330–337. - PMC - PubMed
1. Barretina J., Caponigro G., Stransky N., Venkatesan K., Margolin A.A., Kim S., Wilson C.J., Lehár J., Kryukov G.V., Sonkin D., et al. The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature. 2012;483:603–607. doi: 10.1038/nature11003. - DOI - PMC - PubMed
1. Hudson T.J., Anderson W., Artez A., Barker A.D., Bell C., Bernabé R.R., Bhan M.K., Calvo F., Eerola I., Gerhard D.S., et al. International network of cancer genome projects. Nature. 2010;464:993–998. doi: 10.1038/nature08987. - DOI - PMC - PubMed
1. Fernald G.H., Capriotti E., Daneshjou R., Karczewski K.J., Altman R.B. Bioinformatics challenges for personalized medicine. Bioinformatics. 2011;27:1741–1748. doi: 10.1093/bioinformatics/btr295. - DOI - PMC - PubMed

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Portraying the Expression Landscapes of B-CellLymphoma-Intuitive Detection of Outlier Samples and of Molecular Subtypes

Affiliations

Portraying the Expression Landscapes of B-CellLymphoma-Intuitive Detection of Outlier Samples and of Molecular Subtypes

Authors

Affiliations

Abstract

Figures

Similar articles

Cited by

References

LinkOut - more resources

Full Text Sources

Other Literature Sources