Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Jul 16;3(7):e2696.
doi: 10.1371/journal.pone.0002696.

Molecular subsets in the gene expression signatures of scleroderma skin

Affiliations

Molecular subsets in the gene expression signatures of scleroderma skin

Ausra Milano et al. PLoS One. .

Erratum in

  • PLoS ONE. 2008;3(10). doi: 10.1371/annotation/05bed72c-c6f6-4685-a732-02c78e5f66c2

Abstract

Background: Scleroderma is a clinically heterogeneous disease with a complex phenotype. The disease is characterized by vascular dysfunction, tissue fibrosis, internal organ dysfunction, and immune dysfunction resulting in autoantibody production.

Methodology and findings: We analyzed the genome-wide patterns of gene expression with DNA microarrays in skin biopsies from distinct scleroderma subsets including 17 patients with systemic sclerosis (SSc) with diffuse scleroderma (dSSc), 7 patients with SSc with limited scleroderma (lSSc), 3 patients with morphea, and 6 healthy controls. 61 skin biopsies were analyzed in a total of 75 microarray hybridizations. Analysis by hierarchical clustering demonstrates nearly identical patterns of gene expression in 17 out of 22 of the forearm and back skin pairs of SSc patients. Using this property of the gene expression, we selected a set of 'intrinsic' genes and analyzed the inherent data-driven groupings. Distinct patterns of gene expression separate patients with dSSc from those with lSSc and both are easily distinguished from normal controls. Our data show three distinct patient groups among the patients with dSSc and two groups among patients with lSSc. Each group can be distinguished by unique gene expression signatures indicative of proliferating cells, immune infiltrates and a fibrotic program. The intrinsic groups are statistically significant (p<0.001) and each has been mapped to clinical covariates of modified Rodnan skin score, interstitial lung disease, gastrointestinal involvement, digital ulcers, Raynaud's phenomenon and disease duration. We report a 177-gene signature that is associated with severity of skin disease in dSSc.

Conclusions and significance: Genome-wide gene expression profiling of skin biopsies demonstrates that the heterogeneity in scleroderma can be measured quantitatively with DNA microarrays. The diversity in gene expression demonstrates multiple distinct gene expression programs in the skin of patients with scleroderma.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Figure 1
Figure 1. Gene expression signatures in scleroderma.
4,149 probes that changed at least 2-fold from their median value on at least two microarrays were selected from 75 microarray hybridizations representing 61 biopsies. Probes and microarrays were ordered by 2-dimensional average linkage hierarchical clustering. This clustering shows that the dSSc, lSSc, morphea samples form distinct groups largely stratified by their clinical diagnosis. A. The unsupervised hierarchical clustering dendrogram shows the relationship among the samples using this list of 4,149 probes. Samples names have been color-coded by their clinical diagnosis: dSSc in red, lSSc in orange, morphea and EF in black, and healthy controls (Nor) in green. Forearm (FA) and Back (B) are indicated for each sample. Solid arrows indicate the 14 of 22 forearm-back pairs that cluster next to one another; dashed arrows indicate the additional 3 forearm-back pairs that cluster with only a single sample between them. Technical replicates are indicated by the labels (a), (b) or (c). 9 out of 14 technical replicates cluster immediately beside one another. B. Overview of the gene expression profiles for the 4,149 probes. Each probe has been centered on its median expression value across all samples analyzed. Measurements that are above the median are colored red and those below the median are colored green. The intensity of the color is directly proportional to the fold change. Groups of genes on the right hand side indicated with colored bars are shown in greater detail in panels C–H. C. Immunoglobulin genes expressed highly in a subset of patients with dSSc and in patients with morphea, D. proliferation signature, E. collagen and extracelluar matrix components, F. genes typically associated with the presence of T-lymphocyes and macrophages, G. Genes showing low expression in dSSc, H. Heterogeneous expression cluster that is high in lSSc and a subset of dSSc. In each case only a subset of the genes in each cluster are shown. The precise location of each gene in the cluster can be viewed in Supplemental Figure S1.
Figure 2
Figure 2. Cluster analysis using the scleroderma intrinsic gene set.
The 995 most ‘intrinsic’ genes selected from 75 microarray hybridizations analyzing 34 individuals. Two major branches of the dendrogram tree are evident which divide a subset of the dSSc samples from all other samples. Within these major groups are smaller branches with identifiable biological themes, which have been colored accordingly: blue for diffuse 1, red for diffuse 2, purple for inflammatory, orange for limited and green for normal-like. Statistically significant clusters (p<0.001) identified by SigClust are indicated by an asterisk (*) at the lowest significant branch. A. Experimental sample hierarchical clustering dendrogram. Black bars indicate forearm-back pairs which cluster together based on this analysis. B. Scaled down overview of the intrinsic gene expression signatures. C. Limited SSc gene expression cluster. D. Proliferation cluster. E. Immunoglobulin gene expression cluster. F. T-lymphocyte and IFNγ gene expression cluster. The full figure with all gene names can be viewed in Supplemental Figure S2.
Figure 3
Figure 3. Robustness of sample classification.
The robustness of the sample classifications was analyzed by consensus clustering, which uses multiple iterations of K-means clustering with random restart. 500 subsets of the data were sampled without replacement. The results of consensus clustering and Principal Component Analysis (PCA) applied to the 75 arrays and 995 intrinsic genes are shown. A. Consensus matrices are shown for K = 4, 5 and 6. Cluster numbers are shown and cluster assignments are summarized in Table 3. B. Empirical consensus distribution function (CDF) plots corresponding to K = 2,3,4…10. The ideal number of clusters can be identified when the area under the curve shows minimal increases with increasing K. C. Proportion increase Δ(K) in the area under the CDF. D. PCA was performed using TIGR MeV software; principal components 1 and 2 are plotted in 2-dimensional space. Samples (points in space) have been colored according Figure 2. Normal-like are green, limited orange, diffuse-proliferation in red and inflammatory in black. Circles indicate groups of samples distinguished by the top two principal components. E. Principal components 1 and 3 were plotted in two-dimensional space and show distinction between two groups within the diffuse-proliferation, normal-like and limited scleroderma.
Figure 4
Figure 4. Scleroderma Module Map.
A. Module map of the Gene Ontology (GO) Biological Processes differentially expressed among the scleroderma samples is shown. Each column represents a single microarray and each row represents a single GO Biological process. Patient samples are organized as described in Figure 2. Only modules that were significantly enriched (minimum 2-fold change, p<0.05) on at least 4 micoarrays are shown. The average expression of the gene hits from each enriched gene set is displayed here. Only gene sets that show significant differences after multiple hypothesis testing were included. Select GO biological processes are shown. The entire figure with all biological processes can be viewed in Supplementary Figure S4. B. Module map using gene list created from an experimental identification of all cell cycle-regulated genes and genes found to be expressed in specific lymphocyte subsets .
Figure 5
Figure 5. Correlation between gene expression and clinical covariates.
A. Shown is the color-coded heatmap of the 75 arrays and 995 intrinsic genes. The graph on the right of the heat map shows disease duration for each sample. Disease duration was set to zero for normal controls and morphea samples. B. Pearson correlations were calculated between skin score and the expression values for each gene in the list. The moving average of the Pearson correlation (10-gene window) was plotted. Regions of high negative and high positive correlations to the three different clinical parameters are indicated (regions I–III shaded grey). C. Moving average of the Pearson correlation coefficients (10-gene window) between the self-reported Raynaud's severity score and the expression of each gene, D. Moving average of the Pearson Correlations (10-gene window) between extent of skin involvement and a diagnosis vector (see Methods) for dSSc(red), lSSc (orange) and healthy controls (green). E. Box plot of disease duration for dSSc patients. The patients included in the diffuse-proliferation group had disease duration of 8.4±6.4 years. The dSSc patients that fell into the inflammatory or normal-like groups have disease duration of 3.2±3.9 yrs (p<0.12, t-test). F. Genes that ideally discriminate ‘Diffuse 1’ and ‘Diffuse 2’ groups were selected using Significance Analysis of Microarrays (SAM). 329 genes were selected with an FDR<1%. Pearson correlation coefficients were calculated between each clinical parameter and the expression for each gene and plotted as a 10-gene moving window.
Figure 6
Figure 6. Genes correlated with MRSS.
We selected the genes from the 995 intrinsic list that had a correlation greater than 0.5 or less than −0.5 to the MRSS. This list of 177 genes was then used to organize the skin biopsies. Forearm-back pairs from 14 patients with dSSc (mean MRSS of 26.34±9.42) clustered onto one branch of the dendrogram tree. The forearm-back pairs of 4 patients with dSSc (Mean MRSS 18.11±6.45) clustered onto a different branch of the dendrogram tree. The difference in skin score between these two groups is statistically significant (p<0.0197).
Figure 7
Figure 7. Quantitative Real Time PCR analysis of representative biopsies.
The mRNA levels of three genes, TNFRSF12A (A), CD8A (B) and WIF1 (C) were analyzed by Taqman quantitative real time PCR. Each was analyzed in two representative forearm skin biopsies from each of the major subsets of proliferation, inflammatory, limited and normal controls. In the case of TNFRSF12A, patient dSSc11 was replaced by patient dSSc10, which cluster next to one another in the intrinsic subsets and show similar clinical characteristics (Table 1). Each qRT-PCR assay was performed in triplicate for each sample. The level of each gene was then normalized against triplicate measurements of GAPDH to control for total mRNA levels (see materials and methods). The relative expression values are displayed as the fold change for each gene relative to the median value of the eight samples analyzed.

Similar articles

Cited by

References

    1. Mayes MD. Classification and epidemiology of scleroderma. Semin Cutan Med Surg. 1998;17:22–26. - PubMed
    1. Mayes MD, Lacey JV, Jr., Beebe-Dimmer J, Gillespie BW, Cooper B, et al. Prevalence, incidence, survival, and disease characteristics of systemic sclerosis in a large US population. Arthritis Rheum. 2003;48:2246–2255. - PubMed
    1. Leroy EC, Black C, Fleischmajer R, Jablonska S, Krieg T, et al. Scleroderma (systemic sclerosis): classification, subsets and pathogenesis. JRheumatol. 1988;15:202–205. - PubMed
    1. Medsger TA. Systemic sclerosis (scleroderma): clinical aspects. In: Koopman WJ, editor. Arthritis and Allied Conditions. 14th ed. Philadelphia: Lippincott Williams & Wilkins; 2001. p. 1590.
    1. Masi AT. Classification of systemic sclerosis (scleroderma): relationship of cutaneous subgroups in early disease to outcome and serologic reactivity. J Rheumatol. 1988;15:894–898. - PubMed

Publication types