Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 May;70(5):690-701.
doi: 10.1002/art.40428. Epub 2018 Apr 2.

Identification of Three Rheumatoid Arthritis Disease Subtypes by Machine Learning Integration of Synovial Histologic Features and RNA Sequencing Data

Affiliations

Identification of Three Rheumatoid Arthritis Disease Subtypes by Machine Learning Integration of Synovial Histologic Features and RNA Sequencing Data

Dana E Orange et al. Arthritis Rheumatol. 2018 May.

Abstract

Objective: In this study, we sought to refine histologic scoring of rheumatoid arthritis (RA) synovial tissue by training with gene expression data and machine learning.

Methods: Twenty histologic features were assessed in 129 synovial tissue samples (n = 123 RA patients and n = 6 osteoarthritis [OA] patients). Consensus clustering was performed on gene expression data from a subset of 45 synovial samples. Support vector machine learning was used to predict gene expression subtypes, using histologic data as the input. Corresponding clinical data were compared across subtypes.

Results: Consensus clustering of gene expression data revealed 3 distinct synovial subtypes, including a high inflammatory subtype characterized by extensive infiltration of leukocytes, a low inflammatory subtype characterized by enrichment in pathways including transforming growth factor β, glycoproteins, and neuronal genes, and a mixed subtype. Machine learning applied to histologic features, with gene expression subtypes serving as labels, generated an algorithm for the scoring of histologic features. Patients with the high inflammatory synovial subtype exhibited higher levels of markers of systemic inflammation and autoantibodies. C-reactive protein (CRP) levels were significantly correlated with the severity of pain in the high inflammatory subgroup but not in the others.

Conclusion: Gene expression analysis of RA and OA synovial tissue revealed 3 distinct synovial subtypes. These labels were used to generate a histologic scoring algorithm in which the histologic scores were found to be associated with parameters of systemic inflammation, including the erythrocyte sedimentation rate, CRP level, and autoantibody levels. Comparison of gene expression patterns to clinical features revealed a potentially clinically important distinction: mechanisms of pain may differ in patients with different synovial subtypes.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Study overview.
Figure 2.
Figure 2.
Histologic features of the arthritic synovium. A, Frequency distribution of histologic scores of 20 features assessed in 129 synovial samples from patients with rheumatoid arthritis or osteoarthritis. B, Interrater reliability of the assessment of 15 histologic features by 2 pathologists in 40 synovial samples. Results are the mean ± SD kappa statistic. C and D, Representative images of hematoxylin and eosin-stained synovium, showing 10 synovial histologic features (arrows) that were retained for modeling based on a frequency of >5% and fair interrater reliability. Original magnification × 100 (bars = 200 μm) in C; × 20 (bars = 20 μm) in D.
Figure 3.
Figure 3.
Identification of 3 distinct synovial subtypes by gene expression analysis of 45 synovial tissue samples. A, Consensus clustering heatmaps using the top 500 most variable genes show clusters constrained to k = 2, k = 3, and k = 4. Red denotes samples that clustered together consistently, blue denotes samples that never clustered together, and white denotes samples that showed inconsistent co-clustering. P represents the probability score for co-clustering of individual samples. Samples were labeled green (low inflammatory [L]), orange (high inflammatory [H]), and yellow (mixed [M]) according to partitioning obtained when constraining the clustering algorithm to k = 3 clusters. B, Principal components analysis (PCA) of RNA-seq data shows a PCA plot of the RNA-seq data in relation to the top 500 most variable genes. Samples are color-coded according to cluster as defined in A. C, Heatmaps of normalized gene expression of the top 1,000 and top 50 differentially expressed genes (DEGs) are shown across the 3 clusters (low, mixed, and high inflammatory). Red denotes increased gene expression (Z score), and green denotes decreased gene expression.
Figure 4.
Figure 4.
Gene expression characteristics of the 3 synovial subtypes. A, Enrichment scores of functional annotation clusters of genes with increased expression in the high inflammatory subtype compared to the other subtypes. B, Enrichment scores of functional annotation clusters of genes with increased expression in the low inflammatory subtype compared to the other subtypes. C, CIBERSORT-inferred fraction of leukocyte cell types according to the 3 synovial subtypes. * = P < 0.05, by Kruskal-Wallis test. MHCI/II = major histocompatibility complex class I/class II; ITIM = immunoreceptor tyrosine–based inhibition motif; TNF = tumor necrosis factor; CTL = cytotoxic T lymphocyte; GPI = glycoprotein I; NK = natural killer.
Figure 5.
Figure 5.
Machine learning classification of histologic features using RNA-seq-defined synovial subgroups. A, Receiver operating characteristic curves (ROCs) for the histologic scoring algorithm, with area under the ROC (AUC) for the 3 synovial subtypes. B, Mean absolute weights for 10 histologic features, showing separation of the high inflammatory subtype from the other subtypes classified by the histologic scoring algorithm. Results are the median (interquartile range). C, Frequency distribution of raw histologic scores among the 3 synovial subtypes in rheumatoid arthritis (RA) patients, classified by either RNA-seq clustering (left) or histologic scoring algorithm (right). D, Frequency distribution of clinical laboratory features in RA patients (rank score: 0 = minimum, 1 = 25th percentile, 2 = 50th percentile, 3 = 75th percentile, 4 = maximum) among the 3 synovial subtypes classified by either gene expression clustering (left) or histologic scoring algorithm (right). ESR = erythrocyte sedimentation rate; CRP = C-reactive protein; CCP = anti–cyclic citrullinated peptide antibodies; RF = rheumatoid factor.
Figure 6.
Figure 6.
Comparison of clinical and histologic features in 123 patients with rheumatoid arthritis (RA) classified by synovial subtype according to the histologic scoring algorithm. A, Histologic scores of various features identified in RA patients in the low, mixed, and high inflammatory subtypes. Top, Ordinal features; scores are the median (interquartile range). Bottom, Binary features; scores are the mean ± SD. Top, ** = P < 0.01; **** = P < 0.0001, by Kruskal-Wallis test with Dunn’s test for multiple comparisons. Bottom, * = P < 0.05; ** = P < 0.01; **** = P < 0.0001, by chi-square test. B, Clinical features of the RA patients in the low, mixed, and high inflammatory subtypes. Values are the median (interquartile range). * = P < 0.05; ** = P < 0.01; *** = P < 0.001; **** = P < 0.0001, by Kruskal-Wallis test with Dunn’s test for multiple comparisons. C, Log2-transformed plasma levels of antibodies to putative RA-associated autoantigens. Results are the mean fluorescence intensity (MFI). Antibody levels were significantly different among the 3 synovial subtypes, as determined by analysis of variance with Tukey’s test for multiple comparisons. Values in parentheses are the number of samples from patients assigned to each synovial subtype. D, Spearman’s correlation (rho) coefficients for the assessment of correlations between pain severity scores and levels of acute-phase reactants (erythrocyte sedimentation rate [ESR] and C-reactive protein [CRP]) across the 3 synovial subtypes. RF = rheumatoid factor; CCP = anti-cyclic citrullinated peptide antibodies; SJC = swollen joint count; TJC = tender joint count.

Comment in

References

    1. Pitzalis C, Kelly S, Humby F. New learnings on the pathophysiology of RA from synovial biopsies. Curr Opin Rheumatol 2013;25: 334–44. - PubMed
    1. De Groof A, Ducreux J, Humby F, Nzeusseu Toukap A, Badot V, Pitzalis C, et al. Higher expression of TNFa-induced genes in the synovium of patients with early rheumatoid arthritis correlates with disease activity, and predicts absence of response to first line therapy. Arthritis Res Ther 2016;18:19. - PMC - PubMed
    1. Hogan VE, Holweg CT, Choy DF, Kummerfeld SK, Hackney JA, Teng YK, et al. Pretreatment synovial transcriptional profile is associated with early and late clinical response in rheumatoid arthritis patients treated with rituximab. Ann Rheum Dis 2012;71:1888–94. - PubMed
    1. Klaasen R, Thurlings RM, Wijbrandts CA, van Kuijk AW, Baeten D, Gerlag DM, et al. The relationship between synovial lymphocyte aggregates and the clinical response to infliximab in rheumatoid arthritis: a prospective study. Arthritis Rheum 2009;60:3217–24. - PubMed
    1. Dennis G Jr, Holweg CT, Kummerfeld SK, Choy DF, Setiadi AF, Hackney JA, et al. Synovial phenotypes in rheumatoid arthritis correlate with response to biologic therapeutics. Arthritis Res Ther 2014;16:R90. - PMC - PubMed

Publication types