Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Jul 24;7(1):6236.
doi: 10.1038/s41598-017-06516-1.

Novel risk genes for systemic lupus erythematosus predicted by random forest classification

Affiliations

Novel risk genes for systemic lupus erythematosus predicted by random forest classification

Jonas Carlsson Almlöf et al. Sci Rep. .

Abstract

Genome-wide association studies have identified risk loci for SLE, but a large proportion of the genetic contribution to SLE still remains unexplained. To detect novel risk genes, and to predict an individual's SLE risk we designed a random forest classifier using SNP genotype data generated on the "Immunochip" from 1,160 patients with SLE and 2,711 controls. Using gene importance scores defined by the random forest classifier, we identified 15 potential novel risk genes for SLE. Of them 12 are associated with other autoimmune diseases than SLE, whereas three genes (ZNF804A, CDK1, and MANF) have not previously been associated with autoimmunity. Random forest classification also allowed prediction of patients at risk for lupus nephritis with an area under the curve of 0.94. By allele-specific gene expression analysis we detected cis-regulatory SNPs that affect the expression levels of six of the top 40 genes designed by the random forest analysis, indicating a regulatory role for the identified risk variants. The 40 top genes from the prediction were overrepresented for differential expression in B and T cells according to RNA-sequencing of samples from five healthy donors, with more frequent over-expression in B cells compared to T cells.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Figure 1
Figure 1
Prediction accuracy. Prediction accuracy measured by the area under the curve (AUC) using genotype data from the Immunochip. All data from 1,160 SLE patients and 2,711 controls were used for the prediction of SLE disease status by random forests (RF) and using a risk score based on the single SNP association analysis. The random forest classification was also applied to the subgroup of the SLE patients diagnosed with lupus nephritis (n = 274) together with all control samples.
Figure 2
Figure 2
Overlapping genes. Genes overlapping between the top 40 genes defined by the random forest prediction and the regular single SNP association analysis.
Figure 3
Figure 3
Over-representation of genes with allele-specific expression (ASE) in disease associated genes. Fold difference of expressed predicted SLE genes and autoimmunity associated genes with ASE in more than 80% of the individuals compared to all other genes. The risk genes in T-cells were significantly overrepresented in all gene sets, with the top 40 genes from random forest classification (p = 0.0079), top 40 genes from logistic regression (p = 0.015), and autoimmunity associated genes (p < 0.0001). Additionally, the enrichment of the autoimmunity associated risk genes in B-cells was also significant (p = 0.007).
Figure 4
Figure 4
Over-representation of differentially expressed genes. Over-representation of differentially expressed genes between B cells and T cells among the top 40 genes from the random forest prediction of SLE at different significance cutoffs. Blue curve shows all genes; green curve shows genes that were expressed at a higher level in B cells than in T cells; yellow curve shows genes that were expressed at a higher level in T cells than in B cells.

References

    1. Bengtsson AA, Ronnblom L. Systemic lupus erythematosus: still a challenge for physicians. Journal of internal medicine. 2017;281:52–64. doi: 10.1111/joim.12529. - DOI - PubMed
    1. Morris DL, et al. Genome-wide association meta-analysis in Chinese and European individuals identifies ten new loci associated with systemic lupus erythematosus. Nat Genet. 2016;48:940–946. doi: 10.1038/ng.3603. - DOI - PMC - PubMed
    1. Iwamoto, T. & Niewold, T. B. Genetics of human lupus nephritis. Clinical immunology, doi:10.1016/j.clim.2016.09.012 (2016). - PMC - PubMed
    1. Bolin K, et al. Association of STAT4 polymorphism with severe renal insufficiency in lupus nephritis. PLoS One. 2013;8:e84450. doi: 10.1371/journal.pone.0084450. - DOI - PMC - PubMed
    1. Welter D, et al. The NHGRI GWAS Catalog, a curated resource of SNP-trait associations. Nucleic Acids Res. 2014;42:D1001–1006. doi: 10.1093/nar/gkt1229. - DOI - PMC - PubMed

Publication types

MeSH terms