Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011 Oct 7;89(4):496-506.
doi: 10.1016/j.ajhg.2011.09.002. Epub 2011 Sep 29.

Integrating autoimmune risk loci with gene-expression data identifies specific pathogenic immune cell subsets

Affiliations

Integrating autoimmune risk loci with gene-expression data identifies specific pathogenic immune cell subsets

Xinli Hu et al. Am J Hum Genet. .

Erratum in

  • Am J Hum Genet. 2011 Nov 11;89(5):682

Abstract

Although genome-wide association studies have implicated many individual loci in complex diseases, identifying the exact causal alleles and the cell types within which they act remains greatly challenging. To ultimately understand disease mechanism, researchers must carefully conceive functional studies in relevant pathogenic cell types to demonstrate the cellular impact of disease-associated genetic variants. This challenge is highlighted in autoimmune diseases, such as rheumatoid arthritis, where any of a broad range of immunological cell types might potentially be impacted by genetic variation to cause disease. To this end, we developed a statistical approach to identify potentially pathogenic cell types in autoimmune diseases by using a gene-expression data set of 223 murine-sorted immune cells from the Immunological Genome Consortium. We found enrichment of transitional B cell genes in systemic lupus erythematosus (p = 5.9 × 10(-6)) and epithelial-associated stimulated dendritic cell genes in Crohn disease (p = 1.6 × 10(-5)). Finally, we demonstrated enrichment of CD4+ effector memory T cell genes within rheumatoid arthritis loci (p < 10(-6)). To further validate the role of CD4+ effector memory T cells within rheumatoid arthritis, we identified 436 loci that were not yet known to be associated with the disease but that had a statistically suggestive association in a recent genome-wide association study (GWAS) meta-analysis (p(GWAS) < 0.001). Even among these putative loci, we noted a significant enrichment for genes specifically expressed in CD4+ effector memory T cells (p = 1.25 × 10(-4)). These cell types are primary candidates for future functional studies to reveal the role of risk alleles in autoimmunity. Our approach has application in other phenotypes, outside of autoimmunity, where many loci have been discovered and high-quality cell-type-specific gene expression is available.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Statistical Approach (A) Normalize gene expression data. We normalized the expression profile by dividing the expression value of each gene in each tissue by the Euclidean norm of the gene's expression across all tissues in order to emphasize tissue-specific expression. Scores were converted to nonparametric percentiles. (B) For each tissue, identify the most specifically expressed gene in a locus. For each SNP, we first defined genes implicated by the SNP based on LD. For a specific tissue, we identified the most specifically expressed gene in the locus and then scored the SNP based on that gene's nonparametric specific expression score after adjusting for multiple genes within the locus. (C) With permutations, assess the significance of each tissue across loci. We first calculated a score for the tissue by taking the average of the log adjusted-percentiles across loci from B. Then, we randomly selected matched SNP sets and scored them similarly. The proportion of random SNP sets with tissue scores exceeding that of the actual set of SNPs being tested was reported as the p value of the tissue.
Figure 2
Figure 2
Application to Autoimmune Diseases We evaluated SNPs associated with systemic lupus erythematous, rheumatoid arthritis, and Crohn disease for cell-specific gene enrichment in 223 murine immune-cell types. The Bonferroni-corrected p value is shown by a dotted line. In each case, we labeled cell types that are significant after multiple hypothesis testing (p < 2.2 × 10−4), and we bold the single most significant cell type. (A) In lupus, B cells, especially transitional B cells in the spleen (B.T2.Sp), showed significant enrichment of genes within disease loci. (B) In Crohn disease, epithelial-associated stimulated CD103- dendritic cells (DC.103-11b+PoluIC.Lu) achieved the highest statistical significance. (C) In rheumatoid arthritis, the four subsets of CD4+ effector memory T cells in both the spleen and lymph nodes (T.4Mem.LN, T.4Mem.Sp, T.4Mem44h62l.LN, and T.4Mem44h62l.Sp) showed the most significant gene enrichment (p < 10−6).
Figure 3
Figure 3
Validating Enrichment of Cell-Specific Expression in RA Loci We evaluated 436 putative loci containing SNPs nominally associated to rheumatoid arthritis (pGWAS < 0.001) for cell-specific gene enrichment in 223 murine immune cell types. The Bonferroni-corrected p value is shown by a dotted line. (A) We labeled cell types that are significant after multiple hypothesis testing (p < 2.2 × 10−4). Only one of the four CD4+ effector memory cells (T.4Mem44h62l.LN) is significant. (B) We aggregated the expression specificity scores for the four different types of CD4+ effector memory T cells and calculated empirical locus p values for each of the 436 loci. These p values assessed the degree of specificity that the most highly specific CD4+ effector memory T cell genes in each locus achieved. In red, we plotted the histogram of these empirical locus p values, whereas in gray, we plotted the expected histogram of empirical locus p values. We plotted the ratio of those two values at each significance interval. We noted modest deflation at higher values (p > 0.5) and inflation at lower p values (p < 0.1). (C) We grouped loci by their association statistics (pGWAS) into 50 bins ranging from pGWAS < 0.001 (as in pane A and B) to 0.049 pGWAS < 0.05. Then, using aggregated specificity scores for types of CD4+ effector memory T cells, we evaluated these groups to see if they were enriched for specifically expressed genes. For each bin we plot the observed to expected ratio of loci with lower empirical locus p values (p < 0.1, blue, right axis), and the statistical significance of enrichment (red, left axis).

References

    1. Barrett J.C., Clayton D.G., Concannon P., Akolkar B., Cooper J.D., Erlich H.A., Julier C., Morahan G., Nerup J., Nierras C., Type 1 Diabetes Genetics Consortium Genome-wide association study and meta-analysis find that over 40 loci affect risk of type 1 diabetes. Nat. Genet. 2009;41:703–707. - PMC - PubMed
    1. Raychaudhuri S. Recent advances in the genetics of rheumatoid arthritis. Curr. Opin. Rheumatol. 2010;22:109–118. - PMC - PubMed
    1. Franke A., McGovern D.P., Barrett J.C., Wang K., Radford-Smith G.L., Ahmad T., Lees C.W., Balschun T., Lee J., Roberts R. Genome-wide meta-analysis increases to 71 the number of confirmed Crohn's disease susceptibility loci. Nat. Genet. 2010;42:1118–1125. - PMC - PubMed
    1. Purcell S.M., Wray N.R., Stone J.L., Visscher P.M., O'Donovan M.C., Sullivan P.F., Sklar P., International Schizophrenia Consortium Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature. 2009;460:748–752. - PMC - PubMed
    1. Hindorff L.A., Sethupathy P., Junkins H.A., Ramos E.M., Mehta J.P., Collins F.S., Manolio T.A. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc. Natl. Acad. Sci. USA. 2009;106:9362–9367. - PMC - PubMed

Publication types