Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Aug 1:2022:5412627.
doi: 10.1155/2022/5412627. eCollection 2022.

Identification of Potential Biomarkers and Immune Infiltration Characteristics in Ulcerative Colitis by Combining Results from Two Machine Learning Algorithms

Affiliations

Identification of Potential Biomarkers and Immune Infiltration Characteristics in Ulcerative Colitis by Combining Results from Two Machine Learning Algorithms

Minchun Bu et al. Comput Math Methods Med. .

Abstract

Objective: This study was designed to identify potential biomarkers for ulcerative colitis (UC) and analyze the immune infiltration characteristics in UC.

Methods: Datasets containing human UC and normal control tissues (GSE87466, GSE107597, and GSE13367) were downloaded from the GEO database. Then, the GSE87466 and GSE107597 datasets were merged, and the differentially expressed genes (DEGs) between UC and normal control tissues were screened out by the "limma R" package. The LASSO regression model and support vector machine recursive feature elimination (SVM-RFE) were performed to screen out the best biomarkers. The GSE13367 dataset was used as a validation cohort, and the receiver operating characteristic curve (ROC) was used to evaluate the diagnostic performance. Finally, the immune infiltration characteristics in UC were explored by CIBERSORT, and we further analyzed the correlation between potential biomarkers and different immune cells.

Results: A total of 76 DEGs were screened out, among which 56 genes were upregulated and 20 genes were downregulated. Functional enrichment analysis revealed that these DEGs were mainly involved in immune response, chemokine signaling, IL-17 signaling, cytokine receptor interactions, inflammatory bowel disease, etc. ABCG2, HSPB3, SLC6A14, and VNN1 were identified as potential biomarkers for UC and validated in the GSE13367 dataset (AUC = 0.889, 95% CI: 0.797~0.961). Immune infiltration analysis by CIBERSORT revealed that there were significant differences in immune infiltration characteristics between UC and normal control tissues. A high level of memory B cells, γδ T cells, activated mast cells, M1 macrophages, neutrophils, etc. were found in the UC group, while a high level of M2 type macrophages, resting mast cells, eosinophils, CD8+ T cells, etc. were found in the normal control group.

Conclusion: ABCG2, HSPB3, SLC6A14, and VNN 1 were identified as potential biomarkers for UC. There was an obvious difference in immune infiltration between UC and normal control tissues, which may provide help to guide individualized treatment and develop new research directions.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Figure 1
Figure 1
Flowchart of the analysis process.
Figure 2
Figure 2
DEGs between UC and normal control tissues (LogFC is log2fold change. adj.P.Value is the adjusted P value, which is more reliable. Since the smaller the P value was, the more significant the difference was, then, −log10adj.P.Value transformation was carried out, and the larger the transformation value was, the more significant the difference was.)
Figure 3
Figure 3
The heatmap of the DEGs.
Figure 4
Figure 4
(a) The GO enrichment analysis of DEGs. (b) The KEGG enrichment analysis of DEGs. (c) The DO enrichment analysis of DEGs. (d) The GSEA enrichment analysis of DEGs in the UC group.
Figure 5
Figure 5
(a) The LASSO regression model used 10-fold cross-validation and the minimum absolute shrinkage criterion to identify the optimal penalty coefficient λ. (b) Screening out feature genes by SVM-RFE algorithm. (c) Venn diagram of intersection feature genes between the LASSO regression model and SVM-RFE algorithm. (d) The expression levels of the four genes between UC group (red) and normal control group (blue) in the validation cohort. (e) ROC curve of the four feature genes in the discovery cohort. (f) ROC curve of the four feature genes in the validation cohort.
Figure 6
Figure 6
(a) The relative percentage of 22 immune cells in each sample of the discovery cohort. (b) The difference in immune infiltration between the UC and normal control groups with red representing the UC group and blue representing the normal control group. (c) The correlation heatmap between 22 immune cells with red representing positive correlation and blue representing negative correlation. The darker the color, the stronger the correlation.
Figure 7
Figure 7
(a) The correlation between ABCG2 and immune cells. (b) The correlation between HSPB3 and immune cells. (c) The correlation between SLC6A14 and immune cells. (d) The correlation between VNN1 and immune cells.

Similar articles

Cited by

References

    1. Arnold M., Abnet C. C., Neale R. E., et al. Global burden of 5 major types of gastrointestinal cancer. Gastroenterology . 2020;159(1):335–349.e15. doi: 10.1053/j.gastro.2020.02.068. - DOI - PMC - PubMed
    1. Feuerstein J. D., Moss A. C., Farraye F. A. Ulcerative colitis. Mayo Clinic Proceedings . 2019;94(7):1357–1373. doi: 10.1016/j.mayocp.2019.01.018. - DOI - PubMed
    1. Ungaro R., Mehandru S., Allen P. B., Peyrin-Biroulet L., Colombel J. F. Ulcerative colitis. Lancet . 2017;389(10080):1756–1770. doi: 10.1016/S0140-6736(16)32126-2. - DOI - PMC - PubMed
    1. Du L., Ha C. Epidemiology and pathogenesis of ulcerative colitis. Gastroenterology Clinics of North America . 2020;49(4):643–654. doi: 10.1016/j.gtc.2020.07.005. - DOI - PubMed
    1. Porter R. J., Kalla R., Ho G. T. Ulcerative colitis: recent advances in the understanding of disease pathogenesis. F1000Research . 2020;9:p. 294. doi: 10.12688/f1000research.20805.1. - DOI - PMC - PubMed