Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 May 14;23(1):125.
doi: 10.1186/s12931-022-02035-4.

Diagnosis of pulmonary tuberculosis via identification of core genes and pathways utilizing blood transcriptional signatures: a multicohort analysis

Affiliations

Diagnosis of pulmonary tuberculosis via identification of core genes and pathways utilizing blood transcriptional signatures: a multicohort analysis

Qian Qiu et al. Respir Res. .

Abstract

Background: Blood transcriptomics can be used for confirmation of tuberculosis diagnosis or sputumless triage, and a comparison of their practical diagnostic accuracy is needed to assess their usefulness. In this study, we investigated potential biomarkers to improve our understanding of the pathogenesis of active pulmonary tuberculosis (PTB) using bioinformatics methods.

Methods: Differentially expressed genes (DEGs) were analyzed between PTB and healthy controls (HCs) based on two microarray datasets. Pathways and functional annotation of DEGs were identified and ten hub genes were selected. They were further analyzed and selected, then verified with an independent sample set. Finally, their diagnostic power was further evaluated between PTB and HCs or other diseases.

Results: 62 DEGs mostly related to type I IFN pathway, IFN-γ-mediated pathway, etc. in GO term and immune process, and especially RIG-I-like receptor pathway were acquired. Among them, OAS1, IFIT1 and IFIT3 were upregulated and were the main risk factors for predicting PTB, with adjusted risk ratios of 1.36, 3.10, and 1.32, respectively. These results further verified that peripheral blood mRNA expression levels of OAS1, IFIT1 and IFIT3 were significantly higher in PTB patients than HCs (all P < 0.01). The performance of a combination of these three genes (three-gene set) had exceeded that of all pairwise combinations of them in discriminating TB from HCs, with mean AUC reaching as high as 0.975 with a sensitivity of 94.4% and a specificity of 100%. The good discernibility capacity was evaluated d via 7 independent datasets with an AUC of 0.902, as well as mean sensitivity of 87.9% and mean specificity of 90.2%. In regards to discriminating PTB from other diseases (i.e., initially considered to be possible TB, but rejected in differential diagnosis), the three-gene set equally exhibited an overall strong ability to separate PTB from other diseases with an AUC of 0.999 (sensitivity: 99.0%; specificity: 100%) in the training set, and 0.974 with a sensitivity of 96.4% and a specificity of 98.6% in the test set.

Conclusion: The described commonalities and unique signatures in the blood profiles of PTB and the other control samples have considerable implications for PTB biosignature design and future diagnosis, and provide insights into the biological processes underlying PTB.

Keywords: Expression profiling data; Hub genes; IFIT1; IFIT3; OAS1; Pulmonary tuberculosis.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Flow diagram of procedure of data collection
Fig. 2
Fig. 2
All genes and DEGs common in the two GEO datasets. A Volcano plot of all genes in GSE42834 gene chip. Red and green dots denote upregulated and downregulated genes, respectively. Black dots show the remaining genes without significant changes in expression. B Volcano plot of all genes in GSE83456 gene chip. C Venn diagram of upregulated genes common in the two GEO datasets. D Venn diagram of downregulated genes common in the two GEO datasets. E Heatmap of DEGs in the two GEO datasets (60 upregulated and 2 downregulated genes). Legend represents gene expression. Red and blue colors denote upregulation and downregulation, respectively
Fig. 3
Fig. 3
A Significantly enriched GO terms and KEGG pathways of DEGs with P < 0.05 and gene counts ≥ 3. Font colors of Y-axis label correspond to different enriched terms (BP, CC, MF terms of GO or KEGG pathways) of DEGs. Legend indicates the significance of the term (−log10 P-value). B Results of Gene Set Enrichment Analysis (GSEA) in GO and KEGG terms showed differential enrichment of genes in PTB and HC groups. “1” represents PTB, “0” represents HCs
Fig.4
Fig.4
PPI network of DEGs and module analysis. A DEG PPI network constructed by STRING online database. Circular nodes represent proteins. Edges indicate the interaction between two proteins. The color intensity of the nodes or edges is based on the degree or combined score in the DEGs. B–E Module analysis via Cytoscape software (degree cut-off = 2, node score cut-off = 0.2, k-core = 2, and max. depth = 100) and four central modules were built based on the PPI network. F Interrelation analysis between immune system pathways. All genes from all pathways were noted in red. G Count number of genes involved in the identified pathways
Fig. 5
Fig. 5
The GO enrichment results of the top ten hub genes. The bubble plot and chord diagram show the GO term (BP) (A) related to the top ten hub genes (B)
Fig. 6
Fig. 6
Expression of the ten hub genes and screening in an independent dataset (GSE56153). A Scatter plot from multiple t tests. mRNA expression levels of ISG15, OAS1, IFIT3, OAS3, IFIT1, OASL, IFI44L, RSAD2, XAF1 and IFI44 in the peripheral blood from PTB patients were compared with those in HC groups. Data were analyzed with multiple t tests and the scatter plot is created via GraphPad Prism 8.0.2. Each dot represents one gene and red dots denote genes with significant changes in expression (q < 0.01). The X axis is the difference between means for each gene from PTB and HC groups. The Y value plots the minus logarithm of the q-value. A dotted grid line is shown at Y = −log10(0.01). B Correlation of ISG15, OAS1, IFIT3, OAS3, IFIT1, OASL, IFI44L, RSAD2, XAF1 and IFI44 expression levels and PTB variable. Correlations were analyzed using point-biserial correlation tests and correlation coefficients of these genes were plotted. Red circles denote genes with correlation coefficient > 0.5 and P < 0.01. C Forest plot of the association between the ten hub genes and PTB in GSE56153 dataset. These ten hub genes from PTB patients and HCs were analyzed using multivariate poisson regression analysis with robust variance estimate. Metaanalysis of those was conducted, and adjusted RR, 95% CI of each gene and corresponding P value were calculated and plotted in the forest plot. D mRNA expression values of the three genes (OAS1, IFIT1 and IFIT3) in PTB and HC from an independent sample set by qRT-PCR. The mRNA values of the evaluated genes were normalized to the housekeeping gene GAPDH. The numbers of participants in validation test were the following: PTB, n = 20; HC, n = 20. **P < 0.01, ***P < 0.001
Fig. 7
Fig. 7
Efficacy evaluation for the three-gene set predicting PTB versus HCs by receiver operating characteristic curve (ROC) analysis. A ROC of the single gene or optional combinations of the three-gene set for discriminating PTB from HCs in independent GSE56153 dataset. 95% confidence interval of AUC was shown in the legend area, and the different colors presented different genes or combinations. B, C ROC of the three-gene set in the other seven independent datasets (GSE19491, GSE28623, GSE34608, GSE54992, GSE62525, GSE147964, GSE147690). 95% CI of AUC was shown in the legend area, and the different colors presented different datasets
Fig. 8
Fig. 8
The discriminative performance of the three-gene set in discriminating TB from other diseases based on the Random Forest (RF) predictions in independent GSE37250 dataset. A Importance plot of the variables. Total pixel was the three-gene set, followed by OAS1, IFIT1, and IFIT3. B ROC of RF prediction model based on the three-gene set in the training set and test set from the GSE37250 dataset. AUC was also shown in the legend area

Similar articles

Cited by

References

    1. Organization WH. Global tuberculosis report 2019. Geneva: World Health Organization; 2020. p. 2020.
    1. Oommen S, Banaji N. Laboratory diagnosis of tuberculosis: advances in technology and drug susceptibility testing. Indian J Med Microbiol. 2017;35:323–331. doi: 10.4103/ijmm.IJMM_16_204. - DOI - PubMed
    1. Zhang Y, Zhang X, Zhao Z, Zheng Y, Xiao Z, Li F. Integrated bioinformatics analysis and validation revealed potential immune-regulatory miR-892b, miR-199b-5p and miR-582-5p as diagnostic biomarkers in active tuberculosis. Microb Pathog. 2019;134:103563. doi: 10.1016/j.micpath.2019.103563. - DOI - PubMed
    1. Zhai W, Wu F, Zhang Y, Fu Y, Liu Z. The immune escape mechanisms of Mycobacterium tuberculosis. Int J Mol Sci. 2019;20:340. doi: 10.3390/ijms20020340. - DOI - PMC - PubMed
    1. Joosten SA, Fletcher HA, Ottenhoff TH. A helicopter perspective on TB biomarkers: pathway and process based analysis of gene expression data provides new insight into TB pathogenesis. PLoS ONE. 2013;8:e73230. doi: 10.1371/journal.pone.0073230. - DOI - PMC - PubMed