Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 May 3;8(1):6951.
doi: 10.1038/s41598-018-24758-5.

Applications of Bayesian network models in predicting types of hematological malignancies

Affiliations

Applications of Bayesian network models in predicting types of hematological malignancies

Rupesh Agrahari et al. Sci Rep. .

Abstract

Network analysis is the preferred approach for the detection of subtle but coordinated changes in expression of an interacting and related set of genes. We introduce a novel method based on the analyses of coexpression networks and Bayesian networks, and we use this new method to classify two types of hematological malignancies; namely, acute myeloid leukemia (AML) and myelodysplastic syndrome (MDS). Our classifier has an accuracy of 93%, a precision of 98%, and a recall of 90% on the training dataset (n = 366); which outperforms the results reported by other scholars on the same dataset. Although our training dataset consists of microarray data, our model has a remarkable performance on the RNA-Seq test dataset (n = 74, accuracy = 89%, precision = 88%, recall = 98%), which confirms that eigengenes are robust with respect to expression profiling technology. These signatures are useful in classification and correctly predicting the diagnosis. They might also provide valuable information about the underlying biology of diseases. Our network analysis approach is generalizable and can be useful for classifying other diseases based on gene expression profiles. Our previously published Pigengene package is publicly available through Bioconductor, which can be used to conveniently fit a Bayesian network to gene expression data.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
Schematic view of the methodology. (A) The input is the gene expression profile (matrix). (B) We applied WGCNA to build the coexpression network and to identify gene modules (clusters). (C) PCA is used to summarize the biological information of each gene module into an eigengene. (D) A BN is fitted to the eigengenes to delineate the relationships between modules. We also used the fitted BN as a probabilistic predictive model. The tools used for each step are highlighted in red.
Figure 2
Figure 2
Expression of eigengenes in the MILE dataset. Each row corresponds to a sample. Modules (columns) are clustered based on the similarity of expression in the MILE dataset. The majority of eigengenes show a different pattern of expression in the two diseases. The green strip at the top shows the adjusted p-values of Welch’s t-tests in logarithmic scale (base 10). The adjusted p–values are in the order of 10−60 to 10−10, which indicates that the eigengenes are highly discriminative features.
Figure 3
Figure 3
Consensus BN structures. Each yellow node represents an eigengene. The Effect node is a binary variable that models the disease type. Its parents are denoted by red circles. The directed edges (arcs) model the probabilistic dependencies between nodes. Although these consensus networks are obtained from 500 (A) and 5, 000 (B) BNs, they have fairly similar structures.
Figure 4
Figure 4
ROC curves. The predictions from the Bayesian network approach (red) leads to the highest AUC. The curve corresponding to the SVM predictions (green) is close to the best curve when eigengenes are used as features.

References

    1. Jemal A, Thomas A, Murray T, Thun M. Cancer statistics, 2002. CA: a cancer journal for clinicians. 2002;52:23–47. - PubMed
    1. Greenberg PL, et al. Revised international prognostic scoring system for myelodysplastic syndromes. Blood. 2012;120:2454–2465. doi: 10.1182/blood-2012-03-420489. - DOI - PMC - PubMed
    1. Shi J, et al. Transformation of myelodysplastic syndromes into acute myeloid leukemias. Chinese Medical Journal. 2004;117:963–967. - PubMed
    1. Wang L, Gao C, Chen B. [research progress on mechanism of mds transformation into aml] Zhongguo shi yan xue ye xue za zhi/Zhongguo bing li sheng li xue hui= J. of experimental hematology/Chinese Assoc. of Pathophysiol. 2011;19:254–259. - PubMed
    1. Langfelder P, Horvath S. Wgcna: an r package for weighted correlation network analysis. BMC bioinformatics. 2008;9:559. doi: 10.1186/1471-2105-9-559. - DOI - PMC - PubMed

Publication types

MeSH terms