Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2003:2003:21-5.

HITON: a novel Markov Blanket algorithm for optimal variable selection

Affiliations

HITON: a novel Markov Blanket algorithm for optimal variable selection

C F Aliferis et al. AMIA Annu Symp Proc. 2003.

Abstract

We introduce a novel, sound, sample-efficient, and highly-scalable algorithm for variable selection for classification, regression and prediction called HITON. The algorithm works by inducing the Markov Blanket of the variable to be classified or predicted. A wide variety of biomedical tasks with different characteristics were used for an empirical evaluation. Namely, (i) bioactivity prediction for drug discovery, (ii) clinical diagnosis of arrhythmias, (iii) bibliographic text categorization, (iv) lung cancer diagnosis from gene expression array data, and (v) proteomics-based prostate cancer detection. State-of-the-art algorithms for each domain were selected for baseline comparison.

Results: (1) HITON reduces the number of variables in the prediction models by three orders of magnitude relative to the original variable set while improving or maintaining accuracy. (2) HITON outperforms the baseline algorithms by selecting more than two orders-of-magnitude smaller variable sets than the baselines, in the selected tasks and datasets.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Pseudo-code for algorithm HITON.
Figure 2
Figure 2
Dataset Characteristics
Figure 3
Figure 3
Task-specific and overall model reduction performance (in bold, best performance per row; asterisks indicate that the corresponding algorithm yields the best model or a non-statistically significantly worse model than the best one).

References

    1. Cooper GF, et al. An evaluation of machine learning methods for predicting pneumonia mortality. Artif Intel Med. 1997;9:107–138. - PubMed
    1. Cheng J, et al. KDD Cup 2001 Report. SIGKDD Explorations. 2002;3 (2):1–18.
    1. Furey TS, et al. Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics. 2000;16(10):906–914. - PubMed
    1. Guyon I, et al. Gene selection for cancer classification using support vector machines. Machine Learning. 2002;46:389–422.
    1. Guvenir, H.A., et al. A supervised machine learning algorithm for arrhythmia analysis. Proc. Computers in Cardiology, Lund, Sweden, 1997.

Publication types

LinkOut - more resources