Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2019:1986:65-85.
doi: 10.1007/978-1-4939-9442-7_4.

A Review of Microarray Datasets: Where to Find Them and Specific Characteristics

Affiliations
Review

A Review of Microarray Datasets: Where to Find Them and Specific Characteristics

Amparo Alonso-Betanzos et al. Methods Mol Biol. 2019.

Abstract

The advent of DNA microarray datasets has stimulated a new line of research both in bioinformatics and in machine learning. This type of data is used to collect information from tissue and cell samples regarding gene expression differences that could be useful for disease diagnosis or for distinguishing specific types of tumor. Microarray data classification is a difficult challenge for machine learning researchers due to its high number of features and the small sample sizes. This chapter is devoted to reviewing the microarray databases most frequently used in the literature. We also make the interested reader aware of the problematic of data characteristics in this domain, such as the imbalance of the data, their complexity, and the so-called dataset shift.

Keywords: Dataset shift; High dimensionality; Microarray data; Unbalanced data.

PubMed Disclaimer

LinkOut - more resources