Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Apr;2(2):253-65.
doi: 10.1002/cam4.69. Epub 2013 Feb 27.

Identification of potential biomarkers from microarray experiments using multiple criteria optimization

Affiliations

Identification of potential biomarkers from microarray experiments using multiple criteria optimization

Matilde L Sánchez-Peña et al. Cancer Med. 2013 Apr.

Abstract

Microarray experiments are capable of determining the relative expression of tens of thousands of genes simultaneously, thus resulting in very large databases. The analysis of these databases and the extraction of biologically relevant knowledge from them are challenging tasks. The identification of potential cancer biomarker genes is one of the most important aims for microarray analysis and, as such, has been widely targeted in the literature. However, identifying a set of these genes consistently across different experiments, researches, microarray platforms, or cancer types is still an elusive endeavor. Besides the inherent difficulty of the large and nonconstant variability in these experiments and the incommensurability between different microarray technologies, there is the issue of the users having to adjust a series of parameters that significantly affect the outcome of the analyses and that do not have a biological or medical meaning. In this study, the identification of potential cancer biomarkers from microarray data is casted as a multiple criteria optimization (MCO) problem. The efficient solutions to this problem, found here through data envelopment analysis (DEA), are associated to genes that are proposed as potential cancer biomarkers. The method does not require any parameter adjustment by the user, and thus fosters repeatability. The approach also allows the analysis of different microarray experiments, microarray platforms, and cancer types simultaneously. The results include the analysis of three publicly available microarray databases related to cervix cancer. This study points to the feasibility of modeling the selection of potential cancer biomarkers from microarray data as an MCO problem and solve it using DEA. Using MCO entails a new optic to the identification of potential cancer biomarkers as it does not require the definition of a threshold value to establish significance for a particular gene and the selection of a normalization procedure to compare different experiments is no longer necessary.

Keywords: Cancer biomarkers; cervical cancer; data envelopment analysis; microarray data analysis; multiple criteria optimization.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Schematic example of how to obtain a P-value. This is a schematic example of how to obtain one P-value for a particular gene in a microarray experiment with l = 3 healthy tissues as controls and m = 3 tissues with cancer. If statistical comparison is carried out for each gene, then at the end one has n genes each one with an associated P-value.
Figure 2
Figure 2
Pareto-efficient frontier. The existence of conflict causes that different genes be attractive when lying in the southwest envelope of the gene set. In general, in multiple criteria optimization (MCO), that envelope is called a Pareto-efficient frontier and it is conformed by Pareto-efficient solutions.
Figure 3
Figure 3
The two performance measures for each gene. This figure schematically shows a case with genes characterized by two performance measures: an untransformed P-value and a transformed one with equation (1). Referring to this figure, and following the proposed method, at this point it is recommended to identify the first 10 efficient frontiers. This can be easily done by identifying the genes in the first efficient frontier through data envelopment analysis (DEA), then removing them from the set and continuing with a second DEA iteration. This is repeated until the tenth frontier is identified. A method to determine the number of adequate frontiers to be analyzed is currently under development by our research group.

Similar articles

Cited by

References

    1. Ho L, Sharma N, Blackman L, Festa E, Reddy G, Pasinetti GM. From proteomics to biomarker discovery in Alzheimer's disease. Brain Res. Rev. 2005;48:360–369. - PubMed
    1. Riker AI, Enkemann SA, Fodstad O, Liu S, Ren S, Morris C, et al. The gene expression profiles of primary and metastatic melanoma yields a transition point of tumor progression and metastasis. BMC Med. Genomics. 2008;1:13. - PMC - PubMed
    1. Di Valentin E, Crahay C, Garbacki N, Hennuy B, Guéders M, Noël A, et al. New asthma biomarkers: lessons from murine models of acute and chronic asthma. Am. J. Physiol. Lung Cell. Mol. Physiol. 2009;296:L185–L197. - PubMed
    1. Olson NE. The microarray data analysis process: from raw data to biological significance. NeuroRX. 2006;3:373–383. - PMC - PubMed
    1. Ioannidis JP, Allison DB, Ball CA, Coulibaly I, Cui X, Culhane AC, et al. Repeatability of published microarray gene expression analyses. Nat. Genet. 2009;41:149–155. - PubMed

Publication types

MeSH terms

Substances