Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2017 Dec;41(8):844-865.
doi: 10.1002/gepi.22089. Epub 2017 Nov 8.

Integrative sparse principal component analysis of gene expression data

Affiliations

Integrative sparse principal component analysis of gene expression data

Mengque Liu et al. Genet Epidemiol. 2017 Dec.

Abstract

In the analysis of gene expression data, dimension reduction techniques have been extensively adopted. The most popular one is perhaps the PCA (principal component analysis). To generate more reliable and more interpretable results, the SPCA (sparse PCA) technique has been developed. With the "small sample size, high dimensionality" characteristic of gene expression data, the analysis results generated from a single dataset are often unsatisfactory. Under contexts other than dimension reduction, integrative analysis techniques, which jointly analyze the raw data of multiple independent datasets, have been developed and shown to outperform "classic" meta-analysis and other multidatasets techniques and single-dataset analysis. In this study, we conduct integrative analysis by developing the iSPCA (integrative SPCA) method. iSPCA achieves the selection and estimation of sparse loadings using a group penalty. To take advantage of the similarity across datasets and generate more accurate results, we further impose contrasted penalties. Different penalties are proposed to accommodate different data conditions. Extensive simulations show that iSPCA outperforms the alternatives under a wide spectrum of settings. The analysis of breast cancer and pancreatic cancer data further shows iSPCA's satisfactory performance.

Keywords: contrasted penalization; gene expression data; integrative analysis; sparse PCA.

PubMed Disclaimer

Figures

Figure 1
Figure 1
ROC curves for Scenario 1 (M=4). meta-PCA: purple; meta-SPCA: blue; pooled-SPCA: pink; iSPCA: red; iSPCAM: orange; iSPCAS: green.
Figure 2
Figure 2
ROC curves for Scenario 2 (M=4). meta-PCA: purple; meta-SPCA: blue; pooled-SPCA: pink; iSPCA: red; iSPCAM: orange; iSPCAS: green.
Figure 3
Figure 3
ROC curves for Scenario 3 (M=4). meta-PCA: purple; meta-SPCA: blue; pooled-SPCA: pink; iSPCA: red; iSPCAM: orange; iSPCAS: green.
Figure 4
Figure 4
ROC curves for Scenario 1 (M=8). meta-PCA: purple; meta-SPCA: blue; pooled-SPCA: pink; iSPCA: red; iSPCAM: orange; iSPCAS: green.
Figure 5
Figure 5
ROC curves for Scenario 2 (M=8). meta-PCA: purple; meta-SPCA: blue; pooled-SPCA: pink; iSPCA: red; iSPCAM: orange; iSPCAS: green.
Figure 6
Figure 6
ROC curves for Scenario 3 (M=8). meta-PCA: purple; meta-SPCA: blue; pooled-SPCA: pink; iSPCA: red; iSPCAM: orange; iSPCAS: green.

Similar articles

Cited by

References

    1. Chiquet J, Grandvalet Y, Ambroise C. Inferring multiple graphical structures. Statistics and Computing. 2011;21:537–553.
    1. Gene expression omnibus. 2017 http://www.ncbi.nlm.nih.gov/geo/
    1. Graham KA, Ge X, De LMA, Tripathi A, Rosenberg CL. Gene expression profiles of estrogen receptor-positive and estrogen receptor-negative breast cancers are detectable in histologically normal breast epithelium. Clinical Cancer Research. 2011;17:236–246. - PMC - PubMed
    1. Grutzmann R, Boriss H, Ammerpohl O, Lttges J, Kalthoff H, Schackert HK, …Pilarsky C. Meta-analysis of microarray data on pancreatic cancer defines a set of commonly dysregulated genes. Oncogene. 2005;24:5079–5088. - PubMed
    1. Guerra R, Goldstein DR. Meta-analysis and combining information in genetics and genomics. CRC Press; 2009.

LinkOut - more resources