Promoting Similarity of Sparsity Structures in Integrative Analysis with Penalization
- PMID: 30100648
- PMCID: PMC6086364
- DOI: 10.1080/01621459.2016.1139497
Promoting Similarity of Sparsity Structures in Integrative Analysis with Penalization
Abstract
For data with high-dimensional covariates but small sample sizes, the analysis of single datasets often generates unsatisfactory results. The integrative analysis of multiple independent datasets provides an effective way of pooling information and outperforms single-dataset and several alternative multi-datasets methods. Under many scenarios, multiple datasets are expected to share common important covariates, that is, the corresponding models have similarity in their sparsity structures. However, the existing methods do not have a mechanism to promote the similarity in sparsity structures in integrative analysis. In this study, we consider penalized variable selection and estimation in integrative analysis. We develop an L0-penalty based method, which explicitly promotes the similarity in sparsity structures. Computationally it is realized using a coordinate descent algorithm. Theoretically it has the selection and estimation consistency properties. Under a wide spectrum of simulation scenarios, it has identification and estimation performance comparable to or better than the alternatives. In the analysis of three lung cancer datasets with gene expression measurements, it identifies genes with sound biological implications and satisfactory prediction performance.
Keywords: L0 penalization; cancer genomic data; integrative analysis; sparsity structure; variable selection.
Figures
Similar articles
-
Promoting similarity of model sparsity structures in integrative analysis of cancer genetic data.Stat Med. 2017 Feb 10;36(3):509-559. doi: 10.1002/sim.7138. Epub 2016 Sep 25. Stat Med. 2017. PMID: 27667129 Free PMC article.
-
Integrative analysis of high-throughput cancer studies with contrasted penalization.Genet Epidemiol. 2014 Feb;38(2):144-51. doi: 10.1002/gepi.21781. Epub 2014 Jan 6. Genet Epidemiol. 2014. PMID: 24395534 Free PMC article.
-
Integrative Analysis of Cancer Diagnosis Studies with Composite Penalization.Scand Stat Theory Appl. 2014 Mar 1;41(1):87-103. doi: 10.1111/j.1467-9469.2012.00816.x. Scand Stat Theory Appl. 2014. PMID: 24578589 Free PMC article.
-
Integrative sparse principal component analysis of gene expression data.Genet Epidemiol. 2017 Dec;41(8):844-865. doi: 10.1002/gepi.22089. Epub 2017 Nov 8. Genet Epidemiol. 2017. PMID: 29114920 Free PMC article.
-
Incorporating network structure in integrative analysis of cancer prognosis data.Genet Epidemiol. 2013 Feb;37(2):173-83. doi: 10.1002/gepi.21697. Epub 2012 Nov 17. Genet Epidemiol. 2013. PMID: 23161517 Free PMC article.
Cited by
-
Structured Analysis of the High-dimensional FMR Model.Comput Stat Data Anal. 2020 Apr;144:106883. doi: 10.1016/j.csda.2019.106883. Epub 2019 Nov 13. Comput Stat Data Anal. 2020. PMID: 32863493 Free PMC article.
-
A New Semiparametric Approach to Finite Mixture of Regressions using Penalized Regression via Fusion.Stat Sin. 2020 Apr;30(2):783-807. doi: 10.5705/ss.202016.0531. Stat Sin. 2020. PMID: 34824523 Free PMC article.
-
Estimation of multiple networks with common structures in heterogeneous subgroups.J Multivar Anal. 2024 Jul;202:105298. doi: 10.1016/j.jmva.2024.105298. Epub 2024 Feb 13. J Multivar Anal. 2024. PMID: 38433779 Free PMC article.
-
Regression Trees With Fused Leaves.Stat Med. 2024 Dec 30;43(30):5872-5884. doi: 10.1002/sim.10272. Epub 2024 Nov 20. Stat Med. 2024. PMID: 39567228
-
Structured gene-environment interaction analysis.Biometrics. 2020 Mar;76(1):23-35. doi: 10.1111/biom.13139. Epub 2019 Oct 9. Biometrics. 2020. PMID: 31424088 Free PMC article.
References
-
- Argyriou A, Evgeniou T, Pontil M. Convex multi-task feature learning. Machine Learning. 2008;73(3):243–272.
-
- Dicker L, Huang B, Lin X. Variable selection and estimation with the seamless-l0 penalty. Statistica Sinica. 2012;23:929–962.
-
- Fan J, Peng H. Nonconcave penalized likelihood with a diverging number of parameters. The Annals of Statistics. 2004;32:928–961.
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources