. 2012 Mar 1;107(497):214-222.

doi: 10.1080/01621459.2012.656014. Epub 2012 Jun 11.

Quantile Regression for Analyzing Heterogeneity in Ultra-high Dimension

Lan Wang¹, Yichao Wu, Runze Li

Affiliations

PMID: 23082036
PMCID: PMC3471246
DOI: 10.1080/01621459.2012.656014

Quantile Regression for Analyzing Heterogeneity in Ultra-high Dimension

Lan Wang et al. J Am Stat Assoc. 2012.

. 2012 Mar 1;107(497):214-222.

doi: 10.1080/01621459.2012.656014. Epub 2012 Jun 11.

Authors

Lan Wang¹, Yichao Wu, Runze Li

Affiliation

¹ School of Statistics, University of Minnesota, Minneapolis, MN 55455.

PMID: 23082036
PMCID: PMC3471246
DOI: 10.1080/01621459.2012.656014

Abstract

Ultra-high dimensional data often display heterogeneity due to either heteroscedastic variance or other forms of non-location-scale covariate effects. To accommodate heterogeneity, we advocate a more general interpretation of sparsity which assumes that only a small number of covariates influence the conditional distribution of the response variable given all candidate covariates; however, the sets of relevant covariates may differ when we consider different segments of the conditional distribution. In this framework, we investigate the methodology and theory of nonconvex penalized quantile regression in ultra-high dimension. The proposed approach has two distinctive features: (1) it enables us to explore the entire conditional distribution of the response variable given the ultra-high dimensional covariates and provides a more realistic picture of the sparsity pattern; (2) it requires substantially weaker conditions compared with alternative methods in the literature; thus, it greatly alleviates the difficulty of model checking in the ultra-high dimension. In theoretic development, it is challenging to deal with both the nonsmooth loss function and the nonconvex penalty function in ultra-high dimensional parameter space. We introduce a novel sufficient optimality condition which relies on a convex differencing representation of the penalized loss function and the subdifferential calculus. Exploring this optimality condition enables us to establish the oracle property for sparse quantile regression in the ultra-high dimension under relaxed conditions. The proposed method greatly enhances existing tools for ultra-high dimensional data analysis. Monte Carlo simulations demonstrate the usefulness of the proposed procedure. The real data example we analyzed demonstrates that the new approach reveals substantially more information compared with alternative methods.

PubMed Disclaimer

Figures

**Figure 1**
Lack-of-fit diagnosis QQ plot for the real data.

See this image and copyright information in PMC

Cited by

The spike-and-slab quantile LASSO for robust variable selection in cancer genomics studies.
Liu Y, Ren J, Ma S, Wu C. Liu Y, et al. Stat Med. 2024 Nov 20;43(26):4928-4983. doi: 10.1002/sim.10196. Epub 2024 Sep 11. Stat Med. 2024. PMID: 39260448
The lasso for high dimensional regression with a possible change point.
Lee S, Seo MH, Shin Y. Lee S, et al. J R Stat Soc Series B Stat Methodol. 2016 Jan;78(1):193-210. doi: 10.1111/rssb.12108. Epub 2015 Feb 15. J R Stat Soc Series B Stat Methodol. 2016. PMID: 27656104 Free PMC article.
Covariate Information Number for Feature Screening in Ultrahigh-Dimensional Supervised Problems.
Nandy D, Chiaromonte F, Li R. Nandy D, et al. J Am Stat Assoc. 2022;117(539):1516-1529. doi: 10.1080/01621459.2020.1864380. Epub 2021 Feb 10. J Am Stat Assoc. 2022. PMID: 36172297 Free PMC article.
Quantile Regression Forests to Identify Determinants of Neighborhood Stroke Prevalence in 500 Cities in the USA: Implications for Neighborhoods with High Prevalence.
Hu L, Ji J, Li Y, Liu B, Zhang Y. Hu L, et al. J Urban Health. 2021 Apr;98(2):259-270. doi: 10.1007/s11524-020-00478-y. J Urban Health. 2021. PMID: 32888155 Free PMC article.
Statistical inference for linear mediation models with high-dimensional mediators and application to studying stock reaction to COVID-19 pandemic.
Guo X, Li R, Liu J, Zeng M. Guo X, et al. J Econom. 2023 Jul;235(1):166-179. doi: 10.1016/j.jeconom.2022.03.001. Epub 2022 Apr 8. J Econom. 2023. PMID: 36568314 Free PMC article.

See all "Cited by" articles

References

1. An LTH, Tao PD. The DC (Difference of Convex Functions) programming and DCA revisited with DC models of real world nonconvex optimization problems. Annals of Operations Research. 2005;133:23–46.
1. Bai Z, Wu Y. Limiting behavior of M-estimators of regression coefficients in high dimensional linear models, I. Scale-dependent case. Journal of Multivariate Analysis. 1994;51:211–239.
1. Belloni A, Chernozhukov V. L1-Penalized quantile regression in high-dimensional sparse models. The Annals of Statistics. 2011;39:82–130.
1. Bertsekas DP. Nonlinear programming. 3. Athena Scientific; Belmont, Massachusetts: 2008.
1. Candes EJ, Tao T. The Dantzig selector: Statistical estimation when p is much larger than n. The Annals of Statistics. 2007;35:2313–2351.

Grants and funding

LinkOut - more resources

Full Text Sources
- Europe PubMed Central
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Quantile Regression for Analyzing Heterogeneity in Ultra-high Dimension

Affiliation

Quantile Regression for Analyzing Heterogeneity in Ultra-high Dimension

Authors

Affiliation

Abstract

Figures

Similar articles

Cited by

References

Grants and funding

LinkOut - more resources

Full Text Sources

Abstract

Figures

Similar articles

Cited by

References

Related information

Grants and funding

LinkOut - more resources

Full Text Sources