Variable selection under multicollinearity using modified log penalty

Van Cuong Nguyen¹, Chi Tim Ng¹

Affiliations

PMID: 35706515
PMCID: PMC9041714
DOI: 10.1080/02664763.2019.1637829

Variable selection under multicollinearity using modified log penalty

Van Cuong Nguyen et al. J Appl Stat. 2019.

. 2019 Jul 3;47(2):201-230.

doi: 10.1080/02664763.2019.1637829. eCollection 2020.

Authors

Van Cuong Nguyen¹, Chi Tim Ng¹

Affiliation

¹ Department of Statistics, Chonnam National University, Gwangju, Republic of Korea.

PMID: 35706515
PMCID: PMC9041714
DOI: 10.1080/02664763.2019.1637829

Abstract

To handle the multicollinearity issues in the regression analysis, a class of 'strictly concave penalty function' is described in this paper. As an example, a new penalty function called 'modified log penalty' is introduced. The penalized estimator based on strictly concave penalties enjoys the oracle property under certain regularity conditions discussed in the literature. In the multicollinearity cases where such conditions are not applicable, the behaviors of the strictly concave penalties are discussed through examples involving strongly correlated covariates. Real data examples and simulation studies are provided to show the finite-sample performance of the modified log penalty in terms of prediction error under scenarios exhibiting multicollinearity.

Keywords: Grouping effect; modified log penalty; multicollinearity; penalized regression; strictly concave penalty function.

PubMed Disclaimer

Conflict of interest statement

No potential conflict of interest was reported by the authors.

Figures

**Figure 1.**
The plots of modified log penalty function and their thresholding rule functions. (a) The MLOG penalties: MLOG1 is $λ = 1$ , MLOG2 is $λ = 0.01$ and MLOG3 is $λ = 4$ . (b) The thresholding rules: MLOG1 is $λ = 1$ , MLOG2 is $λ = 0.01$ and MLOG3 is $λ = 4$ .

**Figure A.1.**
Simulation results of Example 4.1 – The mean number of false positives (FP) and false negatives (FN). Panels (a), (c), (e) show the results of Scenario (a). Panels (b), (d), (f) show the results of Scenario (b).

**Figure A.2.**
Simulation results of Example 4.2 – The mean number of false positives (FP) and false negatives (FN): Panels (a), (b) and (c) show the results of Scenario (a). Panels (d), (e) and (f) show the results of Scenario (b). Panels (g), (h) and (i) show the results of Scenario (c).

**Figure A.3.**
Simulation results of Example 4.3 – The mean number of false positives (FP) and false negatives (FN): Panels (a), (b) and (c) show the results of Scenario (a). Panels (d), (e) and (f) show the results of Scenario (b). Panels (g), (h) and (i) show the results of Scenario (c).

**Figure A.4.**
Simulation results of Example 4.4 – The mean number of false positives (FP) and false negatives (FN): Panel (a) shows the results of Model (a). Panel (b) shows the results of Model (b). Panel (c) shows the results of Model (c).

See this image and copyright information in PMC

References

1. Antoniadis A. and Fan J., Regularization of wavelet approximations, J. Am. Stat. Assoc. 96 (2001), pp. 939–967. doi: 10.1198/016214501753208942 - DOI
1. Breiman L., Heuristics of instability and stabilization in model selection, Ann. Statist. 24 (1996), pp. 2350–2383. doi: 10.1214/aos/1032181158 - DOI
1. Chatterjee S. and Hadi A.S., Regression Analysis by Example, 5th ed., John Wiley & Sons, Inc., Hoboken, New Jersey, 2012, 424p.
1. Chong I.-G. and Jun C.-H., Performance of some variable selection methods when multicollinearity is present, Chemometr. Intell. Lab. Syst. 78 (2005), pp. 103–112. doi: 10.1016/j.chemolab.2004.12.011 - DOI
1. Dalayan A., Hebiri M., and Lederer J., On the prediction performance of the LASSO, Bernoulli 23 (2017), pp. 552–581. doi: 10.3150/15-BEJ756 - DOI

LinkOut - more resources

Full Text Sources
Research Materials
- NCI CPTC Antibody Characterization Program

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Variable selection under multicollinearity using modified log penalty

Affiliation

Variable selection under multicollinearity using modified log penalty

Authors

Affiliation

Abstract

Conflict of interest statement

Figures

References

LinkOut - more resources

Full Text Sources

Research Materials