Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Jul 3;47(2):201-230.
doi: 10.1080/02664763.2019.1637829. eCollection 2020.

Variable selection under multicollinearity using modified log penalty

Affiliations

Variable selection under multicollinearity using modified log penalty

Van Cuong Nguyen et al. J Appl Stat. .

Abstract

To handle the multicollinearity issues in the regression analysis, a class of 'strictly concave penalty function' is described in this paper. As an example, a new penalty function called 'modified log penalty' is introduced. The penalized estimator based on strictly concave penalties enjoys the oracle property under certain regularity conditions discussed in the literature. In the multicollinearity cases where such conditions are not applicable, the behaviors of the strictly concave penalties are discussed through examples involving strongly correlated covariates. Real data examples and simulation studies are provided to show the finite-sample performance of the modified log penalty in terms of prediction error under scenarios exhibiting multicollinearity.

Keywords: Grouping effect; modified log penalty; multicollinearity; penalized regression; strictly concave penalty function.

PubMed Disclaimer

Conflict of interest statement

No potential conflict of interest was reported by the authors.

Figures

Figure 1.
Figure 1.
The plots of modified log penalty function and their thresholding rule functions. (a) The MLOG penalties: MLOG1 is λ=1, MLOG2 is λ=0.01 and MLOG3 is λ=4. (b) The thresholding rules: MLOG1 is λ=1, MLOG2 is λ=0.01 and MLOG3 is λ=4.
Figure A.1.
Figure A.1.
Simulation results of Example 4.1 – The mean number of false positives (FP) and false negatives (FN). Panels (a), (c), (e) show the results of Scenario (a). Panels (b), (d), (f) show the results of Scenario (b).
Figure A.2.
Figure A.2.
Simulation results of Example 4.2 – The mean number of false positives (FP) and false negatives (FN): Panels (a), (b) and (c) show the results of Scenario (a). Panels (d), (e) and (f) show the results of Scenario (b). Panels (g), (h) and (i) show the results of Scenario (c).
Figure A.3.
Figure A.3.
Simulation results of Example 4.3 – The mean number of false positives (FP) and false negatives (FN): Panels (a), (b) and (c) show the results of Scenario (a). Panels (d), (e) and (f) show the results of Scenario (b). Panels (g), (h) and (i) show the results of Scenario (c).
Figure A.4.
Figure A.4.
Simulation results of Example 4.4 – The mean number of false positives (FP) and false negatives (FN): Panel (a) shows the results of Model (a). Panel (b) shows the results of Model (b). Panel (c) shows the results of Model (c).

References

    1. Antoniadis A. and Fan J., Regularization of wavelet approximations, J. Am. Stat. Assoc. 96 (2001), pp. 939–967. doi: 10.1198/016214501753208942 - DOI
    1. Breiman L., Heuristics of instability and stabilization in model selection, Ann. Statist. 24 (1996), pp. 2350–2383. doi: 10.1214/aos/1032181158 - DOI
    1. Chatterjee S. and Hadi A.S., Regression Analysis by Example, 5th ed., John Wiley & Sons, Inc., Hoboken, New Jersey, 2012, 424p.
    1. Chong I.-G. and Jun C.-H., Performance of some variable selection methods when multicollinearity is present, Chemometr. Intell. Lab. Syst. 78 (2005), pp. 103–112. doi: 10.1016/j.chemolab.2004.12.011 - DOI
    1. Dalayan A., Hebiri M., and Lederer J., On the prediction performance of the LASSO, Bernoulli 23 (2017), pp. 552–581. doi: 10.3150/15-BEJ756 - DOI