Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jul 26;49(14):3564-3590.
doi: 10.1080/02664763.2021.1957789. eCollection 2022.

A new GEE method to account for heteroscedasticity using asymmetric least-square regressions

Affiliations

A new GEE method to account for heteroscedasticity using asymmetric least-square regressions

Amadou Barry et al. J Appl Stat. .

Abstract

Generalized estimating equations ( G E E ) are widely used to analyze longitudinal data; however, they are not appropriate for heteroscedastic data, because they only estimate regressor effects on the mean response - and therefore do not account for data heterogeneity. Here, we combine the G E E with the asymmetric least squares (expectile) regression to derive a new class of estimators, which we call generalized expectile estimating equations ( G E E E ) . The G E E E model estimates regressor effects on the expectiles of the response distribution, which provides a detailed view of regressor effects on the entire response distribution. In addition to capturing data heteroscedasticity, the GEEE extends the various working correlation structures to account for within-subject dependence. We derive the asymptotic properties of the G E E E estimators and propose a robust estimator of its covariance matrix for inference (see our R package, github.com/AmBarry/expectgee). Our simulations show that the GEEE estimator is non-biased and efficient, and our real data analysis shows it captures heteroscedasticity.

Keywords: Expectile regression; GEE working correlation; cluster data; longitudinal data; quantile regression.

PubMed Disclaimer

Conflict of interest statement

No potential conflict of interest was reported by the author(s).

Figures

Figure 1.
Figure 1.
Comparison of the GEE model (left) and GEEE model (right) for a heteroscedastic sample: (a) shows a GEE mean regression line fitted to the data, and (b) shows the five GEEE regression lines, τ(0.1, 0.25, 0.5, 0.75, 0.9), fitted to the data. The sample (n=90) is generated from a heteroscedastic linear model,  y=6+0.025x+ϵ, where ϵN(0, σ2) and σ=1+0.05x.
Figure 2.
Figure 2.
Bias distribution of βˆ2 represented as an error plot according to the sample size n(50, 100, 250), the degree of correlation ρ(0.1, 0.5, 0.9), the expectiles τ(0.5, 0.6, 0.7, 0.9) and the error term ϵN(0, 1) in a location-shift scenario. (A,B) represent the results for the balanced (m=4) and unbalanced panel (mU(3, 7)), respectively.
Figure 3.
Figure 3.
Bias distribution of βˆ2 represented as an error plot according to the sample size n(50, 100, 250), the degree of correlation ρ(0.1, 0.5, 0.9), the expectiles τ(0.5, 0.6, 0.7, 0.9) and the error term ϵN(0, 1) in a location-scale-shift scenario. (C,D) represent the results for the balanced (m=4) and unbalanced panel (mU(3, 7)), respectively.
Figure 4.
Figure 4.
Boxplot of the labor pain score for the placebo and the pain medication groups. The solid lines connect the medians and the dashed lines connect the means.
Figure 5.
Figure 5.
Multiple imputation parameter estimates and 95% confidence interval of the GEEE independent, exchangeable and AR(1) correlation models at τ=(0.05, 0.25, 0.5, 0.75, 0.95). The classical GEE corresponds to the GEEE model with τ=0.5.
Figure 6.
Figure 6.
Boxplot of the labor pain score and the multiple imputation estimated curves of the expectile functions τ=(0.05, 0.25, 0.5, 0.75, 0.95) for the placebo group and the pain medication group using the GEEE and AR(1) correlation model. The classical GEE corresponds to the GEEE model with τ=0.5.

References

    1. Aigner D., Amemiya T., and Poirier D., On the estimation of production frontiers: Maximum likelihood estimation of the parameters of a discontinuous density function, Int. Econ. Rev. 17 (1976), pp. 377.
    1. Balan R.M. and Schiopu-Kratina I., Asymptotic results with generalized estimating equations for longitudinal data, Ann. Stat. 33 (2005), pp. 522–541.
    1. Barry A., Bhagwat N., Misic B., Poline J.-B., and Greenwood C.M.T., Asymmetric influence measure for high dimensional regression, Commun. Stat. – Theory Methods (2020), pp. 1–27.
    1. Burrus C.S., Barreto J.A., and Selesnick I.W., Iterative reweighted least-squares design of FIR filters, IEEE Trans. Signal Process. 42 (1994), pp. 2926–2936.
    1. Chatterjee S. and Hadi A.S., Influential observations, high leverage points, and outliers in linear regression, Stat. Sci. 1 (1986), pp. 379–393.