Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2026 Mar 13;22(3):e1013922.
doi: 10.1371/journal.pcbi.1013922. eCollection 2026 Mar.

An approximate-copula distribution for statistical modeling

Affiliations

An approximate-copula distribution for statistical modeling

Sarah S Ji et al. PLoS Comput Biol. .

Abstract

Copulas, generalized estimating equations, and generalized linear mixed models promote the analysis of grouped data where non-normal responses are correlated. Unfortunately, parameter estimation remains challenging in these three frameworks. Based on prior work of Tonda, we derive a new class of probability density functions that allow explicit calculation of moments, marginal and conditional distributions, and the score and observed information needed in maximum likelihood estimation. We also illustrate how the new distribution flexibly models longitudinal data following a non-Gaussian distribution. Finally, we conduct a tri-variate genome-wide association analysis on dichotomized systolic and diastolic blood pressure and body mass index data from the UK-Biobank, showcasing the modeling potential and computational scalability of the new distributional family.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Simulation study under the longitudinal model.
Top panel features MSE for β and θ under Simulation I setting with Poisson base (left) and Negative binomial base (right). The bottom panel features MSE of β and θ under Simulation II setting with Poisson base (left) and Negative binomial base (right). Here cluster size refers to di, the number of observations per sample. AC abbreviates approximate-copula and GLMM abbreviates generalized linear mixed model.
Fig 2
Fig 2. Power simulation for the proposed multivariate GWAS routine in Algorithm (2.8).
Here AC denotes approximate-copula, IHT denotes iterative hard thresholding, a penalized sparse regression method [3], and GEMMA implements a multivariate linear-mixed model [23]. The colored band represent ±1 standard deviation.
Fig 3
Fig 3. A 3-trait multivariate GWAS on BMI, dichotomized SBP, and dichotomized DBP.
The black horizontal dotted line indicates the the genome-wide threshold of 5×108. The most significant SNP within a 1Mb window is labeled and colored purple. All other significant SNPs are colored blue and unlabeled. The legend on the right shows chromosome density.

References

    1. Bates D, Mächler M, Bolker B, Walker S. Fitting linear mixed-effects models using lme4. J Stat Soft. 2015;67(1). doi: 10.18637/jss.v067.i01 - DOI
    1. Breslow NE, Clayton DG. Approximate inference in generalized linear mixed models. Journal of the American Statistical Association. 1993;88(421):9–25. doi: 10.1080/01621459.1993.10594284 - DOI
    1. Chu BB, Ko S, Zhou JJ, Jensen A, Zhou H, Sinsheimer JS, et al. Multivariate genome-wide association analysis by iterative hard thresholding. Bioinformatics. 2023;39(4):btad193. doi: 10.1093/bioinformatics/btad193 - DOI - PMC - PubMed
    1. Cohen BB. Plan and operation of the NHANES I epidemiologic followup study, 1982–84. US Department of Health and Human Services, Public Health Service, National; 1987. - PubMed
    1. Fitzmaurice GM, Laird NM, Ware JH. Applied longitudinal analysis. John Wiley & Sons; 2012.

LinkOut - more resources