Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Mar 30;44(7):e70078.
doi: 10.1002/sim.70078.

Frequentist Grouped Weighted Quantile Sum Regression for Correlated Chemical Mixtures

Affiliations

Frequentist Grouped Weighted Quantile Sum Regression for Correlated Chemical Mixtures

Daniel Rud et al. Stat Med. .

Abstract

As individuals are exposed to a myriad of potentially harmful pollutants every day, it is important to determine which actors have the greatest influence on health outcomes. However, jointly modeling the associations of multiple pollutant exposures is often hindered by the presence of highly correlated chemicals originating from a common source. A popular approach to analyzing associations between a disease outcome and several highly correlated exposures is Weighted Quantile Sum Regression (WQSR) modeling. WQSR provides increased stability in estimating model parameters but requires data splitting to estimate individual and group effects of chemicals, which reduces the power of the approach. A recent Bayesian implementation of WQSR regression provides a model fitting procedure that avoids data splitting at the cost of high computational expense on large data. In this paper, we introduce a Frequentist Grouped Weighted Quantile Sum Regression (FGWQSR) model that can be fitted efficiently to large datasets without requiring data splitting. FGWQSR produces estimates of the joint effect of mixture groups and of individual chemicals, and likelihood-ratio-based tests that account for FGWQSR's non-standard asymptotics. We demonstrate that FGWQSR is well calibrated for type-I errors while outperforming both Bayesian Grouped Weighted Quantile Sum Regression and Quantile Logistic Regression in terms of statistical power to detect the effects of mixture groups and individual chemicals. In addition, we show that FGWQSR is robust to model misspecification and can be fitted on large datasets in a fraction of the time required for BGWQSR. We apply FGWQSR to a dataset of 317 767 mother-child pairs with exposure profiles generated by chemical transport models to study the associations between several components found in particulate matter with an aerodynamic diameter smaller than 2.5 μ m $$ \mu \mathrm{m} $$ (PM 2 . 5 $$ {}_{2.5} $$ ) and child Autism Spectrum Disorder (ASD) diagnosis before age 5. PM 2 . 5 $$ {}_{2.5} $$ copper and PM 2 . 5 $$ {}_{2.5} $$ crustal material are found to be statistically significantly associated with ASD diagnosis by five years of age.

Keywords: autism spectrum disorder; chemical mixture modeling; constrained optimization; group sign constrained regression; non‐regular likelihood asymptotics; pollutant mixture modeling; weighted quantile sum regression.

PubMed Disclaimer

Conflict of interest statement

The authors declare no conflicts of interest.

Figures

FIGURE A1
FIGURE A1
Results of correctly specified simulations comparing the performance of FGWQSR, BGWQSR, and QLR over 1 000 replicates of simulated datasets with size n=500 with a case‐control ratio of approximately 1:1. Columns 1–4 correspond to mixture group effect power, single chemical power, individual chemical MSE, and individual chemical bias, respectively. In the presented simulations, CCS is fixed to (0.9,0.5) and the weight distribution of chemicals within groups is set as follows: wA=(13,13,13,0,0); wB=(12,12,0,0); and wC=(13,13,13,0,0). Each row denoted (A)–(E) corresponds to five levels of SOA: SOA1=(1, 1, 1); SOA2=(0.9,1.1,1.1); SOA3=(0.85, 1.15, 1.15); SOA4=(0.8, 1.2, 1.2); and SOA5=(0.7, 1.3, 1.3). Lines and points denoted by green, mustard, and light blue colors correspond to metrics computed from FGWQSR, BGWQSR, and QLR models, respectively. Dashed horizontal purple and black lines in power plots denote powers of 0.04, 0.05, and 0.06 as a reference for type 1 errors in the presence of null mixture group effects and null individual chemicals effects. Individual chemical effects colored in dark red denote null individual chemicals with weights of 0. Note that row (A) corresponds to the case where all group effects are null, which imposes that all individual chemical effects are also null.
FIGURE A2
FIGURE A2
Results of moderate misspecification simulations comparing the performance of FGWQSR, BGWQSR, and QLR over 1 000 replicates of simulated datasets with size n=500 with a case‐control ratio of approximately 1:1. Columns 1–4 correspond to mixture group effect power, individual chemical power, individual chemical MSE, and individual chemical bias, respectively. In the presented simulations, CCS is fixed to (0.9,0.5) and the weight distribution of chemicals within groups is set as follows: wA=(13,13,13,0,0); wB=(16,16,16,16,16,16); and wC=(13,13,13,0,0). Each row denoted (A)–(E) corresponds to one of five levels of SOA: SOA1=(1, 1, 1); SOA2=(0.9, 1.1, 1.1); SOA3=(0.85, 1.15, 1.15); SOA4=(0.8, 1.2, 1.2); and SOA5=(0.7, 1.3, 1.3). Lines and points denoted by green, mustard, and light blue colors correspond to metrics computed from FGWQSR, BGWQSR, and QLR models, respectively. Dashed horizontal purple and black lines in power plots denote powers of 0.04, 0.05, and 0.06 as a reference for type 1 errors in the presence of null mixture group effects and null individual chemicals effects. Individual chemical effects colored in dark red denote null individual chemicals with weights of 0. Individual chemicals in group B with a dark blue shading (B1–B4) have a positive weight of size 16 while chemicals with an orange shading (B5 and B6) have a negative weight of size 16. Note that row (A) corresponds to the case where all group effects are null, which imposes that all individual chemical effects are also null.
FIGURE A3
FIGURE A3
Results of runtime comparison simulations between FGWQSR and BGWQSR. Runtimes are resolved at dataset sizes of n = {100, 500, 1 000, 2 000, 5 000, 7 500, 10 000, 20 000, 50 000, 100 000} with a case‐control ratio of approximately 1:1, and CCS fixed to (0.7, 0.3). Simulations are run for 3‐group and 6‐group mixture configurations with SOA3‐group = (0.8, 1, 1.2) and SOA6‐group = (0.8, 1, 1.2, 0.8, 1, 1.2) with weight distributions as follows: w1=(13,13,13,0,0), w2=(12,12,0,0), and w3=(13,13,13,0,0); w1=(13,13,13,0,0), w2=(12,12,0,0), and w3=(13,13,13,0,0), w4=(13,13,13,0,0), w5=(12,12,0,0), and w6=(13,13,13,0,0). Lines and points denoted by green and orange colors correspond to the averaged runtimes of FGWQSR and BGWQSR models, respectively, over 10 replicates. Solid lines correspond to runtime simulations under a 3‐group configuration, while two‐dash lines correspond to runtime simulations under a 6‐group configuration. Model runtimes are plotted in the scale of log60(minutes) with labels at selected runtimes.
FIGURE A4
FIGURE A4
Distribution of pregnancy averaged PMformula image component exposure generated from Source Oriented Chemical Transport models. Red points denote mean PMformula image component exposure levels. Pollutant concentrations are displayed in log(μg/m3) units.
FIGURE A5
FIGURE A5
Correlation plot of PMformula image components. Red boxes denote resulting clusters from hierarchical clustering on 1C, where C denotes the correlation matrix, using a complete linkage and a cutpoint value to establish 3 clusters.

Similar articles

References

    1. Carpenter D. O., Arcaro K., and Spink D. C., “Understanding the Human Health Effects of Chemical Mixtures,” Environmental Health Perspectives 110 (2002): 25–42. - PMC - PubMed
    1. Silins I. and Högberg J., “Combined Toxic Exposures and Human Health: Biomarkers of Exposure and Effect,” International Journal of Environmental Research and Public Health 8, no. 3 (2011): 629–647. - PMC - PubMed
    1. Colt J. S., Severson R. K., Lubin J., et al., “Organochlorines in Carpet Dust and Non‐Hodgkin Lymphoma,” Epidemiology 16, no. 4 (2005): 516–525. - PubMed
    1. Li Y., Xu L., Shan Z., Teng W., and Han C., “Association Between Air Pollution and Type 2 Diabetes: An Updated Review of the Literature,” Therapeutic Advances in Endocrinology and Metabolism 10 (2019): 1–15. - PMC - PubMed
    1. Czarnota J., Gennings C., Colt J. S., et al., “Analysis of Environmental Chemical Mixtures and Non‐Hodgkin Lymphoma Risk in the NCI‐SEER NHL Study,” Environmental Health Perspectives 123, no. 10 (2015): 965–970, 10.1289/ehp.1408630. - DOI - PMC - PubMed

LinkOut - more resources