Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jun;129(6):67006.
doi: 10.1289/EHP8610. Epub 2021 Jun 23.

Mining of Consumer Product Ingredient and Purchasing Data to Identify Potential Chemical Coexposures

Affiliations

Mining of Consumer Product Ingredient and Purchasing Data to Identify Potential Chemical Coexposures

Zachary Stanfield et al. Environ Health Perspect. 2021 Jun.

Abstract

Background: Chemicals in consumer products are a major contributor to human chemical coexposures. Consumers purchase and use a wide variety of products containing potentially thousands of chemicals. There is a need to identify potential real-world chemical coexposures to prioritize in vitro toxicity screening. However, due to the vast number of potential chemical combinations, this identification has been a major challenge.

Objectives: We aimed to develop and implement a data-driven procedure for identifying prevalent chemical combinations to which humans are exposed through purchase and use of consumer products.

Methods: We applied frequent itemset mining to an integrated data set linking consumer product chemical ingredient data with product purchasing data from 60,000 households to identify chemical combinations resulting from co-use of consumer products.

Results: We identified co-occurrence patterns of chemicals over all households as well as those specific to demographic groups based on race/ethnicity, income, education, and family composition. We also identified chemicals with the highest potential for aggregate exposure by identifying chemicals occurring in multiple products used by the same household. Last, a case study of chemicals active in estrogen and androgen receptor in silico models revealed priority chemical combinations co-targeting receptors involved in important biological signaling pathways.

Discussion: Integration and comprehensive analysis of household purchasing data and product-chemical information provided a means to assess human near-field exposure and inform selection of chemical combinations for high-throughput screening in in vitro assays. https://doi.org/10.1289/EHP8610.

PubMed Disclaimer

Figures

Figure 1 is a flow chart that has six steps. Step 1: Consumer Product Purchase (C P P) data, including 4674292 purchases, 133966 products, and 60476 households leads to unmapped data, direct match with 10719 U P Cs, and fuzzy match with 20656 U P Cs. The CPDat, including 230407 products and 1082 chemicals and U P C mapping, leads to Fuzzy match. Step 2: direct match with 10719 U P Cs, and fuzzy match with 20656 U P Cs leads to mapped data, including 2499829 purchases, 31585 products, and 60324 households. Step 3: Mapped data, including 2499829 purchases, 31585 products, and 60324 households with compliance: 12 or more total purchases by household over 1 year lead to noncompliant households and complaint households, including 2351560 purchases, 31375 products, 783 chemicals (D T X S I Ds), and 53525 households. Step 4: Complaint households, including 2351560 purchases, 31375 products, 783 chemicals (D T X S I Ds), and 53525 households with FUSe, including 14272 chemicals and 137 functional use and aggregate purchases by month and incorporate functional uses lead to Chemicals introduced to individual households, including 539857 transactions, 50 functional uses, and 623 chemicals with known use. Step 5: Chemicals introduced to individual households, including 539857 transactions, 50 functional uses, and 623 chemicals with known use with TSCA, including 31478 chemicals and chemicals of interest lead to Broad chemicals, including 649 chemicals. Step 6: Broad chemicals, including 649 chemicals with C E R A P P, including 1142 E R positive, C O M P A RA, including 16112 A R Positive, and Dodson et al. 2012, including 55 curated, and case study chemicals lead to E D Cs, including 65 chemicals.
Figure 1.
Data processing pipeline for frequent itemset mining of chemicals in consumer purchasing. Consumer purchasing data was obtained from Nielsen, mapped with a database linking chemicals to products and integrated with chemical functional use information, and purchases were aggregated by month to focus on chemical coexposure. For analysis, chemicals were limited to a broad set from the nonconfidential Toxic Substances Control Act (TSCA) inventory and a smaller pathway-based case study (endocrine-disrupting chemicals).
Figure 2 is a heatmap, plotting Prevalence, ranging from bottom to top, 0.119, 0.121, 0.125, 0.125, 0.128, 0.131, 0.131, 0.133, 0.154, 0.154, 0.186, 0.19, 0.21, 0.222, 0.228, 0.24, 0.242, 0.26, 0.332, and 0.517 (left y-axis) and citric acid; titanium dioxide; ethanolamine; carrageenan, native; sodium hypochlorite; C 10-16-Alkyldimethylamines oxides; Diethylenetriaminepentaacetic acid pentasodium salt; d-limonene; sodium chloride; sodium [dodecanoy(methyl)amino]acetate, sodium carbonate; propane; sodium hydroxide; poly(oxy-1,2-ethannediyl), alpha.-sulfo-.omega.-hydroxy-,C 10-16 alkyl ethers, sodium salts; sulfuric acid, mono-C 10-16 alkyl esters, sodium salts; Isobutane; sodium dodecyl sulfate; 1, 2-propylene glycol; glycerol; and ethanol (right y-axis) across functional use, Asian, African American, Hispanic, White, Grade and High School, college, post college, no child, under 6, under 13, under 18, lower, mid lower, mid higher, higher, non-childbearing, and childbearing (x-axis). A scale depicting rank difference is ranging from negative 10 to 5 in increments of 5. The functional use has five parts, namely, ubiquitous, fragrance, surfactant, pH stabilizer, and antimicrobial.
Figure 2.
Prevalence and ranking of individual chemicals. Heat map illustrating the ranked support for the 20 most prevalent chemicals. For a demographic, green color denotes a downward shift of rank relative to the global (lower priority/potential exposure), and red denotes an upward shift (higher priority/potential exposure). Cell numbers quantify the unit change in rank (for households in that demographic) relative to the global rank (all households). Note that ranks are not comparable across demographics as a quantitative measure but are intended to suggest shifts in potential exposure for different demographics with respect to all households. Column annotations indicate demographic categories, and row annotations indicate harmonized functional use of chemicals.
Figure 3 is a heatmap, plotting demographic, from bottom to top, ethanol divides sodium hydroxide; ethanol divides isobutene; sulfuric acid, mono- C 10-16 alkyl esters, sodium salts divides poly(oxy-1,2-ethanediyl),-alpha.-sulfo-.omega.-hydroxy-, C 10-16 alkyl ethers, sodium salts divides C 10-16 alkyldimethylamines oxides; poly(oxy-1,2-ethanediyl), .alpha. –sulfo-. Omega.-hydroxy-, C 10-16 alkyl ethers, sodium salt divides C 10- 16 alkyldimethylamines oxides; Sulfuric acid, mono C -10-16 alkyl esters, sodium salt divides C 10- 16 alkyldimethylamines oxides; propane divides isobutene; Sulfuric acid, mono C -10-16 alkyl esters, sodium salt divides ethanol; Sulfuric acid, mono C -10-16 alkyl esters, sodium salt divides poly(oxy-1,2-ethanediyl), .alpha. –sulfo-. Omega.-hydroxy-, C 10-16 alkyl ethers, sodium salt; Sulfuric acid, mono C -10-16 alkyl esters, sodium salt divides poly(oxy-1,2-ethanediyl), .alpha. –sulfo-. Omega.-hydroxy-, C 10-16 alkyl ethers, sodium salt divides ethanol; poly(oxy-1,2-ethanediyl), .alpha. –sulfo-. Omega.-hydroxy-, C 10-16 alkyl ethers, sodium salts divides ethanol; ethanol divides glycerol; ethanol divides 1,2-propylene glycol; sodium dodecyl sulfate divides glycerol; ethanol divides sodium dodecyl sulfate; poly(oxy-1,2-ethanediyl), .alpha. –sulfo-. Omega.-hydroxy-, C 10-16 alkyl ethers, sodium salts divides 1,2-propylene glycol; sulfuric acid, mono- C 10-16 alkyl esters, sodium salts divides ethanol divides 1,2-propylene glycol; poly(oxy-1,2-ethanediyl), .alpha. –sulfo-. Omega.-hydroxy-, C 10-16 alkyl ethers, sodium salts divides 1,2-propylene glycol; C 10-16 alkyl ethers, sodium salts divides 1,2-propylene glycol; sulfuric acid, mono- C 10-16 alkyl esters, sodium salts divides ethanol divides 1,2-propylene glycol; poly(oxy-1,2-ethanediyl), .alpha. –sulfo-. Omega.-hydroxy-, C 10-16 alkyl ethers, sodium salts divides 1,2-propylene glycol; and sulfuric acid, mono- C 10-16 alkyl esters, sodium salts divides ethanol divides 1,2-propylene glycol; poly(oxy-1,2-ethanediyl), .alpha. –sulfo-. Omega.-hydroxy-, C 10-16 alkyl ethers, sodium salts divides ethanol divides 1,2-propylene glycol (y-axis) across antimicrobial, pH stabilizer, fragrance, ubiquitous, surfactant, grade and high school, African American, Mid higher, lower, mid lower, white, no child, non-childbearing, Asian, post college, childbearing, higher, under 13, under 18, college, Hispanic, and under 6 (x-axis). A scale depicting rank difference is ranging from negative 15 to 10 in increments of 5. A scale depicting number of chemicals with function is ranging from 2 to 0 in unit decrements. The demographic has five parts, namely, race or ethnicity, education, income, family compliance, and female age.
Figure 3.
Ranking of co-occurring chemicals. Heat map illustrating the ranked support for the 20 most prevalent chemical combinations. For a demographic, green color denotes a downward shift of rank relative to the global (lower priority), and red denotes an upward shift (higher priority). Cell numbers quantify the unit change in rank relative to the global rank. Note that ranks are not comparable across demographics as a quantitative measure but are intended to suggest shifts in potential exposure for different demographics with respect to all households. Column annotations indicate demographic categories and row annotations indicate harmonized functional use of chemicals. (A) and (B) annotate groups with similar co-occurrence patterns across demographics as discussed in the main text. Rows and columns were clustered using complete linkage hierarchical clustering based on correlation of rank departures.
Figure 4 is a set of 25 stacked bar graphs divided into a group of 5, with each comprising 5 bar graphs. Group A: Detergents plots chemical combinations, ranging from 0 to 4000 in increments of 1000 (y-axis) across Education, namely, grade and high school, Family comp, namely, no child, under 6, under 13, and under 16, Female Age, namely, not childbearing and childbearing, Income, namely, lower, mid lower, Mid Higher, and Higher, and Race or Ethnicity, namely, Asian, African American, Hispanic, and White, respectively. Group B: fresheners and deodorizers plots chemical combinations, ranging from 0 to 600 in increments of 200 (y-axis) across Education, namely, grade and high school, Family comp, namely, no child, under 6, under 13, and under 16, Female Age, namely, not childbearing and childbearing, Income, namely, lower, mid lower, Mid Higher, and Higher, and Race or Ethnicity, namely, Asian, African American, Hispanic, and White, respectively. Group C: Deodorant plots chemical combinations, ranging from 0 to 300 in increments of 100 (y-axis) across Education, namely, grade and high school, Family comp, namely, no child, under 6, under 13, and under 16, Female Age, namely, not childbearing and childbearing, Income, namely, lower, mid lower, Mid Higher, and Higher, and Race or Ethnicity, namely, Asian, African American, Hispanic, and White. Group D: Skin care preparations plots chemical combinations, ranging from 0 to 400 in increments of 200 (y-axis) across Education, namely, grade and high school, Family comp, namely, no child, under 6, under 13, and under 16, Female Age, namely, not childbearing and childbearing, Income, namely, lower, mid lower, Mid Higher, and Higher, and Race or Ethnicity, namely, Asian, African American, Hispanic, and White, respectively. Group E: Cosmetics plots chemical combinations, ranging from 0 to 600 in increments of 200 (y-axis) across Education, namely, grade and high school, Family comp, namely, no child, under 6, under 13, and under 16, Female Age, namely, not childbearing and childbearing, Income, namely, lower, mid lower, Mid Higher, and Higher, and Race or Ethnicity, namely, Asian, African American, Hispanic, and White, respectively.
Figure 4.
Total number of frequent chemical combinations across demographic groups for five product groups. For each combination of product group and demographic, the purchasing data were reduced to only those chemicals in products contained in the product group and only those households matching the specific demographic category. Parameters for frequent itemset mining: minimum support=0.1%, minimum set length=2, and maximum set length=10.
Figure 5 is a heatmap titled endocrine active chemicals rank by demographic group, plotting prevalence ratio, from bottom to top, namely, 0.007, 0.008, 0.008, 0.01, 0.011, 0.012, 0.012, 0.013, 0.016, 0.018, 0.027, 0.029, 0.031, 0.035, 0.043, 0.044, 0.045, 0.066, and 0.081 (left y-axis) and, from top to bottom, decamethylcyclopentasiloxane, propylparaben, 2-hydroxy-4-methoxybenzophenone, linalool, 1- c e d r-8- e n-9- ylethanone, 1- tetradecanamine- n,n-dimethyl, n-oxide, limonene, diphenyl oxide, methylparaben, benzyl acetate, f d and c blue number 1, d l-tocopherol mixture, dimethyldioctadecylammonium chloride, benethonium chloride, methyl salicylate, diazolidinyl urea, phytonadione, octabenzone, quartenary ammonium compounds di-c14-18-alkyldimethyl me sulfates, behentrimonium methosulfate (right y-axis) across Childbearing, Non-Childbearing, Higher, Mid Higher, Mid Lower, Lower, Under 18, Under 13, Under 6, No Child, Post College, College, Grade and High School, White, Hispanic, Africa America, Asian, Functional Use, and Target Receptor (x-axis) for rank difference, ranging from negative 4 to 4 in increments 2, Target Receptors, namely, Androgen, Estrogen, and Other, and Functional Use, namely, fragrance, surfactant, antimicrobial, masking agent, hair conditioner, colorant, preservative, U V absorber, emollient, and Unknown.
Figure 5.
Prevalence and ranking of individual endocrine active chemicals (EACs). Heat map illustrating the ranked support for the 20 most prevalent EACs (0.1% prevalence threshold). For a demographic, green color denotes a downward shift of rank relative to the global (lower priority), and red denotes an upward shift (higher priority). Cell numbers quantify the unit change in rank relative to the global rank. Note that ranks are not comparable across demographics as a quantitative measure but are intended to suggest shifts in potential exposure for different demographics with respect to all households. Column annotations indicate demographic categories and row annotations indicate harmonized functional use of chemicals and their predicted target receptor.
Figure 6 is heatmap titled Co-occurring endocrine active chemicals rank by demographic group, plotting, from top to bottom, limonene| propylparaben, propylparaben| methylparaben| ethylparaben, propylparaben| f d and c blue number 1, limonene| f d and c blue number 1, limonene| propylparaben| f d and c blue number 1, diphenyl oxide| linalool, 2-hydroxy-4-methoxybenzophenone| propylparaben| benzophenone, d l-tocopherol mixture| phytonadione, decamethylcyclopentasiloxane| propylparaben, 2-hydroxy-4-methoxybenzophenone| methylparaben| ethylparaben| benzophenone, 2-hydroxy-4-methoxybenzophenone| propylparaben| methylparaben| ethylparaben| benzophenone, decamethylcyclopentasiloxane| 2-hydroxy-4-methoxybenzophenone| benzophenone, decamethylcyclopentasiloxane| linalool, diazolidinyl urea| propylparaben, 1- c e d r-8- e n-9- ylethanone| decamethylcyclopentasiloxane, 2-hydroxy-4-methoxybenzophenone| linalool| benzophenone, linalool| limonene, linalool| 2-phenylethanol, 1- c e d r-8- e n-9- ylethanone| propylparaben, decamethylcyclopentasiloxane| limonene (left y-axis), across, from right to left, Childbearing, Non-Childbearing, Higher, Mid Higher, Mid Lower, Lower, Under 18, Under 13, Under 6, No Child, Post College, College, Grade and High School, White, Hispanic, Africa America, Asian, Fragrance, Colorant, U V absorber, masking agent, unknown, emollient, preservative, Estrogen Disruptor, Androgen Disruptor, and Other (x-axis) for rank difference, ranging from negative 15 to 5 in increments of 5, Number of Chemicals with Receptor Activity, ranging from 0 to 3 in unit increments, and Number of chemicals with functions, ranging from 0 to 3 in unit increments.
Figure 6.
Ranking of co-occurring endocrine active chemicals (EACs). Heat map illustrating the ranked support for the 20 most prevalent EAC combinations. For a demographic, green color denotes a downward shift of rank relative to the global (lower priority), and red denotes an upward shift (higher priority). Cell numbers quantify the unit change in rank relative to the global rank. Note that ranks are not comparable across demographics as a quantitative measure but are intended to suggest shifts in potential exposure for different demographics with respect to all households. Column annotations indicate demographic categories and row annotations indicate harmonized functional use of chemicals and their predicted target receptors.

Comment in

References

    1. Berger E, Potouridis T, Haeger A, Püttmann W, Wagner M. 2015. Effect-directed identification of endocrine disruptors in plastic baby teethers. J Appl Toxicol 35(11):1254–1261, PMID: 25988240, 10.1002/jat.3159. - DOI - PubMed
    1. Borgelt C. 2012. Frequent item set mining. Wires Data Mining Knowl Discov 2(6):437–456, 10.1002/widm.1074. - DOI
    1. Branch F, Woodruff TJ, Mitro SD, Zota AR. 2015. Vaginal douching and racial/ethnic disparities in phthalates exposures among reproductive-aged women: National Health and Nutrition Examination Survey 2001–2004. Environ Health 14:57, PMID: 26174070, 10.1186/s12940-015-0043-6. - DOI - PMC - PubMed
    1. Colborn T, Vom Saal FS, Soto AM. 1993. Developmental effects of endocrine-disrupting chemicals in wildlife and humans. Environ Health Perspect 101(5):378–384, PMID: 8080506, 10.1289/ehp.93101378. - DOI - PMC - PubMed
    1. Davis DL, Bradlow HL, Wolff M, Woodruff T, Hoel DG, Anton-Culver H. 1993. Medical hypothesis: xenoestrogens as preventable causes of breast cancer. Environ Health Perspect 101(5):372–377, PMID: 8119245, 10.1289/ehp.93101372. - DOI - PMC - PubMed

Publication types

LinkOut - more resources