Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Mar;70(3):203-10.
doi: 10.1136/oemed-2012-100918. Epub 2012 Nov 15.

Inside the black box: starting to uncover the underlying decision rules used in a one-by-one expert assessment of occupational exposure in case-control studies

Affiliations

Inside the black box: starting to uncover the underlying decision rules used in a one-by-one expert assessment of occupational exposure in case-control studies

David C Wheeler et al. Occup Environ Med. 2013 Mar.

Abstract

Objectives: Evaluating occupational exposures in population-based case-control studies often requires exposure assessors to review each study participant's reported occupational information job-by-job to derive exposure estimates. Although such assessments likely have underlying decision rules, they usually lack transparency, are time consuming and have uncertain reliability and validity. We aimed to identify the underlying rules to enable documentation, review and future use of these expert-based exposure decisions.

Methods: Classification and regression trees (CART, predictions from a single tree) and random forests (predictions from many trees) were used to identify the underlying rules from the questionnaire responses, and an expert's exposure assignments for occupational diesel exhaust exposure for several metrics: binary exposure probability and ordinal exposure probability, intensity and frequency. Data were split into training (n=10 488 jobs), testing (n=2247) and validation (n=2248) datasets.

Results: The CART and random forest models' predictions agreed with 92-94% of the expert's binary probability assignments. For ordinal probability, intensity and frequency metrics, the two models extracted decision rules more successfully for unexposed and highly exposed jobs (86-90% and 57-85%, respectively) than for low or medium exposed jobs (7-71%).

Conclusions: CART and random forest models extracted decision rules and accurately predicted an expert's exposure decisions for the majority of jobs, and identified questionnaire response patterns that would require further expert review if the rules were applied to other jobs in the same or different study. This approach makes the exposure assessment process in case-control studies more transparent, and creates a mechanism to efficiently replicate exposure decisions in future studies.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Illustrative decision tree for the classification of 100 jobs by diesel exhaust exposure. The terminal nodes at the bottom of the tree are leaves with labels for exposure classification (0 = unexposed, 1 = exposed), number of jobs in leaf, and percent agreement of tree-based classifications with exposure status assigned by an expert.
Figure 2
Figure 2
CART decision tree classifying jobs into exposed (0) and unexposed (1) categories. The labels in each leaf, in order, are the predicted exposure category, the number of jobs in the leaf, and the percent of predictions in the leaf that agree with the expert estimate. Variables from the occupational history are designated OH; variables from the modules are designated M.
Figure 3
Figure 3
CART prediction errors in the validation data set as the size of the training set varies for four exposure metrics: binary exposure probability, ordinal probability, intensity, and frequency of exposure. Each boxplot is based on 100 randomly selected training sets to estimate the model using all variables (complexity parameter = 0.01), with the prediction error estimated on the validation set.

References

    1. Fritschi L, Friesen MC, Glass D, et al. OccIDEAS: Retrospective occupational exposure assessment in community-based studies made easier. Journal of Environmental and Public Health. 2009:2009. - PMC - PubMed
    1. Gerin M, Siemiatycki J, Kemper H, et al. Obtaining occupational exposure histories in epidemiologic case-control studies. J Occup Med. 1985;27(6):420–6. - PubMed
    1. Stewart PA, Stewart WF, Siemiatycki J, et al. Questionnaires for collecting detailed occupational information for community-based case control studies. Am Ind Hyg Assoc J. 1998;59(1):39–44. - PubMed
    1. Kauppinen T. Exposure assessment--a challenge for occupational epidemiology. Scand J Work Environ Health. 1996;22(6):401–3. - PubMed
    1. Kromhout H. Commentary. Occupational and Environmental Medicine. 2002;59(9):594. - PMC - PubMed

Publication types

Substances