Detecting departures from the conditional independence assumption in diagnostic latent class models: a simulation study

doi:10.1186/s12874-024-02432-x

. 2024 Dec 5;24(1):299.

doi: 10.1186/s12874-024-02432-x.

Detecting departures from the conditional independence assumption in diagnostic latent class models: a simulation study

Yasin Okkaoglu¹, Nicky J Welton², Hayley E Jones²

Affiliations

¹ Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK. yasin.okkaoglu@bristol.ac.uk.
² Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK.

PMID: 39639189
PMCID: PMC11619692
DOI: 10.1186/s12874-024-02432-x

Detecting departures from the conditional independence assumption in diagnostic latent class models: a simulation study

Yasin Okkaoglu et al. BMC Med Res Methodol. 2024.

. 2024 Dec 5;24(1):299.

doi: 10.1186/s12874-024-02432-x.

Authors

Yasin Okkaoglu¹, Nicky J Welton², Hayley E Jones²

Affiliations

¹ Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK. yasin.okkaoglu@bristol.ac.uk.
² Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK.

PMID: 39639189
PMCID: PMC11619692
DOI: 10.1186/s12874-024-02432-x

Abstract

Background: Latent class models can be used to estimate diagnostic accuracy without a gold standard test. Early studies often assumed independence between tests given the true disease state, however this can lead to biased estimates when there are inter-test dependencies. Residual correlation plots and chi-squared statistics have been commonly utilized to assess the validity of the conditional independence assumption and, when it does not hold, identify which test pairs are conditionally dependent. We aimed to assess the performance of these tools with a simulation study covering a wide range of scenarios.

Methods: We generated data sets from a model with four tests and a dependence between tests 1 and 2 within the diseased group. We varied sample size, prevalence, covariance, sensitivity and specificity, with 504 combinations of these in total, and 1000 data sets for each combination. We fitted the conditional independence model in a Bayesian framework, and reported absolute bias, coverage, and how often the residual correlation plots, $G^{2}$ and $χ^{2}$ statistics indicated lack-of-fit globally or for each test pair.

Results: Across all settings, residual correlation plots, pairwise $G^{2}$ and $χ^{2}$ detected the correct correlated pair of tests only 12.1%, 10.3%, and 10.3% of the time, respectively, but incorrectly suggested dependence between tests 3 and 4 64.9%, 49.7%, and 49.5% of the time. We observed some variation in this across parameter settings, with these tools appearing to perform more as intended when tests 3 and 4 were both much more accurate than tests 1 and 2. Residual correlation plots, $G^{2}$ and $χ^{2}$ statistics identified a lack of overall fit in 74.3%, 64.5% and 67.5% of models, respectively. The conditional independence model tended to overestimate the sensitivities of the correlated tests (median bias across all scenarios 0.094, 2.5th and 97.5th percentiles -0.003, 0.397) and underestimate prevalence and the specificities of the uncorrelated tests.

Conclusions: Residual correlation plots and chi-squared statistics cannot be relied upon to identify which tests are conditionally dependent, and also have relatively low power to detect lack of overall fit. This is important since failure to account for conditional dependence can lead to highly biased parameter estimates.

Keywords: Conditional independence; Diagnostic accuracy; Goodness of fit; Latent class model; Model selection; Residual correlation plots.

PubMed Disclaimer

Conflict of interest statement

Declarations. Ethics approval and consent to participate: Not applicable. Consent for publication: Not applicable. Competing interests: The authors declare no competing interests.

Figures

**Fig. 1**
Bias jitter plots for the combinations with sensitivity = HHHH and varying specificities ( $π = 0.5$ , $ω = 0.5$ , $n_{obs} = 2000$ ). The grey points represent median bias, and the error bars show 2.5th and 97.5th percentiles. The blue, pink, and green points represent bias in prevalence, sensitivity, and specificity estimates, respectively

**Fig. 2**
Bias jitter plots for the combinations with sensitivity = LLHH and varying specificities ( $π = 0.5$ , $ω = 0.5$ , $n_{obs} = 2000$ ). The grey points represent median bias, and the error bars show 2.5th and 97.5th percentiles. The blue, pink, and green points represent bias in prevalence, sensitivity, and specificity estimates, respectively

See this image and copyright information in PMC

References

1. Hui SL, Walter SD. Estimating the error rates of diagnostic-tests. Biometrics. 1980;36(1):167–71. - PubMed
1. Young MA. Evaluating diagnostic criteria: a latent class paradigm. J Psychiatr Res. 1982;17(3):285–96. - PubMed
1. Walter SD, Irwig LM. Estimation of test error rates, disease prevalence and relative risk from misclassified data - a review. J Clin Epidemiol. 1988;41(9):923–37. - PubMed
1. Dendukuri N, Joseph L. Bayesian approaches to modeling the conditional dependence between multiple diagnostic tests. Biometrics. 2001;57(1):158–67. - PubMed
1. Vacek PM. The effect of conditional dependence on the evaluation of diagnostic-tests. Biometrics. 1985;41(4):959–68. - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

MR/T044594/1/MRC-NIHR New Investigator Research Grant

LinkOut - more resources

Full Text Sources
- BioMed Central
- PubMed Central

[1] Hui SL, Walter SD. Estimating the error rates of diagnostic-tests. Biometrics. 1980;36(1):167–71. - PubMed

[2] Hui SL, Walter SD. Estimating the error rates of diagnostic-tests. Biometrics. 1980;36(1):167–71. - PubMed

[3] Young MA. Evaluating diagnostic criteria: a latent class paradigm. J Psychiatr Res. 1982;17(3):285–96. - PubMed

[4] Young MA. Evaluating diagnostic criteria: a latent class paradigm. J Psychiatr Res. 1982;17(3):285–96. - PubMed

[5] Walter SD, Irwig LM. Estimation of test error rates, disease prevalence and relative risk from misclassified data - a review. J Clin Epidemiol. 1988;41(9):923–37. - PubMed

[6] Walter SD, Irwig LM. Estimation of test error rates, disease prevalence and relative risk from misclassified data - a review. J Clin Epidemiol. 1988;41(9):923–37. - PubMed

[7] Dendukuri N, Joseph L. Bayesian approaches to modeling the conditional dependence between multiple diagnostic tests. Biometrics. 2001;57(1):158–67. - PubMed

[8] Dendukuri N, Joseph L. Bayesian approaches to modeling the conditional dependence between multiple diagnostic tests. Biometrics. 2001;57(1):158–67. - PubMed

[9] Vacek PM. The effect of conditional dependence on the evaluation of diagnostic-tests. Biometrics. 1985;41(4):959–68. - PubMed

[10] Vacek PM. The effect of conditional dependence on the evaluation of diagnostic-tests. Biometrics. 1985;41(4):959–68. - PubMed

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Detecting departures from the conditional independence assumption in diagnostic latent class models: a simulation study

Affiliations

Detecting departures from the conditional independence assumption in diagnostic latent class models: a simulation study

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

References

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources