Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2024 Jan 11:2023:1175-1182.
eCollection 2023.

Outliers in diagnosis ratios: A clue toward possibly absent data

Affiliations

Outliers in diagnosis ratios: A clue toward possibly absent data

Dmitry Morozyuk et al. AMIA Annu Symp Proc. .

Abstract

The evaluation of completeness of real-world data is a particularly challenging component of data quality assessment because the degree of truly versus erroneously absent data is unknown. Among inpatient data sets, while absolute counts of admissions having specific categories of diagnoses in the principal or any position may vary depending on hospital size, we hypothesized that the ratio of these parameters will be preserved across sites, with outliers suggesting the potential for erroneously absent data. For several categories of clinical conditions assigned to inpatient admissions, we analyzed the ratio of their recording as the principal diagnosis versus any diagnosis across several hospitals and compared the ratios against a national benchmark. Our analysis showed ratios that matched clinical expectations, with reasonable preservation of ratios across sites. However, some conditions exhibited more variability in the ratios and some sites had many outliers possibly reflecting data quality issues that warrant further attention.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Diagnosis ratios by condition and by site for 2018
Figure 2a.
Figure 2a.
Diagnosis ratios for Diabetes by site 2018-2020
Figure 2b.
Figure 2b.
Diagnosis ratios for COPD by site 2018-2020
Figure 2c.
Figure 2c.
Diagnosis ratios for Stroke by site 2018-2020
Figure 2d.
Figure 2d.
Diagnosis ratios for CHF by site 2018-2020
Figure 2e.
Figure 2e.
Diagnosis ratios for MI by site 2018-2020

References

    1. Kahn MG, Callahan TJ, Barnard J, Bauck AE, Brown J, Davidson BN, Estiri H, Goerg C, Holve E, Johnson SG, Liaw ST, Hamilton-Lopez M, Meeker D, Ong TC, Ryan P, Shang N, Weiskopf NG, Weng C, Zozus MN, Schilling L. A Harmonized Data Quality Assessment Terminology and Framework for the Secondary Use of Electronic Health Record Data. EGEMS (Wash DC) 2016 Sep 11;4(1):1244. doi: 10.13063/2327-9214.1244. PMID: 27713905; PMCID: PMC5051581. - PMC - PubMed
    1. Hersh WR, Cimino J, Payne PR, Embi P, Logan J, Weiner M, Bernstam EV, Lehmann H, Hripcsak G, Hartzog T, Saltz J. Recommendations for the use of operational electronic health record data in comparative effectiveness research. EGEMS (Wash DC) 2013 Oct 8;1(1):1018. doi: 10.13063/2327-9214.1018. PMID: 25848563; PMCID: PMC4371471. - PMC - PubMed
    1. Weiskopf NG, Hripcsak G, Swaminathan S, Weng C. Defining and measuring completeness of electronic health records for secondary use. J Biomed Inform. 2013 Oct;46(5):830–6. doi: 10.1016/j.jbi.2013.06.010. Epub 2013 Jun 29. PMID: 23820016; PMCID: PMC3810243. - PMC - PubMed
    1. Smith M, Lix LM, Azimaee M, Enns JE, Orr J, Hong S, Roos LL. Assessing the quality of administrative data for research: a framework from the Manitoba Centre for Health Policy. Journal of the American Medical Informatics Association. March 2018;Volume 25(Issue 3):Pages 224–229. doi: 10.1093/jamia/ocx078. - DOI - PMC - PubMed
    1. Hossein Estiri, Kari A Stephens, Jeffrey G Klann, Shawn N Murphy. Exploring completeness in clinical data research networks with DQe-c. Journal of the American Medical Informatics Association. January 2018;Volume 25(Issue 1):Pages 17–24. doi: 10.1093/jamia/ocx109. - DOI - PMC - PubMed

LinkOut - more resources