Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2011:2011:19-27.
Epub 2011 Oct 22.

Methods to identify standard data elements in clinical and public health forms

Affiliations

Methods to identify standard data elements in clinical and public health forms

Neil F Abernethy et al. AMIA Annu Symp Proc. 2011.

Abstract

The fragmentation of clinical and public health systems results in divergent information collection practices, presenting challenges to standardization and EHR certification efforts. Data forms employed in public health jurisdictions nationwide reflect these differences in patient treatment, monitoring and evaluation, and follow-up, presenting challenges for data integration. To study these variations, we surveyed tuberculosis contact investigation forms from all fifty states, three municipalities and two countries. We apply statistics and cluster analysis to analyze the divergent content of contact investigation forms with the goal of characterizing normative practices and identifying a common core of data fields. We found widespread variation in data elements between states in the study, with the "Name" field being the only ubiquitous data element. Our method reveals distinct groupings of data fields employed in certain regions, allowing the simultaneous identification of core standard data fields as well as variations in practice.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Summary of national contact investigation form composition. (A) This chart depicts the percentage of fields from each category on an average form (excluding Comments). Identification fields comprise the largest portion. (B) This chart depicts the average frequency in 60 forms of fields from each category (excluding Comments). Testing fields are very likely to be present and hence are more standardized. (C) The frequency of each field. The X axis is the frequency of the field among 60 forms; Category and field names are shown on the Y axis. The distribution of field frequencies reflects the wide diversity observed in forms. Fields are ranked in order of descending frequency within categories.
Figure 2.
Figure 2.
This is a hierarchical clustering of the form elements using complete-linkage (furthest-neighbor) and the Manhattan distance metric. This dendrogram identifies states with similar forms that could more easily share data or standardize their data models.
Figure 3.
Figure 3.
A simultaneous hierarchical clustering of states (row) and fields (columns). Presence/absence of a field is shown with a red (1) or white (0) cell. The core group of fields segregates clearly on the right. Group 1 and Group 2 states which often cluster together are shown on the left. Recurring motifs of fields are highlighted on the field dendrogram at the top: History (shaded blue), Treatment monitoring (shaded yellow) and Identity and Relationship (shaded grey). In turn, the grouping of states using these motifs is shown with bounded boxes in the same color scheme.
Figure 4.
Figure 4.
(A) A scatterplot of contact investigation form complexity vs. TB case rate (states only) shows a very weak correlation (slope 0.87; 95% CI [0.24–1.5]; R2=0.11). However, in (B) we see a greater apparent correlation with date of form revision for the subset of forms for which this data was available. The correlation is positive within a 95% confidence interval in both cases (slope = 0.39; 95% CI = [0.05 – .0.73]; R2=0.21), however the explanatory variables of case rate and form date of revision account for a small amount of the total variation in form complexity as measured by the total number of form fields.

Similar articles

Cited by

References

    1. Elimination DoT, editor. Centers for Disease Control and Prevention; 2003. Tuberculosis information management system user’s Guide, Version 1.2.
    1. Status of State Electronic Disease Surveillance Systems --- United States, 2007. Morbidity and Mortality Weekly Report. 2009 Jul 31;58(29):804–807. 2009; - PubMed
    1. Response to the Request for Information on the Development and Adoption of a National Health Information Network from the ONCHIT, DHHS. Baltimore, MD: Public Health Data Standards Consortium; Nov 15, 2004. 2005.
    1. Pina J, Turner A, Kwan-Gett T, Duchin J. Task analysis in action: the role of information systems in communicable disease reporting. 2009. - PMC - PubMed
    1. O’Carroll PW. Public health informatics and information systems. Springer Verlag; 2003.

Publication types

LinkOut - more resources