Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 May;49(5):1470-5.
doi: 10.1093/ejcts/ezv385. Epub 2015 Dec 5.

The European thoracic data quality project: An Aggregate Data Quality score to measure the quality of international multi-institutional databases

Affiliations

The European thoracic data quality project: An Aggregate Data Quality score to measure the quality of international multi-institutional databases

Michele Salati et al. Eur J Cardiothorac Surg. 2016 May.

Abstract

Objectives: To describe the methodology for the development of data quality metrics in multi-institutional databases, deriving a cumulative data quality score [Aggregate Data Quality score (ADQ)]. The ESTS database was used to create and apply the metrics. The Units contributing to the ESTS database were ranked for the quality of data uploaded using the ADQ.

Methods: We analysed data obtained from 96 Units contributing with at least 100 major lung resections (January 2007 to December 2014). The Units were anonymized assigning a casual numeric code. The following metrics were developed for measuring the data quality of each Unit: (i) record Completeness (COM); rate of present variables on 16 expected variables for all the records uploaded [1 - ('null values'/total expected values for the Unit) × 100, the concept of 'null value' was defined for each variable]; (ii) record Reliability (REL); rate of consistent checks on 9 checks tested for all the records uploaded [1 - (valid controls/total possible controls for the Unit) × 100, specific reliability control queries were defined]. These two metrics were rescaled using the mean and standard deviation of the entire dataset and summed, obtaining: (iii) ADQ score: [COM rescaled + REL rescaled]; it measures the cumulative data quality of a given dataset. The ADQ was used to rank the contributors.

Results: The COM of ESTS database contributors varied from 98.6 to 43% and the REL from 100 to 69%. Combining the rescaled metrics, the obtained ADQ ranged between 2.67 (highest data quality) and -7.85 (lowest data quality). Comparing the rating using just the COM value to the one obtained using the ADQ, 93% of Units changed their position. The major change was the drop of 66 positions considering the ADQ list.

Conclusions: We described a reproducible method for data quality assessment in clinical multi-institutional databases. The ADQ is a unique indicator able to describe data quality and to compare it among centres. It has the potential of objectively guiding projects of data quality management and improvement.

Keywords: Data quality; Database management systems; Quality indicators; Registry.

PubMed Disclaimer

LinkOut - more resources