Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Jul;108(1):174-9.
doi: 10.1016/j.radonc.2012.09.019. Epub 2013 Feb 5.

Benefits of a clinical data warehouse with data mining tools to collect data for a radiotherapy trial

Affiliations

Benefits of a clinical data warehouse with data mining tools to collect data for a radiotherapy trial

Erik Roelofs et al. Radiother Oncol. 2013 Jul.

Abstract

Introduction: Collecting trial data in a medical environment is at present mostly performed manually and therefore time-consuming, prone to errors and often incomplete with the complex data considered. Faster and more accurate methods are needed to improve the data quality and to shorten data collection times where information is often scattered over multiple data sources. The purpose of this study is to investigate the possible benefit of modern data warehouse technology in the radiation oncology field.

Material and methods: In this study, a Computer Aided Theragnostics (CAT) data warehouse combined with automated tools for feature extraction was benchmarked against the regular manual data-collection processes. Two sets of clinical parameters were compiled for non-small cell lung cancer (NSCLC) and rectal cancer, using 27 patients per disease. Data collection times and inconsistencies were compared between the manual and the automated extraction method.

Results: The average time per case to collect the NSCLC data manually was 10.4 ± 2.1 min and 4.3 ± 1.1 min when using the automated method (p<0.001). For rectal cancer, these times were 13.5 ± 4.1 and 6.8 ± 2.4 min, respectively (p<0.001). In 3.2% of the data collected for NSCLC and 5.3% for rectal cancer, there was a discrepancy between the manual and automated method.

Conclusions: Aggregating multiple data sources in a data warehouse combined with tools for extraction of relevant parameters is beneficial for data collection times and offers the ability to improve data quality. The initial investments in digitizing the data are expected to be compensated due to the flexibility of the data analysis. Furthermore, successive investigations can easily select trial candidates and extract new parameters from the existing databases.

Keywords: Clinical trials; Data quality; Data warehouse; Efficiency.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Schematic overview of the CAT data warehouse/research portal. The system synchronizes data from clinical data sources and custom services. It is also capable of collecting data for trials and data collected for other research purposes. For data export, several modules exist in the system and are easily accessible by web-technology (i.e. the patient browser, query builder and an electronic case report form XML export).
Fig. 2
Fig. 2
Average manual versus CAT collection times (in min) for the (a) NSCLC and (b) rectum cases. The parameters that were looked up in the EMR and R&V system are displayed in medium grey and labelled “Lookup”. In dark grey (labelled “Recalc.”), the parameters are shown that were recalculated. The error bars show the standard deviations. For the rectum cases, the collection times for SUV data only show the large variability in the contribution to the recalculated parameters (in light grey and labelled “SUV”).

References

    1. Klein A, Prokosch HU, Muller M, Ganslandt T. Experiences with an interoperable data acquisition platform for multi-centric research networks based on HL7 CDA. Methods Inf Med. 2007;46:580–5. - PubMed
    1. Kush R, Alschuler L, Ruggeri R, et al. Implementing single source: the STARBRITE proof-of-concept study. J Am Med Inform Assoc. 2007;14:662–73. - PMC - PubMed
    1. Knaup P, Garde S, Merzweiler A, et al. Towards shared patient records: an architecture for using routine data for nationwide research. Int J Med Inform. 2006;75:191–200. - PubMed
    1. Roelofs E, Persoon L, Qamhiyeh S, et al. Design of and technical challenges involved in a framework for multicentric radiotherapy treatment planning studies. Radiother Oncol. 2010;97:567–71. - PubMed
    1. Bosmans G, Buijsen J, Dekker A, et al. An “in silico” clinical trial comparing free breathing, slow and respiration correlated computed tomography in lung cancer patients. Radiother Oncol. 2006;81:73–80. - PubMed

Publication types