Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Aug;14(4):763-771.
doi: 10.1055/a-2130-2197. Epub 2023 Jul 17.

The Detection of Date Shifting in Real-World Data

Affiliations

The Detection of Date Shifting in Real-World Data

Laura Evans et al. Appl Clin Inform. 2023 Aug.

Abstract

Objectives: Analysis of health care real-world data (RWD) provides an opportunity to observe the actual patient diagnostic, treatment, and outcome events. However, researchers should understand the possible limitations of RWD. In particular, the dates in these data may be shifted from their actual values, which might affect the validity of study conclusions.

Methods: A methodology for detecting the presence of shifted dates in RWD was developed by considering various approaches to confirm the expected occurrences of medical events, including unique temporal occurrences as well as recurring seasonal or weekday patterns in diagnoses or procedures. Diagnosis and procedure data was obtained from 71 U.S. health care data provider organizations (HCOs), members of the TriNetX global research network. Synthetic data was generated for various degrees of date shifting corresponding to the diagnoses and procedures studied, yielding the resulting patterns when various degrees of shifting (including no shift) were applied. These patterns were compared with those produced for each HCO to predict the presence and degree of date shifting. These predictions were compared with statements of date shifting by the originating HCOs to determine the predictive accuracy of the methods studied.

Results: Twenty-eight of the 71 HCOs analyzed were predicted by methodology and confirmed by their data providers to have shifted data. Likewise, 39 were predicted and confirmed to not have shifted data. With four HCOs, agreement between predicted and stated date shifting status was not obtained. The occurrence of routine medical exams, only happening during weekdays, for these U.S. HCOs was most predictive (0.92 correlation coefficient) of the presence or absence of date shifting.

Conclusion: The presence of date shifting for U.S. HCOs may be reliably detected assessing whether the routine exams should always occur on weekdays.

PubMed Disclaimer

Conflict of interest statement

The authors are either employed by TriNetX, LLC, funder of this study (Evans and Palchuk), or receive consultant compensation (London).

Figures

Fig. 1
Fig. 1
Patients treated with hydroxychloroquine show a spike at a large academic institution in 2020.
Fig. 2
Fig. 2
Weekly pattern using synthetic data. The number above each graph indicates the number of days shifted.
Fig. 3
Fig. 3
One-time drop caused by a sentinel event using synthetic data. The number above each graph indicates the number of days shifted.
Fig. 4
Fig. 4
Yearly pattern of seasonal events using synthetic data. The number above each graph indicates the number of days shifted.
Fig. 5
Fig. 5
Outline of the procedure used to detect the presence and degree of date shifting in the datasets studied.
Fig. 6
Fig. 6
Observed presence of date shifting by study methodology, which was in agreement with data provider's description of their dataset. Datasets where there was disagreement between the study's methodology and the provider on the presence of date shifting are shown as “conflict.”
Fig. 7
Fig. 7
Distribution of the magnitude of the date shift (in days) for the 28 health care organizations with confirmed observed date shifting.
Fig. 8
Fig. 8
Distribution by day of week of routine medical checkup encounters for the health care organizations studied.

References

    1. Evans L, London J W, Palchuk M B. Assessing real-world medication data completeness. J Biomed Inform. 2021;119:103847. - PubMed
    1. Kayaalp M. Patient privacy in the era of big data. Balkan Med J. 2018;35(01):8–17. - PMC - PubMed
    1. Office for Civil Rights, HHS . Standards for privacy of individually identifiable health information. Final rule. Fed Regist. 2002;67(157):53181–53273. - PubMed
    1. Kayaalp M. Modes of de-identification. AMIA Annu Symp Proc. 2018;2017:1044–1050. - PMC - PubMed
    1. Liu J, Erdal S, Silvey S A et al.Toward a fully de-identified biomedical information warehouse. AMIA Annu Symp Proc. 2009;2009:370–374. - PMC - PubMed

Publication types