Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2023 Apr 14;83(8):1175-1182.
doi: 10.1158/0008-5472.CAN-22-1274.

Challenges to Using Big Data in Cancer

Affiliations
Review

Challenges to Using Big Data in Cancer

Shawn M Sweeney et al. Cancer Res. .

Abstract

Big data in healthcare can enable unprecedented understanding of diseases and their treatment, particularly in oncology. These data may include electronic health records, medical imaging, genomic sequencing, payor records, and data from pharmaceutical research, wearables, and medical devices. The ability to combine datasets and use data across many analyses is critical to the successful use of big data and is a concern for those who generate and use the data. Interoperability and data quality continue to be major challenges when working with different healthcare datasets. Mapping terminology across datasets, missing and incorrect data, and varying data structures make combining data an onerous and largely manual undertaking. Data privacy is another concern addressed by the Health Insurance Portability and Accountability Act, the Common Rule, and the General Data Protection Regulation. The use of big data is now included in the planning and activities of the FDA and the European Medicines Agency. The willingness of organizations to share data in a precompetitive fashion, agreements on data quality standards, and institution of universal and practical tenets on data privacy will be crucial to fully realizing the potential for big data in medicine.

PubMed Disclaimer

References

    1. Institute of Medicine. The learning healthcare system: workshop summary. Washington, DC: The National Academies Press; 2007. - PubMed
    1. National Research Council. Toward precision medicine: building a knowledge network for biomedical research and a new taxonomy of disease. Washington, DC: The National Academies Press; 2011. - PubMed
    1. Mangravite LM, Sen A, Wilbanks JT, Sage Bionetworks Governance Team. Mechanisms to govern responsible sharing of open data: a progress report. 2020.
    1. European Medicines Agency (EMA). Draft guideline on registry-based studies. EMA/502388/2020. 2020.
    1. Na L, Yang C, Lo CC, Zhao F, Fukuoka Y, Aswani A. Feasibility of reidentifying individuals in large national physical activity data sets from which protected health information has been removed with use of machine learning. JAMA Netw Open 2018;1:e186040. - PMC - PubMed