Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Mar:4:184-200.
doi: 10.1200/CCI.19.00047.

Systematic Review of Privacy-Preserving Distributed Machine Learning From Federated Databases in Health Care

Affiliations

Systematic Review of Privacy-Preserving Distributed Machine Learning From Federated Databases in Health Care

Fadila Zerka et al. JCO Clin Cancer Inform. 2020 Mar.

Abstract

Big data for health care is one of the potential solutions to deal with the numerous challenges of health care, such as rising cost, aging population, precision medicine, universal health coverage, and the increase of noncommunicable diseases. However, data centralization for big data raises privacy and regulatory concerns.Covered topics include (1) an introduction to privacy of patient data and distributed learning as a potential solution to preserving these data, a description of the legal context for patient data research, and a definition of machine/deep learning concepts; (2) a presentation of the adopted review protocol; (3) a presentation of the search results; and (4) a discussion of the findings, limitations of the review, and future perspectives.Distributed learning from federated databases makes data centralization unnecessary. Distributed algorithms iteratively analyze separate databases, essentially sharing research questions and answers between databases instead of sharing the data. In other words, one can learn from separate and isolated datasets without patient data ever leaving the individual clinical institutes.Distributed learning promises great potential to facilitate big data for medical application, in particular for international consortiums. Our purpose is to review the major implementations of distributed learning in health care.

PubMed Disclaimer

Conflict of interest statement

Fadila Zerka

Employment: Oncoradiomics

Research Funding: PREDICT

Samir Barakat

Employment: PtTheragnostic

Leadership: PtTheragnostic

Sean Walsh

Employment: Oncoradiomics

Leadership: Oncoradiomics

Stock and Other Ownership Interests: Oncoradiomics

Research Funding: Varian Medical Systems (Inst)

Ralph T. H. Leijenaar

Employment: Oncoradiomics

Leadership: Oncoradiomics

Stock and Other Ownership Interests: Oncoradiomics

Patents, Royalties, Other Intellectual Property: Image analysis method supporting illness development prediction for a neoplasm in a human or animal body (PCT/NL2014/050728)

Arthur Jochems

Stock and Other Ownership Interests: Oncoradiomics, Medical Cloud Company

Benjamin Miraglio

Employment: OncoRadiomics

Philippe Lambin

Employment: Convert Pharmaceuticals

Leadership: DNAmito

Stock and Other Ownership Interests: BHV, Oncoradiomics, Convert Pharmaceuticals, The Medical Cloud Company

Honoraria: Varian Medical

Consulting or Advisory Role: BHV, Oncoradiomics

Research Funding: ptTheragnostic

Patents, Royalties, Other Intellectual Property: Co-inventor of two issued patents with royalties on radiomics (PCT/NL2014/050248, PCT/NL2014/050728) licensed to Oncoradiomics and one issued patent on mtDNA (PCT/EP2014/059089) licensed to ptTheragnostic/DNAmito, three nonpatentable inventions (software) licensed to ptTheragnostic/DNAmito, Oncoradiomics, and Health Innovation Ventures.

Travel, Accommodations, Expenses: ptTheragnostic, Elekta, Varian Medical

David Townend

Consulting or Advisory Role: Newron Pharmaceuticals (I)

No other potential conflicts of interest were reported.

Figures

FIG 1.
FIG 1.
Relationship between artificial intelligence, machine learning, and deep learning.
FIG 2.
FIG 2.
Schematic representation of the processes in a transparent distributed learning network. (A) Data preparation steps. (B) Distributed learning network, which is composed of three hospitals, each of which is equipped with a learning machine that can communicate with a master machine responsible for sending model parameters and checking convergence criteria. (C) Flowchart of the distributed learning network described in B. (D) Example of an action that can be tracked by blockchain (designed and implemented according to needs agreed among network members) and keep all network participants aware of any new activity taken in the network. DB, database; FAIR, findable, accessible, interoperable, reusable.
FIG 3.
FIG 3.
Description of findable, accessible, interoperable, reusable (FAIR) principles.
FIG 4.
FIG 4.
Visual representation of blockchain. Adapted from Rennock et al.
FIG A1.
FIG A1.
Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) 2009 flow diagram.

References

    1. Mitchell TM: Machine Learning International ed., [Reprint.]. New York, NY, McGraw-Hill, 1997.
    1. Boyd S, Parikh N, Chu E, et al: Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends in Machine Learning 3:1-122, 2010.
    1. Cardoso I, Almeida E, Allende-Cid H, et al: Analysis of machine learning algorithms for diagnosis of diffuse lung diseases. Methods Inf Med 57:272-279, 2018. - PubMed
    1. Wang X, Peng Y, Lu L, et al: ChestX-Ray8: Hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. Presented at 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, HI, July 21-26, 2017.
    1. Ding Y, Sohn JH, Kawczynski MG, et al. A deep learning model to predict a diagnosis of Alzheimer disease by using 18F-FDG PET of the brain. Radiology. 2019;290:456–464. - PMC - PubMed

Publication types