Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2018 Jan 20;35(1):8-17.
doi: 10.4274/balkanmedj.2017.0966. Epub 2017 Sep 13.

Patient Privacy in the Era of Big Data

Affiliations
Review

Patient Privacy in the Era of Big Data

Mehmet Kayaalp. Balkan Med J. .

Abstract

Privacy was defined as a fundamental human right in the Universal Declaration of Human Rights at the 1948 United Nations General Assembly. However, there is still no consensus on what constitutes privacy. In this review, we look at the evolution of privacy as a concept from the era of Hippocrates to the era of social media and big data. To appreciate the modern measures of patient privacy protection and correctly interpret the current regulatory framework in the United States, we need to analyze and understand the concepts of individually identifiable information, individually identifiable health information, protected health information, and de-identification. The Privacy Rule of the Health Insurance Portability and Accountability Act defines the regulatory framework and casts a balance between protective measures and access to health information for secondary (scientific) use. The rule defines the conditions when health information is protected by law and how protected health information can be de-identified for secondary use. With the advents of artificial intelligence and computational linguistics, computational text de-identification algorithms produce de-identified results nearly as well as those produced by human experts, but much faster, more consistently and basically for free. Modern clinical text de-identification systems now pave the road to big data and enable scientists to access de-identified clinical information while firmly protecting patient privacy. However, clinical text de-identification is not a perfect process. In order to maximize the protection of patient privacy and to free clinical and scientific information from the confines of electronic healthcare systems, all stakeholders, including patients, health institutions and institutional review boards, scientists and the scientific communities, as well as regulatory and law enforcement agencies must collaborate closely. On the one hand, public health laws and privacy regulations define rules and responsibilities such as requesting and granting only the amount of health information that is necessary for the scientific study. On the other hand, developers of de-identification systems provide guidelines to use different modes of operations to maximize the effectiveness of their tools and the success of de-identification. Institutions with clinical repositories need to follow these rules and guidelines closely to successfully protect patient privacy. To open the gates of big data to scientific communities, healthcare institutions need to be supported in their de-identification and data sharing efforts by the public, scientific communities, and local, state, and federal legislators and government agencies.

Keywords: Health Insurance Portability and Accountability Act; confidentiality; data anonymization data sharing; medical informatics; personally identifiable information privacy..

PubMed Disclaimer

Figures

Figure 1a
Figure 1a. Relationship between the set C of all elementary clinical information and the set PID of all elementary personal identifiers, Their subsets are health information with no personal identifiers H, demographic information and clinical personal identifiers D, non-clinical personal identifiers P, and three hypothetical records R1, R2, and R3.
Figure 1b
Figure 1b. Relationship among H, D, and P and the representation of three hypothetical records using the set notation.
Figure 1c
Figure 1c. Protected health information (PHI) is the intersection of health information (HI) and personally identifying information (PII). Members of all sets are compound information such as R1 a clinical report with personal identifiers, R2 a de-identified clinical report, and R3 a table of personal identifiers with no clinical connections.
Figure 2
Figure 2. Graphical representation of HIPAA Privacy Rule de-identification methods. Source: Office of Civil Rights, Department of Health and Human Services (39). HIPAA: Health Insurance Portability and Accountability Ac

Similar articles

Cited by

References

    1. Jones WHS, editor. Loeb Classical Library. Reprint: Hippocrates Collected Works I. Hippocrates ed. Cambridge: MA: Harvard University Press; Jusjurandum (The Oath) p. 1868.
    1. Higgins GL. The history of confidentiality in medicine: the physician-patient relationship. Can Fam Physician. 1989;35:921–6. - PMC - PubMed
    1. Parent WA. Recent work on the concept of privacy. Am Philos Q. 1983;20:341–55.
    1. Heins M. “The Right to Be Let Alone”: Privacy and Anonymity at the U.S. Supreme Court. Revue Française D’études Américaines. 2010:54–72.
    1. Coke E., editor. . Semayne's case. In: Court of King's Bench. 5 Co Rep 91a, 77 Eng Rep 1941604: Semayne's case. In: Court of King's Bench.

Publication types