Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2017 Aug;137(8):e153-e158.
doi: 10.1016/j.jid.2017.04.019.

Research Techniques Made Simple: An Introduction to Use and Analysis of Big Data in Dermatology

Affiliations
Review

Research Techniques Made Simple: An Introduction to Use and Analysis of Big Data in Dermatology

Mackenzie R Wehner et al. J Invest Dermatol. 2017 Aug.

Abstract

Big data is a term used for any collection of datasets whose size and complexity exceeds the capabilities of traditional data processing applications. Big data repositories, including those for molecular, clinical, and epidemiology data, offer unprecedented research opportunities to help guide scientific advancement. Advantages of big data can include ease and low cost of collection, ability to approach prospectively and retrospectively, utility for hypothesis generation in addition to hypothesis testing, and the promise of precision medicine. Limitations include cost and difficulty of storing and processing data; need for advanced techniques for formatting and analysis; and concerns about accuracy, reliability, and security. We discuss sources of big data and tools for its analysis to help inform the treatment and management of dermatologic diseases.

PubMed Disclaimer

Conflict of interest statement

Conflict of Interest: Dr. Asgari has received research funding to her institution from Pfizer Inc. and Valeant Pharmaceuticals, but these associations have not influenced our work on this paper. The authors have no other potential conflicts of interest to disclose.

Figures

Figure 1
Figure 1
The 3 V’s of big data: volume (amount of data), velocity (speed at which data is generated), and variety (number of types of data), all of which have been growing rapidly. After “The 3Vs that define Big Data,” Diya Soubra, Data Science Central, http://www.datasciencecentral.com/forum/topics/the-3vs-that-define-big-data.
Figure 2
Figure 2
Logarithmic scale depicting volume of big data.
Figure 3
Figure 3
Hypothetical example illustrating the utility of decision-tree learning for melanoma mortality prediction showing “leaves” (independent variables) such as tumor thickness, ulceration, and tumor location, and probability of survival (outcome).

References

    1. Asgari MM, Wang W, Ioannidis NM, Itnyre J, Hoffmann T, Jorgenson E, et al. Identification of Susceptibility Loci for Cutaneous Squamous Cell Carcinoma. The Journal of investigative dermatology. 2016;136:930–7. - PMC - PubMed
    1. Birch P. Powering geospatial analysis: public geo datasets now on Google Cloud. 2016 < https://cloudplatform.googleblog.com/2016/10/powering-geospatial-analysi...> Accessed.
    1. Borah BJ. [Accessed December 14 2016];Optum Labs Overview. 2016 <https:// http://www.allianceforclinicaltrialsinoncology.org/main/cmsfile?cmsPath=...>.
    1. Davis RL, Gallagher MA, Asgari MM, Eide MJ, Margolis DJ, Macy E, et al. Identification of Stevens-Johnson syndrome and toxic epidermal necrolysis in electronic health record databases. Pharmacoepidemiology and drug safety. 2015;24:684–92. - PMC - PubMed
    1. Eide MJ, Tuthill JM, Krajenta RJ, Jacobsen GR, Levine M, Johnson CC. Validation of claims data algorithms to identify nonmelanoma skin cancer. The Journal of investigative dermatology. 2012;132:2005–9. - PMC - PubMed

Publication types

LinkOut - more resources