Data Integration for Heterogenous Datasets
- PMID: 25553272
- PMCID: PMC4276119
- DOI: 10.1089/big.2014.0068
Data Integration for Heterogenous Datasets
Abstract
More and more, the needs of data analysts are requiring the use of data outside the control of their own organizations. The increasing amount of data available on the Web, the new technologies for linking data across datasets, and the increasing need to integrate structured and unstructured data are all driving this trend. In this article, we provide a technical overview of the emerging "broad data" area, in which the variety of heterogeneous data being used, rather than the scale of the data being analyzed, is the limiting factor in data analysis efforts. The article explores some of the emerging themes in data discovery, data integration, linked data, and the combination of structured and unstructured data.
Figures








References
-
- Nickerson D, Rogers T. Political campaigns and big data. J Econ Perspect 2014; 28
-
- Fayyad U, Piatetsky-Shapiro G, Smyth P. The KDD process for extracting useful knowledge from volumes of data. Commun ACM 1996; 39:27–34
-
- Ferrucci D. (ed). Special issue: This is Watson. IBM J Res Dev 2012; 56
-
- Hendler J. Peta vs. Meta. Big Data 2013; 1:82–84 - PubMed
LinkOut - more resources
Full Text Sources
Other Literature Sources