Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2023 Jun 23:6:1085754.
doi: 10.3389/frai.2023.1085754. eCollection 2023.

A review of data abstraction

Affiliations
Review

A review of data abstraction

Gianluca Cima et al. Front Artif Intell. .

Abstract

It is well-known that Artificial Intelligence (AI), and in particular Machine Learning (ML), is not effective without good data preparation, as also pointed out by the recent wave of data-centric AI. Data preparation is the process of gathering, transforming and cleaning raw data prior to processing and analysis. Since nowadays data often reside in distributed and heterogeneous data sources, the first activity of data preparation requires collecting data from suitable data sources and data services, often distributed and heterogeneous. It is thus essential that providers describe their data services in a way to make them compliant with the FAIR guiding principles, i.e., make them automatically Findable, Accessible, Interoperable, and Reusable (FAIR). The notion of data abstraction has been introduced exactly to meet this need. Abstraction is a kind of reverse engineering task that automatically provides a semantic characterization of a data service made available by a provider. The goal of this paper is to review the results obtained so far in data abstraction, by presenting the formal framework for its definition, reporting about the decidability and complexity of the main theoretical problems concerning abstraction, and discuss open issues and interesting directions for future research.

Keywords: abstraction; automated reasoning; data integration; data preparation; knowledge representation.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

    1. Abedjan Z., Golab L., Naumann F. (2017). “Data profiling: a tutorial,” in Proceedings of the 2017 ACM International Conference on Management of Data (SIGMOD 2017) (Chicago, IL), 1747–1751. 10.1145/3035918.3054772 - DOI
    1. Afrati F. N., Chirkova R. (2019). Answering Queries Using Views. Synthesis Lectures on Data Management, 2nd ed. San Rafael, CA: Morgan and Claypool Publishers. 10.1007/978-3-031-01871-8 - DOI
    1. Barceló P., Romero M. (2017). “The complexity of reverse engineering problems for conjunctive queries,“ in Proceedings of the Twentieth International Conference on Database Theory (ICDT 2017), Volume 68 of Leibniz International Proceedings in Informatics, 7:1–7:17. Available online at: https://www.dagstuhl.de/en/publications/lipics (accessed June 15, 2023).
    1. Calvanese D., De Giacomo G., Lembo D., Lenzerini M., Rosati R. (2007a). “EQL-lite: effective first-order query processing in description logics,”in Proceedings of the Twentieth International Joint Conference on Artificial Intelligence (IJCAI 2007) (Hyderabad), 274–279.
    1. Calvanese D., De Giacomo G., Lenzerini M., Vardi M. Y. (2000). “What is view-based query rewriting?” in Proceedings of the Seventh International Workshop on Knowledge Representation meets Databases (KRDB 2000), Volume 29 of CEUR Electronic Workshop Proceedings, 17–27. Available online at: http://ceur-ws.org/ (accessed June 15, 2023).

LinkOut - more resources