Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022;66(4):989-1030.
doi: 10.1007/s10817-022-09632-4. Epub 2022 Jul 27.

A Formalization of SQL with Nulls

Affiliations

A Formalization of SQL with Nulls

Wilmer Ricciotti et al. J Autom Reason. 2022.

Abstract

SQL is the world's most popular declarative language, forming the basis of the multi-billion-dollar database industry. Although SQL has been standardized, the full standard is based on ambiguous natural language rather than formal specification. Commercial SQL implementations interpret the standard in different ways, so that, given the same input data, the same query can yield different results depending on the SQL system it is run on. Even for a particular system, mechanically checked formalization of all widely-used features of SQL remains an open problem. The lack of a well-understood formal semantics makes it very difficult to validate the soundness of database implementations. Although formal semantics for fragments of SQL were designed in the past, they usually did not support set and bag operations, lateral joins, nested subqueries, and, crucially, null values. Null values complicate SQL's semantics in profound ways analogous to null pointers or side-effects in other programming languages. Since certain SQL queries are equivalent in the absence of null values, but produce different results when applied to tables containing incomplete data, semantics which ignore null values are able to prove query equivalences that are unsound in realistic databases. A formal semantics of SQL supporting all the aforementioned features was only proposed recently. In this paper, we report about our mechanization of SQL semantics covering set/bag operations, lateral joins, nested subqueries, and nulls, written in the Coq proof assistant, and describe the validation of key metatheoretic properties. Additionally, we are able to use the same framework to formalize the semantics of a flat relational calculus (with null values), and show a certified translation of its normal forms into SQL.

Keywords: Coq; Formalization; Nulls; SQL; Semantics.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
Well-formed SQL syntax
Fig. 2
Fig. 2
Three-valued logic truth tables
Fig. 3
Fig. 3
Formal semantics of SQL (types)
Fig. 4
Fig. 4
Translation from 3VL-SQL to 2VL-SQL
Fig. 5
Fig. 5
Relational Calculus normal forms
Fig. 6
Fig. 6
Formal semantics of the Relational Calculus (types)
Fig. 7
Fig. 7
Relational Calculus translation to SQL (types)
Fig. 8
Fig. 8
Relational Calculus translation to SQL

References

    1. Auerbach, J.S., Hirzel, M., Mandel, L., Shinnar, A., Siméon, J.: Prototyping a query compiler using Coq (experience report). Proc. ACM Program. Lang. 1(ICFP), 9:1–9:15 (2017). 10.1145/3110253
    1. Benzaken, V., Contejean, E.: A Coq mechanised formal semantics for realistic SQL queries: formally reconciling SQL and bag relational algebra. In: A. Mahboubi, M.O. Myreen (eds.) Proceedings of the 8th ACM SIGPLAN International Conference on Certified Programs and Proofs, CPP 2019, Cascais, Portugal, January 14–15, 2019, pp. 249–261. ACM (2019). 10.1145/3293880.3294107
    1. Benzaken, V., Contejean, E., Dumbrava, S.: A Coq formalization of the relational data model. In: Programming Languages and Systems—23rd European Symposium on Programming, ESOP 2014, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2014, Grenoble, France, April 5–13, 2014, Proceedings, pp. 189–208 (2014). 10.1007/978-3-642-54833-8_11
    1. Buneman P, Libkin L, Suciu D, Tannen V, Wong L. Comprehension syntax. SIGMOD Rec. 1994;23(1):87–96. doi: 10.1145/181550.181564. - DOI
    1. Buneman, P., Naqvi, S., Tannen, V., Wong, L.: Principles of programming with complex objects and collection types. Theor. Comput. Sci. 149(1) (1995). 10.1016/0304-3975(95)00024-Q

LinkOut - more resources