A Formal Semantics of SQL Queries, Its Validation, and Applications

Research output: Contribution to journalArticlepeer-review


While formal semantics of theoretical languages underlying SQL have been provided in the past, they all made simplifying assumptions ranging from changes in the syntax to omitting bag semantics and nulls. This situation is reminiscent of what happens in the field of programming languages, where semantics of formal calculi underlying main features of languages are abundant, but formal semantics of real languages that people use are few and far between.
We take the basic class of SQL queries – essentially SELECT-FROM-WHERE queries with subqueries, set/bag operations, and nulls – and define a formal semantics for it, without any departures from the real language. Already this fragment requires decisions related to the data model and handling variable names that are normally disregarded by simplified semantics. To justify our choice of the semantics, we validate it experimentally on a large number of randomly generated queries and databases.
We give two applications of the semantics. One is the first formal proof of the equivalence of basic SQL and relational algebra that extends to bag semantics and nulls. The other application looks at the 3-valued logic employed by SQL that is universally assumed to be necessary to handle nulls. We prove however that this is not so, as three-valued logic does not add power: every SQL query in our fragment can be evaluated under the usual two-valued Boolean semantics of conditions.
Original languageEnglish
Pages (from-to)27-39
Number of pages13
JournalProceedings of the VLDB Endowment (PVLDB)
Issue number1
Publication statusPublished - 1 Sep 2017


Dive into the research topics of 'A Formal Semantics of SQL Queries, Its Validation, and Applications'. Together they form a unique fingerprint.

Cite this