Edinburgh Research Explorer

A Formal Semantics of SQL Queries, Its Validation, and Applications

Research output: Contribution to journalArticle

Related Edinburgh Organisations

Access status

Open

Documents

Original languageEnglish
Number of pages13
JournalProceedings of the VLDB Endowment (PVLDB)
StateAccepted/In press - 15 May 2017

Abstract

While formal semantics of theoretical languages underlying SQL have been provided in the past, they all made simplifying assumptions ranging from changes in the syntax to omitting bag semantics and nulls. This situation is reminiscent of what happens in the field of programming languages, where semantics of formal calculi underlying main features of languages are abundant, but formal semantics of real languages that people use are few and far between.
We take the basic class of SQL queries – essentially SELECT-FROM-WHERE queries with subqueries, set/bag operations, and nulls – and define a formal semantics for it, without any departures from the real language. Already this
fragment requires decisions related to the data model and handling variable names that are normally disregarded by simplified semantics. To justify our choice of the semantics, we validate it experimentally on a large number of randomly generated queries and databases.
We give two applications of the semantics. One is the first formal proof of the equivalence of basic SQL and relational algebra that extends to bag semantics and nulls. The other application looks at the 3-valued logic employed by SQL
that is universally assumed to be necessary to handle nulls. We prove however that this is not so, as three-valued logic does not add power: every SQL query in our fragment can be evaluated under the usual two-valued Boolean semantics
of conditions.

Download statistics

No data available

ID: 36217683