last updated 9/22/05
Although relational algebra is useful in the analysis of query evaluation, SQL is actually based on a different query language: relational calculus
There are two relational calculi:
Both TRC and DRC are:
{ t | cond(t) } is a simple relational calculus form
t is a tuple variable (sort of like a loop control variable, an iterator, or an "element of" reference).
The tuple variable on the left of the "|" can also be a list of rangeVar.Attr dotted pairs. This effectively represents the projection operation.
The condition cond is a logical condition which effectively represents the selection.
As part of the condition we include the predicate relation(t) specify the relation that the variable "ranges" over. This predicate holds true if t is in the relation. Person(p) says that it's true that p is a tuple from the relation Person.
Bound variables are used to make assertions about tuples in database (used in conditions)
Free variables designate the tuples to be returned by the query (used in targets)
{S | Student(S) AND
($ T Î
Transcript
(S.Id = T.StudId AND
T.CrsCode = ‘IT376’)) }
When a value is substituted for S the condition has value true or false
There can be only one free variable in a condition (the one that appears in the target)
{t.Pres, t.State | Presidents(t) and t.Party="Democrat"}
π_{Pres,State}(σ_{Party="Democrat"}(Presidents))
SELECT Pres,State
FROM Presidents
WHERE Party = "Democrat"
Projection of attributes, Selection of rows, and Tables are all specified
List the names of all professors who have taught MGT123
In TRC: | In SQL: |
{P.Name | Professor(P)
AND $ T Î Teaching (P.Id = T.ProfId AND T.CrsCode = ‘MGT123’) } |
SELECT
P.Name FROM Professor P, Teaching T WHERE P.Id = T.ProfId AND T.CrsCode = ‘MGT123’ |
The core language of SQL is a simplified language of TRC.
Well formed formulas (WFF) composed of combinations of the following
Predicate Calculus Atoms
R(t)
t.A OP s.B or c OP t.A or t.A OP c
Every atom is a formula
If F and G are formulas, then so are
(F and G), (F or G), (not F), and (not G)
If F is a formula, then so are
($
t)(F) and ("
t) (F)
A variable is bound if it is appears in an existential or univeral quantifier clause
Otherwise the variable is free
($t)(F) is TRUE if F is true for at least one tuple assigned to free occurences of t in F
This is required for joining of relations in relational calculus
"List the capitals of states that have had presidents"
{s.Capital,s.Sname | States(s)
and ($ p) (Presidents(p)
and p.State = s.Sname)}
(" t) (F) is TRUE if F is true for all tuples assigned to free occurences of t in F
Also used to cross reference data in separate tables
"List states that have not had presidents"
{s.Sname | States(s)
and (" p) (not
Presidents(p) or s.Sname ¹ p.State)}
("t) (F) == not ($t)( not F)
($ t)(F) == not ("t) ( not F)
you can substitute quantifier by negating the condition and negating the formula
DeMorgan's theorem
not (A and B) = (not A) or (not B)
not (A or B) = (not A) and (not B)
SQL has no quantifiers: how come? It uses conventions:
Convention 1. Universal quantifiers are not allowed (but SQL:1999 introduced a limited form of explicit " )
Convention 2. Make existential quantifiers implicit: Any tuple variable that does not occur in SELECT is assumed to be implicitly quantified with $
Compare:
{P.Name | Professor(P) AND $ T Î Teaching … }
and
SELECT
P.Name
FROM Professor P, Teaching T
"List states that have not had presidents"
{s.Sname | States(s)
and ("p)
(not Presidents(p) or s.Sname ¹
p.State)}
{s.Sname | States(s)
and not ($ p) (Presidents(p)
and s.Sname = p.State)}
Must guarantee that the resulting relation is finite
{t | not Presidents(p)} is unsafe