Relational Algebra

last updated 13-sep-18

Relational Query Languages

Languages for describing queries on a relational database

Structured Query Language (SQL)

Relational Algebra

 

 


What is an Algebra?

 


Relational Algebra

 

 

 


Relational Algebra Operations

Below are fundamental operations that are "complete".  That is, this set of operations alone can define any retrieval.

Convenient, natural additions to the set of operations makes


Select Operation

Choose a subset of tuples from a relation based on some criteria, results in another relation called a "result set"

Notation uses lower case sigma:

σcondition(relation)

condition is a boolean expression in which rows are selected/kept/included where the condition is true

 

 


Examples

Id Name Address Hobby
1123 John 123 Main stamps
1123 John 123 Main coins
5556 Mary 5 Lake Dr. hiking
9876 Bart 10Pine St. stamps


Project Operation

Notation is lower case pi

πattribute- list(relation)

Project examples


Expressions

Compose these operations into a sequence to represent a single query.

Id Name Address Hobby
1123 John 123 Main stamps
1123 John 123 Main coins
5556 Mary 5 Lake Dr. hiking
9876 Bart 10Pine St. stamps

πId,Name(σHobby=stamps' OR Hobby='coins’(Person))

Result
Id Name
1123 John
9876 Bart

It is common to do the select first, followed by project.

"List the states that have living Presidents"

 

 

 

 


Set Operations

Union: r ∪ s ⇒ a row is in the result set if it is a row from r or from s; or the set of all tuples that belong to either r or s.

[Intersection: r ∩ s ⇒ a row is in the result set if it is a row from both r and from s.]

Set Difference: r - s ⇒ the set of all tuples in r that are not in s

Both operations require that the sets r and s are union compatible:

Union compatible relations can be combined using union ∪, intersection, and set difference -.

Example:

The following tables are NOT union compatible:

However,  πName (Person) and πName (Professor) ARE union compatible so that

πName (Person) - πName (Professor) makes sense.

Intersection is more likely solved using AND in the select operation. AND is in the definition of set intersection.

Union is more likely solved using OR in the select operation.

Difference is actually more useful to "remove" entries.

 


Cartesian Product

If R and S are two relations, R × S is the set of all concatenated tuples <x,y>, where x is a tuple in R and y is a tuple in S

R × S is expensive to compute--quadratic algorithm:

Cartesian Product
R   S   R X S
a b c d a b c d
xl
x2
y1
y2
x1
x2
y1
y2
x3
x4
y3
y4
x1
x2
y3
y4
x3
x4
y1
y2
x3
x4
y3
y4

From here we could do select and project to reduce the combination, but we consider this below.

 


Renaming

Result of an expression evaluation is a relation, called the result set.

Attributes of relation must have distinct names. This is no longer guaranteed with Cartesian product. Suppose in the previous example attributes a and c were really the same name, that is, R × S would have attributes: a,b,a,d 

Renaming operator tidies this up. To assign the names A1, A2,… An to the attributes of the n column relation produced by

expression expr, use the form relation-expression [A1, A2, … An]

Example

πStudId, CrsCode (Transcript) [StudId, CrsCode1] ×  πProfId, CrsCode (Teaching) [ProfId, CrsCode2]

 

 


Join

This is a derived operation, i.e., it is based on the basic operations of the relational algebra.  It is a convenience operation because it is done so much.

A (general or theta θ) join of R and S is the expression
R join-condition S

where join-condition is a conjunction of terms Ai oper Bi in which Ai is an attribute of R and Bi is an attribute of S and oper is one of =, <, >, ≤, ≥, ≠

The meaning is essentially:

σjoin.-condition(R X S)

where join-condition in both case are the same, except for possible renamings of attributes.

Join and Renaming

Problem: R and S might have attributes with the same name – in which case the Cartesian product is not defined

Solution:


Theta Join – example

Output the names of all employees that earn more than their managers

πEmployee Name(Employee|X|MgrId=Id AND Salary>Salary Manager)

the join yields a table with attributes: Employee.Name, Employee.Id, Employee.Salary, Employee.MngrId, Manager.Name, Manager.Id, Manager.Salary

 

Equijoin Join - Example

Equijoin: Join condition is a conjunction of equalities

πName, CrsCode (Student|X|Id=StudId AND Grade = `A' Transcript )

Equijoin
Student
 
Transcript
Id Name Address Status StudId CrsCode Sem Grade
111 John     111 CS305 F03 B
222 Mary     222 CS306 F03 A
333 Bill     333 CS304 S04 A
444 Joe            
Result
Mary CS306
Bill CS304

 

 


Natural Join

Special, but most useful, case of equijoin:

 

Transcript (StudId, CrsCode, Sem, Grade )
Teaching (ProfId, CrsCode, Sem)

Transcript|X|Teaching =
StudId, Transcript.CrsCode Transcript.Sem, Grade, ProfId (Transcript|X|CrsCode=CrsCode AND Sem = Sem Teaching ))
[StudId,CrsCode, Sem, Grade, ProfId]

More generally:

R|X|S = πattr-list (σjoin.-condition R|X|S )

where attr-list = attributes (R) ∪ attributes(S)
(duplicates are eliminated )and join-condition has the form: A1=A1 and ...and An=An where {A1 ... An} = attributes (R) ∩ attributes(S)


Natural Join Example

List all ID's of students who took at least two different courses:

πStudId (σCrsCodeCrsCode2 (Transcript|X|Transcript [StudId, CrsCode2, Sem2, Grade2]) )

 


Division

Goal: Produce the tuples in one relation, r, that match all tuples in another relation, s

Can be expressed in terms of projection, set difference, and cross-product

 

 

 


Division - Example

List the Ids of students who have passed all courses that were taught in spring 2000

Numerator:

Denominator:

Result is numerator/denominator