2020-02-27 14:25
Page 1

# Databases

## Lecture 5: The Relational Data Model

• Lecture notes: chapter 4
• Book: chapter 2
Page 2

# A mathematical perspective on relational databases

• Concepts from discrete mathematics
• Sets
• Cartesian products, A × B
• Relations
• Functions
• (Disjoint unions, A + B)
• Introduction to
• Functional Dependencies (to be continued next time)
Page 3

# Sets

• Sets are "collections of things".
• x∊A means x is a member of the set A.
• Mathematicans like infinite sets.
• Well-known infinite sets:
• ℕ = {0, 1, 2, 3, …} the set of natural numbers
• ℤ = {…, -2, -1, 0, 1, 2, 3, …}: the set of integers
• ℝ: the set of real numbers
Page 4

# Properties of sets

## Subsets

• A ⊆ B means that A is a subset of B.
• If x∊A then x∊B.
• Example: ℕ ⊆ ℤ ⊆ ℝ.
Page 5

# Operations on sets

• Union: Intersection: Difference: A ∪ B = {x | x∊A or x∊B} A ∩ B = {x | x∊A and x∊B} A – B = {x | x∊A and x∉B}
• Often illustrated with Venn diagrams:

Page 6

# Cartesian Products

•  Cartesian product: A × B = {⟨a,b⟩ | a∊A, b∊B}
• The size of a cartisan product: |A × B| = |A|·|B|
• Projections from a pair p = ⟨a,b⟩:
• π1p = a
• π2p = b
• Well-known cartesian products:
• 2 = ℝ×ℝ: points in 2D space.
• 3 = ℝ×ℝ×ℝ: points in 3D space. (Notation: ⟨x,y,z⟩ ∊ ℝ3)
Page 7

# Mathematical Relations

• A relation is a subset of a cartesian product of two or more sets:
• R ⊆ A × B
• R ⊆ A1 × … × An
• A binary relation R ⊆ A × B is a set of pairs.
• x R y ⇔ ⟨x,y⟩∊R
• A well-know binary relation:
• (≤) ⊆ ℝ2
• (≤) = {⟨x,y⟩ | ⟨x,y⟩∊ ℝ2, x≤y}
Page 8

# Properties of Relations

• Assume we have a relation R ⊆ A × A.
• R is reflexive: for all x∊A, x R x.
• R is transitive: for all x,y,z∊A, x R y and y R z ⇒ x R z.
• R is symmetric: for all x,y∊A, x R y ⇒ y R x.
• Example:
• (≤) is both reflexive and transitive, but not symmetric.
Page 9

# Operations on Relations

• Since relations are sets:
• Union: Intersection: Difference: R ∪ S = {t | t∊R or t∊S} R ∩ S = {t | t∊R and t∊S} R – S = {t | t∊R and t∉S} R × S = {⟨x,y,u,v⟩ | ⟨x,y⟩∊R, ⟨u,v⟩∊S}
• Note: no nested tuples: ⟨x,y,u,v⟩ instead of ⟨⟨x,y⟩,⟨u,v⟩⟩.
Page 10

# Functions

## Some relations are functions

• A function f : A→B is a relation f ⊆ A×B
• f(x) = y means ⟨x,y⟩∊f.
• Restriction: for each x∊A there is at most one y∊B such that ⟨x,y⟩∊f.
• The input x of a function determines the output y.
Page 11

# Functions and Tables

## Comparing to the Entity-Relationship model

• Tables corresponding to entities are typically functions.
• Example: `Countries(name,capital,area,population,continent,currency)`
• Input: `name` (the primary key). Output: the other attributes.
• But database queries are not limited to looking up the output for a given input…
• Tables corresponding to many-to-many relationships are not functions
• Example:
• ```UsedIn(country,language)   country -> Countries.name   language -> Languages.code```
• Even when a table is not a function there might be functional dependencies between the attributes…
Page 12

# Relational schemas and tables

• A relational schema:
• R(a1, …, a1)
• can be augmented with the domain (type) of each attribute:
• R(a1:T1,…,an:Tn)
• The relational signature of a schema is the corresponding cartesian product:
• T1 × … × Tn
• Given a relational schema R(a1, …, an) with signature T1 × … × Tn
• A table of schema R is a subset of the cartesian product T1 × … × Tn.
• A row in the table is an element of the cartesian product: t ∊ T1 × … × Tn.
Page 13

## Example

• ```CREATE TABLE Countries(
country TEXT,
capital TEXT,
currency TEXT
)
```
• Relational schema: `Countries(country:Text, capital:Text, currenct:Text)`
• Relation signature: Text × Text × Text
Page 14

# Tables vs mathematical relations

• Countries
countrycapitalcurrency
FinlandHelsinkiEUR
EstoniaTallinnEUR
SwedenStockholmSEK
• Mathematical notation:
• Countries = {⟨Finland, Helsinki, EUR⟩, ⟨Estonia, Tallinn, EUR⟩, ⟨Sweden, Stockholm, SEK⟩}
• A table as an array of records:
• ```Countries =   [{country="Finland", capital="Helsinki",  currency="EUR"),    {country="Estonia", capital="Tallinn",   currency="EUR"},    {country="Sweden",  capital="Stockholm", currency="SEK"}]```
Page 15

# Attribute names vs tuple components

• The elements of cartesian products T1 × … × Tn are tuples t = ⟨t1, …, tn⟩.
• The components of a tuple are accessed by their index: ti.
• We will also use dot notation t.a for the projection of values from a row t.
• We assume that each attribute name corresponds a specific component:
• t.a = ti(a)
• where i is an indexing function.
• Simultaneous projection:
• If A={a1,…,an} is a set of attributes,
• then t.A = ⟨t.a1, …, t.an
Page 16

# Sets vs arrays and lists

• Order? noOrder? yes
Duplicates? noSetOrdered set
Duplicates? yesMultisetList or Array
• With sets (and relations), order and duplication is irrelevant.
• Sorting makes no sense, for example.
• With tables, order and duplication makes a difference.
• SQL has DISTINCT and ORDER BY.
• The set operations UNION, INTERSECT, EXCEPT discard duplicates.
• There is also UNION ALL, INTERSECT ALL, EXCEPT ALL that preserve duplicates.
Page 17

# Duplicate rows in tables

• `Countries(name,capital,area,population,continent,currency)`
• A table with 252 rows
• Since there is a primary key, there are no duplicate rows in the Countries table.
• Query results are tables too, and they can contain duplicates. Example:
• ```SELECT continent,currency FROM Countries;
```
• 252 rows.
• ```SELECT DISTINCT continent,currency FROM Countries;
```
• Reduced to 168 rows when duplicates are omitted.
Page 18

# Relational Algebra

• We have now seen many of the ingredients of Relational Algebra,
• a concise mathematical notation in which we can express relations and queries.
• Below is a quick overview, we will return to this in a later lecture.
Page 19

## Notation and correspondences

• ConceptRel AlgebraSet theorySQL
domain (attribute values)TType
relationRTable
cartesian product of sets T1 × … × Tn {⟨t1,…,tn⟩ | ti ∊ Ti} table schema
labelaattribute name
componentt.ati(a)value of attribute
tuple{a=t1,…,b=tn} ⟨t1,…,tnrow
{ti | ⟨…,ti,…⟩∊R}column
Page 20
##### Relational Algebra → Notation and correspondences
• ConceptRel AlgebraSet theorySQL
unionR ∪ S{t | t∊R or t∊S}R UNION S
intersectionR ∩ S{t | t∊R and t∊S}R INTERSECT S
differenceR – S{t | t∊R and t∉S}R EXCEPT S
Page 21
##### Relational Algebra → Notation and correspondences
• ConceptRel AlgebraSet theorySQL
selectionσcR{t | t∊R, C}WHERE C
projectionπa,b,cR {⟨t.a,t.b,t.c⟩ | t∊R}SELECT a,b,c
cartesian product of relations R × S {⟨x,y,u,v⟩ | ⟨x,y⟩∊R, ⟨u,v⟩∊S} FROM R, S
theta joinRc S   = σc(R × S) R INNER JOIN S ON C
natural joinRS R NATURAL JOIN S
Page 22
##### Relational Algebra → Notation and correspondences
• ConceptRel AlgebraSet theorySQL
duplicatesδ RDISTINCT
sortingτa RORDER BY a
Page 23

# Functional Dependencies

• Assume a relational schema R(a1,…,an),
• S = {a1, …, an} is the set of attribues of R.
• Let X ⊆ S, A ⊆ S
• Functional Dependency: X → A, "X determines A", iff
• For all t,u∊R: t.X = u.X ⇒ t.A = u.A
Page 24

# Properties of Functional Dependencies

• Reflexivity (trivial dependency): X → X
• Transitivity: if X → Y and Y → Z then X → Z
• Closure: X+ = {A | X→A}
• the set of attributes determined by X
• Superkey: X such that X+ ⊇ S
• Key: minimal superkey
Page 25

# Functional Dependencies

## Example 1a

• Schema: `Countries(country,capital,currency)`
• Functional Dependencies:
• country→capital
country→currency
• Closures:
• country+ = {country,capital,currency}
• capital+ = {capital}
• curency+ = {currency}
• Superkeys: {country}, {country,capital}, {country,currency}, {country,capital,currency}
• Key: {country}
Page 26

## Example 1b

• Schema: `Countries(country,capital,currency)`
• Functional Dependencies:
• country→capital
country→currency
capital→country
• Closures:
• country+ = {country,capital,currency}
• capital+ = {capital,country,currency}
• currency+ = {currency}
• Superkeys: {country}, {capital}, {capital,currency}, {country,capital}, {country,currency}, {country,capital,currency}
• Keys: {country}, {capital}
Page 27
##### Functional Dependencies
• SQL code for example 1a
• ```CREATE TABLE Countries(
country TEXT PRIMARY KEY,
capital TEXT NOT NULL,
currency TEXT NOT NULL);
```
• SQL code for example 1b
• ```CREATE TABLE Countries(
country TEXT PRIMARY KEY,
capital TEXT NOT NULL UNIQUE,
currency TEXT NOT NULL);
```
Page 28

## Example 2

• Schema: `Countries(country,currency,value)`
• Functional Dependencies:
• country→currency
currency→value
• country→value (implied by transitivity)
• Closures:
• country+ = {country,currency,value}
curency+ = {currency,value}
value+ = {value}
• Superkeys: {country}, {country,currency}, {country,value}, {country,currency,value}
• Key: {country}
• Non-trivial FD that is not a superkey: currency→value
Page 29

### Split tables to avoid redundancy

• Countries
countrycurrencyvalue
FinlandEUR1.09
EstoniaEUR1.09
SwedenSEK0.12
• JustCountries
countrycurrency
FinlandEUR
EstoniaEUR
SwedenSEK
Currencies
currencyvalue
EUR1.09
SEK0.12
• Countries = JustCountries ⨝ Currencies
Page 30