2020-02-27 14:25

Page 1
- Lecture notes: chapter 4
- Book: chapter 2

- Concepts from discrete mathematics
- Sets
- Cartesian products, A × B
- Relations
- Functions
- (Disjoint unions, A + B)

- Introduction to
- Relational Algebra (which we return to in a later lecture)
- Functional Dependencies (to be continued next time)

- Sets are "collections of things".
- x∊A means x is a member of the set A.
- Mathematicans like infinite sets.
- Well-known infinite sets:
- ℕ = {0, 1, 2, 3, …} the set of natural numbers
- ℤ = {…, -2, -1, 0, 1, 2, 3, …}: the set of integers
- ℝ: the set of real numbers

- A ⊆ B means that A is a subset of B.
- If x∊A then x∊B.
- Example: ℕ ⊆ ℤ ⊆ ℝ.

Union: A ∪ B = {x | x∊A or x∊B} Intersection: A ∩ B = {x | x∊A and x∊B} Difference: A – B = {x | x∊A and x∉B} - Often illustrated with Venn diagrams:

Cartesian product: A × B = {⟨a,b⟩ | a∊A, b∊B} - The size of a cartisan product: |A × B| = |A|·|B|
*Projections*from a pair p = ⟨a,b⟩:- π
_{1}p = a - π
_{2}p = b

- π
- Well-known cartesian products:
- ℝ
^{2}= ℝ×ℝ: points in 2D space. - ℝ
^{3}= ℝ×ℝ×ℝ: points in 3D space. (Notation: ⟨x,y,z⟩ ∊ ℝ^{3})

- ℝ

- A relation is a subset of a cartesian product of two or more sets:
- R ⊆ A × B
- R ⊆ A
_{1}× … × A_{n}

- A binary relation R ⊆ A × B is a set of pairs.
- x R y ⇔ ⟨x,y⟩∊R

- A well-know binary relation:
- (≤) ⊆ ℝ
^{2} - (≤) = {⟨x,y⟩ | ⟨x,y⟩∊ ℝ
^{2}, x≤y}

- (≤) ⊆ ℝ

- Assume we have a relation R ⊆ A × A.
- R is
*reflexive*: for all x∊A, x R x. - R is
*transitive*: for all x,y,z∊A, x R y and y R z ⇒ x R z. - R is
*symmetric*: for all x,y∊A, x R y ⇒ y R x. - Example:
- (≤) is both reflexive and transitive, but not symmetric.

- Since relations are sets:
Union: R ∪ S = {t | t∊R or t∊S} Intersection: R ∩ S = {t | t∊R and t∊S} Difference: R – S = {t | t∊R and t∉S} Cartesian product: R × S = {⟨x,y,u,v⟩ | ⟨x,y⟩∊R, ⟨u,v⟩∊S} - Note: no nested tuples: ⟨x,y,u,v⟩ instead of ⟨⟨x,y⟩,⟨u,v⟩⟩.

- A function f : A→B is a relation f ⊆ A×B
- f(x) = y means ⟨x,y⟩∊f.
- Restriction: for each x∊A there is at most one y∊B such that ⟨x,y⟩∊f.
- The input
`x`of a function*determines*the output`y`.

- Tables corresponding to entities are typically functions.
- Example:
`Countries(`

__name__,capital,area,population,continent,currency) - Input:
`name`

(the primary key). Output: the other attributes. - But database queries are not limited to looking up the output for a given input…

- Example:
- Tables corresponding to many-to-many relationships are
not functions
- Example:
`UsedIn(`

__country__,__language__)

country -> Countries.name

language -> Languages.code

- Example:
- Even when a table is not a function there might be
*functional dependencies*between the attributes…

- A relational schema:
- R(a
_{1}, …, a_{1})

- R(a
- can be augmented with the
*domain*(type) of each attribute:- R(a
_{1}:T_{1},…,a_{n}:T_{n})

- R(a
- The
*relational signature*of a schema is the corresponding cartesian product:- T
_{1}× … × T_{n}

- T
- Given a relational schema R(a
_{1}, …, a_{n}) with signature T_{1}× … × T_{n}- A table of schema R is a subset of the cartesian product
T
_{1}× … × T_{n}. - A
*row*in the table is an element of the cartesian product:`t`∊ T_{1}× … × T_{n}.

- A table of schema R is a subset of the cartesian product
T

**CREATE****TABLE**Countries**(**country TEXT**,**capital TEXT**,**currency TEXT**)**- Relational schema:
`Countries(country:Text, capital:Text, currenct:Text)`

*Relation signature*: Text × Text × Text

Countries country capital currency Finland Helsinki EUR Estonia Tallinn EUR Sweden Stockholm SEK - Mathematical notation:
- Countries = {⟨Finland, Helsinki, EUR⟩, ⟨Estonia, Tallinn, EUR⟩, ⟨Sweden, Stockholm, SEK⟩}

- A table as an array of records:
`Countries =`

[{country="Finland", capital="Helsinki", currency="EUR"),

{country="Estonia", capital="Tallinn", currency="EUR"},

{country="Sweden", capital="Stockholm", currency="SEK"}]

- The elements of cartesian products
T
_{1}× … × T_{n}are tuples t = ⟨t_{1}, …, t_{n}⟩.- The components of a tuple are accessed by their index:
`t`._{i}

- The components of a tuple are accessed by their index:
- We will also use dot notation
`t.a`for the*projection*of values from a row`t`.- We assume that each attribute name corresponds a specific component:
`t.a`=`t`_{i(a)}- where
`i`is an indexing function.

- where

*Simultaneous projection:*- If A={a
_{1},…,a_{n}} is a set of attributes, - then
`t.A`= ⟨t.a_{1}, …, t.a_{n}⟩

- If A={a

Order? no Order? yes Duplicates? no Set Ordered set Duplicates? yes Multiset List or Array - With sets (and relations), order and duplication is irrelevant.
- Sorting makes no sense, for example.

- With tables, order and duplication makes a difference.
- SQL has DISTINCT and ORDER BY.
- The set operations UNION, INTERSECT, EXCEPT discard duplicates.
- There is also UNION ALL, INTERSECT ALL, EXCEPT ALL that preserve duplicates.

`Countries(`

__name__,capital,area,population,continent,currency)- A table with 252 rows

- Since there is a primary key, there are no duplicate rows in the Countries table.
- Query results are tables too, and they can contain duplicates. Example:
**SELECT**continent**,**currency**FROM**Countries**;**- 252 rows.
**SELECT****DISTINCT**continent**,**currency**FROM**Countries**;**- Reduced to 168 rows when duplicates are omitted.

- We have now seen many of the ingredients of
*Relational Algebra*,- a concise mathematical notation in which we can express relations and queries.

- Below is a quick overview, we will return to this in a later lecture.

Concept Rel Algebra Set theory SQL domain (attribute values) T Type relation R Table cartesian product of sets T _{1}× … × T_{n}{⟨t _{1},…,t_{n}⟩ | t_{i}∊ T_{i}}table schema label a attribute name component t.a t _{i(a)}value of attribute tuple {a=t _{1},…,b=t_{n}}⟨t _{1},…,t_{n}⟩row {t _{i}| ⟨…,t_{i},…⟩∊R}column

Concept Rel Algebra Set theory SQL union R ∪ S {t | t∊R or t∊S} R UNION S intersection R ∩ S {t | t∊R and t∊S} R INTERSECT S difference R – S {t | t∊R and t∉S} R EXCEPT S

Concept Rel Algebra Set theory SQL selection σ _{c}`R`{t | t∊ `R,``C`}WHERE `C`projection π _{a,b,c}`R`{⟨t.a,t.b,t.c⟩ | t∊ `R`}SELECT a,b,c cartesian product of relations `R`×`S`{⟨x,y,u,v⟩ | ⟨x,y⟩∊ `R`, ⟨u,v⟩∊`S`}FROM `R`,`S`theta join `R`⨝_{c}`S`= σ_{c}(`R`×`S`)`R`INNER JOIN`S`ON`C`natural join `R`⨝`S`… `R`NATURAL JOIN`S`

Concept Rel Algebra Set theory SQL duplicates δ `R`DISTINCT sorting τ _{a}`R`ORDER BY a

- Assume a relational schema R(a
_{1},…,a_{n}),- S = {a
_{1}, …, a_{n}} is the set of attribues of R. - Let X ⊆ S, A ⊆ S

- S = {a
*Functional Dependency*: X → A, "X*determines*A", iff- For all t,u∊R: t.X = u.X ⇒ t.A = u.A

*Reflexivity*(trivial dependency): X → X*Transitivity*: if X → Y and Y → Z then X → Z*Closure*: X^{+}= {A | X→A}- the set of attributes determined by X

*Superkey*: X such that X^{+}⊇ S*Key*: minimal superkey

- Schema:
`Countries(country,capital,currency)`

- Functional Dependencies:
- country→capital

country→currency

- country→capital
- Closures:
- country
^{+}= {country,capital,currency} - capital
^{+}= {capital} - curency
^{+}= {currency}

- country
- Superkeys: {country}, {country,capital}, {country,currency}, {country,capital,currency}
- Key: {country}

- Schema:
`Countries(country,capital,currency)`

- Functional Dependencies:
- country→capital

country→currency*capital→country*

- country→capital
- Closures:
- country
^{+}= {country,capital,currency} - capital
^{+}= {capital,*country,currency*} - currency
^{+}= {currency}

- country
- Superkeys: {country},
*{capital}*,*{capital,currency}*, {country,capital}, {country,currency}, {country,capital,currency} - Keys: {country},
*{capital}*

- SQL code for example 1a
**CREATE****TABLE**Countries**(**country TEXT**PRIMARY****KEY****,**capital TEXT**NOT****NULL****,**currency TEXT**NOT****NULL****)****;**- SQL code for example 1b
**CREATE****TABLE**Countries**(**country TEXT**PRIMARY****KEY****,**capital TEXT**NOT****NULL****UNIQUE****,**currency TEXT**NOT****NULL****)****;**

- Schema:
`Countries(country,currency,value)`

- Functional Dependencies:
- country→currency

currency→value - country→value (implied by transitivity)

- country→currency
- Closures:
- country
^{+}= {country,currency,value}

curency^{+}= {currency,value}

value+ = {value}

- country
- Superkeys: {country}, {country,currency}, {country,value}, {country,currency,value}
- Key: {country}
*Non-trivial FD that is not a superkey*: currency→value

Countries country currency value Finland EUR 1.09 Estonia EUR 1.09 Sweden SEK 0.12 JustCountries country currency Finland EUR Estonia EUR Sweden SEK Currencies currency value EUR 1.09 SEK 0.12 - Countries = JustCountries ⨝ Currencies