Normalization Review
First, lets have a quick review of the previous lecture's definitions and the four normal forms that
we discussed.
- Functional Dependency (FD): attribute(s) that uniquely determines another attribute(s), \(X \rightarrow Y\)
- Candidate Key: a minimal set of attributes that can identify a record/row
- Superkey: a set of attributes that includes a candidate key (and possibly additional attributes)
A Table is in 1NF
With 1NF our goal is for our table to just be a clean and well defined table,
at this point our tables need to meet the following requirements:
- Each column contains data about one attribute
- All entries in a column are of the same kind (ex. Alpha (INT), Name (VARCHAR), Date (DATE))
- Each column has a unique name
- Cells contain only one value
- Each row is unique
- The order of the columns and rows does not matter
A Table is in 2NF
- The table is already in 1NF
- All non-key columns are determined only by the ENTIRE key,
- when given \(R(\underline{A},\underline{B},C,D)\)
- all other columns are determined by the primary key, \((A,B) \rightarrow (C,D)\)
- There are no partial (functional) dependencies,
\(B \rightarrow \ D\), is not allowed.
A Table is in 3NF
- The table is already in 2NF
- There are no transitive dependencies,
\(A \rightarrow B \rightarrow C\), is not allowed.
- If you have a question about the transitive property, you can check to see if
a table is in 3NF by checking that,
for every functional
dependency X → A , one of the following
statements is true:
- A ∈ X (it is a trivial FD)
- X is a superkey or
- A is part of some key for R
A Table is in Boyce-Codd Normal Form (BCNF)
- The table is already in 3NF
- For every functional dependency \(X \rightarrow Y\), the determinant \(X\) is a candidate key.
- Simplified,
If any attribute (or set of attributes) determines another attribute,
then that determining attribute(s) must be capable of uniquely identifying a record in the table
(i.e., it must be a candidate key).
Normally we will put in the work and effort to ensure that all of our tables are normalized into Boyce-Codd Normal Form.
Lets review an example of a table that is in 3NF but not in BCNF
Imagine we have a table \( R(A, B, C) \) with the following
functional dependencies:
- \( A, B \rightarrow C \)
- \( C \rightarrow A\)
and with the following candidate keys:
Reviewing our steps:
- 1NF - This is a valid table.
-
2NF - All non-key columns are determined by the entire key.
- \( \{A, B\} \), \(C \) requires the entire candidate key and therefore \( \{A, B\} \) is a valid primary key.
-
2NF - There are no partial (functional) dependencies
- \( \{A, B\} \), no partial (functional) dependencies found.
-
2NF - Using key \( \{A, B\} \) results in a table that satisfies 2NF
- 3NF - There are no transitive dependencies
- \(A \) is a part of the key \( \{A,B\} \), therefore \( C \rightarrow A\) does not violate this rule.
- BCNF - Verify that for all \(X \rightarrow Y\), the determinant \(X\) is a candidate key
- \( C \rightarrow A\) violates BCNF, \( C\) is not a candidate key or superkey.
This table can not be decomposed further into BCNF and we would leave it in 3NF.
Practice Problem Sets
The purpose of this lecture is to practice normalization problems, please form up into small groups and attempt the following:
- Problem Set #1 (Practice Normalization)
- Problem Set #2 (Practice Determining Functional Dependencies / Candidate Keys)