Important! Mathematical objects have
"type", just like objects in programs, and a set is a type.
So, 4 and \( \{4\} \) are different! The first is the number
four, and the second is the set whose only element is
the number four.
So "\( 4 \)" is a number and "\( \{4\} \)" is
a set. They have different type and therefore cannot
be the same.
By the same reason, \( \{4\} \) and \( \{\{4\}\} \) are not the same.
Also by the same reason, \( 0 \) and \( \emptyset \) and "" are all
different, because the first is a number, the
second is a set, and the third is a string.
Important! is \(\emptyset\) a subset of \(\{1,2,3,4\}\)? Yes! Think like a mid trying to get out of trouble: can you show me a single element of \(\emptyset\) that isn't in \(\{1,2,3,4\}\)? Nope! Therefore \(\emptyset\) must be a subset of \(\{1,2,3,4\}\). In fact, for any set \(A\), we have \(emptyset\) is a subset of \(A\). Also note that every set is a subset of itself.
We have nice notations for subset related stuff: \(A \subseteq B\) means that \(A\) is a subset of \(B\). \(A \subset B\) means that \(A\) is a proper subset of \(B\), i.e. \(A \neq B\). Ex: \(\mathbb{Z} \subset \mathbb{R}\). If \(A\) is a set, we denote by \(2^A\) the set of all subsets of \(A\). Ex: If \(A = \{1,2,3\}\), then \(2^A = \{ \emptyset, \{1\}, \{2\}, \{3\}, \{1,2\}, \{2,3\}, \{1,3\}, \{1,2,3\} \}\).def cardinality(A):
k = 0
for x in A:
k = k + 1
return k
A set is a collection of distinct objects. In databases, a set can be thought of as a table and each row represents an element of the set. As an example, a table named Midn where each row contains a student’s information can be considered a set of Midn and each row is an individual element.
Lets now review the results of two SELECT statements.
SELECT Alpha, First, Last FROM Midn;
+--------+--------+-----------+
| Alpha | First | Last |
+--------+--------+-----------+
| 261000 | Eddard | Stark |
| 261001 | Rob | Stark |
| 261002 | Bran | Stark |
| 261234 | Tyrian | Lannister |
| 261235 | Cersei | Lannister |
| 261236 | Tywin | Lannister |
| 261237 | Jaime | Lannister |
+--------+--------+-----------+
7 rows in set (0.02 sec)
SELECT Alpha, AcYr, Course, EndOfTerm FROM Grades;
+--------+------+--------+-----------+
| Alpha | AcYr | Course | EndOfTerm |
+--------+------+--------+-----------+
| 261001 | 2023 | NL110 | C |
| 261002 | 2023 | NL110 | A |
| 261234 | 2023 | HE111 | B |
| 261234 | 2023 | HH104 | A |
| 261234 | 2023 | NL110 | A |
| 261234 | 2023 | NS101 | B |
| 261234 | 2023 | SC111 | C |
+--------+------+--------+-----------+
7 rows in set (0.01 sec)
You can consider any queries we make against a table to be a subset of the table,
remember a subset can be the entire table or any smaller combination of the possible elements. If we had
performed a query like SELECT * FROM Grades WHERE Course='NL110', then this would have
created a subset of the table Grades only showing elements related to NL110.
Remember each row is an element of the larger set (table), and each element is also a subset of the larger set. A single element in our case would be:
+--------+------+--------+-----------+
| Alpha | AcYr | Course | EndOfTerm |
+--------+------+--------+-----------+
| 261001 | 2023 | NL110 | C |
+--------+------+--------+-----------+
If we were to JOIN the two tables without specifying what columns we wanted to use, our results would be Cartesian Product (as described above) where the results would be a new set of all possible pairs of the elements of the original two sets.
SELECT Midn.Alpha, First, Last, AcYr, Course, EndOfTerm FROM Midn, Grades;
+--------+--------+-----------+------+--------+-----------+
| Alpha | First | Last | AcYr | Course | EndOfTerm |
+--------+--------+-----------+------+--------+-----------+
| 261237 | Jaime | Lannister | 2023 | NL110 | C |
| 261236 | Tywin | Lannister | 2023 | NL110 | C |
| 261235 | Cersei | Lannister | 2023 | NL110 | C |
| 261234 | Tyrian | Lannister | 2023 | NL110 | C |
| 261002 | Bran | Stark | 2023 | NL110 | C |
| 261001 | Rob | Stark | 2023 | NL110 | C |
| 261000 | Eddard | Stark | 2023 | NL110 | C |
| 261237 | Jaime | Lannister | 2023 | NL110 | A |
| 261236 | Tywin | Lannister | 2023 | NL110 | A |
| 261235 | Cersei | Lannister | 2023 | NL110 | A |
| 261234 | Tyrian | Lannister | 2023 | NL110 | A |
| 261002 | Bran | Stark | 2023 | NL110 | A |
| 261001 | Rob | Stark | 2023 | NL110 | A |
| 261000 | Eddard | Stark | 2023 | NL110 | A |
| 261237 | Jaime | Lannister | 2023 | HE111 | B |
| 261236 | Tywin | Lannister | 2023 | HE111 | B |
| 261235 | Cersei | Lannister | 2023 | HE111 | B |
| 261234 | Tyrian | Lannister | 2023 | HE111 | B |
| 261002 | Bran | Stark | 2023 | HE111 | B |
| 261001 | Rob | Stark | 2023 | HE111 | B |
| 261000 | Eddard | Stark | 2023 | HE111 | B |
| 261237 | Jaime | Lannister | 2023 | HH104 | A |
| 261236 | Tywin | Lannister | 2023 | HH104 | A |
| 261235 | Cersei | Lannister | 2023 | HH104 | A |
| 261234 | Tyrian | Lannister | 2023 | HH104 | A |
| 261002 | Bran | Stark | 2023 | HH104 | A |
| 261001 | Rob | Stark | 2023 | HH104 | A |
| 261000 | Eddard | Stark | 2023 | HH104 | A |
| 261237 | Jaime | Lannister | 2023 | NL110 | A |
| 261236 | Tywin | Lannister | 2023 | NL110 | A |
| 261235 | Cersei | Lannister | 2023 | NL110 | A |
| 261234 | Tyrian | Lannister | 2023 | NL110 | A |
| 261002 | Bran | Stark | 2023 | NL110 | A |
| 261001 | Rob | Stark | 2023 | NL110 | A |
| 261000 | Eddard | Stark | 2023 | NL110 | A |
| 261237 | Jaime | Lannister | 2023 | NS101 | B |
| 261236 | Tywin | Lannister | 2023 | NS101 | B |
| 261235 | Cersei | Lannister | 2023 | NS101 | B |
| 261234 | Tyrian | Lannister | 2023 | NS101 | B |
| 261002 | Bran | Stark | 2023 | NS101 | B |
| 261001 | Rob | Stark | 2023 | NS101 | B |
| 261000 | Eddard | Stark | 2023 | NS101 | B |
| 261237 | Jaime | Lannister | 2023 | SC111 | C |
| 261236 | Tywin | Lannister | 2023 | SC111 | C |
| 261235 | Cersei | Lannister | 2023 | SC111 | C |
| 261234 | Tyrian | Lannister | 2023 | SC111 | C |
| 261002 | Bran | Stark | 2023 | SC111 | C |
| 261001 | Rob | Stark | 2023 | SC111 | C |
| 261000 | Eddard | Stark | 2023 | SC111 | C |
+--------+--------+-----------+------+--------+-----------+
49 rows in set (0.00 sec)
Of course this is not what we want to retrieve from the database as all of our students grades were mixed,
but it is the effect of what happens when we don't properly join the tables. In reality this
is a CROSS JOIN that would be the result of something similar to SELECT * FROM Midn CROSS JOIN Grades;.
What we really wanted to find was the intersection between the two tables, but a traditional set intersection
would only show elements of the set (rows in our case) that were identical between the two queries. If we only
wanted to find the Alphas this could be accomplished by using INTERSECT in MySQL.
SELECT Alpha FROM Midn INTERSECT SELECT Alpha FROM Grades;
+--------+
| Alpha |
+--------+
| 261001 |
| 261002 |
| 261234 |
+--------+
3 rows in set (0.00 sec)
In our case, we want to define what fields (or columns) to compare, in essense we are performing a MySQL intersection by running a operation over sets with our two tables.
SELECT Midn.Alpha, First, Last, AcYr, Course, EndOfTerm FROM Midn INNER JOIN Grades USING(Alpha);
+--------+--------+-----------+------+--------+-----------+
| Alpha | First | Last | AcYr | Course | EndOfTerm |
+--------+--------+-----------+------+--------+-----------+
| 261001 | Rob | Stark | 2023 | NL110 | C |
| 261002 | Bran | Stark | 2023 | NL110 | A |
| 261234 | Tyrian | Lannister | 2023 | HE111 | B |
| 261234 | Tyrian | Lannister | 2023 | HH104 | A |
| 261234 | Tyrian | Lannister | 2023 | NL110 | A |
| 261234 | Tyrian | Lannister | 2023 | NS101 | B |
| 261234 | Tyrian | Lannister | 2023 | SC111 | C |
+--------+--------+-----------+------+--------+-----------+
7 rows in set (0.00 sec)
We can think of our INNER JOIN as the intersection over our two resulting sets above, \(A\) and \(B\),
and their respective rows \(x\) and \(y\), performed by the operation over the appropriate columns.
$$ C = \sum_{x \in A} \sum_{y \in B} f(x,y,*columns) $$
Again, this action is really just the intersection of the rows of the tables as we are allowing
ourselves a bit of leeway in the requirements for the elements (or comparison of the elements) in the set to support
intersection, or
if you dislike that concept, imagine that the database is finding the proper intersection using the identical columns
and then adding the additionally requested columns.
Continuing this thought, when we are comparing these two tables via the Alpha column, it is as if the other columns in the respective tables are dependent on this particular column as if the remaining columns support this column or could be defined by it. This is actually the concept of functional dependency that we will begin exploring in the next lecture.
INNER JOINOUTER JOINLEFT JOINJOINNATURAL JOIN, CROSS JOIN, and RIGHT JOIN, and why should we avoid these?