Thus far, we have been using the FROM and WHERE clauses to join two tables.
SELECT m.First, m.Last, g.Course
FROM Midn m, Grades g
WHERE g.Alpha = m.Alpha;
You can imagine this pictorially,
SELECT First, Last, Course
FROM Midn INNER JOIN Grades ON Grades.Alpha = Midn.Alpha;
Theses three methods are equivalent:
SELECT m.Alpha, First, Last, Course
FROM Midn m, Grades g WHERE g.Alpha = m.Alpha;
SELECT m.Alpha, First, Last, Course
FROM Midn m INNER JOIN Grades g ON g.Alpha = m.Alpha;
SELECT m.Alpha, First, Last, Course
FROM Midn m JOIN Grades g ON g.Alpha = m.Alpha;
With the JOIN keyword, in MySQL, we have the option to use either ON or USING
SELECT Alpha, First, Last, Course
FROM Midn JOIN Grades USING(Alpha);
USING makes it easier to specify the columns we join on, and assumes
that the column(s) with that name exists in both tables, and the values must
be equal. It removes the ambiguity that we have been working around when
we select a field like Alpha (we have had to select what table to pull
Alpha from, ex. Midn.Alpha).
NOTE: Rows with NULL values in the join column(s) will
not be part of the join answer, as NULL will not match any other value including NULL!
While INNER JOINs return the those rows that intersect the two tables,
OUTER JOINs can be used to return all values from a table and to include
those records from the secondary joined table when they intersect, or NULL values
(for the columns in the joined secondary table) otherwise.
SELECT First, Last, Course
FROM Midn LEFT JOIN Grades ON Grades.Alpha = Midn.Alpha
ORDER BY Last, First;
| First | Last | Course |
|---|---|---|
| Cersei | Lannister | NULL |
| Jaime | Lannister | NULL |
| Tyrian | Lannister | HE111 |
| Tyrian | Lannister | HH104 |
| Tyrian | Lannister | NL110 |
| Tyrian | Lannister | NS101 |
| Tyrian | Lannister | SC111 |
| Tywin | Lannister | NULL |
| Bran | Stark | NL110 |
| Eddard | Stark | NULL |
| Rob | Stark | NL110 |
SELECT First, Last, Course
FROM Midn LEFT JOIN Grades ON Grades.Alpha = Midn.Alpha
WHERE Course IS NULL
ORDER BY Last, First;
NATURAL JOINs are similiar to INNER JOIN except that it will use matching column names automatically for you. While this is the most lazy way possible, there is considerable risk if you are not paying attention and columns with the same name in the two tables have different meanings.
SELECT First, Last, Course
FROM Midn NATURAL JOIN Grades;
We have shown a LEFT JOIN, but similarly there is a RIGHT JOIN that uses the second table as the primary and will retrieve all rows of the right table and those in the left that match.
The set operators can be used to work with the results from multiple queries. The UNION operator allows us to combine the results of two SELECT statements, though they must have the same number of columns and data types. Duplicative rows are removed by default (use UNION ALL to retain duplicates). Below is an example of using UNION. Note this is not the most efficiant way to write that particular query. (the best way is to use just one select))
SELECT First, Last, Course
FROM Midn JOIN Grades USING(Alpha)
WHERE Course = 'NL110'
UNION
SELECT First, Last, Course
FROM Midn JOIN Grades USING(Alpha)
WHERE Course = 'HH104';
There are additional set operators supported by some Relational DBMS: INTERSECT will
return only rows that return
identical values from both SELECT statements, and EXCEPT takes the two results
and only returns those rows that exist in the first result but not in the
second (similar with a minus in math). INTERSECT and EXCEPT are
not implemented in MySQL.
We have seen before that we can have a SELECT query within another SELECT query. The "interior" queries are called subqueries. We can have subqueries within subqueries as well, as many as needed to answer the use question. The subqueries are always enclosed in parantheses ().
We can use a subquery that returns a value anywhere a value is needed (for example in SELECT, WHERE, or HAVING clauses). For example, the query below returns all the rows in the Grades table with the minimum (or maximum in real value, as an A will be the highest grade at USNA but the minimum value as a string) course grade:
SELECT *
FROM Grades
WHERE CourseGrade = (SELECT MIN(CourseGrade) FROM Grades);
We can use a subquery that returns a list of values everywhere a list is required, for example in a WHERE clause using IN or NOT IN. The query below returns all the students who never got an A in a class:
SELECT Alpha, Last, First
FROM Midn
WHERE Alpha NOT IN (SELECT Alpha
FROM Grades
WHERE CourseGrade = 'A');
We can use a subquery that returns a set of rows (one columns or more) as a temporary table, anywhere a table can be used, for example in the FROM clause. The query below returns a list of the lowest performers in each course:
SELECT First, Last, g.Course, g.CourseGrade
FROM Midn m, Grades g, (SELECT Course, MAX(CourseGrade) AS MinGrade
FROM Grades
GROUP BY Course) AS newtable
WHERE m.Alpha = g.Alpha
AND g.Course = newtable.Course
AND g.CourseGrade = newtable.MinGrade;
Bottom line is that subqueries can be used in a multitude of situations. Note however, that using joins is often more efficient than using subqueries, if both of the options are available to answer a question.
Sometimes you want to return only a few rows from the result of a query, instead of returning all rows that satisfy the query. MySQL has a LIMIT clause in the SELECT statement that can be used for that. Below is an example that returns up to 100 rows from the Midn table:
SELECT *
FROM Midn
WHERE Alpha like '25____'
LIMIT 100;
One can also specify an offset or number of rows to skip (2 in the example below), before returning the specified number of rows (100 in the example below).
SELECT *
FROM Midn
WHERE Alpha like '25____'
LIMIT 2,100;
More information about LIMIT can be found in the MySQL SELECT documentation.
As discussed in a previous lecture, LOAD DATA is a very useful MySQL command used to load a text file into a table. More information can be found in the MySQL LOAD DATA documentation. Below is an example of using LOAD DATA to insert data from a sample testData.csv file stored on the local computer into a table. The data has some missing values, which will be replaced by NULL in the table:
CREATE TABLE Test(column1 INT NOT NULL, column2 VARCHAR(50), column3 INT);
LOAD DATA LOCAL INFILE 'testData.csv'
INTO TABLE Test
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\r\n'
IGNORE 1 ROWS
(column1, @vcol2, @vcol3)
SET column2 = NULLIF(@vcol2,''),
column3 = NULLIF(@vcol3,'');
Note that you might need to configure your mysql client to accept uploading local files to the database server. In Unix, you can do that by creating a file ~/.my.cnf with the content:
!includedir /etc/mysql/conf.d/
[client]
local-infile=1
Here is some additional data that we can use towards problems, take a look at a newer version of insert_Midn_data.sql which has two additional tables.
LEFT JOIN the Grades table and add AcYr, Course, Title, and EndOfTerm columns.
LEFT JOIN the Sections table and add the InstID column.
LEFT JOIN the Prof table and add the Professors first and last name (remember to deconflict with the Midn Table for First and Last.)