In this lesson we'll work with the Midshipmen table. If you would like to work with the sample data, download this file Midshipmen.sql, and either cut-and-paste it into the MySQL window, or you can redirect the input into the MySQL command line client.
mysql -h db.cs.usna.edu -u m26xxxx -p m26xxxx -fcvvv < Midshipmen.sql
if you are connected to the CS Server or on a lab workstation, the following line will also work:mysql -h db.cs.usna.edu -u mxxxx -p m26xxxx -fcvvv < /data/SD321/Midshipmen.sql
or you could open the MySQL command line clientmysql -h db.cs.usna.edu -u m26xxxx -p m26xxxx -vvv
and then source the file from within the client viasource Midshipmen.sql
This will create the table and populate it with the same rows as seen in these examples.
The SELECT statement is used to retrieve information from a database. The basic syntax is given below, with [] marking optional parts:
SELECT [DISTINCT] column_name(s)|*
FROM table_name(s)
[WHERE conditions]
If we wanted to see all of the rows and all of the columns in our Midshipmen table, this could be accomplished using the asterisk * (all columns), example:
SELECT * FROM Midshipmen;
| Alpha | FirstName | LastName | Company | Room | Phone | Major | |
|---|---|---|---|---|---|---|---|
| 260190 | John | Smith | 5 | 54321 | 410-293-0001 | m260190@usna.edu | CompSci |
| 260312 | Jane | Doe | 9 | 98765 | 410-293-0003 | m260312@usna.edu | ElecEng |
| 270673 | Jane | Doe | 7 | 76543 | 410-293-0002 | m260673@usna.edu | DataSci |
| 260892 | Jason | Jones | 19 | 19333 | 410-293-0019 | m260892@usna.edu | CompSci |
| 260111 | Joe | Schmoe | 27 | 72543 | 410-293-0008 | m260111@usna.edu | DataSci |
| 260567 | Bill | West | 14 | 48765 | 410-293-0011 | m260567@usna.edu | ElecEng |
When performing our queries we can be much more specific with what data we want to receive, specifying both what columns we want returned and what data we would like back. Instead of using the star *, we will specify specific columns, and we will add the WHERE clause to look for specific rows.
SELECT Alpha, LastName, FirstName, Major
FROM Midshipmen
WHERE Major = 'DataSci';
| Alpha | LastName | FirstName | Major |
|---|---|---|---|
| 260111 | Schmoe | Joe | DataSci |
| 270673 | Doe | Jane | DataSci |
In the database we have the requirement that all rows must not be duplicated, but it is possible create a query that returns multiple identical rows, as an example:
SELECT LastName FROM Midshipmen;
| LastName |
|---|
| Smith |
| Doe |
| Doe |
| Jones |
| Schmoe |
| West |
We can prevent duplicate rows in the responses via the DISTINCT keyword, example
SELECT DISTINCT LastName FROM Midshipmen;
| LastName |
|---|
| Smith |
| Doe |
| Jones |
| Schmoe |
| West |
We can be more specific with the rows selected by using the WHERE clause. Just like traditional if-statements in the programming language of your choice, we have the ability to check for multiple conditions that have to be satisfied by a row in SQL. This is achieved with AND, OR, IN, NOT IN, BETWEEN, and LIKE. SQL-92 also supports multiple wild cards, with _ representing exactly one character and % being any set of characters (0 or more).
SELECT Alpha, LastName, Major
FROM Midshipmen
WHERE Alpha LIKE '26%'
AND Major IN ('DataSci', 'CompSci');
| Alpha | LastName | Major |
|---|---|---|
| 260190 | Smith | CompSci |
| 260673 | Doe | DataSci |
| 260892 | Jones | CompSci |
Knowledge Check: Be careful with OR! Run the query below, what's wrong with the results? What's the correct way to use OR to make our results look like the query above? (Reveal answer after attempting!)
SELECT Alpha, LastName, Major
FROM Midshipmen
WHERE Alpha LIKE '26%'
AND Major = 'DataSci'
OR Major = 'CompSci';
This meme sums it up nicely:-) The IN and NOT IN keywords can take lists as inputs, as seen above, or they can take the results of a single column result from a subquery. The DBMS will process the interior queries and provide that as values to the outer/exterior SQL query.
SELECT Alpha, LastName, Major
FROM Midshipmen
WHERE Alpha LIKE '26%'
AND Major IN (SELECT Major FROM Midshipmen WHERE Company = 5);
It is convenient to have the DBMS return the results in a sorted form, vice having to perform the sorting after the data is returned. To do that, we must add another clause to the SELECT statement.
SELECT [DISTINCT] column_name(s)|*
FROM table_name(s)
[WHERE conditions]
[ORDER BY column_name(s)[ASC/DESC]]
We will use ORDER BY to determine which columns will be used for ordering the results, and using ASC or DESC we can set the results to be displayed in either ascending or descending order respectively. A complete example:
SELECT Alpha, LastName, Major
FROM Midshipmen
WHERE Alpha LIKE '26%' AND Major IN ('DataSci', 'CompSci')
ORDER BY Major ASC, Alpha DESC;
| Alpha | LastName | Major |
|---|---|---|
| 260892 | Jones | CompSci |
| 260190 | Smith | CompSci |
| 260111 | Schmoe | DataSci |
In past lessons we discussed when to choose the data type as numeric versus CHARs, and the general recommendation was that if the value is not used in arithmetic computations, it could be a char. For these problems we did choose to store Alpha and Company as INTs, and we are doing math operations on them so we can learn to do math operations on simple data. But this will probably be the only time you are ever asked to find the mids whose company number is greater than 6.
If you would like to explore a real data set, the thetvdb.com provides their entire database online for download, and a local copy has been placed in the database schema SD321 on db. You are currently using the schema that was created for you on the database server, when you log on i
mysql -u <username> -p<password> -h <host> <schema> -vvv
you have been logging on to the schema that was created for your use (such as m261234, but you are able
to connect to any that you have permissions to, try:
mysql -u <username> -p<password> -h <host> SD321 -vvv
If you are already logged on, you can change schemas via USE SD321. For the examples below, connect
to the SD321 schema; we will be working with the tvseries table.