DUE: April 7, 2009, BEFORE start of lab

This lab should get you familiarized with indexing and query efficiency as well as loading a table in the database with data from a .cvs file.

Create a file called yourlastname_Lab7.txt that will contain all your answers for this lab. Write your name at top of the file.

Exercise 1: Consider a disk with average I/O time of 15 milliseconds per page and page size = 4096 bytes

Consider a table Products(BarCode, PName, Price) with 200,000 rows of 100 bytes each. The table is stored on disk such that no row spans 2 pages.

1. Compute the number of pages needed to store the entire table on disk

1. Compute the time (in seconds) needed to evaluate the following query, assuming no indexes are used:

SELECT * FROM Products

Consider the following query:

SELECT * FROM Products WHERE Price between 10 and 100

Assume that an unclustered B+tree index exists on Price, and 75% of the products satisfy the condition in the WHERE clause (Price between 10 and 100).

1. Compute the approximate time (in seconds) needed to evaluate the query using the B+tree index. State any assumptions that you make.

1. Assume now that the B+tree index is clustered. Compute the approximate time needed to evaluate the query using the clustered index.

1. Compute the approximate time (in seconds) needed to evaluate the query without using any index.

1. How many B+tree indexes can be created for table Products (assume there are no indexes currently created)? Justify your answer.

·        Connect to your MySQL database: Runà All Programs à MySQLàMySQL Query Browser àenter schooner.cs.usna.edu as server, mXXXXXX as user, enter your password, and enter mXXXXXX as the default schema.

·        Consider a table with the following schema

StudentsEnroll(StudentID, CourseID, ACYearEnd, Semester). Execute the following CREATE TABLE statement to create the table in your MySQL database.

CREATE TABLE StudentsEnroll(

StudentID char(6) NOT NULL,

CourseID char(6) NOT NULL,

AcYearEnd int NOT NULL,

Semester char(6) NOT NULL,

CONSTRAINT PK_StudentsEnroll PRIMARY KEY(StudentID, CourseID)

);

·        I’ve created a file with student enrollment data (file is also on Blackboard) and placed in my Unix account at /home/adina/Teaching/StudentsEnrollData.csv. The file is accessible by all, including the MySQL server running on schooner.cs.usna.edu.

·        Load that data into the StudentsEnroll table that you created, by using the following LOAD DATA statement in your Query Browser (most database systems provide some statement to load data from a file into a table in a database; load data infile is for MySQL):

INTO TABLE StudentsEnroll

FIELDS TERMINATED BY ','

LINES TERMINATED BY '\n';

To check that everything went well, execute a “SELECT * FROM StudentsEnroll” in your Query Browser. You should see all the data in your table (1773 rows).

Exercise 2:

a) Execute the following query:

SELECT CourseID, Count(*) AS NumStudents

FROM StudentsEnroll

WHERE ACYearEnd = 2009 and Semester = 'SPRING'

GROUP BY CourseID;

a.1) What are the steps used by the database system to answer the query? (Click the “Explain” button in the menu, and look at the “Possible keys”, “Key”, “Extra” column in the pane that opens; look in the textbook – page 296-300, in particular Table 12.9 on page 300)

a.2) What was the time needed to complete the query? (Time and number of rows fetched by each select query is reported on the same screen as the results of the query, on the bottom).

a.3) Re-execute the above select statement three times, and report the average time.

b) What index (if any) would you create to improve the efficiency of the query?  Justify your answer.

c) Write the SQL statement to create the above index in MySQL. Execute the statement in MySQL.

d) Execute again the query

SELECT CourseID, Count(*) AS NumStudents

FROM StudentsEnroll

WHERE ACYearEnd = 2009 and Semester = 'SPRING'

GROUP BY CourseID;

d.1) What are the steps used by the database system to answer the query now? (Click the “Explain” button in the menu, and look at the “Possible Keys”, “Key”, “Extra” column in the pane that opens) If the database did not use the index you just created, try to explain why.

d.2) What was the time needed to complete the query? (Time and number of rows fetched by each select query is reported on the same screen as the results of the query).

d.3) Re-execute the above select statement three times, and report the average time. If the execution time is higher than the execution time you obtained in a.3), try to explain why.

Turn in (due before start of lab on April 7, 2009):

Electronic:

1. Upload yourlastname_Lab7.txt with all your answers to the Lab 7 assignment on the blackboard.

Hard-copies:

1. The completed assignment coversheet. Your comments will help us improve the course.
2. A hard copy of your yourlastname_Lab7.txt containing the answers for each exercise.