R offers support for MySQL via multiple packages. In this course we will be using the RMariaDB package to connect from R to a MySQL database server and run queries. MariaDB is MySQL are related, and the MariaDB driver is the one currently used for connections to both MySQL and MariaDB. For more information on RMariaDB see the official documentation.
# Lets install the required package for R
# We only need to do this once.
install.packages("RMariaDB")
The code to connect to the database is similar with what we used from Python, but we need to specify as first parameter an object of class MariaDBDriver:
# Lets connect to the database
library(RMariaDB)
db = dbConnect(RMariaDB::MariaDB(),
user='username',
password = 'dbpass',
host = 'dbhost',
dbname = 'databasename')
To test that connection was successful, you could execute a command to list
all tables:
dbListTables(db)
To disconnect, when all queries are completed, usually at the end of
your R script:
dbDisconnect(db)
In practice, this method is not what is normally used
directly. Please review the next section!
Since your username and password (for the database) is needed for the connection, we'll create a small mydb.R file which stores the parameters for the connection. This file should normally be stored outside of the R working directory, in a place where other configurations settings are. For now, we will store it in the same dorectory as our other R scripts.
# mydb.R - This file stores the configuration for the database connection.
user = "m2xxxxx"
password = "YOURPASSWORD"
host = "db.cs.usna.edu"
database = "m2xxxxx"
# Import the Library
library(RMariaDB)
# Connect to the database
db = dbConnect(RMariaDB::MariaDB(),
user = user,
password = password,
host = host,
dbname = database )
Then, we'll connect to the database using the configuration file instead of
specifying the username/password explicitly in the script:
# Source our mydb file
source("mydb.R")
There are a few basic steps needed when performing SQL queries via R and MySQL. In the examples below simple queries will be presented on the left, while dynamic queries are shown on the right.
query = "SELECT Alpha, Course
FROM Grades
WHERE Course = 'NL110'"
query2 = "UPDATE Grades
SET 6Week = 'A'
WHERE Course = 'NL110'"
query = "SELECT Alpha, Course
FROM Grades
WHERE Course = ?"
query2 = "UPDATE Grades
SET 6week = ?
WHERE Course = ?"
# This step not required for simple query!
course = 'NL110' # Just as an example
values = c(course)
values2 = c('A',course)
res = dbSendQuery(db, query)
res = dbSendStatement(db, query2)
res = dbSendQuery(db, query, values)
res = dbSendStatement(db, query2, values2)
row_count = dbGetRowsAffected(res) # Number of rows affected by last insert, delete, or update
columns_info = dbColumnInfo(res) # Description of columns selected
statement = dbGetStatement(res) # Last SQL statement executed -useful for debugging
# Option 1: Retrieving one row at a time with a loop
row = dbFetch(res,n=1) # Fetch a single row
print(row) # Do some work
while (!dbHasCompleted(res)){
row = db.Fetch(res,n=1) # Retrieve the next row
print(row) # Do some work
}
# Option 2: Retrieve all of the rows at once.
allresults = dbFetch(res)
print(allresults)
dbClearResult(res)
dbGetQuery() for
SELECT queries, and dbExecute() for INSERT, UDPATE, DELETE queries,
and these methods have the send-fetch-clear steps build-in.res = dbGetQuery(db, query, values) # For SELECT
res2 = dbExecute(db, query2, values2) # For INSERT, UPDATE, DELETE
mydb.R file based on the "Connecting (in
practice)"mydb.R