In the previous lecture we discussed interfaces for traditional data devices, and introduced the concept of RAIDing drives to support reliability and to increase data transfer speeds. Lets discuss how our systems and MySQL help further improve performance.
In your computer, data is stored long term in persistent storage using Hard Disk Drives and/or Solid State Drives. These devices provide data to the CPU at the rates that were mentioned in the previous class, but these speeds are considerably slower than when we use memory that is connected directly to the motherboard or cache that is attached to the CPU itself.
Random Access Memory (RAM) in a computer is a form of volatile (non-persistent) storage that is used to store data and machine code currently being used or processed by the CPU.
When a program is run, data is transferred from the hard drive to RAM. This data includes the program's executable code and any files or data it needs to access. The CPU accesses this data from RAM for processing since accessing data from RAM is much faster than from a hard drive or SSD. Since RAM is volatile, it's constantly refreshed to maintain the data. When new tasks are started, old data is overwritten.
RAM is typically installed in modules, such as DIMMs (for desktops) or SO-DIMMs (for laptops). The amount of RAM in a computer can significantly affect its performance, especially in tasks that require a lot of data to be stored temporarily. Modern systems tend to support dual or quad-channel mode, which allows two (or four) RAM modules to workin in parallel which will double (or quadruple) performance.
| Standard | Introduced | Speed | Voltage | Max Module Size | Advantages |
|---|---|---|---|---|---|
| DDR3 | 2007 | 800MHz to 2133MHz | 1.5V | 32GB | |
| DDR4 | 2014 | 2133MHz to 3200MHz | 1.2V | 64GB-128GB | Offers higher bandwidth and faster data rates compared to DDR3, along with improved energy efficiency and stability. |
| DDR5 | 2020 | 4800MHz to 5200MHz (expected to hit 8400 in the future) | 1.1 - 1.2V | 128GB-256GB | DDR5 provides improved performance, lower power consumption, and greater data integrity features compared to DDR4. It also introduces new features like on-die ECC (Error Correcting Code) for better reliability. |
Memory comes in different speed grades, often labeled as DDR3-800, DDR3-1066, DDR3-1333, DDR3-1600, DDR3-1866, and DDR3-2133. The number after "DDR3-" represents the maximum transfer rate in megatransfers per second (MT/s).
For example, DDR3-1600 refers to a peak transfer rate of 1600 MT/s.
The actual data rate in megabytes per second (MB/s) can be calculated by multiplying the transfer rate by 8
(since DDR memory transfers 8 bytes per cycle). For instance, DDR3-1600 has a peak data rate of
1600 × 8 = 12,800 MB/s.. Remember that this is per module, and systems use their memory in parallel.
The cache on a CPU is a small amount of very fast memory located on or very close to the CPU chip. It is used to temporarily store instructions and data that the CPU is likely to reuse. This improves the efficiency and speed of data access, as accessing data from the cache is much faster than fetching it from the main memory (RAM).
Some of the benefits of a CPU cache include reducing the average time to access data and decreasing the load on the main memory bus of the computer. There are limitations though, since a larger cache will consume more power and surface area on the CPU, and after a certain point increasing the size of a cache does not proportionaly increase the performance.
This kind of behavior is typical in real-world scenarios. Initially, increasing the cache size significantly improves the hit rate, but beyond a certain point, the returns diminish. This plateau effect reflects the principle of locality: most programs tend to access only a limited set of data and instructions repeatedly. Once the cache is large enough to hold this set, further increases in cache size result in smaller improvements in hit rate.
MySQL implements caching to improve performance and efficiency, with various places in the pipeline where we can use this to our advantage.
MySQL supports multiple storage engines, these define specific methods on which to store and access information. We will discuss two such engines, the default InnoDB and the Memory engine.
Considering the descriptions of the two database engines above, you could imagine that nearly all of your data would be maintained in the default InnoDB engine, and for tables that include session data (authentication, where someone is in a website, etc) that a memory table could be used. In most web environments the state is refreshed each time a user visits a new page.
Creating a table using a different engine is as simple as specifying the engine
in the CREATE TABLE statement. Lets see a couple quick examples
from the MySQL Documentation.
CREATE TABLE tablename (
fieldOne INT
) ENGINE=MEMORY;
CREATE TABLE tablename ENGINE=MEMORY
SELECT * FROM othertable;
CREATE TABLE tablename (
fieldOne INT,
fieldTwo INT,
INDEX USING HASH (fieldOne)
) ENGINE=MEMORY;
CREATE TABLE tablename (
fieldOne INT,
fieldTwo INT,
INDEX USING BTREE (fieldOne)
) ENGINE=MEMORY;
/home drive
that was constructed from 9 drives in RAID6, this drive
is mounted as /home on every lab workstation as well as ssh.cs.usna.edu
and desktop.cs.usna.edu.
df -BG
Filesystem 1G-blocks Used Available Use% Mounted on
tmpfs 2G 1G 2G 1% /run
/dev/sda2 234G 41G 181G 19% /
tmpfs 8G 1G 8G 1% /dev/shm
tmpfs 1G 1G 1G 1% /run/lock
/dev/sda1 1G 1G 1G 2% /boot/efi
nfs.cs.usna.edu:/home ?????G ???G ?????G ??% /home
9 drives in RAID6, what is the size
of each of the 9 drives?
500MB/s, lets imagine that
our RAID controller adds no latency and can stripe across the data drives perfectly.
What is the maximum possible write speed to this array?
Filesystem 1G-blocks Used Available Use% Mounted on
backups.cs.usna.edu:/snapshot/shared 204081G 13097G 190984G 7% /backups
Has an array of 12 Hard Disk Drives (with about 150MB/s sustained write speeds) in RAID6.
What are the sizes of the individual drives and whats the total possible throughput on the system?