Sizing your Storage System is always a crucial factor when you Virtualize your Business Critical Applications, because I/O characteristics differ significantly between applications depending on the nature of the access patterns.
The first step in being able to determine the requirements for your storage system is to understand the I/O pattern of your application. The frequency and size of reads and writes sent by the application are received and processed by the storage system. So, you need to understand their behavior and frequency in order to properly understand the requirements of that system.
Application I/O characteristics influence the overall performance of storage system and storage solution design.
Random and Sequential
I/O is characterized as either random or sequential. Random I/O refers to successive read-write operations from noncontiguous addresses – accesses that are spread across the addressable capacity of the LUN. Examples of applications that largely generate random I/O include messaging, online transaction processing (OLTP) applications.
Sequential I/O refers to successive read-write operations from contiguous addresses – one logical block address after another. In sequential I/O access, disk seek time is reduced because the read-write head moves little to access the next block. Examples of sequential I/O include data backup.
Reads and Writes
Another aspect of the I/O workload is the ratio of read I/Os to write I/Os generated by an application. The sum of the read rate and the write rate is the I/O rate (the number of I/O operations per second). The application’s I/O rate is one of the important factors that determine the minimum number of disks required for the application. In the storage system, cache plays an important role to improve the system performance.
Now let us go through an example situation and it’s solution.
My Client is deploying a new business application in their environment. The new application requires 1TB of storage space for business and application data. During peak workload application is expected to generate 4900 IOPS (I/O per second) with typical I/O size of 4KB.
The available disk drive option is 15,000 rpm drive with 100 GB capacity.
Other specifications of the drives are:
Av. Seek time = 5 millisecond
Data transfer rate = 40MB/sec
Now we have to calculate the required number of disk drives that can meet both capacity and performance requirements of this application.
1. Calculate time required to perform one I/O, which depends on disk service time.
Whereas, Disk service time= Av. seek time + rotational latency + data transfer time
Av. Seek time (given) = 5 millisecond.
Rotational latency is ½ of the time taken for full rotation. Since rotation speed is given as 15000 revolutions per min; one revolution will take 1/ (15000/60) sec.
Therefore time taken for half revolution is 0.5/ (15000/60) = 2ms.
Data transfer rate is 40MB/s, therefore transfer of 4KB I/O will take, 4KB/40MB/s = 0.1ms
Therefore, time required to perform one I/O is = 5 ms + 2ms + 0.1ms = 7.1 msec
2. Now calculate maximum number of IOPS a disk can perform, which is equal to, 1 / 7.1 ms = 140 IOPS
For acceptable response time disk controller utilization must be less than 70%, therefore maximum number of IOPS a disk can perform at 70% utilization is 140 X 0.7 = 98 IOPS
3. Now calculate number of disk required to meet:
a. Application’s performance requirement = 4900/98 = 50 disks
b. Application’s capacity requirement = 1TB/ 100 GB = 10 disks
4. Finally, disk required = Maximum (Capacity, Performance)
= Maximum (10, 50) = 50 disks
So you see in this exercise that even from a capacity perspective we could have taken a 10 disks but we need to take 50 disks when we talk about performance.
Let us look at the typical read/write ratio and IO Request Size.
Typical read versus write ratio for common business applications are as follows:
- Online transaction processing (OLTP) — 67 percent reads and 33 percent writes.
- Decision support system (DSS) — also referred to as data warehouse or business intelligence. I/O load is 80 percent to 90 percent reads to data tables including frequent table scans (sequential reads).
- Backup — As long as the file system is not fragmented, file-based backups are sequential.
I/O Request Size
The size of I/O generated by an application can vary depending on the type of the application. Some of the overhead to execute an I/O is fixed. If data exists in large chunks, it is more efficient to transmit larger blocks because a host can move data faster by using larger I/Os than smaller I/Os. The response time of each large transfer is longer than the response time for a single small transaction, but the combined service time of many smaller transactions is greater than a single transaction that contains the same amount of data.
UPDATE: One of my fellow vExpert and VCDX #66, Michael Webster has written the other considerations in terms of Storage when you Virtualize your Business Critical Applications. Have a look at the article, he has shown the whole gamut of it.