EMC FAST Cache – Behind the scene

I know VCDX defense session is going on at this time when I am writing this and many of the aspiring candidates who have defended and going to defend in the future, may have their design based on EMC Storage. They might go for the EMC FAST in their design. I thought to use this opportunity and discuss the concept of EMC FAST Cache and the technology that works behind the scene.

I am not going to talk about what is EMC FAST Cache. Read more about it here.

Fast cache uses a map of pages, which is stored in the flash drives. Your hot data get promoted to flash devices from normal HDD and will be written to FAST cache, which improves response time and avoids FAST cache misses. In a write back operation when the data become cold, that data in FAST cache is then asynchronously written back to hard disk drives. FAST cache provides a much larger, scalable cache.

You can either enable FAST cache on traditional LUNs, or on all LUNs that are part of a storage pool. You can max allocate 2 TB of space to FAST cache. Now let me take a step ahead and show you the components in a FAST cache.

First major component in FAST Cache is memory map. It maintains information on the state of 64 KB chunks of storage and their contents in FAST cache. The size of the memory map increases linearly with the size of FAST cache being created.

Second component is a policy engine. It Manages the flow of I/O through the FAST cache. When a chunk of HDD data is accessed frequently, it is copied to flash drives and they are copied back to HDDs when other data becomes more heavily used. The decision on when these operations should take place is made by the policy engine.

If the host I/O request is for a read operation, and the target data is in the DRAM cache, the data is read from the DRAM cache. If the data is not in DRAM cache but in FAST cache, the I/O is redirected by the policy engine and the data is read from FAST cache; it is also put in the DRAM cache. However, if the data is not in FAST cache as well, the I/O path follows the same path it would if the storage array had no FAST cache.

If the host I/O request is a write operation, the data is written to DRAM cache first. When dirty pages in DRAM cache need to be flushed, they are written to the FAST cache if that data has already been promoted. Since data is written to Flash drives instead of HDDs, this operation is very fast and may help limit the number of dirty pages in DRAM cache. Data that is in FAST cache and used less frequently is asynchronously written back or demoted to HDD.

Data on HDDs that is frequently accessed is promoted to FAST cache, which is an asynchronous process. Data promotion into FAST cache depends on the number of accesses (read/write) within a 64 KB chunk of storage. Applications with a high locality rate will see a significant increase in throughput and response time performance.

You must have heard that picture says 1000 words :-). So let me show you four operations that happens in the FAST Cache. Following image shows a Cache Hit operation.

Cache-Hit

Cache hit happens when I/O request is received and forwarded to memory map and match is found and then the I/O request is served by FAST Cache.

Following image shows how cache miss happens in FAST Cache. It happens when there is no match found on the memory map and then the I/O is served by the HDD.

Cache-Miss
Following image shows the cache promotion event. It happens when the policy engine decides that some chunk of data needs to be promoted to the cache. Then the data from HDD get copied over to the cache and memory map reflects the changes in the cache and update its table.

Cache-Promotion

 

Following image shows Cache Write-Back operation. When the data become cold and not much hit happens for that set of data, policy engine pick those data blocks and ask cache driver to write those data blocks back to the HDD. Then the memory map update its table.

Cache-WriteBack

 

About Prasenjit Sarkar

Prasenjit Sarkar is a Product Manager at Oracle for their Public Cloud with primary focus on Cloud Strategy, Oracle Openstack, PaaS, Cloud Native Applications and API Platform. His primary focus is driving Oracle’s Cloud Computing business with commercial and public sector customers; helping to shape and deliver on a strategy to build broad use of Oracle’s Infrastructure as a Service (IaaS) and Platform as a Service (PaaS) offerings such as Compute, Storage, Java as a Service, and Database as a Service. He is also responsible for developing public/private cloud integration strategies, customer’s Cloud Computing architecture vision, future state architectures, and implementable architecture roadmaps in the context of the public, private, and hybrid cloud computing solutions Oracle can offer.