As we know, VMware vSAN is an software defined storage platform which is fully integrated with vSphere. vSAN creates an object based distributed storage solution by aggregating locally attached storage on the host part of  vSAN Cluster. VMware vSAN SPBM ( Storage Policy Based Management) framework delivers unique availability and performance requirement. Instead of Reading / Writing data directly from / to the capacity disk vSAN check if the requested data is available in the caching disk in case of read operations. Similarly, VMware vSAN writes the data first to the caching disk before writing data back to the capacity disk. In this post I will be discussing on how how the caching algorithms are different for read and write caching as well as hybrid and all flash configurations.

Read Cache

VMware VSAN leverage SSD devices of each disk group as the “performance tier” or for caching purpose. The purpose of leveraging SSD Devices for caching is to serve the highest possible ratio of read operations from the data staged in the Read Cache and to minimize the read operations to be served by the capacity disks. Read Caching is only used in case of hybrid configuration because SSD can serve a large number of IOPS even for random workloads. In case of All Flash configuration read operations  are directly served by the capacity SSD disk drives and caching SSD is only used for Write Cache purpose.

In case of hybrid VSAN configuration, by default 70% of the caching SSD will be used for Read Cache purpose. Even though VMware don’t recommend to change the default configuration of 70% you can configure this value under certain scenarios in case recommended by VMware support post analysis of use case and workloads.

The Read Cache is organized in terms of “cache lines” currently 1MB in size. Data is fetched into the RC and evicted, when needed, at the granularity of a cache line. Depending on the available memory in the system, VSAN also maintains a small in-memory read cache apart from the Read Cache available in SSD. In order to track the Read Cache state both in in-memory & SSD, VMware VSAN maintains an in-memory metadata. To avoid imposing any substantial CPU overhead or memory usage, these metadata structures are designed in a way that they are compressed. VMware VSAN does not track the Read Cache content in case of any power-cycle operations of the host. In that Read Cache will be re-populated from scratch.

How Read Cache Works

Once a read operation hits the VSAN layer, VSAN check the in-memory data structure to find if requested logical block in available in Read Cache. If in case the requested logical block in partially or fully not available in the Read Cache, a 1 MB of space will be allocated in the Read Cache for the new cache line by evicting existing cache lines in case needed. Each missing cache line is read from the HDD as multiple of 64Kb chunks instead of 1MB. The is fragmentation reduces the queuing penalty incurred for other operations sent to the HDD just after the RC miss. Instead of reading any random data, VSAN issues the reads for the 64KB chunks that contain the referenced data first to complete the original operation immediately. The rest of the cache line in the RC will be populated asynchronously. Once the entire cache line has been read from the HDD and written in the RC, the 1MB in-memory buffer is added to the in-memory RC.

Write Cache

VMware vSAN leverage write cache in both Hybrid and All Flash VSAN Configuration. In Hybrid configuration by default 30% of the caching tier SSD is used for Write Cache on the other hand in an All Flash configuration 100% of the caching tier SSD is used for the Write Cache.

Generally in a virtualized environment, storage working is almost always random. Magnetic disk (HDD) performs decently for a sequential workload but they performs badly with random workloads where IOPS and latency are the key performance metrics. In case of random workload, it is not a good idea of sending the write operations directly to the spinning disk. VMware VSAN utilize write-back caching in case of both Hybrid and All Flash configuration.

In an VSAN Hybrid configuration, write-back caching is entirely used for performance reason. VSAN stage all the write operations as a write-back buffer in Write Buffer section of the SSD. The key objective of staging the write operations in Write-Back buffer is to de-stage written data in a way that will a near-sequential write workload for the HDDs that form the capacity tier of the disk group.

In All-Flash disk groups, entire caching tier SSD up to a maximum of 600 GB is used for write-back buffer. The purpose of Write-Buffer is not the performance in case of All Flash VSAN configuration as we have in case of Hybrid Configuration. The Write-Buffer helps to absorb the  highest rate of write operations in a high endurance (caching tier) SSD and only small stream of data to be written to the capacity flash tier. This approach allow to use low endurance cheaper SSD for capacity and high endurance SSD for caching purpose. As capacity tier SSD can serve large number of read IOPS, it is not required to leverage caching tier SSD for read caching.


Hope this would have helps you understanding how Read & Write caching work in VMware VSAN Hybrid and All Flash Architecture. Hope this will be informative for you. Thanks for Reading!!!.