Storage Spaces Direct Explained – ReFS, Multi-Tier Volumes and Erasure Coding


Here’s where we dive in and get dirty…but I promise by the end of my series, you will smiling like my friend here. I am planning a surprise with special guest bloggers. Stayed Tuned. Now one to the show…..

Storage Spaces Direct Explained ReFS

The NEW ReFS File System, Multi-Tier Volumes and Erasure Coding

Storage Spaces Direct Explained ReFS

Like S2D, the ReFS file system actually isn’t new either, they have been working on it for several releases now also.  In Windows Server 2016, it finally drops the tech preview label and is now ready for production.  And there is a lot of benefits… like volume creation doesn’t have to zero out the volume for 10 minutes like NTFS. It’s just a metadata operation that is effectively instantaneous now, I’m just going to focus on the couple of benefits that ReFS has for S2D.

For those not familiar Erasure coding (EC) and to prepare you for the next part, EC is a method of data protection in which data is broken into fragments, expanded and encoded with redundant data pieces and stored across a set of different locations.

The original goal of EC was to enable data that becomes corrupted at some point in the storage process to be reconstructed by using information about the data that’s stored elsewhere.  Erasure codes are great, because of their ability to reduce the time and overhead required to reconstruct data. The drawback of erasure coding is that it can be more CPU-intensive, and that can translate into increased latency.

Now all that being said, classic erasure codes were designed and optimized more for communication, not for storage. Naively applying classic erasure codes in storage is okay, but is missing enormous efficiencies. Microsoft has developed their own erasure codes optimized for storage called Local Reconstruction Codes (LRC). I will cover this brieifly further down in the post.

Now back on to S2D…For data protection, S2D uses either 3-way mirroring or distributed parity with EC.  Mirroring gives you great write performance, but only 33% data efficiency.  EC gives you good data efficiency, but random write performance isn’t great for hot data.  ReFS supports the ability to combine different disk tiers using different parity schemes in the same vDisk. This allows S2D to do real-time data tiering by writing new data to the mirror tier and then automatically rotating cold data out to the parity tier and applying the erasure code on data rotation.

It is important to note that ReFS does not currently support Deduplication.  There was a question on this in every session and MSFT says that this is all the ReFS is currently focused on. So we’ll expect to see it land in ReFSv3. For now, customers can get dedupe with S2D by using NTFS. 🙁

Storage Spaces Direct Explained ReFS Storage Spaces Direct Explained ReFS

Note if you only have two types of storage then the highest performing is used for the cache while the other type will be divided between performance and capacity with the different resiliency option (mirror vs parity) providing the performance/capacity difference between the tiers. If you only have one type of storage then the cache is disabled and the disks divided between performance and capacity like the previously mentioned case.

For non-Storage Spaces Direct only two tiers, of storage are supported like Windows Server 2012 R2, i.e. SSD and HDD, there is no cache. If you had NVMe storage that could be the “hot” tier while the rest of storage (SSD, HDD) could be the “cold” tier (you name the tiers whatever you want) but you cannot use three tiers.

Storage Spaces Direct Explained ReFS Storage Spaces Direct Explained ReFSStorage Spaces Direct Explained ReFS

During Ignite 2016, Microsoft took many shots at VMware. Microsoft said that there’s a right way and a wrong way to do erasure coding.  “When you do it the wrong way, performance sucks and you have to limit it to all-flash configurations.”

Microsoft research is using a new technique called “Local Reconstruction Codes”. It uses smaller groups within the vDisk that allows them to recover from failures much faster by not having to reconstruct data from across the entire pool. This combined with multi-tier volumes gives S2D good performance, even on hybrid systems. Sounds like a technology that I seen before. Hmmm..I wonder where…….  😉

Storage Spaces Direct Explained ReFSOk, that’s all for now. next up, Fault Tolerance and Multisite Replication with S2D….

Until Next time, Rob….

Storage Spaces Direct Basics – Explained


'Steno Keypads 50% OFF' 'So, would you like the model that only types verbs, or the one that only types nouns?'

Storage Spaces Direct Basics

Storage Spaces Direct Basics

Like anything else, I’m going to start with the basics of the stack and then dive into details of each component over the next few blog posts. There’s a lot to digest…So let’s get rolling…

As mentioned in my previous post, S2D can be deployed in either a more traditional disaggregated compute model or as a Hyperconverged model as shown below:

Storage Spaces Direct Basics

Here are the basic components of the stack…

Failover Clustering The built-in clustering feature of Windows Server is used to connect the servers.

Software Storage Bus – The Software Storage Bus is new in S2D. The bus spans the cluster and establishes a software-defined storage fabric where all the servers can see all of each other’s local drives.

Storage Bus Layer Cache – The Software Storage Bus dynamically binds the fastest drives present (typically  SSDs) to slower HDDs to provide server-side read/write caching. The cache is independent of pools and vDisks, always-on, and requires no configuration.

Storage Pool – When an IT Admin enables storage spaces, all of the eligible drives (excludes boot drives, etc.) discovered by the storage bus. Disks are grouped together to form a pool.  It’s created automatically on setup, and by default, there is only one pool per cluster.  IT Admin’s can configure additional pools, but Microsoft recommends against it.

Storage Spaces – From the pool, Microsoft’s carves out ‘storage spaces’ or essentially virtual disks. The vDisks can be defined as a simple space (no protection), mirrored space (distributed 2-way or 3-way mirroring), or a parity space (distributed erasure coding). You can think of it as distributed, software-defined RAID using the drives in the pool.  IT Admin’s can choose to use the new ReFS file system (more on this later) or traditional NTFS.

Resilient File System (ReFS)  ReFS is the purpose-built filesystem for virtualization. This includes dramatic accelerations for .vhdx file operations such as creation, expansion, and checkpoint merging. It also has built-in checksums to detect and correct bit errors. ReFS also introduces real-time tiers. This allows the rotation data between so-called “hot” and “cold” storage tiers in real-time based on usage.

Cluster Shared Volumes – Each vDisk is a cluster shared volume that exists within a single namespace so that every volume appears to each host server as being mounted locally.

Scale-Out File Server – The scale-out file server only exists in converged deployments and provides remote file access via SMB3.

Networking Hardware  Storage Spaces Direct uses SMB3, including SMB Direct and SMB Multichannel, over Ethernet to communicate between servers. Microsoft strongly recommends 10+ GbE with remote-direct memory access (RDMA). IT Admin’s can either use iWARP or RoCE (RDMA over Converged Ethernet).

In Windows Server 2016, Microsoft has also incorporated Storage Replica, Storage QoS, and a new Health Service. I’ll cover each of these areas in a little more detail in a later post with regards to S2D.

Storage Spaces Direct Basics Storage Spaces Direct Basics

Storage Hardware

Microsoft supports hybrid or all-flash configurations.  Each server must have at least 2 SSDs and 4 additional drives. Microsoft has support for NVMe in the product today.  IT Admin’s can use a mixture of NVME, SSD, or HDDs in a variety of tiering models. The SATA and SAS devices should be behind a host-bus adapter (HBA) and SAS expander.

Storage Spaces Direct Basics Storage Spaces Direct Basics

Now that we have covered the basics, next I will dive into how each of the components work.  Next up, ReFS, Multi-Tier Volumes, Erasure Coding and tigers oh my… 🙂

Until next time, Rob…