Storage Spaces Direct Explained – Applications & Performance


Microsoft SQL Server product group announced that SQL Server, either virtual or bare metal, is fully supported on Storage Spaces Direct. The Exchange Team did not have a clear endorsement for Exchange on S2D and clearly still prefers that Exchange is deployed on physical servers with local JBODs using Exchange Database Availability Groups or that customers simply move to O365.

Microsoft showed all kinds of performance #s but these are using all NVMe SSD systems and real-world workloads like 100% 4k Random Reads.
Much like VSAN, Storage Spaces is implemented in-kernel. Their messaging is very similar as well, claiming more efficient IO path, and CPU consumption typically way less than 10% of system CPU. Like VSAN, the exact overhead of S2D is difficult to measure.
Microsoft is pushing NVMe Flash Devices for S2D and here are some examples of their positioning.
Their guidance was to avoid NVMe devices if your primary requirement is capacity as today you will pay a significant premium $/GB.
image034 image035
Where NVMe shines is on reduced latency and increased performance with NVMe systems driving 3.4x more IOPs than a similar SATA SSD on S2D.
image036 image037
There is also a significant benefit to CPU consumption with NVMe consuming nearly 50% less CPU than SATA SSDs on S2D.
I also want to point out that the Azure Storage team is working very closely with Intel and Micron and will be moving parts of Azure to 3D Xpoint as soon as possible. This will filter down to S2D at some point, and we should expect them to be close to the bleeding edge for supporting new storage class memory technologies.

Storage Spaces Direct will scale up to 16 nodes. In earlier Tech Preview releases they supported a minimum cluster size of 4 nodes. Recently they dropped that to 3 nodes and this week at Ignite they announced support for 2-node configurations. The 2-node configurations will use 2-way mirroring and require a separate witness that can be deployed on-premise or as a remote witness in Azure. Support for min 2 node configs does give them an advantage in ROBO and mid-market especially when low-cost is more important than high availability.

S2D will support both scale-up (adding additional local disk) and scale-out (with support for adding nodes in increments of 1).
image039 image040 image041 image042

Product Positioning

Microsoft’s guidance is for customers to use smaller hyper-converged configurations for ROBO and small departmental workloads where cost efficiency is the primary driver. For larger enterprises and hosters/service providers, Microsoft recommends a converged model that allows the independent scaling of compute and storage resources.image043
So How Do Customers Buy Storage Spaces Direct?

Storage Spaces Direct is a feature of Windows Server 2016 and customers get it for free with DataCenter Edition. Customers will have the option of DIY or purchase one of the new Storage Spaces Direct reference architecture solutions from one of 12 different partners.
With previous storage spaces offerings in Server 2012 and 2012R2, Microsoft put the technology out there for the DIY crowd and hoped that the server vendors would find the technology interesting enough to add to their portfolios. The problem was it needed JBOD shelves and in most server vendor organizations, JBODs fell under the storage teams, not the server teams. There was no way that any storage team was going to jeopardize their high margin traditional storage business by offering low margin Storage Spaces based JBOD solutions. Most vendors didn’t even want to sell JBODs at all. For example, Dell typically overpriced their JBODs to make EqualLogic look like a good deal at just a 15% uplift from a basic JBOD shelf…. much like movie theaters get us to buy the large popcorn for 50 cents more.

With Storage Spaces Direct, Microsoft is now dealing with the server part of these organizations… and all these guys care about is selling more servers. So Spaces went from having no partner interest to having support from all of the major server vendors.

However, since S2D is free with Windows and channel partners only get paid for the server sale, there is little incentive for them push S2D over other HCI options on these platforms. Therefore, I suspect that the majority of S2D adoption should come from customers asking to buy it rather than partners pushing it as an option.
So here is what the partner ecosystem looks like today.
image045 image046

To formalize this, Microsoft created a new program called Windows Server Software Defined (WSSD) allowing partners to submit validated WSSD Reference Architectures. Microsoft provides the validation tools and methodology and the partner does the testing. They get a Windows Server 2016 Certified Logo plus SDDC Additional Qualifiers.
image047 image048

Partners can offer their choice of Hyper-Converged or Converged configurations. Here’s where the classic Microsoft unnecessary complexity comes in… Within Hyper-Converged there are two additional options – Standard and Premium. Premium has some additional SDN and Security features turned on, but it’s simply a configuration thing. All of these come with Datacenter Edition so there is no cost or licensing difference.
image049 image050

Here are a few examples of the offerings. S2D offerings will be available starting in mid-October as soon as Server 2016 goes GA.
image050 image051 image052
You may be asking who is responsible for support? Because it’s just a reference architecture, there is a split support model. Customers will call the server vendor for hardware issues and Microsoft for software issues.


Storage Spaces has come a long way since Server 2012 and will be considered a viable option for customers looking at software-defined storage solutions. Some of the customers perceived advantages of S2D will be… low cost, min 2-node config, a broad choice of hardware vendors, storage QoS, NVMe support, single vendor software stack, and choice of deployment model (Hyper-Converged or Converged). Probably the most important of those is the price. Understanding the differences will be key. It’s tough to compete against ‘good enough’ and ‘free’.

Microsoft has not been very successful driving Storage Spaces adoption in the last 2 releases. Part of this is due to product immaturity, but most of this is because they didn’t build any real sales program around it. This hasn’t really changed with the WSSD Reference Architecture program. The big boys like Dell, HP, and Cisco are not going to position S2D over their own HCI offerings and the smaller players like SuperMicro, DataON and RAID Inc will never drive any significant adoption. Regardless of hardware platform, there is a very little incentive for the channel to sell S2D reference architectures over other HCI solutions (where they get paid for both the SW+HW sale). So without a strong sales program, I don’t believe that we will see S2D emerge as a big market share anytime soon.

Until next time, Rob.

Storage Spaces Direct Explained – Management & Operations

Management & Operations
Good day everyone. It been a few weeks, like busy with work and such. Anyways, this post will go into how Management & Operations are done in S2D.  Now, my biggest pet peeve is complex GUI management and yet again, Microsoft doesn’t disappoint.  It still a number of steps in different interfaces to bring up S2D, Check out Aidan Finns blog post on disaggregated management from last year.  It still rings true to this day with the release of 2016. It shouldn’t be this complex IMO 🙁 That being said, let move to the details.

Management & OperationsManagement & Operations

Microsoft is pushing everyone to use PowerShell as the primary management tool for Storage Spaces, but you can also manage it with a combination of Windows Failover Cluster Manager, SCVMM, and SCOM as mentioned above. So if you are good at Powershell, management is fairly simple. If not, then you have the classic switching between different tools management experience :(. This is why everyone really needs to start their PowerShell training now, to survive as an architect in Microsoft land going forward ;).

There is a Health Service built into Windows Server 2016 that provides some decent system health and status information for Storage Spaces. I just saw a few demos at ignite16 and have not played with it yet, so I’ll have to dig into this further and see how they stack up in a future post.
Management & OperationsManagement & Operations
S2D supports cluster aware updating that integrates with the Windows Update Service. Like VSAN, because they run in kernel, they need to live migrate VMs off the host server, perform the update, reboot, and then migrate everything back. I’ll note that this is only the case for the hyper-converged deployment model. In a converged model where the VMs are on a separate compute tier, you can update the storage controllers one at a time fairly seamlessly without impacting VMs on the separate compute tier.

While I am not a big fan of the management,  this could give rise to tools like 5nine if they decide to support S2D management. Next up. Application and Performance, Until next time, Rob.

Storage Spaces Direct Explained – Storage QOS & Networking

Storage QOS & NetworkingYo everyone…This is going to be a short blog post in this series. I am just covering Networking and Storage QoS as it pertains to S2D. There are the technologies the bind S2D together.
Storage QoS

S2D is using the Storage (QoS) Quality of Service that ships with Windows Server 2016 which provides standard min/max IOPS and bandwidth control. QoS policy can be applied at the VHD, VM, Groups of VMs, or Tenant Level. Benefits include:

  • Mitigate noisy neighbor issues. By default, Storage QoS ensures that a single virtual machine cannot consume all storage resources and starve other virtual machines of storage bandwidth.
  • Monitor end to end storage performance. As soon as virtual machines stored on a Scale-Out File Server are started, their performance is monitored. Performance details of all running virtual machines and the configuration of the Scale-Out File Server cluster can be viewed from a single location
  • Manage Storage I/O per workload business needs Storage QoS policies define performance minimums and maximums for virtual machines and ensures that they are met. This provides consistent performance to virtual machines, even in dense and overprovisioned environments. If policies cannot be met, alerts are available to track when VMs are out of policy or have invalid policies assigned.

Storage QOS & NetworkingWhat’s New in Networking with S2D?
In Windows Server 2016, they added Remote Direct Memory Access (RDMA) support to the Hyper-V virtual switch.
For those that don’t know what RMDA is it technology that allows direct memory access from one computer to another, bypassing TCP layer, CPU , OS layer and driver layer. Allowing for low latency and high-throughput connections. This is done with hardware transport offloads on network adapters that support RDMA.
Back to Hyper-V virtual switch support for RDMA.  This allows you to configure regular or RDMA enabled vNICs on top of a pair of RDMA capable physical NICs. They also added embedded NIC teaming or Switch Embedded Teaming (SET).
SET is where NIC teaming and the Hyper-V switch is a single entity and can now be used in conjunction with RDMA NICs, wherein Windows 2012 Server you needed to have separate NIC teams for RDMA and Hyper-V Switch.
The images below illustrates the architecture changes between Windows Server 2012 R2 and Windows Server 2016.
Storage QOS & Networking
Storage QOS & NetworkingNext up…Management and Operations…

Until next time, Rob

Storage Spaces Direct Explained – Fault Tolerance and Multisite Replication

funniest-construction-mistakes-25Fault Tolerance…What does it mean?  Let me break it down simply. Pictured above is just a bad design, not fault tolerance. This is not really what fault tolerance means. Having two or more of something is one factor, but how it’s implanted is just as important.  Fault Tolerance incorporates two very important principles, High availability and Redundancy.
Now if we had a few toilets side by side and kept only 1 open and the other 2 on standby. Also, if it could move the user automatically to another toilet during a failure, then it technically it would be fault tolerant. Anyways, let’s move on from toilets to the real world. 🙂
stalls-3Simply, Fault Tolerance is the ability to continue non-stop when a hardware failure occurs. A fault-tolerant system is designed from the ground up for reliability by building multiples of all critical components, such as CPUs, memories, disks and power supplies into the same computer. In the event one component fails, another takes over without skipping a beat.
Many systems are designed to recover from a failure by detecting the failed component and switching to another computer system. These systems, although sometimes called fault tolerant, are more widely known as “high availability” systems, requiring that the software re-submits the job when the second system is available.
True fault tolerant systems with redundant hardware are the most costly because the additional components add to the overall system cost. However, fault tolerant systems provide the same processing capacity after a failure as before, whereas high availability systems often provide reduced capacity. Ok, let move on to fault tolerance in S2D.
Fault Tolerance in S2D

Storage Space Direct (S2D) uses 3-way mirroring and will spread those mirrors across 3 different servers in the cluster. S2D supports full chassis and rack awareness and gives you the option to distribute data copies across these fault domains.
For disk failures, S2D also uses a self-healing approach… in basic terms, S2D offlines the disk and rebuilds the data copy on another node in the cluster. Replacing a drive adds capacity back into the system.  This is important note as not all HCI vendors support self-healing, For example, on VSAN and some other vendors, disk failures take out entire vDisks.
Fault Tolerance Fault Tolerance Fault Tolerance Fault Tolerance
Multisite Replication

S2D uses Storage Replica (that ships with Windows Server 2016) for synchronous or async replication. They support both stretched clusters and cluster to cluster DR. Storage Replica is part of Windows Server  can be used for other data replication needs outside of S2D.
Fault ToleranceFault ToleranceOk…Next up, Storage QOS and Networking..

Until next time, Rob….

Storage Spaces Direct Explained – ReFS, Multi-Tier Volumes and Erasure Coding

Here’s where we dive in and get dirty…but I promise by the end of my series, you will smiling like my friend here. I am planning a surprise with special guest bloggers. Stayed Tuned. Now one to the show…..
Storage Spaces Direct Explained ReFS

The NEW ReFS File System, Multi-Tier Volumes and Erasure Coding

Storage Spaces Direct Explained ReFSLike S2D, the ReFS file system actually isn’t new either, they have been working on it for several releases now also.  In Windows Server 2016, it finally drops the tech preview label and is now ready for production.  And there is a lot of benefits… like volume creation doesn’t have to zero out the volume for 10 minutes like NTFS. It’s just a metadata operation that is effectively instantaneous now, I’m just going to focus on the couple of benefits that ReFS has for S2D.
For those not familiar Erasure coding (EC) and to prepare you for the next part, EC is a method of data protection in which data is broken into fragments, expanded and encoded with redundant data pieces and stored across a set of different locations.
The original goal of EC was to enable data that becomes corrupted at some point in the storage process to be reconstructed by using information about the data that’s stored elsewhere.  Erasure codes are great, because of their ability to reduce the time and overhead required to reconstruct data. The drawback of erasure coding is that it can be more CPU-intensive, and that can translate into increased latency.
Now all that being said, classic erasure codes were designed and optimized more for communication, not for storage. Naively applying classic erasure codes in storage is okay, but is missing enormous efficiencies. Microsoft has developed their own erasure codes optimized for storage called Local Reconstruction Codes (LRC). I will cover this brieifly further down in the post.
Now back on to S2D…For data protection, S2D uses either 3-way mirroring or distributed parity with EC.  Mirroring gives you great write performance, but only 33% data efficiency.  EC gives you good data efficiency, but random write performance isn’t great for hot data.  ReFS supports the ability to combine different disk tiers using different parity schemes in the same vDisk. This allows S2D to do real-time data tiering by writing new data to the mirror tier and then automatically rotating cold data out to the parity tier and applying the erasure code on data rotation.
It is important to note that ReFS does not currently support Deduplication.  There was a question on this in every session and MSFT says that this is all the ReFS is currently focused on. So we’ll expect to see it land in ReFSv3. For now, customers can get dedupe with S2D by using NTFS. 🙁
Storage Spaces Direct Explained ReFS Storage Spaces Direct Explained ReFSNote if you only have two types of storage then the highest performing is used for the cache while the other type will be divided between performance and capacity with the different resiliency option (mirror vs parity) providing the performance/capacity difference between the tiers. If you only have one type of storage then the cache is disabled and the disks divided between performance and capacity like the previously mentioned case.
For non-Storage Spaces Direct only two tiers, of storage are supported like Windows Server 2012 R2, i.e. SSD and HDD, there is no cache. If you had NVMe storage that could be the “hot” tier while the rest of storage (SSD, HDD) could be the “cold” tier (you name the tiers whatever you want) but you cannot use three tiers.
Storage Spaces Direct Explained ReFS Storage Spaces Direct Explained ReFSStorage Spaces Direct Explained ReFSDuring Ignite 2016, Microsoft took many shots at VMware. Microsoft said that there’s a right way and a wrong way to do erasure coding.  “When you do it the wrong way, performance sucks and you have to limit it to all-flash configurations.”
Microsoft research is using a new technique called “Local Reconstruction Codes”. It uses smaller groups within the vDisk that allows them to recover from failures much faster by not having to reconstruct data from across the entire pool. This combined with multi-tier volumes gives S2D good performance, even on hybrid systems. Sounds like a technology that I seen before. Hmmm..I wonder where…….  😉
Storage Spaces Direct Explained ReFSOk, that’s all for now. next up, Fault Tolerance and Multisite Replication with S2D….

Until Next time, Rob….

Storage Spaces Direct Basics – Explained

'Steno Keypads 50% OFF' 'So, would you like the model that only types verbs, or the one that only types nouns?'Storage Spaces Direct BasicsStorage Spaces Direct BasicsLike anything else, I’m going to start with the basics of the stack and then dive into details of each component over the next few blog posts. There’s a lot to digest…So let’s get rolling…
As mentioned in my previous post, S2D can be deployed in either a more traditional disaggregated compute model or as a Hyperconverged model as shown below:
Storage Spaces Direct Basics

Here are the basic components of the stack…

Failover Clustering The built-in clustering feature of Windows Server is used to connect the servers.

Software Storage Bus – The Software Storage Bus is new in S2D. The bus spans the cluster and establishes a software-defined storage fabric where all the servers can see all of each other’s local drives.

Storage Bus Layer Cache – The Software Storage Bus dynamically binds the fastest drives present (typically  SSDs) to slower HDDs to provide server-side read/write caching. The cache is independent of pools and vDisks, always-on, and requires no configuration.

Storage Pool – When an IT Admin enables storage spaces, all of the eligible drives (excludes boot drives, etc.) discovered by the storage bus. Disks are grouped together to form a pool.  It’s created automatically on setup, and by default, there is only one pool per cluster.  IT Admin’s can configure additional pools, but Microsoft recommends against it.

Storage Spaces – From the pool, Microsoft’s carves out ‘storage spaces’ or essentially virtual disks. The vDisks can be defined as a simple space (no protection), mirrored space (distributed 2-way or 3-way mirroring), or a parity space (distributed erasure coding). You can think of it as distributed, software-defined RAID using the drives in the pool.  IT Admin’s can choose to use the new ReFS file system (more on this later) or traditional NTFS.

Resilient File System (ReFS)  ReFS is the purpose-built filesystem for virtualization. This includes dramatic accelerations for .vhdx file operations such as creation, expansion, and checkpoint merging. It also has built-in checksums to detect and correct bit errors. ReFS also introduces real-time tiers. This allows the rotation data between so-called “hot” and “cold” storage tiers in real-time based on usage.

Cluster Shared Volumes – Each vDisk is a cluster shared volume that exists within a single namespace so that every volume appears to each host server as being mounted locally.

Scale-Out File Server – The scale-out file server only exists in converged deployments and provides remote file access via SMB3.

Networking Hardware  Storage Spaces Direct uses SMB3, including SMB Direct and SMB Multichannel, over Ethernet to communicate between servers. Microsoft strongly recommends 10+ GbE with remote-direct memory access (RDMA). IT Admin’s can either use iWARP or RoCE (RDMA over Converged Ethernet).

In Windows Server 2016, Microsoft has also incorporated Storage Replica, Storage QoS, and a new Health Service. I’ll cover each of these areas in a little more detail in a later post with regards to S2D.
Storage Spaces Direct Basics Storage Spaces Direct BasicsStorage Hardware

Microsoft supports hybrid or all-flash configurations.  Each server must have at least 2 SSDs and 4 additional drives. Microsoft has support for NVMe in the product today.  IT Admin’s can use a mixture of NVME, SSD, or HDDs in a variety of tiering models. The SATA and SAS devices should be behind a host-bus adapter (HBA) and SAS expander.
Storage Spaces Direct Basics Storage Spaces Direct Basics
Now that we have covered the basics, next I will dive into how each of the components work.  Next up, ReFS, Multi-Tier Volumes, Erasure Coding and tigers oh my… 🙂

Until next time, Rob…