Storage Spaces Direct Explained – ReFS, Multi-Tier Volumes and Erasure Coding

Here’s where we dive in and get dirty…but I promise by the end of my series, you will smiling like my friend here. I am planning a surprise with special guest bloggers. Stayed Tuned. Now one to the show…..
Storage Spaces Direct Explained ReFS

The NEW ReFS File System, Multi-Tier Volumes and Erasure Coding

Storage Spaces Direct Explained ReFSLike S2D, the ReFS file system actually isn’t new either, they have been working on it for several releases now also.  In Windows Server 2016, it finally drops the tech preview label and is now ready for production.  And there is a lot of benefits… like volume creation doesn’t have to zero out the volume for 10 minutes like NTFS. It’s just a metadata operation that is effectively instantaneous now, I’m just going to focus on the couple of benefits that ReFS has for S2D.
For those not familiar Erasure coding (EC) and to prepare you for the next part, EC is a method of data protection in which data is broken into fragments, expanded and encoded with redundant data pieces and stored across a set of different locations.
The original goal of EC was to enable data that becomes corrupted at some point in the storage process to be reconstructed by using information about the data that’s stored elsewhere.  Erasure codes are great, because of their ability to reduce the time and overhead required to reconstruct data. The drawback of erasure coding is that it can be more CPU-intensive, and that can translate into increased latency.
Now all that being said, classic erasure codes were designed and optimized more for communication, not for storage. Naively applying classic erasure codes in storage is okay, but is missing enormous efficiencies. Microsoft has developed their own erasure codes optimized for storage called Local Reconstruction Codes (LRC). I will cover this brieifly further down in the post.
Now back on to S2D…For data protection, S2D uses either 3-way mirroring or distributed parity with EC.  Mirroring gives you great write performance, but only 33% data efficiency.  EC gives you good data efficiency, but random write performance isn’t great for hot data.  ReFS supports the ability to combine different disk tiers using different parity schemes in the same vDisk. This allows S2D to do real-time data tiering by writing new data to the mirror tier and then automatically rotating cold data out to the parity tier and applying the erasure code on data rotation.
It is important to note that ReFS does not currently support Deduplication.  There was a question on this in every session and MSFT says that this is all the ReFS is currently focused on. So we’ll expect to see it land in ReFSv3. For now, customers can get dedupe with S2D by using NTFS. 🙁
Storage Spaces Direct Explained ReFS Storage Spaces Direct Explained ReFSNote if you only have two types of storage then the highest performing is used for the cache while the other type will be divided between performance and capacity with the different resiliency option (mirror vs parity) providing the performance/capacity difference between the tiers. If you only have one type of storage then the cache is disabled and the disks divided between performance and capacity like the previously mentioned case.
For non-Storage Spaces Direct only two tiers, of storage are supported like Windows Server 2012 R2, i.e. SSD and HDD, there is no cache. If you had NVMe storage that could be the “hot” tier while the rest of storage (SSD, HDD) could be the “cold” tier (you name the tiers whatever you want) but you cannot use three tiers.
Storage Spaces Direct Explained ReFS Storage Spaces Direct Explained ReFSStorage Spaces Direct Explained ReFSDuring Ignite 2016, Microsoft took many shots at VMware. Microsoft said that there’s a right way and a wrong way to do erasure coding.  “When you do it the wrong way, performance sucks and you have to limit it to all-flash configurations.”
Microsoft research is using a new technique called “Local Reconstruction Codes”. It uses smaller groups within the vDisk that allows them to recover from failures much faster by not having to reconstruct data from across the entire pool. This combined with multi-tier volumes gives S2D good performance, even on hybrid systems. Sounds like a technology that I seen before. Hmmm..I wonder where…….  😉
Storage Spaces Direct Explained ReFSOk, that’s all for now. next up, Fault Tolerance and Multisite Replication with S2D….

Until Next time, Rob….

Microsoft Exchange Best Practices on Nutanix

To continue on my last blog post on Exchange...

As I mentioned previously, I support SE’s from all over the world. And again today, I got asked what are the best practices for running Exchange on Nutanix. Funny enough, this question comes in quite often.  Well, I am going to help resolve that. There’s a lot of great info out there, especially from my friend Josh Odgers, which has been leading the charge on this for a long time.  Some of his posts can be controversial, but truth is always there.  He’s getting a point across.

This blog post will be updated on a regular basis as things change. It will also be moved to a permanent part of the netwatch.me resources section.  This is meant to be a general best practice guide to help with planning and maintaining a healthy Exchange environment on Nutanix.  I will specify hypervisor specifics when required.  Now on the post…..

msexchange.

Let’s start out with the basics…

MS Exchange on Nutanix Support

Nutanix provides a 100% supported solution for MS Exchange running on vSphere, Hyper-V or Acropolis Hypervisor using iSCSI (Block storage)
Here is a breakdown of supported configurations by hypervisor:

vSphere (ESXi) Use In-Guest iSCSI (Volume Groups) for full support
Hyper-V Use SMB 3.0
AHV Use native vDisks (iSCSI) – SVVP Certification for AHV

Also, check out Josh’s post “Fight the FUD – Support for MS Exchange on Nutanix” that outlines this very topic.  In summary, the customer has the choice to deploy in multiple configurations to suit their needs. But, one of the most often questions I get is, “does your SVVP Certification cover running Exchange on all your supported hypervisors?”  The answer is not simple.  The SVVP was submitted for the Acropolis Hypervisor, while this does not cover all of them, we technically are supported for all hypervisors as per Microsoft supported storage architectures.  Microsoft does not specifically mention Hyperconverged, it only mentions ISCSI in regards to SAN.  IMO, that covers ESXi and AHV.

Now let me explain….SAN’s are one of the biggest modern datacenter bottlenecks. Data has gravity, so co-locating storage and compute eliminates network bottlenecks = Hyperconverged is way better than SAN and hence SUPPORTED IMO 😉

To end this topic and move on, a Nutanix customer has the choice to deploy in multiple configurations to suit their needs.  Being pushed to one particular hypervisor for a customer is not always in their best interest.  Having choices now and later is a much better approach with the overall goal of simplifying the datacenter.   As Josh said in one of his blog posts ,”Running a standard platform and storage protocol for all workloads is a simple model which reduces the unnecessary complexity of multiple protocols and/or in-guest storage configurations”, I can’t agree more with that statement. 🙂

Exchange Performance on Nutanix

Now this subject will always be controversial and potentially subject to criticism.  Internal testing performed by the Nutanix Performace and Engineering team shows that AHV and Hyper-V performance are roughly the same from a hypervisor perspective and ESXi was 10% higher. That being said, usually, the next question is how is performance versus traditional SAN/NAS.  And again, I have to point out, it’s all about Data Locality. Can’t change the laws of physics. Data has gravity, hence we will always beat traditional SAN architecture.

Check out Josh’s posts on “Peak Performance vs Real World – Exchange on Nutanix Acropolis Hypervisor”.  It gives you a better understanding of are realistic benchmarks of Exchange in general and on Nutanix. I wholeheartedly agree with Josh when he says “Benchmarks are of little value without context specific to customer requirements!”  Spending close to over 15 years building and maintain Exchange systems, I learned one hard fact, no generic simulator (like JetStress) can show real world metrics.

Data Reduction Technologies with Exchange on Nutanix

Recommendation:
1 vDisk per Database, 1 vDisk per DB Logs
1 Container with RF2, In-Line Compression & EC-X for Databases
1 Container with RF2 for Logs
Do not use Dedupe with MS Exchange!
Reference: https://technet.microsoft.com/en-us/library/ee832792(v=exchg.150).aspx
Microsoft does not support Data deduplication (Note: Underlying storage deduplication such as Nutanix dedupe is not mentioned, but implied)

Data Reduction Estimates:

Rule of thumb: Always size without data reduction if possible.
Conservative assumption for compression for Exchange = 1.3:1
Aggressive assumption for compression for Exchange = 1.6:1
Conservative assumption for EC-X for Exchange = 1.1:1
Aggressive assumption for EC-X for Exchange = 1.25:1

Questions to ask yourself when planning an Exchange Environment:

How many Users? e.g.: 10000, 10000, etc.
How many user profiles do you need? e.g.: 2 , Standard and Executives
How large Mailbox (excluding archiving) per User? e.g.: 1GB, 2GB , 5GB
How many messages per day do you want to support per user? Light = 50 , Medium = 100 , Heavy = 150+

Do you require site resiliency?

These are among some of the basic questions you need to answer.  This is where the Exchange Server Role Calculator comes in. It’s a great tool, but like any tool, you do need to give it good input to get out good output. The function of the tool is as the name implies.

Exchange Server Role Calculator Defined

Now, at the time of this writing, version 7.8 is the latest and greatest. Now, do note, I would not call this tool perfect, but its gets you pretty close. Like anything else, the Exchange team is still learning real world behavior and this is where a good experienced Exchange engineer comes into play.

IMO..there is an Art and Science to sizing Exchange.  The days of Exchange just being a simple mail server are far over. These days, it’s much more complex with supporting multiple forms of ingress and egress traffic for different functions (Mobile, Web, SMTP, Skype Integration, etc.). Each of these different functions has varying load considerations and supports more visible features like Outlook Web Access and Exchange Activesync. Also, I still am of the opinion that it does not take into consideration the number of devices that 1 mailbox services.
exchangecomplex
Considering this complexity, you can see that undersizing or oversizing can happen easily.  If you size correctly at the beginning with Nutanix, then it just an easy scale out, buy as you need it situation. Then you know what happens, finally for the first time, predictability in your budgets.  I remember the days, not that long ago, when I had to have a client retire a SAN, not for space constraints, but for IO constraints.  And at the time, all I got from the client was “can’t we use it for something else” and ya, I’ve replied with “use it as a WSUS repository for patching the Exchange environment” 😉

During my next post, I will dive into the Exchange Role Calculator much more and go over some examples of sizing on Exchange. We’ll mainly focus on mailbox storage and then move on to other role sizing considerations.  I also plan to cover the other aspects to maintain a healthy Exchange environment (i.e. Message Hygiene, Global and Local Load balancing, Integrations and End User Experience) in subsequent posts.
Below are the Office Best Practices Guides from Nutanix and some public case studies.

Until next time, Rob…..

Nutanix Offical Best Practice Guides
MS Exchange on Nutanix / vSphere Best practice guide: http://go.nutanix.com/VirtualizingMicrosoftExchangeonWeb-ScaleConvergedInfrastructure.html

Public Case Studies for Nutanix customers using Exchange
Richter: http://go.nutanix.com/rs/nutanix/images/Nutanix-Case-Study-Richter.pdf
Riverside: http://www.nutanix.com/resource/riverside-for-riversides-server-and-storage-consolidation-nutanix-fits-like-a-glove/