Building Nutanix Ready…What does it mean to be “Ready”?

Before we go into what “Ready” really means.  Every great journey has a story behind it. This will be a multi-part series starting with how I joined Nutanix and evolved myself to build a world-class program called “Nutanix Ready”. Stay Tuned, Part 1 coming very soon!  RobNutanix Ready

CPS Standard on Nutanix Released

nutanixCPS
Fun and crazy days here at Nutanix. I’ve busy been fielding a lot of calls around our new offering, CPS Standard on Nutanix. Now if you don’t know what CPS is, it stands for Cloud Platform System.

So what is Microsoft CPS anyways?

Simply, Microsoft CPS is a software stack of Window Server, System Center, and Windows Azure Pack.  CPS delivers a self-service cloud environment for Windows and Linux applications that provides optimized deployment of Windows Azure Pack.
Currently based on Windows Server 2012 R2, System Center 2012 R2 and Windows Azure Pack, CPS provides an Azure-consistent experience by leveraging Azure services to deliver business continuity (through Azure Site Recovery) of your hybrid cloud for your virtualized Windows and Linux workloads. For more details on Windows Azure Pack, check out my blog series on WAP.

If you have read my WAP blog series, building your own cloud can be a complex undertaking. Integrating the hardware, installing and configuring the software, and optimizing the overall solution for usability, performance, and scale, and reliability means that many cloud deployments fall short.

Introducing Microsoft CPS on Nutanix, an easier way to deploy WAP

The solution is due to the co-engineering and joint validation efforts with Microsoft and Nutanix. Getting the solution up and running is pretty fast, accelerating your time to value.
The joint effort goes beyond initial deployment. Once the Microsoft\Nutanix CPS solution is up and running, you get a single point of contact for support and simplified patching and updating across the entire stack of firmware and software. And as an added benefit, you get the ability to scale the environment with all the Nutanix goodness.

Bits are installed at the factory, so when you get your Nutanix Block, it’s just as easy as a wizard to get you up and running.  Below is a video that my buddy @mcghee did on the install and initial configuration of CPS. The video brings you right up to the admin and tenant portals and gives you a brief tour.

Enjoy…Until next time, Rob….

Microsoft Exchange Best Practices on Nutanix

To continue on my last blog post on Exchange…

As I mentioned previously, I support SE’s from all over the world. And again today, I got asked what are the best practices for running Exchange on Nutanix. Funny enough, this question comes in quite often.  Well, I am going to help resolve that. There’s a lot of great info out there, especially from my friend Josh Odgers, which has been leading the charge on this for a long time.  Some of his posts can be controversial, but truth is always there.  He’s getting a point across.

This blog post will be updated on a regular basis as things change. It will also be moved to a permanent part of the netwatch.me resources section.  This is meant to be a general best practice guide to help with planning and maintaining a healthy Exchange environment on Nutanix.  I will specify hypervisor specifics when required.  Now on the post…..

msexchange.

Let’s start out with the basics…

MS Exchange on Nutanix Support

Nutanix provides a 100% supported solution for MS Exchange running on vSphere, Hyper-V or Acropolis Hypervisor using iSCSI (Block storage)
Here is a breakdown of supported configurations by hypervisor:

vSphere (ESXi)Use In-Guest iSCSI (Volume Groups) for full support
Hyper-V Use SMB 3.0
AHVUse native vDisks (iSCSI) – SVVP Certification for AHV

Also, check out Josh’s post “Fight the FUD – Support for MS Exchange on Nutanix” that outlines this very topic.  In summary, the customer has the choice to deploy in multiple configurations to suit their needs. But, one of the most often questions I get is, “does your SVVP Certification cover running Exchange on all your supported hypervisors?”  The answer is not simple.  The SVVP was submitted for the Acropolis Hypervisor, while this does not cover all of them, we technically are supported for all hypervisors as per Microsoft supported storage architectures.  Microsoft does not specifically mention Hyperconverged, it only mentions ISCSI in regards to SAN.  IMO, that covers ESXi and AHV.

Now let me explain….SAN’s are one of the biggest modern datacenter bottlenecks. Data has gravity, so co-locating storage and compute eliminates network bottlenecks = Hyperconverged is way better than SAN and hence SUPPORTED IMO 😉

To end this topic and move on, a Nutanix customer has the choice to deploy in multiple configurations to suit their needs.  Being pushed to one particular hypervisor for a customer is not always in their best interest.  Having choices now and later is a much better approach with the overall goal of simplifying the datacenter.   As Josh said in one of his blog posts ,”Running a standard platform and storage protocol for all workloads is a simple model which reduces the unnecessary complexity of multiple protocols and/or in-guest storage configurations”, I can’t agree more with that statement. 🙂

Exchange Performance on Nutanix

Now this subject will always be controversial and potentially subject to criticism.  Internal testing performed by the Nutanix Performace and Engineering team shows that AHV and Hyper-V performance are roughly the same from a hypervisor perspective and ESXi was 10% higher. That being said, usually, the next question is how is performance versus traditional SAN/NAS.  And again, I have to point out, it’s all about Data Locality. Can’t change the laws of physics. Data has gravity, hence we will always beat traditional SAN architecture.

Check out Josh’s posts on “Peak Performance vs Real World – Exchange on Nutanix Acropolis Hypervisor”.  It gives you a better understanding of are realistic benchmarks of Exchange in general and on Nutanix. I wholeheartedly agree with Josh when he says “Benchmarks are of little value without context specific to customer requirements!”  Spending close to over 15 years building and maintain Exchange systems, I learned one hard fact, no generic simulator (like JetStress) can show real world metrics.

Data Reduction Technologies with Exchange on Nutanix

Recommendation:
1 vDisk per Database, 1 vDisk per DB Logs
1 Container with RF2, In-Line Compression & EC-X for Databases
1 Container with RF2 for Logs
Do not use Dedupe with MS Exchange!
Reference: https://technet.microsoft.com/en-us/library/ee832792(v=exchg.150).aspx
Microsoft does not support Data deduplication (Note: Underlying storage deduplication such as Nutanix dedupe is not mentioned, but implied)

Data Reduction Estimates:

Rule of thumb: Always size without data reduction if possible.
Conservative assumption for compression for Exchange = 1.3:1
Aggressive assumption for compression for Exchange = 1.6:1
Conservative assumption for EC-X for Exchange = 1.1:1
Aggressive assumption for EC-X for Exchange = 1.25:1

Questions to ask yourself when planning an Exchange Environment:

How many Users? e.g.: 10000, 10000, etc.
How many user profiles do you need? e.g.: 2 , Standard and Executives
How large Mailbox (excluding archiving) per User? e.g.: 1GB, 2GB , 5GB
How many messages per day do you want to support per user? Light = 50 , Medium = 100 , Heavy = 150+

Do you require site resiliency?

These are among some of the basic questions you need to answer.  This is where the Exchange Server Role Calculator comes in. It’s a great tool, but like any tool, you do need to give it good input to get out good output. The function of the tool is as the name implies.

Exchange Server Role Calculator Defined

Now, at the time of this writing, version 7.8 is the latest and greatest. Now, do note, I would not call this tool perfect, but its gets you pretty close. Like anything else, the Exchange team is still learning real world behavior and this is where a good experienced Exchange engineer comes into play.

IMO..there is an Art and Science to sizing Exchange.  The days of Exchange just being a simple mail server are far over. These days, it’s much more complex with supporting multiple forms of ingress and egress traffic for different functions (Mobile, Web, SMTP, Skype Integration, etc.). Each of these different functions has varying load considerations and supports more visible features like Outlook Web Access and Exchange Activesync. Also, I still am of the opinion that it does not take into consideration the number of devices that 1 mailbox services.
exchangecomplex
Considering this complexity, you can see that undersizing or oversizing can happen easily.  If you size correctly at the beginning with Nutanix, then it just an easy scale out, buy as you need it situation. Then you know what happens, finally for the first time, predictability in your budgets.  I remember the days, not that long ago, when I had to have a client retire a SAN, not for space constraints, but for IO constraints.  And at the time, all I got from the client was “can’t we use it for something else” and ya, I’ve replied with “use it as a WSUS repository for patching the Exchange environment” 😉

During my next post, I will dive into the Exchange Role Calculator much more and go over some examples of sizing on Exchange. We’ll mainly focus on mailbox storage and then move on to other role sizing considerations.  I also plan to cover the other aspects to maintain a healthy Exchange environment (i.e. Message Hygiene, Global and Local Load balancing, Integrations and End User Experience) in subsequent posts.
Below are the Office Best Practices Guides from Nutanix and some public case studies.

Until next time, Rob…..

Nutanix Offical Best Practice Guides
MS Exchange on Nutanix / vSphere Best practice guide: http://go.nutanix.com/VirtualizingMicrosoftExchangeonWeb-ScaleConvergedInfrastructure.html

Public Case Studies for Nutanix customers using Exchange
Richter: http://go.nutanix.com/rs/nutanix/images/Nutanix-Case-Study-Richter.pdf
Riverside: http://www.nutanix.com/resource/riverside-for-riversides-server-and-storage-consolidation-nutanix-fits-like-a-glove/

Nutanix App for Splunk – Just Released

nutanix-US Nutanix App for Splunk

Nutanix App for Splunk

A Video Walkthrough on installation, configuration and demo of the Nutanix App for Splunk.  Also, included is demo of Splunk Mobile running the Nutanix App versys Safari running Prism. To learn more about Splunk, and details on this app, check out Andre’s Leibovici @andreleibovici blog post.  Happy Splunking 🙂

Until next time, Rob…

Microsoft SQL Server High Availability Options on Nutanix

To give credit, this content was taken from my buddy Mike McGhem’s blog and I added some more color to it, but his content is right on.

In General, modern versions Microsoft SQL Server (MSSQL) supports several High Availability (HA) options at both the host and storage level.  For the purposes of this post I will only be addressing the HA options which leverage native Windows Server Failover Clustering (WSFC)  in some form.  SQL Server also provides transactional replication through the use of a publisher and subscriber model, which some consider an HA option, but that’s a topic (and debate) for another post with Mike McGhem.
Starting with MSSQL 2012, Microsoft introduced AlwaysOn which is a combination of some existing and new functionality.  Under the AlwaysOn umbrella falls two main options, Failover Cluster Instances (FCI) and Availability Groups (AAG).

Nutanix has long supported and recommended the use of AlwaysOn Availability Groups.  AAG leverages a combination of WSFC and native database level replication to create either an HA or disaster recovery solution between instances of MSSQL.  The instances of MSSQL leveraged to support the AAG can be either standalone or clustered (in the case of Nutanix these would be standalone instances today).

The following figure provides a logical overview of an AlwaysOn Availability Group.
Microsoft SQL High Availability
An AAG performs replication at the database level creating “primary” and one or more “secondary” database copies.  The secondary copies are replicated using  either synchronous or asynchronous commit mode, as specified by an administrator.  Asynchronous commit is intended more as a disaster recovery or reporting solution as it implies the potential for data loss.  So for HA scenarios as we’re discussing them here, we should assume synchronous commit.  Because database replication is used, shared storage is not required and each MSSQL instance within the AAG can use its own local devices.  Additional details on AlwaysOn Availability Groups can be found here: https://msdn.microsoft.com/en-us/library/hh510230.aspx

AAGs can take advantage of the secondary databases for the purpose of read-only transactions or for backup operations.  In the context of a scale-out architecture like Nutanix, leveraging multiple copies across hypervisor hosts for distributing these kinds of operations creates an excellent solution.

While AAGs are a great solution and fit nicely with the Nutanix architecture, they may not be a good fit or even possible for certain environments.  Some of the limiting factors for adopting AAGs can include:

  • Space utilization:  Because a secondary database copy is created additional storage space will be consumed.  Some administrators may prefer a single database copy where server HA is the primary use case.
  • Synchronous commit performance:  The synchronous replication of transactions (Insert/Update/Delete…) needed for AAG replication (in the context of an HA solution) do have a performance overhead.  Administrators of latency sensitive applications may prefer not to have the additional response time of waiting for transactions to be committed to multiple SQL instances.
  • Distributed Transactions:  Some applications perform distributed transactions across databases and MSSQL instances.  Microsoft does not support the use of distributed transactions with AAGs, and by extension application vendors will not support their software which utilize distributed transactions where AAGs are present.
  • SQL Server versions:  Some environments can simply not yet upgrade to SQL 2012 or higher.  Whether it be due to current business requirements or application requirements based on qualification, many administrators have to stick with SQL 2008 (and I hope not, but maybe even earlier versions) for the time being.

In the above cases MSSQL Failover Cluster Instances are likely the better solution.  FCI have long been used as the primary means for HA with MSSQL.  FCI can be leveraged with all current versions of MSSQL and relies on shared storage to support the MSSQL instances.  The following figure provides a logical overview of Failover Cluster Instances.
Microsoft SQL High Availability
The shared storage used can be block (LUN) based or, starting with MSSQL 2012, SMB (file) based.  In the case of LUN based shared storage, SCSI-3 persistent reservations are used to arbitrate ownership of the shared disk resources between nodes.  The MSSQL instance utilizing specific LUNs is made dependent against those disk resources.  Additional details on AlwaysOn Failover Cluster Instances can be found here:  https://msdn.microsoft.com/en-us/library/ms189134.aspx

Until very recently Nutanix has not supported MSSQL FCI within virtual machines, whether they reside on ESXi, Hyper-V or the Nutanix Acropolis Hypervisor (AHV).  But starting with the Nutanix 4.5 release (with technical preview support in posted 4.1.5 release), MSSQL FCI will be supported across all three of the aforementioned hypervisors.  Nutanix will support this form of clustering using iSCSI from within the virtual machines.  In essence Nutanix virtual disks (vdisks) which support SCSI-3 persistent reservations are created within a Nutanix container.  These vdisks will be presented directly to virtual machines as LUNs, leveraging the Nutanix Controller Virtual Machines (CVM) as iSCSI targets.  The virtual machines will utilize the Microsoft iSCSI initiator service and the Multipath I/O (MPIO) capabilities native to the Windows Operating System for connectivity and path failover.  An overview of this configuration can be seen in the following diagram.
Microsoft SQL High Availability
The association between virtual machine iSCSI initiators and the vdisks is managed via the concept of a Volume Group.  A volume group acts as a mapping to determine the virtual disks which can be accessed by one or multiple (in the case of clustering) iSCSI initiators.   Additional information on volume groups can be found under the Volumes API section of the Nutanix Bible: http://stevenpoitras.com/the-nutanix-bible/
Like AAG’s, MSSQL FCI may not be best suited for all environments.  Some of its drawback can include:

  • Shared storage complexity:  The configuration and maintenance of shared storage is often more complex to manage than standalone environments
  • Planned or unplanned downtime:  FCI can generally take more time to transition operation between cluster nodes than a similar AAG configuration.  Part of this downtime is to commit transactions which may have been in-flight prior to failover.  This can be somewhat mitigated with the recovery interval setting or using indirect checkpoints (https://msdn.microsoft.com/en-us/library/ms189573.aspx).
  • Separation of workloads:  AAG configurations can create multiple database copies across SQL instances for the purposes of distributed reporting or for backup offload.  An FCI cannot offer this functionality natively, although such configurations are possible via intelligent cloning methodologies that the Nutanix platform can offer.

As mentioned earlier it’s possible to configure both FCI and AAG as a part of the same solution.  So for example, if the HA capabilities of FCI are preferred, but the  replication capabilities of AAG are desired for the purposes of reporting, backup offload or disaster recovery, a blended configuration can be deployed.

With the support of shared storage clustering in 4.5, Nutanix can provide the full range of options necessary to support the broad number of use cases SQL Server can require.  Mike McGhem will have follow-on posts on his blog to detail how to configure volume group based clustering for Microsoft SQL Server.

Until next time, Rob.

NPP Training series – How does it work – CVM – Software Defined

To continue NPP training series here is my next topic:  How does it work – CVM – Software Defined

If you missed other parts of my series, check out links below:
Part 1 – NPP Training series – Nutanix Terminology
Part 2 – NPP Training series – Nutanix Terminology
Cluster Architecture with Hyper-V

Data Structure on Nutanix with Hyper-V
I/O Path Overview
Drive Breakdown

To give credit, most of the content was taken from Steve Poitras’s “Nutanix Bible” blog as his content is the most accurate and then I put a Hyper-V lean to it. Also, he just rocks…other than being a Sea Hawks Fan :).

Software-Defined
Nutanix CVM

As mentioned before (likely numerous times), the Nutanix platform is a software-based solution which ships as a bundled software + hardware appliance.  The controller VM or what we call the Nutanix CVM is where the vast majority of the Nutanix software and logic sits and was designed from the beginning to be an extensible and pluggable architecture. A key benefit to being software-defined and not relying upon any hardware offloads or constructs is around extensibility.  As with any product life-cycle, advancements and new features will always be introduced.

By not relying on any custom ASIC/FPGA or hardware capabilities, Nutanix can develop and deploy these new features through a simple software update.  This means that the deployment of a new feature (e.g., deduplication) can be deployed by upgrading the current version of the Nutanix software.  This also allows newer generation features to be deployed on legacy hardware models. For example, say you’re running a workload running an older version of Nutanix software on a prior generation hardware platform (e.g., 2400).  The running software version doesn’t provide deduplication capabilities which your workload could benefit greatly from.  To get these features, you perform a rolling upgrade of the Nutanix software version while the workload is running, and you now have deduplication.  It’s really that easy.

Similar to features, the ability to create new “adapters” or interfaces into Distributed Storage Fabric is another key capability.  When the product first shipped, it solely supported iSCSI for I/O from the hypervisor, this has now grown to include NFS and SMB for Hyper-V.  In the future, there is the ability to create new adapters for various workloads and hypervisors (HDFS, etc.).  And again, all of this can be deployed via a software update. This is contrary to most legacy infrastructures, where a hardware upgrade or software purchase is normally required to get the “latest and greatest” features.  With Nutanix, it’s different. Since all features are deployed in software, they can run on any hardware platform, any hypervisor, and be deployed through simple software upgrades.

The following figure shows a logical representation of what this software-defined controller framework (Nutanix CVM) looks like:Nutanix CVMNext up, NPP Training Series – How does it all work – Disk Balancing

Until next time, Rob…