Nutanix NOS 4.6 Released….

On February 16, 2016, Nutanix announced the Acropolis NOS 4.6 release and last week was available for download. Along with many enhancements, I wanted to highlight several items, including some tech preview features.

Also, checkout this excellent video with Nutanix’s Tim Isaacs and Raghu Nandan in which they go into more detail on the updates included in Acropolis 4.6 and the interviewer is my buddy Chris Brown.

Tim Isaacs and Raghu Nandan from Nutanix HQ about some of the important updates in Acropolis 4.6.

1-Click Upgrades – BIOS and BMC Firmware
The 1-Click upgrade for BIOS and BMC firmware feature is available for Acropolis hypervisor (AHV) and ESXi hypervisor host environments running on NX-xxxx G4 (Haswell) platforms only.

Acropolis App Mobility Fabric: Windows or Linux Guest Customization
Customize or clone Windows or Linux guest VMs hosted by AHV. Includes automated OS installation and custom ISOs by using sysprep (Windows) or cloudinit (Linux).

Acropolis Drivers for OpenStack
These drivers facilitate consuming the Nutanix Acropolis infrastructure as a cloud service or for use in a data center. For example, an OpenStack implementation might require using features such as single sign-on, orchestration, role-based access control, and so on. Drivers include Acropolis compute, image, volume, and network drivers.

Convert Cluster Redundancy Factor from RF-2 to RF-3
Convert a cluster created with redundancy factor 2 (RF-2) to RF-3 through the ncli cluster set-redundancy-state command. This increases the cluster fault tolerance.

Cross Hypervisor Disaster Recovery
Cross-hypervisor disaster recovery provides an ability to migrate the VMs from one hypervisor to another (ESXi to AHV or AHV to ESXi) by using the protection domain semantics of protecting VMs, taking snapshots, replicating the snapshots, and then recovering the VMs from the snapshots. To perform these operations, you need to install and configure NGT on all VMs.

Guest VM VLAN Trunking
AHV supports guest VM VLAN tagging, where the tag passes through a single port from the physical network to a VM. It allows the VLAN ID tags to be included in an Ethernet packet to be passed to the guest VM. Guest VM operating systems can use this feature to enable Virtual Guest Tagging (VGT) and simulate multiple virtual NICs.

More Backup and Data Recovery/Replication Features

  • Snapshot and Async DR for volume groups.
  • Application-consistent snapshots on AHV and ESXi by using the Nutanix native in-guest Volume Shadow Copy Service (VSS) agent for all VMs that support Microsoft’s VSS. Nutanix Guest Tools provides application-consistent snapshot support for Linux VMs by running specific pre-freeze and post-thaw scripts on VM quiesce.
  • Integrated snapshot management from an AHV cluster to a CommVault solution

Nutanix Guest Tools

  • Nutanix Guest Agent (NGA) service. Communicates with the Nutanix Controller VM.
  • File Level Restore (FLR) CLI. Performs self-service file-level recovery from the VM snapshots.
  • Nutanix VM Mobility Drivers. Facilitates distribution of drivers required for VM migration between ESXi and AHV, in-place hypervisor conversion, and cross-hypervisor disaster recovery (CH-DR) features.
  • VSS requestor and hardware provider for Windows VMs. Enables application-consistent snapshots of AHV or ESXi Windows VMs.
  • Application-consistent snapshot for Linux VMs. Supports application-consistent snapshots for Linux VMs by running specific scripts on VM quiesce.

Self-Service Restore
Self-service restore allows a user to restore a file within a virtual machine from the Nutanix protected snapshot with minimal Nutanix administrator intervention. This feature is supported on Nutanix clusters running the ESXi and Acropolis hypervisors only.

Tech Preview Features

In-Place Hypervisor Conversion
This 1-click feature available through the Prism web console allows you to convert your cluster from using ESXi hosts to using AHV hosts. Guest VMs are converted to the hypervisor target format, and cluster network configurations are stored and then restored as part of the conversion process.

Native File Services
Provides file server capability within a Nutanix AHV cluster, as one or more network-attached VMs, to form a virtual file server.

To download the update, you can go to my.nutanix.com and go to support, downloads section or you can upgrade to 4.6 within Prism.  Until next time, Rob

Microsoft SQL Server High Availability Options on Nutanix

To give credit, this content was taken from my buddy Mike McGhem’s blog and I added some more color to it, but hit content is right on.

In General, modern versions Microsoft SQL Server (MSSQL) supports several High Availability (HA) options at both the host and storage level.  For the purposes of this post I will only be addressing the HA options which leverage native Windows Server Failover Clustering (WSFC)  in some form.  SQL Server also provides transactional replication through the use of a publisher and subscriber model, which some consider an HA option, but that’s a topic (and debate) for another post with Mike McGhem.

Starting with MSSQL 2012, Microsoft introduced AlwaysOn which is a combination of some existing and new functionality.  Under the AlwaysOn umbrella falls two main options, Failover Cluster Instances (FCI) and Availability Groups (AAG).

Nutanix has long supported and recommended the use of AlwaysOn Availability Groups.  AAG leverages a combination of WSFC and native database level replication to create either an HA or disaster recovery solution between instances of MSSQL.  The instances of MSSQL leveraged to support the AAG can be either standalone or clustered (in the case of Nutanix these would be standalone instances today).   The following figure provides a logical overview of an AlwaysOn Availability Group.

Microsoft SQL High Availability

An AAG performs replication at the database level creating “primary” and one or more “secondary” database copies.  The secondary copies are replicated using  either synchronous or asynchronous commit mode, as specified by an administrator.  Asynchronous commit is intended more as a disaster recovery or reporting solution as it implies the potential for data loss.  So for HA scenarios as we’re discussing them here, we should assume synchronous commit.  Because database replication is used, shared storage is not required and each MSSQL instance within the AAG can use its own local devices.  Additional details on AlwaysOn Availability Groups can be found here: https://msdn.microsoft.com/en-us/library/hh510230.aspx

AAGs can take advantage of the secondary databases for the purpose of read-only transactions or for backup operations.  In the context of a scale-out architecture like Nutanix, leveraging multiple copies across hypervisor hosts for distributing these kinds of operations creates an excellent solution.

While AAGs are a great solution and fit nicely with the Nutanix architecture, they may not be a good fit or even possible for certain environments.  Some of the limiting factors for adopting AAGs can include:

  • Space utilization:  Because a secondary database copy is created additional storage space will be consumed.  Some administrators may prefer a single database copy where server HA is the primary use case.
  • Synchronous commit performance:  The synchronous replication of transactions (Insert/Update/Delete…) needed for AAG replication (in the context of an HA solution) do have a performance overhead.  Administrators of latency sensitive applications may prefer not to have the additional response time of waiting for transactions to be committed to multiple SQL instances.
  • Distributed Transactions:  Some applications perform distributed transactions across databases and MSSQL instances.  Microsoft does not support the use of distributed transactions with AAGs, and by extension application vendors will not support their software which utilize distributed transactions where AAGs are present.
  • SQL Server versions:  Some environments can simply not yet upgrade to SQL 2012 or higher.  Whether it be due to current business requirements or application requirements based on qualification, many administrators have to stick with SQL 2008 (and I hope not, but maybe even earlier versions) for the time being.

In the above cases MSSQL Failover Cluster Instances are likely the better solution.  FCI have long been used as the primary means for HA with MSSQL.  FCI can be leveraged with all current versions of MSSQL and relies on shared storage to support the MSSQL instances.  The following figure provides a logical overview of Failover Cluster Instances.

Microsoft SQL High Availability

The shared storage used can be block (LUN) based or, starting with MSSQL 2012, SMB (file) based.  In the case of LUN based shared storage, SCSI-3 persistent reservations are used to arbitrate ownership of the shared disk resources between nodes.  The MSSQL instance utilizing specific LUNs is made dependent against those disk resources.  Additional details on AlwaysOn Failover Cluster Instances can be found here:  https://msdn.microsoft.com/en-us/library/ms189134.aspx

Until very recently Nutanix has not supported MSSQL FCI within virtual machines, whether they reside on ESXi, Hyper-V or the Nutanix Acropolis Hypervisor (AHV).  But starting with the Nutanix 4.5 release (with technical preview support in posted 4.1.5 release), MSSQL FCI will be supported across all three of the aforementioned hypervisors.  Nutanix will support this form of clustering using iSCSI from within the virtual machines.  In essence Nutanix virtual disks (vdisks) which support SCSI-3 persistent reservations are created within a Nutanix container.  These vdisks will be presented directly to virtual machines as LUNs, leveraging the Nutanix Controller Virtual Machines (CVM) as iSCSI targets.  The virtual machines will utilize the Microsoft iSCSI initiator service and the Multipath I/O (MPIO) capabilities native to the Windows Operating System for connectivity and path failover.  An overview of this configuration can be seen in the following diagram.

Microsoft SQL High Availability

The association between virtual machine iSCSI initiators and the vdisks is managed via the concept of a Volume Group.  A volume group acts as a mapping to determine the virtual disks which can be accessed by one or multiple (in the case of clustering) iSCSI initiators.   Additional information on volume groups can be found under the Volumes API section of the Nutanix Bible: http://stevenpoitras.com/the-nutanix-bible/

Like AAG’s, MSSQL FCI may not be best suited for all environments.  Some of its drawback can include:

  • Shared storage complexity:  The configuration and maintenance of shared storage is often more complex to manage than standalone environments
  • Planned or unplanned downtime:  FCI can generally take more time to transition operation between cluster nodes than a similar AAG configuration.  Part of this downtime is to commit transactions which may have been in-flight prior to failover.  This can be somewhat mitigated with the recovery interval setting or using indirect checkpoints (https://msdn.microsoft.com/en-us/library/ms189573.aspx).
  • Separation of workloads:  AAG configurations can create multiple database copies across SQL instances for the purposes of distributed reporting or for backup offload.  An FCI cannot offer this functionality natively, although such configurations are possible via intelligent cloning methodologies that the Nutanix platform can offer.

As mentioned earlier it’s possible to configure both FCI and AAG as a part of the same solution.  So for example, if the HA capabilities of FCI are preferred, but the  replication capabilities of AAG are desired for the purposes of reporting, backup offload or disaster recovery, a blended configuration can be deployed.

With the support of shared storage clustering in 4.5, Nutanix can provide the full range of options necessary to support the broad number of use cases SQL Server can require.  Mike McGhem will have follow-on posts on his blog to detail how to configure volume group based clustering for Microsoft SQL Server.   Until next time, Rob.