Building Nutanix Ready…What does it mean to be “Ready”?

Before we go into what “Ready” really means.  Every great journey has a story behind it. This will be a multi-part series starting with how I joined Nutanix and evolved myself to build a world-class program called “Nutanix Ready”. Stay Tuned, Part 1 coming very soon!  RobNutanix Ready

Microsoft Exchange Best Practices on Nutanix

To continue on my last blog post on Exchange...

As I mentioned previously, I support SE’s from all over the world. And again today, I got asked what are the best practices for running Exchange on Nutanix. Funny enough, this question comes in quite often.  Well, I am going to help resolve that. There’s a lot of great info out there, especially from my friend Josh Odgers, which has been leading the charge on this for a long time.  Some of his posts can be controversial, but truth is always there.  He’s getting a point across.

This blog post will be updated on a regular basis as things change. It will also be moved to a permanent part of the netwatch.me resources section.  This is meant to be a general best practice guide to help with planning and maintaining a healthy Exchange environment on Nutanix.  I will specify hypervisor specifics when required.  Now on the post…..

msexchange.

Let’s start out with the basics…

MS Exchange on Nutanix Support

Nutanix provides a 100% supported solution for MS Exchange running on vSphere, Hyper-V or Acropolis Hypervisor using iSCSI (Block storage)
Here is a breakdown of supported configurations by hypervisor:

vSphere (ESXi) Use In-Guest iSCSI (Volume Groups) for full support
Hyper-V Use SMB 3.0
AHV Use native vDisks (iSCSI) – SVVP Certification for AHV

Also, check out Josh’s post “Fight the FUD – Support for MS Exchange on Nutanix” that outlines this very topic.  In summary, the customer has the choice to deploy in multiple configurations to suit their needs. But, one of the most often questions I get is, “does your SVVP Certification cover running Exchange on all your supported hypervisors?”  The answer is not simple.  The SVVP was submitted for the Acropolis Hypervisor, while this does not cover all of them, we technically are supported for all hypervisors as per Microsoft supported storage architectures.  Microsoft does not specifically mention Hyperconverged, it only mentions ISCSI in regards to SAN.  IMO, that covers ESXi and AHV.

Now let me explain….SAN’s are one of the biggest modern datacenter bottlenecks. Data has gravity, so co-locating storage and compute eliminates network bottlenecks = Hyperconverged is way better than SAN and hence SUPPORTED IMO 😉

To end this topic and move on, a Nutanix customer has the choice to deploy in multiple configurations to suit their needs.  Being pushed to one particular hypervisor for a customer is not always in their best interest.  Having choices now and later is a much better approach with the overall goal of simplifying the datacenter.   As Josh said in one of his blog posts ,”Running a standard platform and storage protocol for all workloads is a simple model which reduces the unnecessary complexity of multiple protocols and/or in-guest storage configurations”, I can’t agree more with that statement. 🙂

Exchange Performance on Nutanix

Now this subject will always be controversial and potentially subject to criticism.  Internal testing performed by the Nutanix Performace and Engineering team shows that AHV and Hyper-V performance are roughly the same from a hypervisor perspective and ESXi was 10% higher. That being said, usually, the next question is how is performance versus traditional SAN/NAS.  And again, I have to point out, it’s all about Data Locality. Can’t change the laws of physics. Data has gravity, hence we will always beat traditional SAN architecture.

Check out Josh’s posts on “Peak Performance vs Real World – Exchange on Nutanix Acropolis Hypervisor”.  It gives you a better understanding of are realistic benchmarks of Exchange in general and on Nutanix. I wholeheartedly agree with Josh when he says “Benchmarks are of little value without context specific to customer requirements!”  Spending close to over 15 years building and maintain Exchange systems, I learned one hard fact, no generic simulator (like JetStress) can show real world metrics.

Data Reduction Technologies with Exchange on Nutanix

Recommendation:
1 vDisk per Database, 1 vDisk per DB Logs
1 Container with RF2, In-Line Compression & EC-X for Databases
1 Container with RF2 for Logs
Do not use Dedupe with MS Exchange!
Reference: https://technet.microsoft.com/en-us/library/ee832792(v=exchg.150).aspx
Microsoft does not support Data deduplication (Note: Underlying storage deduplication such as Nutanix dedupe is not mentioned, but implied)

Data Reduction Estimates:

Rule of thumb: Always size without data reduction if possible.
Conservative assumption for compression for Exchange = 1.3:1
Aggressive assumption for compression for Exchange = 1.6:1
Conservative assumption for EC-X for Exchange = 1.1:1
Aggressive assumption for EC-X for Exchange = 1.25:1

Questions to ask yourself when planning an Exchange Environment:

How many Users? e.g.: 10000, 10000, etc.
How many user profiles do you need? e.g.: 2 , Standard and Executives
How large Mailbox (excluding archiving) per User? e.g.: 1GB, 2GB , 5GB
How many messages per day do you want to support per user? Light = 50 , Medium = 100 , Heavy = 150+

Do you require site resiliency?

These are among some of the basic questions you need to answer.  This is where the Exchange Server Role Calculator comes in. It’s a great tool, but like any tool, you do need to give it good input to get out good output. The function of the tool is as the name implies.

Exchange Server Role Calculator Defined

Now, at the time of this writing, version 7.8 is the latest and greatest. Now, do note, I would not call this tool perfect, but its gets you pretty close. Like anything else, the Exchange team is still learning real world behavior and this is where a good experienced Exchange engineer comes into play.

IMO..there is an Art and Science to sizing Exchange.  The days of Exchange just being a simple mail server are far over. These days, it’s much more complex with supporting multiple forms of ingress and egress traffic for different functions (Mobile, Web, SMTP, Skype Integration, etc.). Each of these different functions has varying load considerations and supports more visible features like Outlook Web Access and Exchange Activesync. Also, I still am of the opinion that it does not take into consideration the number of devices that 1 mailbox services.
exchangecomplex
Considering this complexity, you can see that undersizing or oversizing can happen easily.  If you size correctly at the beginning with Nutanix, then it just an easy scale out, buy as you need it situation. Then you know what happens, finally for the first time, predictability in your budgets.  I remember the days, not that long ago, when I had to have a client retire a SAN, not for space constraints, but for IO constraints.  And at the time, all I got from the client was “can’t we use it for something else” and ya, I’ve replied with “use it as a WSUS repository for patching the Exchange environment” 😉

During my next post, I will dive into the Exchange Role Calculator much more and go over some examples of sizing on Exchange. We’ll mainly focus on mailbox storage and then move on to other role sizing considerations.  I also plan to cover the other aspects to maintain a healthy Exchange environment (i.e. Message Hygiene, Global and Local Load balancing, Integrations and End User Experience) in subsequent posts.
Below are the Office Best Practices Guides from Nutanix and some public case studies.

Until next time, Rob…..

Nutanix Offical Best Practice Guides
MS Exchange on Nutanix / vSphere Best practice guide: http://go.nutanix.com/VirtualizingMicrosoftExchangeonWeb-ScaleConvergedInfrastructure.html

Public Case Studies for Nutanix customers using Exchange
Richter: http://go.nutanix.com/rs/nutanix/images/Nutanix-Case-Study-Richter.pdf
Riverside: http://www.nutanix.com/resource/riverside-for-riversides-server-and-storage-consolidation-nutanix-fits-like-a-glove/

Nutanix NOS 4.6 Released….

On February 16, 2016, Nutanix announced the Acropolis NOS 4.6 release and last week was available for download. Along with many enhancements, I wanted to highlight several items, including some tech preview features.
Also, checkout this excellent video with Nutanix’s Tim Isaacs and Raghu Nandan in which they go into more detail on the updates included in Acropolis 4.6 and the interviewer is my buddy Chris Brown.
Tim Isaacs and Raghu Nandan from Nutanix HQ about some of the important updates in Acropolis 4.6.

1-Click Upgrades – BIOS and BMC Firmware
The 1-Click upgrade for BIOS and BMC firmware feature is available for Acropolis hypervisor (AHV) and ESXi hypervisor host environments running on NX-xxxx G4 (Haswell) platforms only.
Acropolis App Mobility Fabric: Windows or Linux Guest Customization
Customize or clone Windows or Linux guest VMs hosted by AHV. Includes automated OS installation and custom ISOs by using sysprep (Windows) or cloudinit (Linux).
Acropolis Drivers for OpenStack
These drivers facilitate consuming the Nutanix Acropolis infrastructure as a cloud service or for use in a data center. For example, an OpenStack implementation might require using features such as single sign-on, orchestration, role-based access control, and so on. Drivers include Acropolis compute, image, volume, and network drivers.
Convert Cluster Redundancy Factor from RF-2 to RF-3
Convert a cluster created with redundancy factor 2 (RF-2) to RF-3 through the ncli cluster set-redundancy-state command. This increases the cluster fault tolerance.
Cross Hypervisor Disaster Recovery
Cross-hypervisor disaster recovery provides an ability to migrate the VMs from one hypervisor to another (ESXi to AHV or AHV to ESXi) by using the protection domain semantics of protecting VMs, taking snapshots, replicating the snapshots, and then recovering the VMs from the snapshots. To perform these operations, you need to install and configure NGT on all VMs.
Guest VM VLAN Trunking
AHV supports guest VM VLAN tagging, where the tag passes through a single port from the physical network to a VM. It allows the VLAN ID tags to be included in an Ethernet packet to be passed to the guest VM. Guest VM operating systems can use this feature to enable Virtual Guest Tagging (VGT) and simulate multiple virtual NICs.
More Backup and Data Recovery/Replication Features

  • Snapshot and Async DR for volume groups.
  • Application-consistent snapshots on AHV and ESXi by using the Nutanix native in-guest Volume Shadow Copy Service (VSS) agent for all VMs that support Microsoft’s VSS. Nutanix Guest Tools provides application-consistent snapshot support for Linux VMs by running specific pre-freeze and post-thaw scripts on VM quiesce.
  • Integrated snapshot management from an AHV cluster to a CommVault solution

Nutanix Guest Tools

  • Nutanix Guest Agent (NGA) service. Communicates with the Nutanix Controller VM.
  • File Level Restore (FLR) CLI. Performs self-service file-level recovery from the VM snapshots.
  • Nutanix VM Mobility Drivers. Facilitates distribution of drivers required for VM migration between ESXi and AHV, in-place hypervisor conversion, and cross-hypervisor disaster recovery (CH-DR) features.
  • VSS requestor and hardware provider for Windows VMs. Enables application-consistent snapshots of AHV or ESXi Windows VMs.
  • Application-consistent snapshot for Linux VMs. Supports application-consistent snapshots for Linux VMs by running specific scripts on VM quiesce.

Self-Service Restore
Self-service restore allows a user to restore a file within a virtual machine from the Nutanix protected snapshot with minimal Nutanix administrator intervention. This feature is supported on Nutanix clusters running the ESXi and Acropolis hypervisors only.

Tech Preview Features
In-Place Hypervisor Conversion
This 1-click feature available through the Prism web console allows you to convert your cluster from using ESXi hosts to using AHV hosts. Guest VMs are converted to the hypervisor target format, and cluster network configurations are stored and then restored as part of the conversion process.
Native File Services
Provides file server capability within a Nutanix AHV cluster, as one or more network-attached VMs, to form a virtual file server.
To download the update, you can go to my.nutanix.com and go to support, downloads section or you can upgrade to 4.6 within Prism.  Until next time, Rob

Microsoft SQL Server High Availability Options on Nutanix

To give credit, this content was taken from my buddy Mike McGhem’s blog and I added some more color to it, but his content is right on.

In General, modern versions Microsoft SQL Server (MSSQL) supports several High Availability (HA) options at both the host and storage level.  For the purposes of this post I will only be addressing the HA options which leverage native Windows Server Failover Clustering (WSFC)  in some form.  SQL Server also provides transactional replication through the use of a publisher and subscriber model, which some consider an HA option, but that’s a topic (and debate) for another post with Mike McGhem.
Starting with MSSQL 2012, Microsoft introduced AlwaysOn which is a combination of some existing and new functionality.  Under the AlwaysOn umbrella falls two main options, Failover Cluster Instances (FCI) and Availability Groups (AAG).

Nutanix has long supported and recommended the use of AlwaysOn Availability Groups.  AAG leverages a combination of WSFC and native database level replication to create either an HA or disaster recovery solution between instances of MSSQL.  The instances of MSSQL leveraged to support the AAG can be either standalone or clustered (in the case of Nutanix these would be standalone instances today).

The following figure provides a logical overview of an AlwaysOn Availability Group.
Microsoft SQL High Availability
An AAG performs replication at the database level creating “primary” and one or more “secondary” database copies.  The secondary copies are replicated using  either synchronous or asynchronous commit mode, as specified by an administrator.  Asynchronous commit is intended more as a disaster recovery or reporting solution as it implies the potential for data loss.  So for HA scenarios as we’re discussing them here, we should assume synchronous commit.  Because database replication is used, shared storage is not required and each MSSQL instance within the AAG can use its own local devices.  Additional details on AlwaysOn Availability Groups can be found here: https://msdn.microsoft.com/en-us/library/hh510230.aspx

AAGs can take advantage of the secondary databases for the purpose of read-only transactions or for backup operations.  In the context of a scale-out architecture like Nutanix, leveraging multiple copies across hypervisor hosts for distributing these kinds of operations creates an excellent solution.

While AAGs are a great solution and fit nicely with the Nutanix architecture, they may not be a good fit or even possible for certain environments.  Some of the limiting factors for adopting AAGs can include:

  • Space utilization:  Because a secondary database copy is created additional storage space will be consumed.  Some administrators may prefer a single database copy where server HA is the primary use case.
  • Synchronous commit performance:  The synchronous replication of transactions (Insert/Update/Delete…) needed for AAG replication (in the context of an HA solution) do have a performance overhead.  Administrators of latency sensitive applications may prefer not to have the additional response time of waiting for transactions to be committed to multiple SQL instances.
  • Distributed Transactions:  Some applications perform distributed transactions across databases and MSSQL instances.  Microsoft does not support the use of distributed transactions with AAGs, and by extension application vendors will not support their software which utilize distributed transactions where AAGs are present.
  • SQL Server versions:  Some environments can simply not yet upgrade to SQL 2012 or higher.  Whether it be due to current business requirements or application requirements based on qualification, many administrators have to stick with SQL 2008 (and I hope not, but maybe even earlier versions) for the time being.

In the above cases MSSQL Failover Cluster Instances are likely the better solution.  FCI have long been used as the primary means for HA with MSSQL.  FCI can be leveraged with all current versions of MSSQL and relies on shared storage to support the MSSQL instances.  The following figure provides a logical overview of Failover Cluster Instances.
Microsoft SQL High Availability
The shared storage used can be block (LUN) based or, starting with MSSQL 2012, SMB (file) based.  In the case of LUN based shared storage, SCSI-3 persistent reservations are used to arbitrate ownership of the shared disk resources between nodes.  The MSSQL instance utilizing specific LUNs is made dependent against those disk resources.  Additional details on AlwaysOn Failover Cluster Instances can be found here:  https://msdn.microsoft.com/en-us/library/ms189134.aspx

Until very recently Nutanix has not supported MSSQL FCI within virtual machines, whether they reside on ESXi, Hyper-V or the Nutanix Acropolis Hypervisor (AHV).  But starting with the Nutanix 4.5 release (with technical preview support in posted 4.1.5 release), MSSQL FCI will be supported across all three of the aforementioned hypervisors.  Nutanix will support this form of clustering using iSCSI from within the virtual machines.  In essence Nutanix virtual disks (vdisks) which support SCSI-3 persistent reservations are created within a Nutanix container.  These vdisks will be presented directly to virtual machines as LUNs, leveraging the Nutanix Controller Virtual Machines (CVM) as iSCSI targets.  The virtual machines will utilize the Microsoft iSCSI initiator service and the Multipath I/O (MPIO) capabilities native to the Windows Operating System for connectivity and path failover.  An overview of this configuration can be seen in the following diagram.
Microsoft SQL High Availability
The association between virtual machine iSCSI initiators and the vdisks is managed via the concept of a Volume Group.  A volume group acts as a mapping to determine the virtual disks which can be accessed by one or multiple (in the case of clustering) iSCSI initiators.   Additional information on volume groups can be found under the Volumes API section of the Nutanix Bible: http://stevenpoitras.com/the-nutanix-bible/
Like AAG’s, MSSQL FCI may not be best suited for all environments.  Some of its drawback can include:

  • Shared storage complexity:  The configuration and maintenance of shared storage is often more complex to manage than standalone environments
  • Planned or unplanned downtime:  FCI can generally take more time to transition operation between cluster nodes than a similar AAG configuration.  Part of this downtime is to commit transactions which may have been in-flight prior to failover.  This can be somewhat mitigated with the recovery interval setting or using indirect checkpoints (https://msdn.microsoft.com/en-us/library/ms189573.aspx).
  • Separation of workloads:  AAG configurations can create multiple database copies across SQL instances for the purposes of distributed reporting or for backup offload.  An FCI cannot offer this functionality natively, although such configurations are possible via intelligent cloning methodologies that the Nutanix platform can offer.

As mentioned earlier it’s possible to configure both FCI and AAG as a part of the same solution.  So for example, if the HA capabilities of FCI are preferred, but the  replication capabilities of AAG are desired for the purposes of reporting, backup offload or disaster recovery, a blended configuration can be deployed.

With the support of shared storage clustering in 4.5, Nutanix can provide the full range of options necessary to support the broad number of use cases SQL Server can require.  Mike McGhem will have follow-on posts on his blog to detail how to configure volume group based clustering for Microsoft SQL Server.

Until next time, Rob.

NPP Training series – How does it work – CVM – Software Defined

To continue NPP training series here is my next topic:  How does it work – CVM – Software Defined

If you missed other parts of my series, check out links below:
Part 1 – NPP Training series – Nutanix Terminology
Part 2 – NPP Training series – Nutanix Terminology
Cluster Architecture with Hyper-V

Data Structure on Nutanix with Hyper-V
I/O Path Overview
Drive Breakdown

To give credit, most of the content was taken from Steve Poitras’s “Nutanix Bible” blog as his content is the most accurate and then I put a Hyper-V lean to it. Also, he just rocks…other than being a Sea Hawks Fan :).

Software-Defined
Nutanix CVM

As mentioned before (likely numerous times), the Nutanix platform is a software-based solution which ships as a bundled software + hardware appliance.  The controller VM or what we call the Nutanix CVM is where the vast majority of the Nutanix software and logic sits and was designed from the beginning to be an extensible and pluggable architecture. A key benefit to being software-defined and not relying upon any hardware offloads or constructs is around extensibility.  As with any product life-cycle, advancements and new features will always be introduced.

By not relying on any custom ASIC/FPGA or hardware capabilities, Nutanix can develop and deploy these new features through a simple software update.  This means that the deployment of a new feature (e.g., deduplication) can be deployed by upgrading the current version of the Nutanix software.  This also allows newer generation features to be deployed on legacy hardware models. For example, say you’re running a workload running an older version of Nutanix software on a prior generation hardware platform (e.g., 2400).  The running software version doesn’t provide deduplication capabilities which your workload could benefit greatly from.  To get these features, you perform a rolling upgrade of the Nutanix software version while the workload is running, and you now have deduplication.  It’s really that easy.

Similar to features, the ability to create new “adapters” or interfaces into Distributed Storage Fabric is another key capability.  When the product first shipped, it solely supported iSCSI for I/O from the hypervisor, this has now grown to include NFS and SMB for Hyper-V.  In the future, there is the ability to create new adapters for various workloads and hypervisors (HDFS, etc.).

And again, all of this can be deployed via a software update. This is contrary to most legacy infrastructures, where a hardware upgrade or software purchase is normally required to get the “latest and greatest” features.  With Nutanix, it’s different. Since all features are deployed in software, they can run on any hardware platform, any hypervisor, and be deployed through simple software upgrades.

The following figure shows a logical representation of what this software-defined controller framework (Nutanix CVM) looks like:Nutanix CVMNext up, NPP Training Series – How does it all work – Disk Balancing

Until next time, Rob…

Nutanix NOS 4.5 Released…

Hi all…It’s been a few weeks since my last blog post. I’ve been busy with some travel to Microsoft Technology Centers and working on the Nutanix Ready Program.  Yesterday, Nutanix released NOS 4.5.  This exciting upgrade adds some great features..  Sit back and get ready to enjoy the ride…release notes below.

customLogo NOS 4.5

Table 1. Terminology Updates
New Terminology Formerly Known As
Acropolis base software Nutanix operating system, NOS
Acropolis hypervisor, AHV Nutanix KVM hypervisor
Acropolis API Nutanix API and Acropolis API
Acropolis App Mobility Fabric Acropolis virtualization management and administration
Acropolis Distributed Storage Fabric, DSF Nutanix Distributed Filesystem (NDFS)
Prism Element Web console (for cluster management); also known as the Prism web console; a cluster managed by Prism Central
Prism Central Prism Central (for multicluster management)
Block fault tolerance Block awareness

What’s New in Acropolis base software 4.5

Bandwidth Limit on Schedule

  • The bandwidth throttling policy provides you with an option to set the maximum limit of the network bandwidth. You can specify the policy depending on the usage of your network.

Note: You can configure bandwidth throttling only while updating the remote site. This option is not available during the configuration of remote site.

Cloud Connect for Azure

  • The cloud connect feature for Azure enables you to back up and restore copies of virtual machines and files to and from an on-premise cluster and a Nutanix Controller VM located on the Microsoft Azure cloud. Once configured through the Prism web console, the remote site cluster is managed and monitored through the Data Protection dashboard like any other remote site you have created and configured. This feature is currently supported for ESXi hypervisor environments only.

Common Access Card Authentication

  • You can configure two-factor authentication for web console users that have an assigned role and use a Common Access Card (CAC).

Default Container and Storage Pool Upon Cluster Creation

  • When you create a cluster, the Acropolis base software automatically creates a container and storage pool for you.

Erasure Coding

  • Complementary to deduplication and compression, erasure coding increases the effective or usable cluster storage capacity. [FEAT-1096]

Hyper-V Configuration through Prism Web Console

  • After creating a Nutanix Hyper-V cluster environment, you can use the Prism web console to join the hosts to the domain, create the Hyper-V failover cluster, and also enable Kerberos.

Image Service Now Available in the Prism Web Console

  • The Prism web console Image Configuration workflow enables a user to upload ISO or disk images (in ESXi or Hyper-V format) to a Nutanix AHV cluster by specifying a remote repository URL or by uploading a file from a local machine.

MPIO Access to iSCSI Disks (Windows Guest VMs)

  • Acropolis base software 4.5 feature to help enforce access control to volume groups and expose volume group disks as dual namespace disks.

Network Mapping

  • Network mapping allows you to control network configuration for the VMs when they are started on the remote site. This feature enables you to specify network mapping between the source cluster and the destination cluster. The remote site wizard includes an option to create one or more network mappings and allows you to select source and destination network from the drop-down list. You can also modify or remove network mappings as part of modifying the remote sites.

Nutanix Cluster Check

  • Acropolis base software 4.5 includes Nutanix Cluster Check (NCC) 2.1, which includes many new checks and functionality.
  • NCC 2.1 Release Notes

NX-6035C Clusters Usable as a Target for Replication

  • You can use a Nutanix NX-6035C cluster as a target for Nutanix native replication and snapshots, created by source Nutanix clusters in your environment. You can configure the NX-6035C as a target for snapshots, set a longer retention policy than on the source cluster (for example), and restore snapshots to the source cluster as needed. The source cluster hypervisor environment can be AHV, Hyper-V, or ESXi. See Nutanix NX-6035C Replication Target in Notes and Cautions.

Note: You cannot use an NX-6035C cluster as a backup target with third-party backup software.

Prism Central Can Now Be Deployed on the Acropolis Hypervisor (AHV)

  • Nutanix has introduced a Prism Central OVA which can be deployed on an AHV cluster by leveraging Image Service features. See the Web Console Guide for installation details.
  • Prism Central 4.5 Release Notes

Prism Central Scalability

Simplified Add Node Workflow

  • This release leverages Foundation 3.0 imaging capabilities and automates the manual steps previously required for expanding a cluster through the Prism web console.

SNMP

  • The Nutanix SNMP MIB database includes the following changes:
    • The database includes tables for monitoring hypervisor instances and virtual machines.
    • The service status table named serviceStatusTable is obsolete. Analogous information is available in a new table named controllerStatusTable. The new table has a smaller number of MIB fields for displaying the status of only essential services in the Acropolis base software.
    • The disk status table (diskStatusTable), storage pool table (storagePoolInformationTable), and cluster information table include one or more new MIB fields.
  • The SNMP feature also includes the following enhancements:
    • From the web console, you can trigger test alerts that are sent to all configured SNMP trap receivers.
    • SNMP service logs are now written to the following log file: /home/nutanix/data/logs/snmp_manager.out

Support for Minor Release Upgrades for ESXi Hosts

  • Acropolis base software 4.5 enables you to patch upgrade ESXi hosts with minor release versions of ESXi host software through the Controller VM cluster command. Nutanix qualifies specific VMware updates and provides a related JSON metadata upgrade file for one-click upgrade, but now customers can patch hosts by using the offline bundle and md5sum checksum available from VMware, and using the Controller VM cluster command.

Note: Nutanix supports the ability to patch upgrade ESXi hosts with minor versions that are greater than or released after the Nutanix qualified version, but Nutanix might not have qualified those minor releases. Please see the the Nutanix hypervisor support statement in our Support FAQ.

VM High Availability in Acropolis

  • In case of a node failure, VM High Availability (VM-HA) ensures that VMs running on the node are automatically restarted on the remaining nodes within the cluster. VM-HA can optionally be configured to reserve spare failover capacity. This capacity reservation can be distributed across the nodes in chunks known as “segments” to provide better overall resource utilization.

Windows Guest VM Failover Clustering

  • Acropolis base software 4.5 supports configuring Windows guest VMs as a failover cluster. This clustering type enables applications on a failed VM to fail over to and run on another guest VM on the same or different host. This release supports this feature on Hyper-V hosts with in-guest VM iSCSI and SCSI 3 Persistent Reservation (PR).

Tech Preview Features

Note: Do not use tech preview features on production systems or storage used or data stored on production systems.

File Level Restore

  • The file level restore feature allows a virtual machine user to restore a file within a virtual machine from the Nutanix protected snapshot with minimal Nutanix administrator intervention.

Note: This feature should be used only after upgrading all nodes in the cluster to Acropolis base software 4.5.

What’s New in Prism Central

Prism Central for Acropolis Hypervisor (AHV)

Nutanix has introduced a Prism Central VM which is compatible with AHV to enable multicluster management in this environment. Prism Central now supports all three major hypervisors: AHV, Hyper-V, and ESXi.

Prism Central Scalability

The Prism Central VM requires these resources to support the clusters and VMs indicated in the table.

 
Prism Central vCPU
Prism Central Memory (GB, default) Total Storage Required for Prism Central VM (GB) Clusters Supported VMs Supported (across all clusters) Virtual disks per VM
4 8 256 50 5000 2

Release Notes | NCC 2.1

Learn More About NCC Health Checks

You can learn more about the Nutanix Cluster Check (NCC) health checks on the Nutanix support portal. The portal includes a series of Knowledge Base articles describing most NCC health checks run by the ncc health_checks command.

What’s New in NCC 2.1

NCC 2.1 includes support for:

  • Acropolis base software 4.5 or later
  • NOS 4.1.3 or later only
  • All Nutanix NX Series models
  • Dell XC Series of Web-scale Converged Appliances

Tech Preview Features

The following features are available as a Tech Preview in NCC 2.1.

Run NCC health checks in parallel

  • You can specify the number of NCC health checks to run in parallel to reduce the amount of time it takes for all checks to complete. For example, the command ncc health_checks run_all –parallel=25 will run 25 of the health checks in parallel.

Use npyscreen to display NCC status

  • You can specify npyscreen as part of the ncc command to display status to the terminal window. Specify –npyscreen=true as part of the ncc health_checks command.

New Checks in This Release

Check Name Description KB Article
check_disks Check whether disks are discoverable by the host. Pass if the disks are discovered. KB 2712
check_pending_reboot Check if host has pending reboots. Pass if host does not have pending reboots. KB 2713
check_storage_heavy_node Verify that nodes such as the storage-heavy NX-6025C are running a service VM and no guest VMs.
Verify that nodes such as the storage-heavy NX-6025C are runningthe Acropolis hypervisor only.
KB 2726
KB 2727
check_utc_clock Check if UTC clock is enabled. KB 2711
cluster_version_check Verifiy that the cluster is running a released version of NOS or the Acropolis base software. This check returns an INFO status and the version if the cluster is running a pre-release version. KB 2720
compression_disabled_check Verify if compression is enabled. KB 2725
data_locality_check Check if VMs that are part of a cluster with metro availability are in two different datastores (that is, fetching local data). KB 2732
dedup_and_compression_enabled_containers_check Checks if any container have deduplication and compression enabled together. KB 2721
dimm_same_speed_check Check that all DIMMs have the same speed. KB 2723
esxi_ivybridge_performance_degradation_check Check for the Ivy Bridge performance degradation scenario on ESXi clusters. KB 2729
gpu_driver_installed_check Check the version of the installed GPU driver. KB 2714
quad_nic_driver_version_check Check the version of the installed quad port NIC driver version. KB 2715
vmknics_subnet_check Check if any vmknics have same subnet (different subnets are not supported). KB 2722

Foundation Release 3.0

This release includes the following enhancements and changes:

  • A major new implementation that allows for node imaging and cluster creation through the Controller VM for factory-prepared nodes on the same subnet. This process significantly reduces network complications and simplifies the workflow. (The existing workflow remains for imaging bare metal nodes.) The new implementation includes the following enhancements:
    • A Java aplet that automatically discovers factory-prepared nodes on the subnet and allows you to select the first one to image.
    • A simplified GUI to select and configure the nodes, define the cluster, select the hypervisor and Acropolis base software versions to use, and monitor the imaging and cluster creation process.

Customers may create a cluster using the new Controller VM-based implementation in Foundation 3.0. Imaging bare metal nodes is still restricted to Nutanix sales engineers, support engineers, and partners.

  • The new implementation is incorporated in the Acropolis base software version 4.5 to allow for node imaging when adding nodes to an existing cluster through the Prism GUI.
  • The cluster creation workflow does not use IPMI, and for both cluster creation and bare-metal imaging, the host operating system install is done within an “installer VM” in Phoenix.
  • To see the progress of a host operating system installation, point a VNC console at the node’s Controller VM IP address on port 5901.
  • Foundation no longer offers the option to run diagnostics.py as a post-imaging test.  Should you wish to run this test, you can download it from the Tools & Firmware page on the Nutanix support portal.
  • There is no Foundation upgrade path to the new Controller VM implementation; you must download the Java aplet from the Foundation 3.0 download page on the support portal. However, you can upgrade Foundation 2.1.x to 3.0 for the bare metal workflow as follows:
      • Copy the Foundation tarball (foundation-version#.tar.gz) from the support portal to /home/nutanix in your VM.
      • Navigate to /home/nutanix.
      • Enter the following five commands:
        • $ sudo service foundation_service stop
        • $ rm -rf foundation
        • $ tar xzf foundation-version#.tar.gz
        • $ sudo yum install python-scp
        • $ sudo service foundation_service restart
    • If the first command (foundation_service stop) is skipped or the commands are not run in order, the user may get bizarre errors after upgrading. To fix this situation, enter the following two commands:
  • $ sudo pkill -9 foundation
  • $ sudo service foundation_service restart

Release Notes for each of these products is located at:

Download URLs:

Until next time, Rob…