Azure Stack 101: The Definitive Introduction

Azure Stack

Microsoft’s Azure Stack is an excellent toolset that allows enterprises to run a hybrid cloud right in their own datacenters, giving them additional cloud options.

But to really use it to its best advantage, IT pros should know the ins and outs of Azure Stack so they can use it within their business IT infrastructures to better manage, speed up and control their Azure cloud deployments and workloads.

A good place to start is with a primer on Azure Stack itself to give business users a broad look at what’s under the hood of their IT infrastructure.

Continue reading

Storage Spaces Direct Explained – Applications & Performance

Applications

Microsoft SQL Server product group announced that SQL Server, either virtual or bare metal, is fully supported on Storage Spaces Direct. The Exchange Team did not have a clear endorsement for Exchange on S2D and clearly still prefers that Exchange is deployed on physical servers with local JBODs using Exchange Database Availability Groups or that customers simply move to O365.
image031

Continue reading

Storage Spaces Direct Explained – ReFS, Multi-Tier Volumes and Erasure Coding

Here’s where we dive in and get dirty…but I promise by the end of my series, you will smiling like my friend here. I am planning a surprise with special guest bloggers. Stayed Tuned. Now one to the show…..
Storage Spaces Direct Explained ReFS

The NEW ReFS File System, Multi-Tier Volumes and Erasure Coding

Like S2D, the ReFS file system actually isn’t new either, they have been working on it for several releases now also.  In Windows Server 2016, it finally drops the tech preview label and is now ready for production.  And there is a lot of benefits… like volume creation doesn’t have to zero out the volume for 10 minutes like NTFS. It’s just a metadata operation that is effectively instantaneous now, I’m just going to focus on the couple of benefits that ReFS has for S2D.
For those not familiar Erasure coding (EC) and to prepare you for the next part, EC is a method of data protection in which data is broken into fragments, expanded and encoded with redundant data pieces and stored across a set of different locations.
The original goal of EC was to enable data that becomes corrupted at some point in the storage process to be reconstructed by using information about the data that’s stored elsewhere.  Erasure codes are great, because of their ability to reduce the time and overhead required to reconstruct data. The drawback of erasure coding is that it can be more CPU-intensive, and that can translate into increased latency.
Now all that being said, classic erasure codes were designed and optimized more for communication, not for storage. Naively applying classic erasure codes in storage is okay, but is missing enormous efficiencies. Microsoft has developed their own erasure codes optimized for storage called Local Reconstruction Codes (LRC). I will cover this brieifly further down in the post.
Now back on to S2D…For data protection, S2D uses either 3-way mirroring or distributed parity with EC.  Mirroring gives you great write performance, but only 33% data efficiency.  EC gives you good data efficiency, but random write performance isn’t great for hot data.  ReFS supports the ability to combine different disk tiers using different parity schemes in the same vDisk. This allows S2D to do real-time data tiering by writing new data to the mirror tier and then automatically rotating cold data out to the parity tier and applying the erasure code on data rotation.
It is important to note that ReFS does not currently support Deduplication.  There was a question on this in every session and MSFT says that this is all the ReFS is currently focused on. So we’ll expect to see it land in ReFSv3. For now, customers can get dedupe with S2D by using NTFS. 🙁
Storage Spaces Direct Explained ReFS Storage Spaces Direct Explained ReFSNote if you only have two types of storage then the highest performing is used for the cache while the other type will be divided between performance and capacity with the different resiliency option (mirror vs parity) providing the performance/capacity difference between the tiers. If you only have one type of storage then the cache is disabled and the disks divided between performance and capacity like the previously mentioned case.
For non-Storage Spaces Direct only two tiers, of storage are supported like Windows Server 2012 R2, i.e. SSD and HDD, there is no cache. If you had NVMe storage that could be the “hot” tier while the rest of storage (SSD, HDD) could be the “cold” tier (you name the tiers whatever you want) but you cannot use three tiers.
Storage Spaces Direct Explained ReFS Storage Spaces Direct Explained ReFSStorage Spaces Direct Explained ReFSDuring Ignite 2016, Microsoft took many shots at VMware. Microsoft said that there’s a right way and a wrong way to do erasure coding.  “When you do it the wrong way, performance sucks and you have to limit it to all-flash configurations.”
Microsoft research is using a new technique called “Local Reconstruction Codes”. It uses smaller groups within the vDisk that allows them to recover from failures much faster by not having to reconstruct data from across the entire pool. This combined with multi-tier volumes gives S2D good performance, even on hybrid systems. Sounds like a technology that I seen before. Hmmm..I wonder where…….  😉
Storage Spaces Direct Explained ReFSOk, that’s all for now. next up, Fault Tolerance and Multisite Replication with S2D….

Until Next time, Rob….

Microsoft World Wide Partner Conference 2015…Picture Highlights

Gallery

This gallery contains 29 photos.

Microsoft World Wide Partner Conference 2015 WPC is the largest event for Microsoft partners When it comes to meeting the right people in the right place, bigger is better. The Microsoft Worldwide Partner Conference (WPC) brings together over 15,000 attendees … Continue reading

Nutanix SCOM Management Pack – Monitor Your Nutanix Infrastructure

As a Microsoft Evangelist at Nutanix, I am always asked….”How would you monitor your Nutanix Infrastructure and can I use System Center suite. And my answer always is, “YES, with SCOM”….What is SCOM you ask?
SysCnt-OprtnsMgr_h_rgb_2 Nutanix SCOM Management Pack System Center Operations Manager (SCOM) is designed to be a monitoring tool for the datacenter. Think of a datacenter with multiple vendors representing multiple software and hardware products. Consequently, SCOM was developed to be extensible using the concept of management packs. Vendors typically develop one or more management packs for every product they want plugged into SCOM.

To facilitate these management packs, SCOM supports standard discovery and data collection mechanisms like SNMP, but also affords vendors the flexibility of native API driven data collection.  Nutanix provides management packs that support using the Microsoft System Center Operations Manager (SCOM) to monitor a Nutanix cluster.

Nutanix SCOM Management Pack

The management packs collect information about software (cluster) elements through SNMP and hardware elements through ipmiutil (Intelligent Platform Management Interface Utility) and REST API calls and then package that information for SCOM to digest. Note: The Hardware Elements Management Pack leverages the ipmiutil program to gather information from Nutanix block for Fans, Power Supply and Temperature.
SCOM01 Nutanix SCOM Management Pack

Nutanix provides two management packs:

  • Cluster Management Pack – This management pack collects information about software elements of a cluster including Controller VMs, storage pools, and containers.
  • Hardware Management Pack – This management pack collects information about hardware elements of a cluster including fans, power supplies, disks, and nodes.

Installing and configuring the management packs involves the following simple steps:

  1. Install and configure SCOM on the Windows server system (if not installed) (will blog a post soon on this topic)
  2. Uninstall existing Nutanix management packs (if present)
  3. Open the IPMI-related ports (if not open). IPMI access is required for the hardware management pack
  4. Install the Nutanix management packs
  5. Configure the management packs using the SCOM discovery and template wizards

SCOM02 Nutanix SCOM Management Pack SCOM03 Nutanix SCOM Management Pack SCOM04 Nutanix SCOM Management PackSCOM16 Nutanix SCOM Management PackSCOM17 Nutanix SCOM Management PackSCOM18 Nutanix SCOM Management PackSCOM19 Nutanix SCOM Management PackAfter the management packs have been installed and configured, you can use SCOM to monitor a variety of Nutanix objects including cluster, alert, and performance views as shown in examples below. Also, I check out this great video produced by pal @mcghem . He shows a great demo of the SCOM management pack…Kudo’s Mike….also, check out his blog.

YouTube player

Views and Objects Snapshots SCOM05 Nutanix SCOM Management Pack

Cluster Monitoring SnapshotsSCOM06 Nutanix SCOM Management Pack SCOM07 Nutanix SCOM Management Pack

Cluster Performance Monitoring

SCOM08 Nutanix SCOM Management Pack SCOM09 Nutanix SCOM Management Pack SCOM10 Nutanix SCOM Management Pack SCOM11 Nutanix SCOM Management Pack

Hardware Monitoring Snapshots SCOM12 Nutanix SCOM Management Pack SCOM13 Nutanix SCOM Management Pack

In the following diagram views, users can navigate to the components with failure.

SCOM14 Nutanix SCOM Management Pack

Nutanix Objects Available for Monitoring via SCOM

The following provides an high level overview of Nutanix Cluster with Components:

dsf_overview Nutanix SCOM Management Pack

The following sections describe Nutanix Cluster objects being monitored by this version of MPs:

Cluster

Monitored Element

Description

Version

Current cluster version. This is the nutanix-core package version expected on all the Controller VMs.

Status

Current Status of the cluster. This will usually be  one of started or stopped

TotalStorageCapacity

Total storage capacity of the cluster

UsedStorageCapacity

Number of bytes of storage used on the cluster

Iops

For Performance: Cluster wide average IO operations per second

Latency

For Performance: Cluster wide average latency

CVM Resource Monitoring

Monitored Element

Description

ControllerVMId

Nutanix Controller VM Id

Memory

Total memory assigned to CVM

NumCpus

Total number of CPUs allocated to a CVM

Storage

Storage Pool

A storage pool is a group of physical disks from SSD and/or HDD tier.

Monitored Element

Description

PoolId

Storage pool id

PoolName

Name of the storage pool

TotalCapacity

Total capacity of the storage pool

Note: An alert if there is drop in capacity may indicate a bad disk.

UsedCapacity

Number of bytes used in the storage pool

Performance parameters:

Monitored Element

Description

IOPerSecond

Number of IO operations served per second from this storage pool.

AvgLatencyUsecs

Average IO latency for this storage pool in microseconds

Containers

A container is a subset of available storage within a storage pool. Containers hold the virtual disks (vDisks) used by virtual machines. Selecting a storage pool for a new container defines the physical disks where the vDisks will be stored.

Monitored Element

Description

ContainerId

Container id

ContainerName

Name of the container

TotalCapacity

Total capacity of the container

UsedCapacity

Number of bytes used in the container

Performance parameters:

Monitored Element

Description

IOPerSecond

Number of IO operations served per second from this container.

AvgLatencyUsecs

Average IO latency for this container in  microseconds

Hardware Objects

Cluster

Monitored Element

Description

Discovery IP Address

IP address used for discovery of cluster

Cluster Incarnation ID

Unique ID of cluster

CPU Usage

CPU usage for all the nodes of cluster

Memory Usage

Memory usage for all the nodes of cluster

Node IP address

External IP address of Node

System Temperature

System Temperature

Disk

Monitored Element

Description

Disk State/health

Node state as returned by the PRISM [REST /hosts “state” attribute ]

Disk ID

ID assigned to the disk

Disk Name

Name of the disk (Full path where meta data stored)

Disk Serial Number

Serial number of disk

Hypervisor IP

Host OS IP where disk is installed

Tire Name

Disk Tire

CVM IP

Cluster VM IP which controls the disk

Total Capacity

Total Disk capacity

Used Capacity

Total Disk used

Online

If Disk is online or offline

Location

Disk location

Cluster Name

Disk cluster name

Discovery IP address

IP address through which Disk was discovered

Disk Status

Status of the disk

Node

Monitored Element

Description

Node State/health

Node state as returned by the PRISM [REST /hosts “state” attribute ]

Node IP address

External IP address of Node

IPMI Address

IPMI IP address of Node

Block Model

Hardware model of block

Block Serial Number

Serial number of block

CPU Usage %

 CPU usage for Node

Memory Usage  %

Memory usage for node

Fan Count

Total fans

Power Supply Count

Total Power supply

System Temperature

System Temperature

Fan

Monitored Element

Description

Fan number

Fan number

Fan speed

Fan speed in RPM

Power supply

 Element

Description

Power supply number

Power supply number

Power supply status

Power supply status whether present or absent

If you would like to checkout the Nutanix management pack on your SCOM instance, please go to our portal to download the management pack and documentation.
This management pack was development by our awesome engineering team @ Nutanix. Kudos to Yogi and team for a job well done!!! 😉  I hope I gave you a good feel for Nutanix monitoring using SCOM. As always, if you have any questions or comments, please leave below….

Until next time….Rob

NPP Training series – Drive Breakdown

To continue NPP training series here is my next topic:  Drive Breakdown
If you missed other parts of my series, check out links below:
Part 1 – NPP Training series – Nutanix Terminology
Part 2 – NPP Training series – Nutanix Terminology
Cluster Architecture with Hyper-V

Data Structure on Nutanix with Hyper-V
I/O Path Overview

To give credit, most of the content was taken from Steve Poitras’s “Nutanix Bible” blog as his content is the most accurate and then I put a Hyper-V lean to it.

Drive Breakdown

In this section I’ll cover how the various storage devices (SSD / HDD) are broken down, partitioned and utilized by the Nutanix platform. NOTE: All of the capacities used are in Base2 Gibibyte (GiB) instead of the Base10 Gigabyte (GB).  Formatting of the drives with a filesystem and associated overheads has also been taken into account.

SSD Devices

SSD devices store a few key items which are explained in greater detail above:

  • Nutanix Home (CVM core)
  • Cassandra (metadata storage) – MORE
  • OpLog (persistent write buffer) – MORE
  • Extent Store (persistent storage) – MORE

Below we show an example of the storage breakdown for a Nutanix node’s SSD(s):
NDFS_SSD_breakdown3 Drive Breakdown
NOTE: The sizing for OpLog is done dynamically as of release 4.0.1 which will allow the extent store portion to grow dynamically.  The values used are assuming a completely utilized OpLog.  Graphics and proportions aren’t drawn to scale.  When evaluating the Remaining GiB capacities do so from the top down.  For example the Remaining GiB to be used for the OpLog calculation would be after Nutanix Home and Cassandra have been subtracted from the formatted SSD capacity. Most models ship with 1 or 2 SSDs, however the same construct applies for models shipping with more SSD devices. For example, if we apply this to an example 3060 or 6060 node which has 2 x 400GB SSDs this would give us 100GiB of OpLog, 40GiB of Content Cache and ~440GiB of Extent Store SSD capacity per node.  Storage for Cassandra is a minimum reservation and may be larger depending on the quantity of data.
NDFS_SSD_3060_2 Drive Breakdown
For a 3061 node which has 2 x 800GB SSDs this would give us 100GiB of OpLog, 40GiB of Content Cache and ~1.1TiB of Extent Store SSD capacity per node.
NDFS_SSD_3061v2 Drive Breakdown

HDD Devices

Since HDD devices are primarily used for bulk storage, their breakdown is much simpler:

  • Curator Reservation (Curator storage) – MORE
  • Extent Store (persistent storage)

NDFS_HDD_breakdown Drive Breakdown
For example, if we apply this to an example 3060 node which has 4 x 1TB HDDs this would give us 80GiB reserved for Curator and ~3.4TiB of Extent Store HDD capacity per node.
NDFS_HDD_3060 Drive Breakdown
NOTE: the above values are accurate as of 4.0.1 and may vary by release.
Next up, I figured we would look at some of the cool software technologies that run on our CVM (Controller Virtual Machine), next up Elastic Dedupe Engine.

Until next time, Rob

Nutanix Community Edition – Public Beta – Now Available – Build Your Own Nutanix Test Lab for Free

nutanix-community-edition_w_500

Nutanix Community Edition

Another very exciting announcement was Nutanix Community Edition (CE) on June 9th, 2015 at our Inaugural .NEXT conference. So, what is it?…..Our website describes it the best “Community Edition is a 100% software solution enabling technology enthusiasts to easily evaluate the latest hyperconvergence technology at zero cost.”  In other words, you can use your own hardware to test out Nutanix.  Very cool.  This is great for building a lab and just gaining understanding of hyperconvergence hands on.
Nutanix is offering a hardware compatibility list (HCL) to users that includes the minimum requirements to run the software; essentially, any standard x86 server can be used….
And to quote our CEO and co-founder Dheeraj Pandey,
“From our very first software release in 2012, Nutanix has been dedicated to open architectures and technologies, offering unprecedented customer choice and flexibility,” “Community Edition is the next step in democratizing hyperconverged infrastructure technology, enabling anyone to experience the transformative benefits of our software. Only by eliminating the requirement for proprietary hardware and embracing off-the-shelf platforms can the next revolution of datacenter technologies be fully realized.”
As the name implies, the support for the CE will come from the community through Nutanix’s NEXT online portal. Users will be able to log in, ask questions and get answers from the community.
CE also allow you to also check our new hypervisor based on KVM and Acropolis. Check out Josh Odger’s Blog to learn more about Acropolis.
Join the beta…And don’t forget my NPP training series that helps you with all the concepts around hyperconvergence.
Currently, I am getting started with Nutanix CE installation and will be posting my experiences in a later blog post with how I build my Nutanix Lab @ Home. 🙂

Until next time….Rob

NPP Training series – I/O Path Overview

To continue NPP training series, here is my next topic:  I/O Path Overview
If you missed other parts of my series, check out links below:
Part 1 – NPP Training series – Nutanix Terminology
Part 2 – NPP Training series – Nutanix Terminology
Cluster Architecture with Hyper-V

Data Structure on Nutanix with Hyper-V

To give credit, most of the content was taken from Steve Poitras’s “Nutanix Bible” blog as his content is the most accurate and then I put a Hyper-V lean-to it.

IO Path Overview

The Nutanix IO path is composed of the following high-level components:
NDFS_IO_basev5 IO Path

OpLog

  • Key Role: Persistent write buffer
  • Description: The Oplog is similar to a filesystem journal and is built to handle bursty writes, coalesce them and then sequentially drain the data to the extent store.  Upon a write the OpLog is synchronously replicated to another n number of CVM’s OpLog before the write is acknowledged for data availability purposes.  All CVM OpLogs partake in the replication and are dynamically chosen based upon load.  The OpLog is stored on the SSD tier on the CVM to provide extremely fast write I/O performance, especially for random I/O workloads.  For sequential workloads the OpLog is bypassed and the writes go directly to the extent store.  If data is currently sitting in the OpLog and has not been drained, all read requests will be directly fulfilled from the OpLog until they have been drain where they would then be served by the extent store/content cache.  For containers where fingerprinting (aka Dedupe) has been enabled, all write I/Os will be fingerprinted using a hashing scheme allowing them to be deduped based upon fingerprint in the content cache.

Extent Store

  • Key Role: Persistent data storage
  • Description: The Extent Store is the persistent bulk storage of NDFS and spans SSD and HDD and is extensible to facilitate additional devices/tiers.  Data entering the extent store is either being A) drained from the OpLog or B) is sequential in nature and has bypassed the OpLog directly.  Nutanix ILM will determine tier placement dynamically based upon I/O patterns and will move data between tiers.

Content Cache

  • Key Role: Dynamic read cache
  • Description: The Content Cache (aka “Elastic Dedupe Engine”) is a deduped read cache which spans both the CVM’s memory and SSD.  Upon a read request of data not in the cache (or based upon a particular fingerprint) the data will be placed in to the single-touch pool of the content cache which completely sits in memory where it will use LRU until it is ejected from the cache.  Any subsequent read request will “move” (no data is actually moved, just cache metadata) the data into the memory portion of the multi-touch pool which consists of both memory and SSD.  From here there are two LRU cycles, one for the in-memory piece upon which eviction will move the data to the SSD section of the multi-touch pool where a new LRU counter is assigned.  Any read request for data in the multi-touch pool will cause the data to go to the peak of the multi-touch pool where it will be given a new LRU counter.  Fingerprinting is configured at the container level and can be configured via the UI.  By default fingerprinting is disabled.
  • Below we show a high-level overview of the Content Cache:

CC_Pools IO Path

Extent Cache

  • Key Role: In-memory read cache
  • Description: The Extent Cache is an in-memory read cache that is completely in the CVM’s memory.  This will store non-fingerprinted extents for containers where fingerprinting and dedupe disabled.

Drive Breakdown

In this section I’ll cover how the various storage devices (SSD / HDD) are broken down, partitioned and utilized by the Nutanix platform. NOTE: All of the capacities used are in Base2 Gibibyte (GiB) instead of the Base10 Gigabyte (GB).  Formatting of the drives with a filesystem and associated overheads has also been taken into account.

SSD Devices

SSD devices store a few key items which are explained in greater detail above:

  • Nutanix Home (CVM core)
  • Cassandra (metadata storage) – MORE
  • OpLog (persistent write buffer)
  • Extent Store (persistent storage)

Below we show an example of the storage breakdown for a Nutanix node’s SSD(s):
NDFS_SSD_breakdown3 IO PathNOTE: The sizing for OpLog is done dynamically as of release 4.0.1 which will allow the extent store portion to grow dynamically.  The values used are assuming a completely utilized OpLog.  Graphics and proportions aren’t drawn to scale.  When evaluating the Remaining GiB capacities do so from the top down.

For example the Remaining GiB to be used for the OpLog calculation would be after Nutanix Home and Cassandra have been subtracted from the formatted SSD capacity. Most models ship with 1 or 2 SSDs, however the same construct applies for models shipping with more SSD devices. For example, if we apply this to an example 3060 or 6060 node which has 2 x 400GB SSDs this would give us 100GiB of OpLog, 40GiB of Content Cache and ~440GiB of Extent Store SSD capacity per node.  Storage for Cassandra is a minimum reservation and may be larger depending on the quantity of data.
NDFS_SSD_3060_2 IO Path
For a 3061 node which has 2 x 800GB SSDs this would give us 100GiB of OpLog, 40GiB of Content Cache and ~1.1TiB of Extent Store SSD capacity per node.
NDFS_SSD_3061v2 IO Path

HDD Devices

Since HDD devices are primarily used for bulk storage, their breakdown is much simpler:

  • Curator Reservation (Curator storage) – MORE
  • Extent Store (persistent storage)

NDFS_HDD_breakdown IO Path
For example, if we apply this to an example 3060 node which has 4 x 1TB HDDs this would give us 80GiB reserved for Curator and ~3.4TiB of Extent Store HDD capacity per node.
NDFS_HDD_3060 IO PathFor a 6060 node which has 4 x 4TB HDDs this would give us 80GiB reserved for Curator and ~14TiB of Extent Store HDD capacity per node.
NDFS_HDD_6060 IO PathStatistics and technical specifications: opportunites-digitales.com/avis-expressvpn/
NOTE: the above values are accurate as of 4.0.1 and may vary by release.
Next up, Drive Breakdown on Nutanix

Until next time, Rob….