Microsoft Azure Cloud Series – What is Cloud? – Part 1

Hi All, its Rob again and I decided to write a series on Azure Cloud.  Since Azure Stack is months away from GA, its good to understand Azure Cloud for a few reasons.  The API is the consistent across Azure Cloud and Azure Stack. And building a hybrid environment is the future for IT to use features like DR, Application Portability and Backup.
So, let’s start from the beginning and go over the fundamental terms.

What is Cloud? Simply put, it’s a large number of devices connected through wide communication network.

What are the benefits of Cloud?

  • Provide Services at much lower cost
  • Flexibility on technology that gives the customer a competitive advantage
  • Helps IT to be more efficient on operations
  • Pays as they go and for what they use
  • Move to OPEX model from CAPEX model
  • Faster deployment

Types of Clouds and Examples

Private

  • A private cloud is dedicated to the customer’s organization.
  • On-premises (their own data center) or in a partner’s hosting facility.
  • More control over the level of security, privacy, customization, and governance of your software and services.
  • More cost-effective data center operations using the customer’s existing investments in people and equipment.
  • Example: Customer needs dedicated resources and wants to restrict access to all content in the cloud

Public

  • The public cloud is where cloud services are provided in a virtualized environment.
  • Resources are available on demand.
  • Centralized operation and resource management are shared.
  • Customers can access the resource they need, and then only pay for what they use.
  • Many services are available that are ready to be accessed from any internet enabled device.
  • Example: Customer can share resources and wants to pay when they use the resource

Hybrid

  • A hybrid cloud is an integrated service utilizing on-premises resources, private clouds and public clouds.
  • Moving non-sensitive functions to the public cloud allows an organization to benefit from public cloud scalability while reducing the demands on a private cloud.
  • The availability of secure resources and scalable cost-effective public resources provides organizations with options.
  • Example: Customer has secure and non-secure data and they have made some investment in their own infrastructure and they want to use them

Industry Tends

Industrytrendsazure Cloud

As we look at the IT industry today, a number of important trends are changing the way software is being purchased, deployed and used in the organization.

Data Explosion

The volume of data in the workplace is exploding. According to IDC, digital data will grow more than 40x over the next decade. As more and more data is created digitally, we expect to see ever greater demands being placed on our data platforms to store, secure, process and manage these large volumes of data.

Consumerization of IT

Today we see an increasing trend toward the “consumerization” of IT—creating the demand for Web 2.0 experiences in the business environment. As consumers, we are accustomed to powerful user-friendly experiences, whether searching the Internet on a mobile device to find information instantly, or quickly accessing our personal financial data. In the workplace, however, we are often unable to answer even the most basic questions about our business. Increasingly, users demand business experiences that more closely mirror the convenience and ease of use in consumer life.

Private and Public Cloud

Cloud computing is changing the way data is accessed and processed, and it is creating whole new models for the way applications are delivered. According to IDC, Cloud services will account for 46% of net-new growth in overall IT spending. With private and public cloud infrastructure, organizations have an opportunity to reduce TCO dramatically as data volume increases. As we see an evolution toward greater use of private and public clouds, we see an increasing need for solutions that support hybrid cloud scenarios.

Azure at a Glance

azure-overview Cloud
So this picture, or at least the big blocks above, are how Microsoft thinks about the main capability buckets of their platform. As we go though this series, we will start to get more specific about these buckets (Compute, data, networks, and app services).
Well, I hope you enjoyed this brief introduction on Azure Cloud.  Stay tuned to the rest of the series. Next up, more fundamentals and use cases and then we dive into some fundamentals, like Azure Resource Manager.

Until next time, Rob

Microsoft SQL Server High Availability Options on Nutanix

To give credit, this content was taken from my buddy Mike McGhem’s blog and I added some more color to it, but his content is right on.

In General, modern versions Microsoft SQL Server (MSSQL) supports several High Availability (HA) options at both the host and storage level.  For the purposes of this post I will only be addressing the HA options which leverage native Windows Server Failover Clustering (WSFC)  in some form.  SQL Server also provides transactional replication through the use of a publisher and subscriber model, which some consider an HA option, but that’s a topic (and debate) for another post with Mike McGhem.
Starting with MSSQL 2012, Microsoft introduced AlwaysOn which is a combination of some existing and new functionality.  Under the AlwaysOn umbrella falls two main options, Failover Cluster Instances (FCI) and Availability Groups (AAG).

Nutanix has long supported and recommended the use of AlwaysOn Availability Groups.  AAG leverages a combination of WSFC and native database level replication to create either an HA or disaster recovery solution between instances of MSSQL.  The instances of MSSQL leveraged to support the AAG can be either standalone or clustered (in the case of Nutanix these would be standalone instances today).

The following figure provides a logical overview of an AlwaysOn Availability Group.
Microsoft SQL High Availability
An AAG performs replication at the database level creating “primary” and one or more “secondary” database copies.  The secondary copies are replicated using  either synchronous or asynchronous commit mode, as specified by an administrator.  Asynchronous commit is intended more as a disaster recovery or reporting solution as it implies the potential for data loss.  So for HA scenarios as we’re discussing them here, we should assume synchronous commit.  Because database replication is used, shared storage is not required and each MSSQL instance within the AAG can use its own local devices.  Additional details on AlwaysOn Availability Groups can be found here: https://msdn.microsoft.com/en-us/library/hh510230.aspx

AAGs can take advantage of the secondary databases for the purpose of read-only transactions or for backup operations.  In the context of a scale-out architecture like Nutanix, leveraging multiple copies across hypervisor hosts for distributing these kinds of operations creates an excellent solution.

While AAGs are a great solution and fit nicely with the Nutanix architecture, they may not be a good fit or even possible for certain environments.  Some of the limiting factors for adopting AAGs can include:

  • Space utilization:  Because a secondary database copy is created additional storage space will be consumed.  Some administrators may prefer a single database copy where server HA is the primary use case.
  • Synchronous commit performance:  The synchronous replication of transactions (Insert/Update/Delete…) needed for AAG replication (in the context of an HA solution) do have a performance overhead.  Administrators of latency sensitive applications may prefer not to have the additional response time of waiting for transactions to be committed to multiple SQL instances.
  • Distributed Transactions:  Some applications perform distributed transactions across databases and MSSQL instances.  Microsoft does not support the use of distributed transactions with AAGs, and by extension application vendors will not support their software which utilize distributed transactions where AAGs are present.
  • SQL Server versions:  Some environments can simply not yet upgrade to SQL 2012 or higher.  Whether it be due to current business requirements or application requirements based on qualification, many administrators have to stick with SQL 2008 (and I hope not, but maybe even earlier versions) for the time being.

In the above cases MSSQL Failover Cluster Instances are likely the better solution.  FCI have long been used as the primary means for HA with MSSQL.  FCI can be leveraged with all current versions of MSSQL and relies on shared storage to support the MSSQL instances.  The following figure provides a logical overview of Failover Cluster Instances.
Microsoft SQL High Availability
The shared storage used can be block (LUN) based or, starting with MSSQL 2012, SMB (file) based.  In the case of LUN based shared storage, SCSI-3 persistent reservations are used to arbitrate ownership of the shared disk resources between nodes.  The MSSQL instance utilizing specific LUNs is made dependent against those disk resources.  Additional details on AlwaysOn Failover Cluster Instances can be found here:  https://msdn.microsoft.com/en-us/library/ms189134.aspx

Until very recently Nutanix has not supported MSSQL FCI within virtual machines, whether they reside on ESXi, Hyper-V or the Nutanix Acropolis Hypervisor (AHV).  But starting with the Nutanix 4.5 release (with technical preview support in posted 4.1.5 release), MSSQL FCI will be supported across all three of the aforementioned hypervisors.  Nutanix will support this form of clustering using iSCSI from within the virtual machines.  In essence Nutanix virtual disks (vdisks) which support SCSI-3 persistent reservations are created within a Nutanix container.  These vdisks will be presented directly to virtual machines as LUNs, leveraging the Nutanix Controller Virtual Machines (CVM) as iSCSI targets.  The virtual machines will utilize the Microsoft iSCSI initiator service and the Multipath I/O (MPIO) capabilities native to the Windows Operating System for connectivity and path failover.  An overview of this configuration can be seen in the following diagram.
Microsoft SQL High Availability
The association between virtual machine iSCSI initiators and the vdisks is managed via the concept of a Volume Group.  A volume group acts as a mapping to determine the virtual disks which can be accessed by one or multiple (in the case of clustering) iSCSI initiators.   Additional information on volume groups can be found under the Volumes API section of the Nutanix Bible: http://stevenpoitras.com/the-nutanix-bible/
Like AAG’s, MSSQL FCI may not be best suited for all environments.  Some of its drawback can include:

  • Shared storage complexity:  The configuration and maintenance of shared storage is often more complex to manage than standalone environments
  • Planned or unplanned downtime:  FCI can generally take more time to transition operation between cluster nodes than a similar AAG configuration.  Part of this downtime is to commit transactions which may have been in-flight prior to failover.  This can be somewhat mitigated with the recovery interval setting or using indirect checkpoints (https://msdn.microsoft.com/en-us/library/ms189573.aspx).
  • Separation of workloads:  AAG configurations can create multiple database copies across SQL instances for the purposes of distributed reporting or for backup offload.  An FCI cannot offer this functionality natively, although such configurations are possible via intelligent cloning methodologies that the Nutanix platform can offer.

As mentioned earlier it’s possible to configure both FCI and AAG as a part of the same solution.  So for example, if the HA capabilities of FCI are preferred, but the  replication capabilities of AAG are desired for the purposes of reporting, backup offload or disaster recovery, a blended configuration can be deployed.

With the support of shared storage clustering in 4.5, Nutanix can provide the full range of options necessary to support the broad number of use cases SQL Server can require.  Mike McGhem will have follow-on posts on his blog to detail how to configure volume group based clustering for Microsoft SQL Server.

Until next time, Rob.

SQL 2012 AlwaysON Feature….What is it? How does it work?

As a Microsoft Solutions Architect, part of my job is to help the teams with solutions around the Microsoft stack.  Today, a colleague of mine reached out to me about the new SQL Server AlwaysOn feature that part of SQL 2012 and how it compared to SQL 2008 clustering….So I started with this topic to bring some understanding around it:

SQL Server AlwaysON
Prior to SQL Server 2012, SQL Server had several high availability and disaster recovery solutions for an enterprise’s mission critical databases such as failover clustering, database mirroring, log shipping or combinations of these. Each solution typically has a major limitation, in the case of failover clustering for example, its configuration is very tedious and complex and you arguably have single shared storage or single point of failure. Database mirroring is relatively easy to configure in comparison with failover clustering, but you can have only one database in a single mirroring setup and you cannot read from the mirrored database. Log shipping does not provide automatic failover (higher availability) though it be used for disaster recovery with some expected data loss.

SQL Server AlwaysOn

SQL 2012 AlwaysOn Diagram

SQL Server 2012  introduced a new feature called AlwaysOn which combines the best of failover clustering and database mirroring and overcomes major of the limitations imposed in failover clustering or a database mirroring setup.

AlwaysOn is a High Availability (HA) and Disaster Recovery (DR) solution in SQL Server 2012 which improves high availability and protects data of your mission critical applications. AlwaysOn is the common name for two high availability and disaster recovery solutions:

AlwaysOn Failover Cluster Instance (FCI)
This is an enhancement to the existing SQL Server failover clustering (which is based on Windows Server Failover Cluster (WSFC)) which provides higher availability of SQL Server instance after failover. Some of the enhancements in AlwaysOn Failover Cluster Instance over the existing SQL Server failover clustering are:

  • Multisite failover clustering
  • Flexible failover policies to better control instance failover
  • Improved diagnostics capabilities out of the box

AlwaysOn Availability Group (AG)

This is a new HA/DR feature  in SQL 2012 and combines best of failover clustering and database mirroring. It allows you to create a group of databases which failover together as a unit from one replica/instance of SQL Server to another replica/instance of SQL Server in the same availability group. Each availability group that we create, allows you to create one (and only) availability group listener which is nothing but a Virtual Network Name (VNN) to be used by clients to connect to the availability group.
The AlwaysOn availability group is based on Windows Server Failover Cluster (WSFC) and hence you need to install the failover clustering feature on each server/replica and create a failover cluster adding all these server/replicas before you can start enabling/creating the availability group.

Availability Groups Compared To Traditional SQL Server Failover Clustering
In a typical SQL Server failover cluster (at the instance level), you will have two nodes/instances (Active-Passive or Active-Active) connected to shared storage drives. Though SQL Server failover clustering has been good and is used in many deployments for higher availability and disaster recovery, it has several limitations and pain points, such as:

  • The process of setting up SQL Server failover clustering is tedious and complex – there are some 30-40 steps that you have to perform missing any of those steps can result in hours of additional work. This is why setting up SQL Server failover clustering is only recommended to be performed by highly experienced professionals.
  • Both the nodes are connected to a shared storage drive; though these drives might have their own failover mechanisms we still can have a single point of failure.
    One of the nodes is idle all the time in case of Active-Passive cluster (recommended) and hence resources are underutilized. Though you have an Active-Active failover cluster this is not recommended as after failover one node will have double the load from both the cluster setup/applications.
  • The infrastructure and configuration of each node should be exactly same as other nodes and mimic each other.
  • You cannot distribute or load balance your read-write load from read only load on multiple nodes.
  • An AlwaysOn availability group is superior to SQL Server failover clustering because the configuration, deployment and management is relatively simple and all the nodes/replicas will a copy of the databases and hence there is no shared storage or a single point of failure. You can have readable secondary and hence you can route your read-only load to a secondary replica and the read-write load to primary replica and hence have better utilization of your hardware resources.

How Availability Group differs from database mirroring
Database mirroring (at database level) can be set up in either synchronous mode or asynchronous mode but not both in a single mirroring setup.

  1. Synchronous Commit mode (high-safety) : The transaction logs are hardened at both the principal server as well as at the mirror server before commit acknowledgement is returned to the client; it may introduce some latency but ensures no data loss after failover. In this mode you can also set automatic failover and for that you need another instance which will work as a witness and performs the job of role switching.
  2. Asynchronous Commit mode (high performance) : The principal server hardens the transaction log at the principal server and returns the commit acknowledgement to client without waiting for transaction log hardening acknowledgement to be received from the mirror server. Transaction log hardening at the mirror server happens in an asynchronous manner.

These all sound like good solutions, but like SQL Server failover clustering, it has also several limitations:

  1. You can have only one database in a single mirroring session/setup, though you can define multiple mirroring sessions/setups (one for each database) but it is not possible to have a group of databases failover together.
  2. Databases on mirror server are always in recovery mode and hence you cannot read from a mirrored database (though you can create a database snapshot and read from it but but would only reflect data till the particular point in time when it was created).
  3. You cannot load balance your read-write requests on one server and read-only on another server.
  4. You can have only one mirror server; you cannot have one for higher availability (synchronous commit mode) and one for disaster recovery (asynchronous commit mode) in one single mirroring session, although you can combine it with log shipping for disaster recovery.

An AlwaysOn availability group is recommended over database mirroring as this overcomes several limitations imposed in database mirroring, for example with an AlwaysOn availability group:

  1. You can have multiple mirrored instance/nodes/replicas (up to four secondaries apart from one primary replica) with a combination of synchronous commit mode and asynchronous commit mode both at the same time. The replica set up in synchronous commit mode can be used for higher availability (or for automatic failover) and the replica set up in asynchronous commit mode can be used for disaster recovery.
  2. You can combine two or more database together and failover them as a unit, you don’t need to do it for each database separately as you were doing in case of database mirroring.
  3. You can offload the read-only load from the primary replica to the secondary by configuring the secondary as readable. In this way you can have better utilization of secondary replica’s hardware resources.
  4. You can also offload backup operations from the primary replica to the secondary replica and hence have less workload/IO on the primary replica and better utilization of the secondary replica’s hardware.

So…as you can see….it is a welcomed feature in SQL Clustering technologies.  It reminds me a lot of the Exchange Available Groups for DB introduced in Exchange 2010.  There are some new upcoming features being announced for SQL AlwaysON at Ignite from what I hear.  Do we have SQL Azure integration coming?

Until next time, Rob…