Item: Windows Server Failover Clustering
Rating: 8
Author: Verified User

Overall Satisfaction with Windows Server Failover Clustering

Use Cases and Deployment Scope

We use Windows Server Failover Clustering for our Hyper-V environment to improve the availability of the VMs. The most important problem addressed by Failover Clustering is the reduction of the downtime for server maintenance. Standalone Hyper-V hypervisors tend to need a lot of downtime for Windows updates. The entire organization uses VMs running on top of the Failover Cluster. We use a hyper-converged solution with Failover Clustering to avoid the need to purchase expensive SANs, so the cost of improved uptime is relatively low.

Pros and Cons

Pros

Reduced outages for server maintenance. VMs can be live migrated from the node being taken down for maintenance to avoid outages. With Cluster-Aware Updating (CAU) it is possible to run Windows Update on cluster nodes automatically.
Very fast live migration and failover. With hyper-converged DAS, live migration is so fast, it is hard to see the VM outage in the RDP session.
Inexpensive. Failover Clustering is included in Windows Server. For educational organization, Windows Server licenses are extremely cheap.

Cons

iSCSI configuration can be confusing. To achieve redundancy, each node in the cluster must have redundant (multi-path) access to storage (iSCSI, FiberChannel, etc.). Configuring iSCSI multi-path correctly can take several tries.
The configuration is time-consuming. Cluster Validation Wizard is verbose - takes a while to read through and check all the issues. It is still very important to go through all of the information though. It is easy to configure a cluster that seems fine but does not failover when needed.
Not really a drawback but the effort must be made to understand quorum configuration if a cluster has even number of nodes. I would suggest doing multiple failover tests before using the cluster for production, including pulling power cables from nodes and disconnecting network cables to simulate switch failure.

Return on Investment

Maintenance windows were dramatically reduced as a result of Failover Clustering implementation, reducing overtime.
Hypervisor failure is not expected to result in the outage of critical services anymore. I suggest doing realistic testing of failover by pulling power cable from one of the nodes.
It is much faster to migrate VMs between the nodes than stand-alone Hypervisors, so less time is spent on VM provisioning.

Usability

Usability of Failover Clustering on Windows Server is generally good. Failover Clustering console is not hard to understand if the complexity of the product is taken into account. Most of the task on the Cluster can be done via PowerShell, so automation is possible and not hard (PowerShell is very intuitive). Configuring storage is the hardest and most confusing task during cluster configuration, so storage configuration should be planned in advance. Cluster Validation Wizard is verbose but most of the errors are easy to understand.

Support Rating

We never contacted Microsoft regarding Failover Clustering. Documentation is mostly good and readily available. Not much can be said here.

Alternatives Considered

Obvious competitors are VMware ESXi clustering and Linux KVM clustering. Windows Server Clustering was selected instead as most of our server environment is Windows Server based. Windows Server licenses were readily available, so Windows Clustering was an inexpensive option. ESXi would have been out of our budget. KVM was not used to keep the environment simple and unified on the Windows platform. Various pre-built hyper-converged appliances from vendors like Scale Computing were ruled out as we wanted to re-use existing servers.

Other Software Used

Darktrace, G Suite, Fortinet FortiGate

Likelihood to Recommend

Windows Failover Clustering is a good fit for a medium to a large organization with a predominantly Windows Server environment. VMware and Linux shops have their own clustering options. A cost-benefit analysis should be used before deploying a cluster, as extra capacity for failover is an additional expense. As servers are quite reliable, stand-alone hypervisors can be a better fit for a small business, which can tolerate outages for maintenance. While Failover Clustering feature itself is included in the Windows Server license, cost of extra servers and especially SANs (if used) is significant. The organization must calculate whether reduced downtime is worth the expense, especially considering that clustering by itself does not guarantee high availability.

Comments

Please log in to join the conversation

Great choice for Windows shops