Great choice for Windows shops
August 19, 2019
Great choice for Windows shops

Score 8 out of 10
Vetted Review
Verified User
Overall Satisfaction with Windows Server Failover Clustering
We use Windows Server Failover Clustering for our Hyper-V environment to improve the availability of the VMs. The most important problem addressed by Failover Clustering is the reduction of the downtime for server maintenance. Standalone Hyper-V hypervisors tend to need a lot of downtime for Windows updates. The entire organization uses VMs running on top of the Failover Cluster. We use a hyper-converged solution with Failover Clustering to avoid the need to purchase expensive SANs, so the cost of improved uptime is relatively low.
- Reduced outages for server maintenance. VMs can be live migrated from the node being taken down for maintenance to avoid outages. With Cluster-Aware Updating (CAU) it is possible to run Windows Update on cluster nodes automatically.
- Very fast live migration and failover. With hyper-converged DAS, live migration is so fast, it is hard to see the VM outage in the RDP session.
- Inexpensive. Failover Clustering is included in Windows Server. For educational organization, Windows Server licenses are extremely cheap.
- iSCSI configuration can be confusing. To achieve redundancy, each node in the cluster must have redundant (multi-path) access to storage (iSCSI, FiberChannel, etc.). Configuring iSCSI multi-path correctly can take several tries.
- The configuration is time-consuming. Cluster Validation Wizard is verbose - takes a while to read through and check all the issues. It is still very important to go through all of the information though. It is easy to configure a cluster that seems fine but does not failover when needed.
- Not really a drawback but the effort must be made to understand quorum configuration if a cluster has even number of nodes. I would suggest doing multiple failover tests before using the cluster for production, including pulling power cables from nodes and disconnecting network cables to simulate switch failure.
- Maintenance windows were dramatically reduced as a result of Failover Clustering implementation, reducing overtime.
- Hypervisor failure is not expected to result in the outage of critical services anymore. I suggest doing realistic testing of failover by pulling power cable from one of the nodes.
- It is much faster to migrate VMs between the nodes than stand-alone Hypervisors, so less time is spent on VM provisioning.
Obvious competitors are VMware ESXi clustering and Linux KVM clustering. Windows Server Clustering was selected instead as most of our server environment is Windows Server based. Windows Server licenses were readily available, so Windows Clustering was an inexpensive option. ESXi would have been out of our budget. KVM was not used to keep the environment simple and unified on the Windows platform. Various pre-built hyper-converged appliances from vendors like Scale Computing were ruled out as we wanted to re-use existing servers.