What to know before implementing Failover Cluster
August 30, 2016

What to know before implementing Failover Cluster

Marc-Olivier Turgeon-Ferland | TrustRadius Reviewer
Score 9 out of 10
Vetted Review
Verified User

Overall Satisfaction with Windows Server Failover Clustering

Windows Server Failover Clustering is used on most of our production infrastructure. We use it for our General FS Storage, Scaleout FS Storage, and Hyper-V Clusters.

Because it is used for our VM environment, it is used by the whole organization.

It provide us High Availability on those services.
  • Live Migration of VMs between hosts. If you have sufficient network bandwidth, it is fast and I never had a failed live migration break the VM or kill it. Worst case is the live migration will fail (not enough RAM for example) but the VM always stayed up.
  • Windows Server Failover Clustering enables Scaleout Storage, which is probably the coolest feature Microsoft has to offer at this moment. It gives you Active-Active SMB file shares which can now be used by most Microsoft Services like MS SQL, Hyper-V, etc. and clients if Windows 8+
  • Cluster Validation is really complete and easy to understand. The validation gives you comprehensive error messages that help to diagnose and fix rapidly to get your Failover Cluster running in no time.
  • Storage Pool / Virtual Disk management via the Failover cluster is confusing. You sometime needs to initiate the task from the Failover Cluster Manager (to have the right permissions) but it just use the new Server Manager Console. It is also possible to see information like number of columns of VD from the Failover Cluster Manager console, but you can't see the deduplication stats. It would be nice to at least have all the information available on both console or eliminate one of them.
  • General FS switchover between nodes is slow and creates timeout when switching nodes. Failover Cluster doesn't seem to manage VD ownership that well. I even had a case where the VD was locked by a shutdowned node (bluescreen) which brought the whole cluster down.
  • DLL locking also doesn't seem to be well handled. We had multiple cases where the Hyper-V cluster crashed because some waiting for restart updates locked dll.
  • Failover Cluster gives us the power to do updates or hardware upgrade / change without having to create an outage. Which permit us not to work night shifts.
  • By creating one cluster with all Hyper-V servers, it enabled us to move VMs via live migration between host to balance RAM usage which was time consuming and took a lot of time over network before.
  • It created some problems that caused us to have to investigate quite some time before finding the cause. We encountered dll locking that caused the Failover Cluster to force-restart a host. Logs are really not the strong point of Failover Cluster Manager, and even Microsoft Support wasn't able to help much. We had to find the problem ourself.
If you are already on Windows Server and are using a compatible Role and hardware (Ex. Shared Storage), Windows Failover Clustering is free (If you already run Windows Server) and doesn't require much effort to put in place even as an afterthought.

It isn't the best on the market if you NEED high availability but it's basically free and offer nice features on top of other Windows Features.
It is well suited for redundancy during Windows Updates, hardware maintenance, or any outage where you are present in case something goes wrong.

It is not well suited for redundancy during, power outage, bluescreen, hardware failures, etc. because I have seen Failover Cluster bring the whole cluster down on all those cases. It even causes more chances to bring down the services sometimes (dll locking, VD locking)