Item: Windows Server Failover Clustering
Rating: 10
Author: Verified User

Overall Satisfaction with Windows Server Failover Clustering

Use Cases and Deployment Scope

We use Windows Server Failover Clustering for two primary reasons: high availability and simplification of performing systems maintenance. Our failover clustering allows critical applications to continue with only a minor interruption in service if a needed system resource fails. It also allows systems administrators to failover an application to a passive node in order to perform scheduled or un-scheduled maintenance on the other node, and then fail back if necessary, all with minimal interruption of critical business applications such as Microsoft SQL Server and BMC's Control-M Workload Automation.

Pros and Cons

Pros

Windows Failover Clustering is well suited to keeping critical applications online with only a brief outage in services during the actual failover. In some cases, it will disconnect user applications during the failover. That isn't a good thing, but better than taking the entire application down for a longer period of time to shutdown one server and bring another online.
Windows Failover Clustering can be easily configured to manage individual cluster resources. For example, we use BMC Control-M/Enterprise and Control-M Server. Our gateway resources for distributed systems and mainframe (z/Os), are managed well as individual resources within the cluster, allowing us to take a single resource offline when necessary, without having to take the entire cluster down.
When used in combination with Microsoft PowerShell (now also available to Linux systems), it provide tremendous ability to monitor, query, report, configure and deploy systems in high availability (HA) infrastructures.

Cons

The disconnection of services or users -- brief though it may be -- is a drawback to a seamless failover. The failover process is generally quick, and in many cases invisible to the business end user community, but with the variety of applications and how they interact with Windows Failover Clustering, sometime there is a brief outage (seconds) that does NOT go unnoticed.
Windows Server Failover Clustering in a Hyper-V environment can be a little tricky if the Hyper-V infrastructure is not properly configured at the cluster level for affinity. If you are considering using Windows Failover Clustering in combination with Hyper-V, be sure to set your affinity rules so that both nodes are never on the same host.
Error reporting is quite detailed, if you know where to look. What appears in the Critical Events list for a cluster, and even the Windows Event Logs can lead one to think that Microsoft overlooked that critical area. You have to dig deeper into the Windows logs -- not just the usual three of Application, System and Security -- to get meaningful and helpful detailed error data.

Return on Investment

Windows Server Failover Clustering has enabled us to provide better adherence to SLAs while still keeping company data resources properly protected. For example, patching the operating system, repairing corrupted antivirus definitions, and the like.
Windows Server Failover Clustering also allows us to be more proactive in the area of system resources. If we see from our server monitoring that disk capacity is growing, we can take a node down, add resources to it (disk, CPU, memory) and then bring it back online -- all without the end users being aware that it was being done. In other words, no outage. SLAs remain high and IT management is happier.
Using Windows Server Failover Clustering on Hyper-V hosts enabled us to SIGNIFICANTLY reduct the cost of licensing Microsoft SQL Server, and by that I mean over $100,000 annually.

Alternatives Considered

DoubleTake

Several years ago we began using DoubleTake to cover our highly critical application, Control-M/Enterprise and Control-M/Server. We configured it to perform an automatic failover in the event of a critical failure. In that scenario, the system that was mirrored and came online assume the full identify of the original server. It also resulted in a short outage window, but at least the application and its data were not lost, and service was restored quickly. The downside of this was that it did not scale well from a licensing perspective for using it on many servers. The major downside of this -- other than cost -- was that if a system failed and DoubleTake performed a full system failover, the old server had to be completely rebuilt from scratch.

Other Software Used

Hyper-V, Atlassian Confluence, JIRA Service Desk, Symantec Endpoint Protection, Windows Server, Microsoft SQL Server

Likelihood to Recommend

Windows ServerFailover Clustering works very well for applications that can sustain a short disconnect when failing over. It works, and works well, in providing single-node applications HA, meaning an active/passive setup. It is not a load balancing solution. Use NLB for that. Another area that it works well is when used in combination with Hyper-V. We set our Hyper-V hosts up as clusters, and those clusters also host clusters for SQL Server and other enterprise class applications like BMC's Control-M/Enterprise and Control-M/Server.

Using Windows Server Failover Clustering

Users and Roles

Business Intelligence
Database Administration
Production Control
Product and Procurement
PeopleSoft HR
PeopleSoft Finance
Core Services

Support Headcount Required

5 - Supporting Windows Server Failover Clustering requires the expertise of a trained Windows Administrator: preferably someone with certification as an MCSE or MCITP. Windows Server Failover Clustering will not be well or properly supported by someone who does know have a depth of knowledge of both Microsoft Windows Server and Windows Server Failover Clustering.

Business Processes Supported

Microsoft SQL Server - ALL of our important databases run on Windows Server Failover Clustering in order to provide HA.
BMC Control-M/Enterprise and Control-M/Server. This enterprise class workload automation product is extremely critical to our business. Windows Server Failover Clustering provides us with the ability to meet SLAs for this application.

Future Planned Uses

We are investigating ways to eliminate the need to install individual instances of Control-M modules on client servers by having them linked back to clustered module servers.

Likelihood to Renew

It has proven its value to us both for maintaining SLAs and providing the ability to perform much needed and regular systems maintenance without taking applications offline for more than a few seconds.

Using Windows Server Failover Clustering

Usability

With adequate knowledge, it is pretty easy to work with and manage a Windows Server Failover Cluster. It can, however, be very confusing in combination with Hyper-V to the neophyte. For example, learning when to use the Hyper-V Manager and when to use the Failover Cluster Manager.

Usability Pros and Cons

Pros	Cons
Like to use Well integrated Consistent	Requires technical support Slow to learn Lots to learn

Easy Tasks

Until you have the knowledge of how clustering works, and particularly how Windows clustering works, you will only end up banging your head. It is critical that a neophyte to Windows Server Failover Clustering learn and understand how it all works before embarking on a project as complicated as this can be.

Difficult Tasks

Setting up the initial cluster can be very tricky. It isn't a case of just accepting the defaults and clicking on the "Next" button. You have to know what your doing. For example, you have to create a cluster resource with its own IP address separate from that of the nodes. IP for node 1. IP for node 2. IP for cluster. I would also suggest using a CNAME in DNS that points to the cluster name. That way, no matter which node is the active node you can still get to it.

Comments

Please log in to join the conversation

Keeping It Up - Windows Server Failover Clustering for HA Applications

Overall Satisfaction with Windows Server Failover Clustering

Use Cases and Deployment Scope

Pros and Cons

Pros

Cons

Return on Investment

Alternatives Considered

Other Software Used

Likelihood to Recommend

Using Windows Server Failover Clustering

Users and Roles

Support Headcount Required

Business Processes Supported

Future Planned Uses

Likelihood to Renew

Using Windows Server Failover Clustering

Usability

Usability Pros and Cons

Easy Tasks

Difficult Tasks

Comments

More Reviews of Windows Server Failover Clustering

Windows Server Failover Clustering