Migrate Windows Failover Clusters Between Domains

There are numerous tools that can be used for migrating servers between domains but what happens when you have invested in Windows Failover Clusters? If you’re right up to date with Windows Server then Microsoft can now help you out with a new domain migration process at https://docs.microsoft.com/en-us/windows-server/failover-clustering/cluster-domain-migration. This will likely help a couple of people but for everyone else you’re left with the options to either first upgrade to Windows Server 2019, or to do it the manual way.

In Microsoft’s words the manual way “involves destroying the cluster and rebuilding it in the new domain.”

OK so that’s not as bad as it sounds at first. As long as you have a process and make sure you document a few things it should go reasonably simply.

The not so bad process

The migration process involves shutting down all cluster processes and removing them from the cluster. This will then allow the cluster to be destroyed. Once you’re at that point you can migrate the individual servers to the new domain. Finally you re-create the cluster in the new domain and set up all resources again.

This might sound onerous, and it may be depending on the type of Windows Failover Cluster that you have, but some workloads are easier than others. For instance if you’re running a Hyper-V cluster then you can easily remove all VMs so that they are no longer clustered. If on the other hand this is a SQL cluster, well that’s a little more difficult. But let’s look at the example of a Hyper-V cluster to see how this process will be achieved.

Now Failover Clusters are a real special snowflake with no two being alike so first you will need to sit down and get to know your particular version.

You will also need to think about other features such as is it managed by VMM? What types of storage do you have? What type of Quorum do you have? What features are you using? Start by documenting it all if you don’t already have this.

Minimum Windows Failover Cluster Documentation

As a cheat sheet this is the minimum information that you will need to migrate a Windows Failover Cluster between Domains.

  • Cluster Name

  • Cluster IP

  • Cluster Quorum Type

  • Cluster Networks (Include name and IP ranges)

  • Cluster Quorum Disk/Location

  • List all Cluster Disks (Include Name, Disk number, disk letter or mount point)
  • Of particular note be very aware of the cluster disk mount location. This is something that is easy to forget about but is hard to discover after the fact. If you get this wrong then the new cluster won’t be able to find the VM information or components leading to a Off-Critical state.

    Migration Time!!

    Once you have this all documented you can move on to the actual migration. Below are the minimum steps that will need to be performed:

    1) Disable the computer account of the cluster in the destination domain if it exists
    2) Stop and delete any cluster specific services (Replication Brokers etc)
    3) Shut down all VMs
    4) Set VM startup to manual
    5) Remove all VMs. Check that they are located on a single node (This is removing the VMs from the cluster and not deleting the actual VMs)
    6) Move all disks to a single node
    7) Evict one node from the cluster
    8) Remove all CSV disks
    9) Destroy the cluster
    10) Change DNS on all cluster nodes to use the new domain’s DCs
    11) Change the domain membership of all cluster nodes
    12) Create new cluster on one node
    13) Run the cluster validation
    14) Insert old cluster name and IP address
    15) Add Quorum Disk and configure quorum settings
    16) Add additional disks in the correct order so that the CSV volume names are correct. Correct manually if required.
    17) Check that Hyper-V machines are now Off and not off-critical
    18) Create the Quorum disk for the cluster
    19) Add the subsequent nodes to the cluster making sure to run the cluster validation again
    20) Test moving the cluster name between nodes
    21) Test moving all disks between nodes
    22) Import VMs
    23) Start VMs
    24) Configure Auto-start for VMs
    25) Test moving VMs between nodes

    In practice I’ve scheduled a total cluster outage of two hours for this type of migration but remember that your special snowflake may require more time.