Issues getting output from Hyper-V Tools or Failover Cluster Manager

Hello, a few weeks ago i had some issues at a client where Failover Cluster Manager would not repsond when trying to anything. It would just hang and eventually fail.

Even Hyper-V manager or the normal Powershell tools where hanging and taking forever. Doing a simple Get-VM, Get-Storagejob or any powershell command related to Hyper-V or Storage Spaces Direct would hang for ages, and maybe give out a response after 10-20 min.

Identifying the root cause is some what hard, and the logs are not relay your best friend on this. But the common ground i have found is that booting the “right” node solves the issue. But if you have many nodes cycling trough all the nodes takes a while. Esp if you have Storage Spaces Direct. The issue comes back to the Cluster service is not “healthy” for some reason on 1 node. It might be running, but is in a state that is causing issues.

Now to keep you calm and happy, the VM’s are still running and does not have an issue. But if for some reason someone reboots a VM that might cause an issue. So i suggest you schedule a time for an outage and shut all nodes down and reboot them.

Now to do this follow these steps. These will work for any Hyper-V Failover scenario. And it can happen on a normal cluster, hyperconverged or converged.

  1. Stop All VM’s on cluster
  2. Once all VM’s are stopped run these commands.
$Cluster = Clustername
$Servers = Get-ClusterNode -Cluster $cluster

Stop-Cluster -Cluster $Cluster

foreach ($server in $servers){

    Get-Service -Name ClusSvc -ComputerName $Server | Stop-Service -Name ClusSvc -Force
    Get-Service -Name ClusSvc -ComputerName $Server | Set-Service -StartupType Disabled

}

#Once all nodes are rebooted run these commands

foreach ($server in $servers){

    Get-Service -Name ClusSvc -ComputerName $Server | Set-Service -StartupType Automatic
    Get-Service -Name ClusSvc -ComputerName $Server | Start-Service -Name ClusSvc -Force
      
}

Get-Cluster -Cluster $Cluster
#If Cluster not started run this
Start-Cluster -Cluster $Cluster

Once your cluster is back up verify all storage and all VM’s can start up.

Now you can start to try and figure out what happend, go trough all event logs for hyper-v and clustering and try and figure out what was the root cause. Try and see if the Health Service had been restarting and jumping from server to server. This might be a indication to where it stopped and caused the cluster to stop responding to commands.

As a general fyi, i do recomend to reboot nodes at a intervall. To do patching and clear out any thing that might cause the cluster service to hang.

If you are not able to identify the issue your self, Microsoft Support might be able to help. But for a root cause analysis you will need a Premier Support Ticket.

Leave a Reply

Your email address will not be published. Required fields are marked *

Comment moderation is enabled. Your comment may take some time to appear.