Wednesday, October 7, 2015

vsphere HA host partition issue

So if you see one of your host being on a different network than the other (network partition) then i am pretty sure you might have googled for the below articles
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1002117
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2033250
but hey wait, before you try that you might want to take a look at this.
So, I recently came across this situation and the above articles did not help. Usually the partition of the network means the there usually 2 different sets of hosts in a cluster which cannot failover to other sets of hosts. hosts in partition 1 cannot fail over VMs to partition 2 and vice versa. This usually means that the host which is on different partition (or with that error) has different network settings than the other for communicating with other hosts in the cluster but it does not necessarily means that this is the one which needs fixing and that is exactly what was true in my case. the other host which had no partition error had its vmk management load balancing was set to virtual port ID where as it should have been ip hashing and when i changed it on this host the error on the other host vanished automagically. I hope this helps, just make sure the network settings (vswitch, portgroup settings) are identical across the cluster. In future when you want to set the load balancing across a cluster i suggest you use powercli to loop it across the portgroups, vswitchs and hosts so that you won't miss on any one of them, otherwise you are doomed to face this error.