Friday, May 4, 2012

VMs Heartbeat, RVtools and ESX 4.1

Issue : Customer is using the RVTools an opensource tool to monitor the VMs.
the heartbeat of the VMs intermittently goes gray, green and red where as the VMs are running and working fine in the vcenter server.
Resolution :
It is a 3 node cluster with ESX 4.1 build 260247 and with vcenter 4.1 build 258902
Removed the nic for one of the VM and found that the VM did not restart automatically and the RVtools was still showing the heartbeat of the VM as green indicating that it does not use the network ping to check the heartbeat.
vmware-cmd -l listed all the running VMs including the one with the gray/red/green status of heartbeat in RVtools.
since the heartbeat of the VMs is conveyed to the vCenter through the vpxa agent by the hostd process of the host which is hosting the VMs, irrespective of its network status it can be safely assumed that now the heartbeat for the VMs are working fine. If the heartbeats are no longer received by the hostd, by default sent out every second, VM Monitoring will check if there is any Network or Storage I/O to avoid false positives. please make sure the VM monitoring under the Ha is turned on.
I or the customer did not know how the RVTools was calculating the heartbeat but the vcenter was receiving the heartbeats and the best way to check that is to reset a host with a non critical VM in it whose status is gray/red in the RVTools for the heartbeat column.