Monday, December 31, 2012

How to find a driver for a device for VMware ESXi OS

Use this method to find the driver for any device used in VMware OS (Esxi 4/5/etc.,)
concern: need driver for HP 366m NIC adapter for ESXi 4.1 U2
  1. go to
  2. select the type of search that you want to do, in this case IO devices
  3. optionally select the partner name (eventhough it is not necessary since almost all the device names are unique for different partners), in this case HP.
  4. enter the name of the device under the box named keyword and hit enter, here it is 366m.
  5. select the appropriate result and proceed to the next page
  6. There copy the name of the version of the nic driver in regards to the version of the operating system. In this scenario it is, ESX / ESXi 4.1 U2         igb version 3.2.10
  7. Then go to the generic download link given below that version and enter the copied version of the driver in the search box and hit enter, the first result should give you the version of the driver.
(this was written/documented for a friend of fine who often finds it puzzling on how to find drivers for the VMware OS)

Friday, December 21, 2012

vCenter is slow and sluggish after upgrading to 5.1

Scenario: vCenter server was upgraded from 5 to 5.1a and after that the vcenter server became too sluggish and slow.
The inventory service was also taking too much of time to load
In the task manager the memory usage was almost 100%,
most of the memory was eaten by multiple java.exe

What to do: for vcenter being very sluggish, high memory usage by multiple java.exe
vCenter Server
Change the setting in the wrapper.conf file located in <installation_directory>\VMware\Infrastructure\tomcat\conf folder.
  1. Open the file and search for
  2. Change the setting to desired value, in my case 512 MB.
  3. Save and close the file.

vCenter Single Sing On
Also here, change the setting in the wrapper.conf file located in <installation_directory>\VMware\Infrastructure\SSOServer\conf folder.
  1. Open the file and search for
  2. Change the setting to desired value, in my case once again 512 MB.
  3. Save and close the file.
vCenter Inventory Service
Change the setting in wrapper.conf located in the <installation_directory>\VMware\Infrastructure\Inventory Service\conf folder.
  1. Open the file and search for
  2. Change the setting to desired value, in my case 1024 MB.
  3. Save and close the file.
Profile-Drive Storage Service
Also here, change the setting in wrapper.conf located in the <installation_directory>\VMware\Infrastructure\Profile-Driven Storage\conf folder.
  1. Open the file and search for
  2. Change the setting to desired value, in my case 512 MB.
  3. Save and close the file.
Now that you have changed the settings, restart the services or the vCenter Server for the changes to take effect. Try to tweak around with the settings and find out what best suits your environment.
sincere thanks to :

vcenter inventory service being too slow to start :

Tuesday, December 18, 2012

vSphere Boot Options

SD/USB requires setup of separate scratch area in shared storage to have persistent logs and crash dumps.  This is an additional configuration step and needs to be remembered and documented.  Personally, I do not recommend SD/USB, as maintenance of this infrastructure does not align well with established server administration methods.

BFS(Boot from SAN) is great for large environments, where generally there is an experienced SAN administrator and they are able to maintain the BFS infrastructure.  BFS gives you wonderful flexibility, as the blades themselves keep no state other than BIOS settings.  In addition, there is a considerable saving in eliminating 2 physical local disks from each ESXi server.  If you have 200+ of these servers then this is significant capex that is eliminated.  The cost of shared storage allocated for BFS is insignificant, as this is only ~6 GB of RAID5 storage per host.

Also note that BFS is not an option when your shared storage is NFS only (e.g. NetApp arrays).

BFS from iSCSI arrays is possible with hardware initiators, but it rather complex to set up and maintain, so it does not mesh very well with the small scale of typical iSCSI installations and the way they are operated.

So, BFS is really mainly for FC SAN.

On the other hand, the mirrored pair of local disks is very simple to set up and the boot from it is very simple to maintain.  All server administrators are familiar with it.  It nicely decouples the boot problems from the configuration of shared storage.  I generally recommend this setup for smaller environments, especially the less sophisticated ones, where there may not be a specialized SAN administrator.  The simplicity of management compensates for increased capex.

So my rules of thumb are very simple:

·         Auto-deploy                     - no, too complex for little benefit compared with BFS
·         SD/USB                                             - no, too awkward to maintain
·         Mirrored local disk          - yes, for small unsophisticated environments and where shared storage is NFS or iSCSI
·         BFS                                      - yes, for large environments with FC shared storage and good FC SAN skills

copied from a colleague :)

Monday, December 17, 2012

Unable to migrate VMs between ESXi hosts

Issue : unable to migrate VMs from one host to another in a 2 node cluster with dvSwitch.
“network card 'network adapter 1' has a DVPort backing, which is not supported. This could be because the host does not support VDS or  because the host has not joined VDS”

Background (may be irrelevant) : the source host was disconnected from the vcenter server.
restarted the services inside the esxi 5 host ( restart) and added it back to the host.

added the host to the dvSwitch using the wizard.
Issue Resolved.

ESX 4.1 host disconnected

Issue: - both the hosts of the cluster are in a disconnected state.
What didnt work:
udpated the ip address of the vcenter server in the VC>administration>runtime settings.
removed and added the hosts back to the vcenter server but no go.
Opened the ilo for another host and renamed the host with lowercase.
 and made sure the file the entries in the files
Reinstalled the management agents and restarted the vCenter service in the host but only one host stayed connected and the other one keeps getting disconnected after 1 or 2 minutes.
on the nonworking host tried
service mgmt-vmware stop && service vmware-vpxa stop && service vmware-vmkauthd stop && service xinetd restart && rpm -qa | grep -i vpxa | awk '{print $1}' | xargs rpm -ef $1 && userdel vpxuser && rpm -qa | grep -i aam | awk '{print $1}' | xargs rpm -ef $1 && service mgmt-vmware start && service vmware-vmkauthd start
made sure the services are started back on and tried to connect but no go.
Nslookup works from the host to other hosts, vcenter server and vice versa,
reverse nslookup works too.
Updated the /etc/opt/vmware/vpxa/vpxa.cfg  file with the correct vcenter ip address but no go.
Checked the port requirements for the host from the vcenter using telent and
 443, 902, 80, 5989 they were open but  623 for the DPM wasnt.
added the
but no go.

What worked/what i forgot to check: noticed issue with the Managed IP Address in vCenter Settings as one of the hosts IP address was mentioned there
Changed it to vCenter IP
Finally turned out to be an issue with windows firewall. Disabling firewall reconnected all the hosts.
Hosts stayed connected after that. worked on the DNS pointers and fixed the DNS issues as there was a change in IP address done on one of the hosts

Thursday, December 6, 2012

vSphere 5 LACP and Virtual Connect

concern: how to configure the Virtual Connect (Downlink) in accordance with vSphere feature about LACP.
What i came to know :

VMware provides a number of different NIC teaming algorithms, any of the available algorithms can be used, except IP Hash. IP Hash requires switch assisted load balancing (802.3ad), which Virtual Connect does not support 802.3ad with server downlink ports.

HP and VMware recommend using Originating Virtual Port ID with Standard vSwitch, and Physical NIC Load when using vDS and NetIOC,

See: HP BladeSystem Networking Reference Architecture

Saturday, December 1, 2012

Deploy ESX host by cloning USB disk or SAN disk

To fast deploy some ESX hosts you might opt to clone the USB disk or the SD card.
If you clone the install disk you have to modify /etc/vmware/esx.conf to update the mac address of your physical nics:

Delete following lines and then reboot your ESX to get your esx.conf automatically updated with new the mac  addresses:

/net/pnic/child[0000]/mac = "00:0c:29:73:c2:ee"
/net/pnic/child[0000]/virtualMac = "00:50:56:53:c2:ee"
/net/pnic/child[0000]/name = "vmnic0"
/net/pnic/child[0002]/mac = "00:0c:29:73:c2:02"
/net/pnic/child[0002]/virtualMac = "00:50:56:53:c2:02"
/net/pnic/child[0002]/name = "vmnic2"
/net/pnic/child[0001]/mac = "00:0c:29:73:c2:f8"
/net/pnic/child[0001]/virtualMac = "00:50:56:53:c2:f8"
/net/pnic/child[0001]/name = "vmnic1"
/net/pnic/child[0003]/mac = "00:0c:29:73:c2:0c"
/net/pnic/child[0003]/virtualMac = "00:50:56:53:c2:0c"
/net/pnic/child[0003]/name = "vmnic3"

BTW I’m not sure if it is supported to deploy ESXi hosts by cloning the install disk ….but if does please leave a comment after you have tried it on your testing machine.

Friday, November 30, 2012

vCenter 5.1 SSO few known installation issues & suggestions

  • 1. Installing vCenter Single Sign On fails with the error: Error 20010: Failed to configure LookupService
  • vCenter Single Sign On installer reports the error: Error 29155.Identity source discovery error.
also try some known issues here
  • "Error 20010.Failed to configure LookupService" error.
managed to overcome this by creating a single user for RSA database (after executing the createtablespace script).

This user was given db_owner permissions and sysadmin (temporarily) as the installation doc suggests
  • Don't use anything except dots in your admin password as special character and IT WILL WORK
  • If you have the word PORT in any part of your domain name or fqdn for vcenter the installer will throw errors!!! 
  • If you have hyphern "-" in any part of your domain name or fqdn for vcenter the installer will throw errors!!! 
  • try changing the name to an fqdn name.
  • pre-create the tablespace and the users (RSA_DBA and RSA_USER) and let the SSO installation create the tables and schemas during installation

Host Disconnected from vcenter server

Issue: Host Disconnected from the vCenter server and you might have already referred the following

What solved it : Check for any VM whether it has snapshots in that host and if it has snapshots.
If removing that VM from the host's inventory is letting the host stay connected in the vCenter server then the VM is the cause.
If VMs in that host has snapshots then connect directly to the host using a vSphere client and consolidate them through the GUI or CLI.
If the snapshot consolidation takes a long time and you don’t want the host to be in a disconnected state till then you can remove the VM from the inventory and bring the host back on and consolidate the snapshots of the VM from a different host through GUI or CLI.

The above can also be used if the vCenter server service is not starting and you have already tried the following.
connect directly to all the hosts in that cluster and consolidate all the snapshots and try restarting the vcenter server. It has worked on 2 incidents on a customer site.  

Thursday, November 29, 2012

VMware Snapshot removal stuck, snapshot helper file in GUI

Problem Description: VMWARE Remove Snapshot on one of our Windows 2008 R2 MS Exchange server has not completed after 8 hours.  It is causing performance issue for our users.

Troubleshooting: No actions other than monitor the progress which is stuck at 95%

Resolution :
snapshot helper file is seen in the snapshot manager gui but no snapshots are visible.
VM edit settings show that the disks are pointing to snapshots
waited for the Helper -7 snapshot to cleanup but it didn't after 10 hours.
connected to the ESXi server, signed in, and restarted the management services which fixed the issues.
The snapshots cleared and the VM was accessibl.

Thursday, November 15, 2012

sVMotion VM won't boot after a reboot and becomes corrupt

Issue: When svmotion a VM in your vmware environment to a different array it works fine but after a first reboot it fails to boot or becomes corrupt.
cause: VAAI primitive “Block Zero”
Resolution : Disable only the "Block Zero" (HardwareAcceleratedInit) VAAI. You can leave the ATS and XCOPY enabled.
public advisory
how to disable VAAI? 

as a solution you might try