Thursday, June 20, 2013

your VMware vswitch redundancy may not be so failproof

Incident : A host with 4 vmnics has recently lost connection due to a failure in the vmnic. you might be wondering how come all the vmnics failed at the same time ?
Design Flaw : The vswitch0 had 4 vmnic in it vmnicA0,A1,A3,A4 but it still failed because the all these vmnics were not a separate entity but 4 ports of a physical nic. It was a quadport Nic and the firmware on the PCI NIC crashed momentarily, even though it was less than a minute it was still a issue.
Design consideration :
Let us say the host has 2 quad port NICs NIC A & NIC B
make sure any vswitch is made up of nics from both the NICA and NICB.
example:
vswitch0=vmnicA0+vmnicB0
pros: avoid 3 single points of failure
a) failure of a single multi-port PCI (hardware failure) NIC,
b) failure/crashing of the firmware of the multi-port NIC,
c) failure/crashing of the driver of the multi-port NIC.

Wednesday, June 19, 2013

Recreating the vmdk when you can't find the flat-vmdk size

It is a personal note which I wanted to write it down before I forget because I learn it from a friend on phone.
we all know how to recreate vmdk descriptor files but what if the command
ls -l vmdisk0-flat.vmdk
doesnt return any output or gives no size of the flat.vmdk.
well it turns out, you go to the datastore browser>check the size of the flat.vmdk and multiply it with 1024 and it is the same as
ls -l vmdisk0-flat.vmdk.

Intermittent network drop/disconnectivity in VMware environment

It is hard to identify what is wrong or what is going on when your VMs or host has an intermittent network issue. Here is what I have made a note of on my recent encounter with a same problem.
assuming vswitch0 has 2 nics(vmnic0, vmnic1)
start a continous ping to a test VM in the problematic host.
put vmnic0 as active and vmnic1 as unsed and check for network drops.
put vmnic1 as active and vmnic0 as unused and check for network drops.
let us assume on vmnic1 as active there were network packet drops.
assume that
vmnic1 is connected to switchport1
vmnic0 is connected to swtchport0
swap those connections
vmnic1 will now connect to switchport0
vmnic0 will now connect to switchport1
and check whether you are still having the packet drops on the same vmnic1.
If yes then it might be either the cable or the vmnic1 which is faulty.
[You can isolate this by changing the cable with a known good one]
If no, and the packets are now dropping on vmnic0 then it is either the switchport1 or the cable which is faulty.
[You can isolate this by changing the cable with a known good one]

extended scenario:
 If you have 4 vmnics then divide them in a group of 2[groupA=vmnic0,1 GroupB=vmni2,3]
Once you identify on which group you are seeing the packet drop, repeat the process for the vmnic inside the group too.
I mean if the packets are dropping when GroupB is active then keep GroupA and vmnic2 of GroupB as unused and vmnic3 of GroupB as active to check the network drop.
Then redo the same this time with GroupA and vmnic3 of GroupB as unused and vmnic2 of GroupB as active to check the network drop.
If the network drop is on vmnic2 then keep only that as active, othes as unused and then swap the cable and the switch port one at a time to eliminate the possibility of them being faulty.

b4 u do all this make sure that u r up2date on ur drivers/firmware on ur IO devices as per the vmware OS version.

Thursday, June 13, 2013

Why don't I enable HT on my VMware hosts esxi 4.x/5.x



please refer the page 20/54 in the above document and you should see the below documentation.
An ESX system enabled for hyper-threading should behave almost exactly like system without it. Logical
processors on the same core have adjacent CPU numbers, so that CPUs 0 and 1 are on the first core, CPUs
2 and 3 are on the second core, and so on.
You may also refer
and
which is applicable for
·  Product Version(s):
VMware ESX 4.0.x
VMware ESX 4.1.x
VMware ESXi 4.0.x Embedded
VMware ESXi 4.0.x Installable
VMware ESXi 4.1.x Embedded
VMware ESXi 4.1.x Installable
VMware ESXi 5.0.x
So incorrect HT configuration may cause some issues later, hence for a negligible amount or no performance gain by enabling HT we invite the possibilities of the following issues. If they have enabled HT on the physical host then they might have to configure each VM separately so that they can take advantage of the HT, Imagine 20 VMs running on a HT enabled host and we have 5 such hosts in a cluster. We are looking at the reconfiguration of 100 VMs manually.
Configure one virtual machine to use hyper-threading with NUMA, add numa.vcpu.preferHT=TRUE for per-virtual machine advanced configuration file.
Right-click on VM
Select Edit Settings
Click the Options tab.
Highlight General under Advanced options and click Configuration Parameters.
Or enable  it on all VMs
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2003582
So even if they enable HT without manually configuring the VMs the HT is actually not being used and more than 90% of the users are not aware that they have to enable this on VMs too and even if they do, nobody wants to do that on 100 VMs manually for a performance gain which is near 0 (as per vmware documentation). So for a near 0 performance gain we are looking at a possibility of PSOD and if these PSODs are exception 13/14 then that is very hard to isolate, which hardware might have caused it, and almost all the exception 13/14 PSODs that I have seen end up with some or the other hardware replacement, mostly CPU.
Page 18/60
If the hardware and BIOS support hyper-threading, ESX automatically makes use of it. For the best
performance we recommend that you enable hyper-threading, which can be accomplished as follows
however this has only been seen in esxi 5.x and many users that I have dealt with who were running 4.1 haven’t had any considerable performance gain by enabling HT.
So to reiterate again, HT is supported but not advised unless they are (made) aware of all the other cons that they will get in exchange for a negligible performance gain. Some customer’s with database VMs have earlier faced some low performance on their VMs after enabling HT. When we enable Processor resources are shared such as the L2 and L3 caches.  This means that the two threads running on the same processor compete for the same resources if they both have high demand for them.  This can, in turn, degrade performance. Until you have more vCPUs requesting processing power than there are physical cores, HT cannot hurt and provides no value; which means when all the actual physical cores of the CPU are running at near 100% then only the vmware will try to use HT but with DRS enabled on the cluster the VMs will automatically be moved to other hosts reducing the load on the host which means the hosts will practically never reach a state where the CPU is being utilized near 100%. Almost 99% of the VMware hosts that I have seen so far run out of memory first before the cpu usage of the host can actually reach near 90% mark which again makes sure that your host ‘s physical cores are never at above 90% mark because once the memory reaches that 90% mark before the CPU the VMs will be moved to other hosts either by the user or by DRS.
refer
http://communities.vmware.com/docs/DOC-5101
http://vmguy.com/wordpress/index.php/archives/362 
http://serverfault.com/questions/194377/will-disabling-hyperthreading-improve-performance-on-our-sql-server-install
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2008843
http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2012404
http://communities.vmware.com/thread/422723?start=0&tstart=0

Thursday, June 6, 2013

(solved) microsoft office installer and Error Code 1303. Set up cannot access the Folder

Scenario: I recently had to uninstall office 2013 because it use to start configure word or outlook whenever I start word or outlook. After hours of wasting time (well it's a Microsoft product, so ....) I uninstalled it and started installing office 2010 which i had earlier but it started giving me errors
After again wasting hours on this in my office the following worked for me.
right click on the following folder
C:\Program Files\Microsoft Office
On the Properties window, select the Security tab
select SYSTEM
click edit
give full permissions and click ok, click ok.
The installation should proceed perfectly.