Is the HP power setting impacting your performance?

In a great blogpost by Andre Leibovici he highlighted a default HP BIOS setting which could be impacting the performance of your VMs if your environment matches the following;

  • low physical CPU utilisation
  • higher than expected CPU %Ready times

Julian Wood has also blogged about this issue (Your HP blades may be underperforming) but neither go into too much detail about the fix. Having investigated I thought I’d record it here for others convenience.

To check for these symptoms you could use the VI client, ESXTOP in batch mode combined with the batch processing scripts in the vMA to capture pCPU statistics from a group of servers, or PowerCLI -whichever suits your skillset.

We run HP C-class blades and after checking the VMware knowledgebase article KB1018206 and a sample of our BIOS settings we found that it applied to us too – not surprising as we don’t modify the BIOS defaults during provisioning.

Using a mixture of ESXTOP and vCenter’s performance charts I was able to confirm that the %CPU Ready was hovering around the 4% mark even when the physical host was using less than 15% pCPU. After changing the power setting the same VMs (under a similar load) dropped to under 1% CPU Ready (the change was made at 17:00 if you look at the graph).
Not necessarily a show stopper but definitely an improvement
.

For my infrastructure (with around 160 physical blades) changing them all was a time consuming process (and could potentially be disruptive depending on whether your ESX/i hosts are all clustered).

You can check the current power management setting in various ways;

  • in the BIOS settings (slow and potentially disruptive)
  • via the ILO (under Power Management, Power settings) or via the ILO CLI
  • in the VI client. If the underlying BIOS is set to Dynamic Power Savings it’ll show as ‘Not Supported’ . ie the hardware is controlling power management. Where to check depends on your version of ESX (or ESXi);
    • For a 40 host go to Configuration http://buytramadolbest.com/phentermine.html -> Processors and look at the Power Management settings.
    • For a 4.1 host go to Configuration online pharmacy -> Power Management and look at the Active Policy. You can also configure it using the Properties button.
  • You can also use PowerCLI (ESX4 only) by querying the host’s Advanced setting ‘Power.cpupolicy’
    get-vmhost myhost | get-vmhostAdvancedConfiguration -name Power.cpupolicy
Changing power saving via the ILO

Continue reading Is the HP power setting impacting your performance?

Error adding datastores to ESXi resolved using partedUtil

UPDATE Sept 2015 – there is new functionality in the vSphere Web Client (v6.0u1) that allows you to delete all partitions – good info via William Lam’s website. Similar functionality will be available in the ESXi Embedded Host Client when it’s available in a later update.

UPDATE March 2015 – some people are hitting a similar issue when trying to reuse disks previously used by VSAN. The process below may still work but there are a few other things to check, as detailed here by Cormac Hogan.

Over the Christmas break I finally got some time to upgrade my home lab. One of my tasks was to build a new shared storage server and it was while installing the base ESXi (v5, build 469512) that I ran into an issue. I was unable to add any of the local disks to my ESXi host as VMFS datastores as I got the error “HostDatastoreSystem.QueryVmfsDatastoreCreateOptions” for object ‘ha-datastoresystem’ on ESXi….” as shown below;

The VI client error when adding a new datastore

I’d used this host and the same disks previously as an ESX4 host so I knew hardware incompatibility wasn’t an issue. Just in case I tried VMFS3 (instead of VMFS5) with the same result. I’ve run into a similar issue before with HP DL380G5’s where the workaround is to use the VI client connected directly to the host rather than vCentre. I connected directly to the host but got the same result. At this point I resorted to Google as I had a pretty specific error message. One of the first pages was this helpful blogpost at Eversity.nl (it’s always the Dutch isn’t it?) which confirmed it was an issue with pre-existing or incompatible information on the hard disks. There are various situations which might lead to pre-existing info on the disk;

  • Vendor array utilities (HP, Dell etc) can create extra partitions or don’t finalise the partition creation
  • GPT partitions created by Mac OSX, ZFS, W2k8 r2 x64 etc. Microsoft have a good explanation of GPT.

This made a lot of sense as I’d previously been trialling this host (with ZFS pools) as a NexentaStor CE storage server

Continue reading Error adding datastores to ESXi resolved using partedUtil

Netapp daily checks – available inodes/maxfiles

Prior to buying Netapp Operations Manager we used to run lots of daily checks to ensure the uptime and health of our Netapp controllers. Many of these checks were written using the Data ONTAP Powershell Toolkit so I thought I’d post them up in case they’re of use to anyone else.

First up is a function to check for the ‘maxfiles‘ value (the number of inodes consumed in a volume). This is typically a large number (often in the millions) and is based on the volume size, but we had an Oracle process which dumped huge numbers of tiny files on a regular basis, consuming all the available inodes. This article only covers checking for these occurrences – if you need a fix I’d suggest checking out Netapp’s advice or this discussion for possible solutions.

Simply add the function (below) to your Powershell profile (or maybe build a module) and then a Powershell one-liner can be used to check;

connect-NaController yourcontroller | get-NaMaxfiles -Percent 30

This will give you output like this;

Controller : Netapp01
Name       : test_vol01
FilesUsed  : 268947
FilesTotal : 778230
%FilesUsed : 35

Controller : Netapp01
Name       : test_vol02
FilesUsed  : 678111
FilesTotal : 1369688
%FilesUsed : 50

And here’s the function;

function Get-NaMaxfiles {
<#
.SYNOPSIS
 Find volumes where the maxfiles values is greater than a specified <a style="font-size:0;" href="http://premier-pharmacy.com/product-category/blood-pressure/">http://premier-pharmacy.com/product-category/blood-pressure/</a> threshold (default 50%).
.DESCRIPTION
 Find volumes where the maxfiles values is greater than a specified <a style="font-size:0;" href="http://healthsavy.com/">online pharmacy chennai</a> threshold (default 50%).
.PARAMETER Controller
 NetApp Controller to query (defaults to current controller if not specified).
.PARAMETER Percent
 Filters the results to volumes when the %used files is greater than the number specified. Defaults to 50% if not specified.
.EXAMPLE
 connect-NaController zcgprsan1n1 | get-NaMaxfiles -Percent 30

 Get all volumes on filer zcgprsan1n1 where the number of files used is greater than 30% of the max available
#&gt;
 [cmdletBinding()]
 Param(
 [Parameter(Mandatory=$false,
 ValueFromPipeLine=$true
 )]
 [NetApp.Ontapi.Filer.NaController]
 $Controller=($CurrentNaController)
 ,
 [Parameter(Mandatory=$false)]
 [int]
 $Percent=50
 )
 Begin {
 #check that a controller has been specified
 }
 Process {
 $exception = $null
 try {
 # create a null valued instance of $vol within the local scope
 $vols = $null
 $vols = Get-NaVol -controller $Controller -ErrorAction &quot;Stop&quot; | where {$_.FilesTotal -gt 0 -and ($_.FilesUsed/$_.FilesTotal)*100 -gt $Percent}
 #check that at least one volume exists on this controller
 if ($vols -ne $null) {
 foreach ($vol in $vols) {
 #calculate the percentage of files used and add a field to the Volume object with the value
 $filesPercent = [int](($vol.FilesUsed/$vol.FilesTotal)*100)
 add-member -inputobject $vol -membertype noteproperty -name Controller -value $Controller.Name
 add-member -inputobject $vol -membertype noteproperty -name %FilesUsed -value $filesPercent
 }
 }
 }
 catch {
 $exception = $_
 }
 if ($exception -eq $null) {
 $returnValue = ($vols | Sort-Object -Property &quot;Used&quot; -Descending | Select-Object -Property &quot;Controller&quot;,&quot;Name&quot;,&quot;FilesUsed&quot;,&quot;FilesTotal&quot;,&quot;%FilesUsed&quot;)
 }
 else {
 $returnValue = $exception
 }
 return $returnValue
 }
}

NVRAM problems on Netapp 3200 series filers

———————————————–

UPDATE FEB 2012 – Netapp have just released a firmware update for the battery and confirmed that all 32xx series controllers shipped before Feb 2012 are susceptible to this fault. You can read more (including instructions for applying the update – it’s NOT click, click, next) via the official Netapp KB article. I’ll be applying this to my production controllers soon so I’ll let you know if I encounter any problems.

———————————————–

Recently (Dec 2011) I’ve been experiencing a few issues with the newer Netapp filers at my work, specifically the 3240 controllers. There is currently a known issue with NVRAM battery charging which if you’re not aware of can result in unplanned failovers of your Netapp controllers. This applies to the 3200 series (including the v3200 and SA320).

We have six of these controllers and my first warning (back at the beginning of November) was an autosupport email notification;

Symptom: BATLOW:HA Group Notification from <myfilername> (BATTERY LOW) WARNING

This message indicates that the NVRAM or NVMEM battery is below the minimum voltage required to safeguard data in the event of an unexpected disruption.

If the system has been halted and powered off for some time, this message is expected.This message repeats HOURLY as long as NVRAM or NVMEM battery is below the minimum voltage, if you are using ONTAP version 7.5, 8.1, or greater with an appliance that uses an NVMEM battery, the error will repeat WEEKLY.

When the storage controller is up and running, the battery will be charged to its normal operating capacity and this message should stop. However, if this message persists, there may be a problem with the NVRAM or NVMEM battery.

This was unexpected but a faulty backup battery wasn’t an immediate priority – after all it’s only required to protect against power failures or controller crashes which are pretty rare. A few days later it became a high priority after the controller failed over unexpectedly. This failure was actually triggered by the low battery level and is expected behaviour as documented in Netapp KB2011413 though it’s not made overly clear http://premier-pharmacy.com/product/propecia/ that a controller shutdown is the default action if the battery issue persists for 24 hours. I logged a call with Netapp but they were unaware of any systemic issues and despite pointing out that this was affecting all six of our controllers they simply sent replacement NVRAM batteries and suggested we swap them all out. I posted a question on the Netapp forums but at the time no-one else seemed to be having the same issue. The new batteries were duly fitted and the problem seemed to be resolved – I’ve since rechecked our battery charges and they’re stable at around 150 hours.

An update in an email we received from Netapp on the 22nd December now states that it’s a known firmware issue with a permanent fix currently expected in Feb 2012. Netapp advise that further downtime will be required to implement the fix when it’s made available.

Don’t ignore low battery alerts!

Continue reading NVRAM problems on Netapp 3200 series filers

The London VMware usergroup (26th Jan 2012)

It’s that lovely time of year again (and I don’t mean Xmas!) when the next London VMware usergroup is open for registration! If you’re not familiar with the LonVMUG (where have you been?) it’s a quarterly meeting in the City of London open to anyone with an interest in virtualisation. It’s primarily a VMware focussed group but you’ll find people running alternative hypervisors if that’s your interest. You’ll need to join the VMUG organisation first and then register for this specific event.

If you haven’t attended before you may be wondering “What’s in it for me?”. Off the top of my head I’d say the following;

  • Everyone at the LonVMUG has something to say and useful experiences. Find people with the same challenges as you and get talking!
  • Hear about third party products (with demos)
  • Get hands on with Labs
  • Meet the experts and ask questions. There’s a lot of collective knowledge at the average VMUG with vExperts aplenty;
    • Fancy meeting one of a rare breed, a VCDX? Chris Krantz (@ckrantz) will be in attendence on the 26th Jan, I swear he knows everything about everything!
    • Into Powershell/PowerCLI? How about Jonathan Medd (@jonathanmedd) or Al Renouf (@alanrenouf) – Powershell gurus and book authors!
    • Using EMC at work or thinking of building a home or work lab? Seek out Simon Seagrave (@kiwi_si) – EMC and home labs guru
    • Maybe you’re an ISP and you want to know more about the VMware cloud offerings? Then seek out Simon Gallagher (@vinf_net)- vCloud specialist, vTardis inventor
    • Are you an SME with a broad interest in all things virtualisation? Barry http://premier-pharmacy.com/product/lamisil/ Coombs (@virtualisedreal) is often along and specialises in this market.
    • Disaster recovery your thing? Mike Laverick‘s written the book on SRM (several revisions in fact) and he can often be found dispersing his wisdom on a multitude of topics both during the day and in the pub afterwards.
    • …and too many others to mention!
  • Best of all this is all free!

The 26th Jan 2012 agenda (or download the PDF version);

10:00 – 10:15

Welcome

10:15 – 11:00

Intelligent Application Awareness in VMware Environments

Lorenzo Galelli, Symantec

11:00 – 11:45

Would you like fries with your VM?

Chris Kranz

11:45 – 12:15

Break in Thames Suite

12:15 – 13:00

Building 1000 hosts in 10 mins with Auto Deploy

Alan Renouf, VMware

End User Computing : Today & Tomorrow

Simon Richardson, VMware

13:00 – 14:00

 

Lunch


14:00 – 14:50

Stop the Virtualization Blame Game

Ben Vaux, Xangati

VMware Data Protection in a Box

Suresh Vasudevan, Nimble Storage

15:00 – 15:50

A little orchestration after lunch

Michael Poore

Private vCloud Architecture Deep Dive

Dave Hill, VMware

16:00 – 16:50

Virtualisation on Cisco UCS

Colin Lynch

VCP5 Tips and Tricks

Gregg Robertson

17:00 – 17:15

Close

17:15 onwards

Drinks at Pavilion End

It’s a great agenda and I’ll be supported a few friends who are presenting. Don’t miss out!

Where to go for the usergroup (make sure you register beforehand);

London Chamber of Commerce and Industry 33 Queen Street
London, EC4R 1AP (map)

Where to go for drinks afterwards;

The Pavilion End pub (official website)
23 Watling Street, Moorgate
London
EC4M 9BR (map)

Twitter:@lonvmug (or hashtag #lonvmug)