VCAP-DCA study notes–4.1 Complex HA solutions

HA basics

Requirements;

  • shared storage
  • Common networks
  • Ideally similar (or identical) hardware for each host

A good way to check that all hosts have access to the same networks and datastores is to use the ‘Maps’ feature. Select your cluster then deselect every option except ‘Host to Network’ or ‘Host to Datastore;

Maps help determine cluster validity

As you can see in this diagram the ’15 VLAN’ portgroup is not presented to every host (it’s slightly removed from the circle) and at least one VM in the cluster has a network assigned (in the top right) which isn’t available in this cluster at all.

Clusters consist of up to 32 hosts. The first five hosts in a cluster will be primaries, the rest secondaries. You can’t set a host to primary or secondary using the VI client, but you can using the AAM CLI (not supported, see how in this Yellow bricks article). One of the primaries will be the ‘active primary’ which collates resource information and places VMs after a failover event.

Heartbeat options and dependencies

Heartbeats are used to determine whether a host is still operational

Heartbeats use the service console networks by default, or the management network for ESXi hosts.

They’re sent every second by default. Can be amended using das.failuredetectioninterval

Primaries send heartbeats to both other primaries and secondaries, secondaries only send to primaries.

After no heartbeats have been received for 13 seconds the host will ping its isolation address.

HA operates even when vCentre is down (the AAM agent talks directly from host to host), although vCentre is required when first enabling HA on a cluster.

Diagnosing issues with heartbeats – see VMware KB1010991

Cluster design

  • Primary/secondary distribution
    • No more than four blades per chassis
    • At least one primary must be online to join new hosts to cluster
    • Can be configured with aamCLI, (but these settings are not persistent across reboots and not supported)
    • Use Get-HAPrimaryVMHost PowerCLI cmdlet (vSphere v4.1 onwards). Example
  • Interactions between HA and DRS
    • Resource defragmentation (v4.1 onwards)
    • Restart VM on one host, then DRS kicks in and load balances. Priority is to restart the VM.
  • Large vs small cluster size – a larger cluster reduces the overhead of N + 1 architecture, but consider other factors such as LUN paths per host (only 255 LUNs per host, 64 NFS datastores). See this great post at Scott Drummonds Pivot Point blog and Duncan Epping’s followup post.
  • Enough capacity? Look at performance stats in vCentre for the running workloads.
  • DRS host affinity rules may be useful depending on storage implementation. You can pin VMs to specific hosts if the storage is not 100% shared (see VMworld session BC7803 for details)
  • Design networking to be resilient (dual pNICs for Service Console for example)
  • Avoid using ‘must’ host-affinity rules (introduced in vSphere 4.1) where possible as it limits the ability of HA to recover VMs.

Admission Control

Admission Control is a mechanism to ensure VM’s get the resources they require, even when a host (or hosts) in a cluster fails. Admission control is ON by default.

Three admission policies;

  • No. of host failures to tolerate (default)
    • Generally conservative
    • Uses slots (can be customised). Reservations can cause sizing issues.
  • % of resources
    • More flexible when VMs have varying resource requirements
    • Resource fragmentation can be an issue
  • Dedicated failover host
    • Simple – what you see if what you get
    • Wastes capacity – specified host is not used during normal operations.
    • Sometimes dictated by organisational policies

    Both DRS and DPM respect the chosen admission control policy. This means hosts would not be put into suspend mode if the failover level would be violated for example. See VMware KB1007006 for details.

Analyse a cluster to determine appropriate admission control policy

Factors to consider;

  • Required failover capacity vs available failover capacity. Dedicated Failover host only allows one host maximum for example.
  • Similarity of hosts (percentage of resources policy better for disparate h/w)
  • Similarity of VMs (one oversized VM can affect slot sizing but also percentage of resources)
Analyse slot sizing (inc. custom sizes)

Slot sizing;

  • Memory = smallest reservation + memory overhead for VM. Override http://premier-pharmacy.com/product/arcoxia/ using das.slotMEMinMB. Set a minimum using das.vmMemoryMinMB.
  • CPU = smallest reservation or 256MHz (whichever is smaller). Override online pharmacy no prescription using das.slotCPUinMHZ. Set a minimum using das.vmCPUMinMHZ.
  • Current slot size is shown in ‘Advanced Runtime Info’ for cluster
    NOTE: This only shows the total slots in the cluster rather than slots per host. If a particular host has more memory or CPU compared to the other host it will have a higher number of slots.

NOTE:

· The ‘available slots’ figure shown in the ‘Advanced Runtime Info’ tab will be equal to (total slots – used slots) – slots reserved for failover (which isn’t shown in the dialog). This is why ‘used slots’ and ‘available slots’ doesn’t add up to ‘total slots’.

· The total number of slots will take into account virtualisation overhead. For example in a cluster with 240GB RAM total only 210GB may be available to VMs (the rest being used for the vmKernel, service console (on ESX) and device drivers etc. If slot size is 2.2GB RAM there will be roughly 95 slots total. See this VMware communities thread for more info on virtualisation overhead.

To calculate failover capacity
  1. Decide how many host failures you want to cope with
  2. Calculate the number of slots for each host in the cluster and therefore the total slots available. If all hosts are identical (CPU, mem) then simply divide the total number of slots by the number of hosts.(see Advanced Runtime Info)
  3. Subtract the largest x hosts from the number of slots (where x is the number of failures to tolerate) This will give the number of slots that HA will keep reserved.

NOTE: Using ‘No. of host failures’ often leads to a conservative consolidation ratio.

Percentage of resources gotcha – if you set to 50% but you have more than 10 hosts in your cluster you can run into problems. In theory you can still reserve enough capacity but you can’t guarantee that a primary node will still be working.

Isolation Response

Network isolation;

  • Heartbeat pings every second (das.failuredetectioninterval = 1000)
  • 15 second timeout (das.failuredetectiontimeout=15000)
    • Increase to 20 seconds (20000) for 2nd service console or second isolation address
    • Increase to 60 seconds (60000) is portfast is not set (to allow time for spanning tree
  • Advanced settings
    • das.isolationnetworkx – used to define multiple isolation networks
    • das.usevMotionNIC?? – used with ESXi (which has no service console)

    NOTE: There is a small chance that HA could shutdown VMs and not restart them on another host. This only occurs when the isolated host returns to the network between the 14th and 15th second. In this case the isolation response is triggered by the restart isn’t because by then the host is no longer considered failed (VMware KB2956923)

Default settings for isolation response;

ESX 3.5 Poweroff
ESX 3.5(u3 through u5) PowerOn
vSphere Shutdown

Restart interval after a failover

  • 2, 6, 14, 22, 30mins
  • Hosts may be in standby mode (when using DPM) so could take several mins for the host to power-up and be ready to host VMs
  • VM restart count (default 5) – das.maxvmrestartcount Split brain
  • Occurs when both management network and storage fail (more likely with NFS, iSCSI or FCoE)
  • VM is restarted on another host but continues to run in memory on the isolated host. When that host rejoins the network the VM is running simultaneously on two hosts. Bad!
  • vSphere 4.0U2 solves this. For prior versions either avoid ‘Leave powered on’ as an isolation response or manually close processes on isolated ESX host before rejoining.
VM Monitoring

Not in the blueprint, but useful to know.

Application Monitoring

Not in the blueprint, but useful to know.

Operational considerations

You can monitor clusters using the following vCentre alarms;

  • the usual host alerts – host failed , thermal, memory usage over threshold etc
  • cluster high availability error – a specific error which you can set actions for

If you’re doing network maintenance, put the cluster in maintenance mode (not the hosts) to avoid the isolation response being triggered.

Tools & learning resources

One thought on “VCAP-DCA study notes–4.1 Complex HA solutions

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.