How to silence VMware vSAN Health Checks

A new feature in vSAN 6.6 is the ability to silence Health Checks. In previous versions, it was already possible to disable alerts that are triggered by health checks. Silencing health checks is one step further and enables you to have a clean vSAN health. Silenced checks are displayed with a green checkmark and are marked as "Skipped".

Especially for home labs, where unsupported hardware is used, this is a great feature.

Currently, it is not possible to silence health checks with the vSphere Web Client and it is not described in the documentation. The feature is available in the RVC or by using the vSAN Management API. This article is focused on the RVC. If you are unfamiliar with RVC, see this article.

Add a check to the silent list:
vsan.health.silent_health_check_configure -a <Check ID> <CLUSTER>

Remove a check from the silent list:
vsan.health.silent_health_check_configure -r <Check ID> <CLUSTER>

The following Check IDs are available:

Description  Check ID
Cloud Health
Controller utility is installed on host vendortoolpresence
Controller with pass-through and RAID disks mixedmode
Customer experience improvement program (CEIP) vsancloudhealthceipexception
Disks usage on storage controller diskusage
Online health connectivity vsancloudhealthconnectionexception
vSAN and VMFS datastores on a Dell H730 controller with the lsi_mr3 driver mixedmodeh730
vSAN configuration for LSI-3108 based controller h730
vSAN max component size smalldiskstest
Cluster
Advanced vSAN configuration in sync advcfgsync
Deduplication and compression configuration consistency physdiskdedupconfig
Deduplication and compression usage health physdiskdedupusage
Disk format version upgradelowerhosts
ESXi vSAN Health service installation healtheaminstall
Resync operations throttling resynclimit
Software version compatibility upgradesoftware
Time is synchronized across hosts and VC timedrift
vSAN CLOMD liveness clomdliveness
vSAN Disk Balance diskbalance
vSAN Health Service up-to-date healthversion
vSAN cluster configuration consistency consistentconfig
vSphere cluster members match vSAN cluster members clustermembership
Data
vSAN VM health vmhealth
vSAN object health objecthealth
Encryption
CPU AES-NI is enabled on hosts hostcpuaesni
vCenter and all hosts are connected to Key Management Servers kmsconnection
Hardware compatibility
Controller disk group mode is VMware certified controllerdiskmode
Controller driver is VMware certified controllerdriver
Controller firmware is VMware certified controllerfirmware
Controller is VMware certified for ESXi release controllerreleasesupport
Host issues retrieving hardware info hclhostbadstate
SCSI controller is VMware certified controlleronhcl
vSAN HCL DB Auto Update autohclupdate
vSAN HCL DB up-to-date hcldbuptodate
Limits
After 1 additional host failure limit1hf
Current cluster situation limit0hf
Host component limit nodecomponentlimit
Network
Active multicast connectivity check multicastdeepdive
All hosts have a vSAN vmknic configured vsanvmknic
All hosts have matching multicast settings multicastsettings
All hosts have matching subnets matchingsubnet
Hosts disconnected from VC hostdisconnected
Hosts with connectivity issues hostconnectivity
Multicast assessment based on other checks multicastsuspected
Network latency check hostlatencycheck
vMotion: Basic (unicast) connectivity check vmotionpingsmall
vMotion: MTU check (ping with large packet size) vmotionpinglarge
vSAN cluster partition clusterpartition
vSAN: Basic (unicast) connectivity check smallping
vSAN: MTU check (ping with large packet size) largeping
Performance service
All hosts contributing stats hostsmissing
Performance data collection collection
Performance service status perfsvcstatus
Stats DB object statsdb
Stats DB object conflicts renameddirs
Stats master election masterexist
Verbose mode verbosemode
Physical disk
Component limit health physdiskcomplimithealth
Component metadata health componentmetadata
Congestion physdiskcongestion
Disk capacity physdiskcapacity
Memory pools (heaps) lsomheap
Memory pools (slabs) lsomslab
Metadata health physdiskmetadata
Overall disks health physdiskoverall
Physical disk health retrieval issues physdiskhostissues
Software state health physdisksoftware
Stretched cluster
Invalid preferred fault domain on witness host witnesspreferredfaultdomaininvalid
Invalid unicast agent hostwithinvalidunicastagent
No disk claimed on witness host witnesswithnodiskmapping
Preferred fault domain unset witnesspreferredfaultdomainnotexist
Site latency health siteconnectivity
Unexpected number of fault domains clusterwithouttwodatafaultdomains
Unicast agent configuration inconsistent clusterwithmultipleunicastagents
Unicast agent not configured hostunicastagentunset
Unsupported host version hostwithnostretchedclustersupport
Witness host fault domain misconfigured witnessfaultdomaininvalid
Witness host not found clusterwithoutonewitnesshost
Witness host within vCenter cluster witnessinsidevccluster
vSAN iSCSI target service
Home object iscsihomeobjectstatustest
Network configuration iscsiservicenetworktest
Service runtime status iscsiservicerunningtest

 

Example
Silence all HCL related checks in an unsupported home lab configuration. This typically includes the following Check IDs:

  • controllerdiskmode
  • controllerdriver
  • controllerfirmware
  • controllerreleasesupport
  • controlleronhcl
  1. Connect to the vCSA with SSH
  2. Open RVC
    # rvc administrator@vc.virten.lab
  3. Mark the vSAN Cluster.
    This step is not required but allows you to use commands with ~vsan66 as target

    > mark vsan66 vc.virten.lab/Datacenter/computers/vSAN66/
  4. Silence health checks
    > vsan.health.silent_health_check_configure -a controllerdiskmode ~vsan66
    Successfully add check "Controller disk group mode is VMware certified" to silent health check list for vSAN66
    > vsan.health.silent_health_check_configure -a controllerdriver ~vsan66
    Successfully add check "Controller driver is VMware certified" to silent health check list for vSAN66
    > vsan.health.silent_health_check_configure -a controllerfirmware ~vsan66
    Successfully add check "Controller firmware is VMware certified" to silent health check list for vSAN66
    > vsan.health.silent_health_check_configure -a controllerreleasesupport ~vsan66
    Successfully add check "Controller is VMware certified for ESXi release" to silent health check list for vSAN66
    > vsan.health.silent_health_check_configure -a controlleronhcl ~vsan66
    Successfully add check "SCSI controller is VMware certified" to silent health check list for vSAN66
  5. Verify the status with vsan.health.silent_health_check_status
    > vsan.health.silent_health_check_status ~vsan66
    Silent Status of Cluster vSAN66:
    +----------------------------------------------------+---------------------------+---------------+
    | Health Check                                       | Health Check Id           | Silent Status |
    +----------------------------------------------------+---------------------------+---------------+
    | Hardware compatibility                             |                           |               |
    |   Controller disk group mode is VMware certified   | controllerdiskmode        | Silent        |
    |   Controller driver is VMware certified            | controllerdriver          | Silent        |
    |   Controller firmware is VMware certified          | controllerfirmware        | Silent        |
    |   Controller is VMware certified for ESXi release  | controllerreleasesupport  | Silent        |
    |   Host issues retrieving hardware info             | hclhostbadstate           | Normal        |
    |   SCSI controller is VMware certified              | controlleronhcl           | Silent        |
    |   vSAN HCL DB Auto Update                          | autohclupdate             | Silent        |
    |   vSAN HCL DB up-to-date                           | hcldbuptodate             | Normal        |
    +----------------------------------------------------+---------------------------+---------------+
    

 

  1. In order to see if the changes worked, go to the cluster > Monitor > vSAN > Health > Retest and it should turn the previously failed tests green :-)

Leave a Comment

NOTE - You can use these HTML tags and attributes:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>