Skip to content

ESXi Issues caused by hp-ams module

I recently had strange issues with Hewlett-Packard servers. ESXi hosts randomly have shown a couple of different symptoms:

  • ESXi host unmanageable
  • ESXi host grayed out in vCenter
  • Starting host services fails with an error message:

    Call "HostServiceSystem.Restart" for object "serviceSystem-[*]" on vCenter Server * failed.

  • Cannot perform vMotion to or from the host
  • Starting virtual machine fails with an error message:

    Power On virtual machine *
    A general system error occurred: The virtual machine could not start
    VMK_NO_MEMORY

  • Restarting services in DCUI fails

    A general system error occurred: Command /bin/sh failed

  • SSH connection to the host possible, but no response after login requests
  • Local console displays an error message:

    /bin/sh cannot fork

  • Error Message received at syslog server

    sfcb-HTTPS-Daemon[*]: handleHttpRequest fork failed: Cannot allocate memory
    crond[*]: can't vfork
    cpu*:*)WARNING: Heap: *: Heap_Align(globalCartel-1, 136/136 bytes, 8 align) failed.
    cpu*:*)WARNING: Heap: *: Heap globalCartel-1 already at its maximum size. Cannot expand)

  • DCUI message log (ALT+F12) displays an error message

    WARNING: Heap: *: Heap globalCartel-1 already at its maximum size. Cannot expand.

The problem was caused by the hp-ams module (HP Agentless Management Service) which has a known problem in these versions:

  • hp-ams 9.5
  • hp-ams 9.6
  • hp-ams 10.0

You can verify the version with the following command:

# esxcli software vib list |grep hp-ams

The issue has been resolved in hp-ams 10.0.1 which an be downloaded from the HP website:

If you cannot upgrade the server immediately due to change management processes, you can also mitigate the issue by stopping the hp-ams service and removing the package:

# /etc/init.d/hp-ams.sh stop 
# esxcli software vib remove -n hp-ams
Tags:

10 thoughts on “ESXi Issues caused by hp-ams module”

  1. Interesting.. last Friday I experienced exactly the same problem.
    I am still puzzled what can be the cause of this bug. I had ESX hosts with hp-ams 9.6 running for half a year without issues. :???:

    1. Me too... Some die after 4-5 weeks, some still alive after month. Maybe something or someone triggers hp-ams services which causes that issue.

  2. Its caused by a memory leak that fills the ram allocation on the ESXi host, meaning it cannot respond to requests, everything else is just false alarms tbh,

    if you esxcli onto the host, you will find it gives the cannot fork message,

    I wrote a post on this a while ago.

    I would recommend you add in the HP repository to your update manager download sources to capture the latest HP customized drivers.

    http://www.educationalcentre.co.uk/automatically-download-hp-drivers-to-vmware-update-manager/

  3. Thanks for this post. I'm not sure the RAM usage has anything to do with it. For example, I've got a host with 213GB out of 255GB used (MemoryUsageGB from get-vmhost in powercli) and ssh start works, where other hosts at a lower memory usage level (180GB for example) are unable to have ssh started. Are you looking at another metric when you mention a RAM leak? Thanks

    1. The physical memory usage has nothing to do with this issue. The problem can happen with 180GB free physical memory.

      There is a limit for the vmkernel. I haven't figured out what's the limit or how to measure it.

      1. Many thanks. Until we refine it with something more specific, this seems to at least tell us when a host has hit the issue (splunk search - "esx5_syslog" "globalCartel-1 "). Update here if I find a way to measure.

Leave a Reply to Pavel Cancel reply

Your email address will not be published. Required fields are marked *