Nagios Check: VMware Virtual Machine Snapshot Age

As you might know it is a VMware best practice to keep snapshots for more than 24-72 hours. To monitor aged snapshots using Nagios i created a perl script that checks the whole vCenter for snapshots. The script throws a warning because i think this is still not a critical event. You can easily change the behavior by changing the exit code to 2.

I set the allowed age to 3 days, based on VMware  KB1025279. You can change the maximum allowed age to whatever you want by changing the subroutine check_age.

You can use this script as source for Nagios. There is also a multiple line output which allows you to see the virtual machine that caused the warning event.

#!/usr/bin/perl
#
# Nagios check for dated VMware Snapshots. Works with vCenter 4&5
#
# This check searches the vCenter for snapshots and notifies when there is a
# Snapshots that is older than 3 Days. (Based on best practices KB1025279) 
# This check produces a warning when a old snapshot is found. You can change 
# the exit code when you want to receive a critial state.
#
# Author: Florian Grehl - www.virten.net
# Version: 1.0 - January 2013
#
# http://www.virten.net/2013/01/nagios-check-vmware-virtual-machine-snapshot-age/
#
# Usage:
# ./check_snapshot.pl --server <VCENTER> --username <USER> --password <PW>
#

use VMware::VIRuntime; 
use Date::Parse;
use POSIX;

my @old_snapshots = ();

Opts::parse();
Opts::validate();
Util::connect();

my $vm = Vim::find_entity_views(view_type => 'VirtualMachine');

# 1 Day = 86400s
# 3 Days = 259200s
# 1 Week = 604800s
sub check_age {
  my $date_created = shift;
  return(1) if ((time() - $date_created) > 259200);
  return(0);
}

sub check_snaplist {
  my $vm_name = shift;
  my $vm_snaptree = shift;
  foreach my $vm_snapshot (@{$vm_snaptree}) {
    my $date_snapshot = str2time($vm_snapshot->{createTime});
    next unless (check_age($date_snapshot));
    $old_snapshots[scalar(@old_snapshots)] = {
      'vm' => $vm_name,
      'age' => ceil(((time() - $date_snapshot)/86400)),
    };
  }
}

foreach my $vm_view (@{$vm}) {
  my $vm_name     = $vm_view->{summary}->{config}->{name};
  my $vm_snaptree = $vm_view->{snapshot};
  next unless defined $vm_snaptree;
  check_snaplist($vm_name, $vm_snaptree->{rootSnapshotList});
}

if (scalar(@old_snapshots) > 0){
  print "Old Snapshots found.\n";
  map {
    printf "%s (%s Days) \n",
    $_->{'vm'}, $_->{'age'}
  } @old_snapshots;
  exit 1; # Nagios: Warning
# exit 2; # Nagios: Critical
}
else{ 
  print "No Snapshots found.\n"; 
  exit 0; # Nagios: OK
}

This script has been tested with vSphere vCenter 4.x and 5.x using vMA 5.

  1. Question on your script. I like it but would like to mod it to be able to run against a specific VM. Is that possible? I see that you are some calls to other routines in the script? I assume that is where the modifications would take place. Can you direct me? Thanks.

    Shark

  2. Nice work! We use Veeam to replicate servers, which leaves a number of snapshots on the powered-off replica, so I've done a very basic hack to ignore the repicas. Inserted at line 56:

    next if ($vm_name =~ /_replica/);

  3. Habe das Script ein wenig angepasst, Warning löst aus sobald ein Snapshot vorhanden ist, Critical löst aus sobald ein Snapshot älter als 12 Stunden ist. Alter der Snapshots wird in Stunden angegeben.


    #!/usr/bin/perl
    $ENV{PERL_LWP_SSL_VERIFY_HOSTNAME} = 0;
    $ENV{SSL_verify_mode} = SSL_VERIFY_NONE;

    #
    # Nagios check for dated VMware Snapshots. Works with vCenter 4&5
    #
    # This check searches the vCenter for snapshots and notifies when there is a
    # Snapshots that is older than 3 Days. (Based on best practices KB1025279)
    # This check produces a warning when a old snapshot is found. You can change
    # the exit code when you want to receive a critial state.
    #
    # Author: Florian Grehl - www.virten.net
    # Version: 1.0 - January 2013
    #
    # http://www.virten.net/2013/01/nagios-check-vmware-virtual-machine-snapshot-age/
    #
    # Usage:
    # ./check_snapshot.pl --server --username --password
    #

    use VMware::VIRuntime;
    use Date::Parse;
    use POSIX;

    my @old_snapshots = ();

    Opts::parse();
    Opts::validate();
    Util::connect();

    my $vm = Vim::find_entity_views(view_type => 'VirtualMachine');

    sub check_age {
    my $date_created = shift;
    return(1) if ((time() - $date_created) > 1); # Angabe der Ausloesezeit in Sekunden
    return(0);
    }

    sub check_snaplist {
    my $vm_name = shift;
    my $vm_snaptree = shift;
    foreach my $vm_snapshot (@{$vm_snaptree}) {
    my $date_snapshot = str2time($vm_snapshot->{createTime});
    next unless (check_age($date_snapshot));
    $old_snapshots[scalar(@old_snapshots)] = {
    'vm' => $vm_name,
    'age' => ceil(((time() - $date_snapshot)/3600)), # Anzeige des Alters der VMs in Stunden
    };
    }
    }

    foreach my $vm_view (@{$vm}) {
    my $vm_name = $vm_view->{summary}->{config}->{name};
    my $vm_snaptree = $vm_view->{snapshot};
    next unless defined $vm_snaptree;
    check_snaplist($vm_name, $vm_snaptree->{rootSnapshotList});
    }

    if (scalar(@old_snapshots) > 0){
    print "Existing Snapshots found.\n";
    map {
    printf "%s (%s Hours) \n",
    $_->{'vm'}, $_->{'age'}
    } @old_snapshots;

    my $exit = 1;
    foreach (@old_snapshots) {
    my $old = $_->{'age'};
    if ($old >= 12) { #Wenn ein Snapshot aelter 12 Stunden setze Alert auf Critical
    $exit = 2;
    break;
    }
    }
    exit $exit;
    #exit 1; # Nagios: Warning
    } else {
    print "No Snapshots found.\n";
    exit 0; # Nagios: OK
    }

Leave a Comment

NOTE - You can use these HTML tags and attributes:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>