Skip to content

Determine TBW from SSDs with S.M.A.R.T Values in ESXi (smartctl)

smartctl-in-esxiSolid-State-Drives are getting more and more common in ESXi Hosts. They are used for caching (vFlash Read Cache, PernixData FVP), Virtual SAN or plain Datastores. A problem that comes with SSDs is their limited lifetime per cell. Depending on their technology, each cell can be overwritten from 1.000 times in consumer TLC SSDs up to 100.000 times in enterprise SLC based SSDs.

The value to keep an eye on is the guaranteed TBW (Total Bytes Written or Terabytes Written) which is typically provided by the vendor in their specifications. This value describes how many Terabytes can be written to the entire device, until the warranty expires. The current value can be readout with S.M.A.R.T. in the Total_LBAs_Written field.

Unfortunatelly, VMware makes it hard to readout RAW S.M.A.R.T values on ESXi hosts. For that reason I've ported a version of smartctl, which is part of  smartmontools to ESXi. I've made the package available as VIB. The download link is at the bottom of this post.

First of all, let's get started what you can see on an ESXi Host regarding to endurance without smartctl. In this example I'm using a Samsung SSD 850 EVO M.2 250GB which is currently in use as a local Datastore. Warranty for this device is 75TBW. Just mentioning that this is a consumer grade SSD. The lowest Endurance Class for Virtual SAN for exmaple starts at 365TBW.

ESXCLI can display S.M.A.R.T stats with
esxcli storage core device smart get -d [device]

# esxcli storage core device smart get -d t10.ATA_____Samsung_SSD_850_EVO_M.2_250GB___________S24BNXAG805065D_____
Parameter                     Value  Threshold  Worst
----------------------------  -----  ---------  -----
Health Status                 OK     N/A        N/A
Media Wearout Indicator       N/A    N/A        N/A
Write Error Count             N/A    N/A        N/A
Read Error Count              N/A    N/A        N/A
Power-on Hours                99     0          99
Power Cycle Count             99     0          99
Reallocated Sector Count      100    10         100
Raw Read Error Rate           N/A    N/A        N/A
Drive Temperature             N/A    N/A        N/A
Driver Rated Max Temperature  49     0          34
Write Sectors TOT Count       100    0          100
Read Sectors TOT Count        N/A    N/A        N/A
Initial Bad Block Count       N/A    N/A        N/A

What do these values mean? Actually only that the drive is "healthy". It does not provide the information we are looking for. ESXi also keeps track fo the health status with the smartd and writes the status to /var/log/syslog.log like in the following example:

2016-05-18T14:54:23Z smartd: [warn] t10.ATA_____ST9500325AS_________________________________________S2WB2XXB: above TEMPERATURE threshold (40 > 0)

ESXCLI can also display device stats, which are very close to what we are looking for:

# esxcli storage core device stats get -d t10.ATA_____Samsung_SSD_850_EVO_M.2_250GB___________S24BNXAG805065D_____
t10.ATA_____Samsung_SSD_850_EVO_M.2_250GB___________S24BNXAG805065D_____
   Device: t10.ATA_____Samsung_SSD_850_EVO_M.2_250GB___________S24BNXAG805065D_____
   Successful Commands: 93483233
   Blocks Read: 205579211
   Blocks Written: 2123298938
   Read Operations: 3240880
   Write Operations: 90144369
   Reserve Operations: 39107
   Reservation Conflicts: 0
   Failed Commands: 22
   Failed Blocks Read: 0
   Failed Blocks Written: 0
   Failed Read Operations: 0
   Failed Write Operations: 0
   Failed Reserve Operations: 0

ESXi keeps track of all read and write operations to the disk. These counters are reset when ESXi is rebooted. So this does not help to determine wear leveling either.

And here comes smartctl into play:

# smartctl -d sat --all /dev/disks/t10.ATA_____Samsung_SSD_850_EVO_M.2_250GB___________S24BNXAG805065D_____
smartctl 6.6 2016-05-10 r4321 [x86_64-linux-6.0.0] (daily-20160510)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org


=== START OF INFORMATION SECTION ===
Model Family:     Samsung based SSDs
Device Model:     Samsung SSD 850 EVO M.2 250GB
Serial Number:    S24BNXAG805065D
LU WWN Device Id: 5 002538 d404b9f9f
Firmware Version: EMT21B6Q
User Capacity:    250,059,350,016 bytes [250 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2, ATA8-ACS T13/1699-D revision 4c
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Wed May 16 15:25:26 2016 UTC
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
[...]
SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  9 Power_On_Hours          0x0032   099   099   000    Old_age   Always       -       5039
 12 Power_Cycle_Count       0x0032   099   099   000    Old_age   Always       -       35
177 Wear_Leveling_Count     0x0013   094   094   000    Pre-fail  Always       -       122
179 Used_Rsvd_Blk_Cnt_Tot   0x0013   100   100   010    Pre-fail  Always       -       0
181 Program_Fail_Cnt_Total  0x0032   100   100   010    Old_age   Always       -       0
182 Erase_Fail_Count_Total  0x0032   100   100   010    Old_age   Always       -       0
183 Runtime_Bad_Block       0x0013   100   100   010    Pre-fail  Always       -       0
187 Uncorrectable_Error_Cnt 0x0032   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0032   049   034   000    Old_age   Always       -       51
195 ECC_Error_Rate          0x001a   200   200   000    Old_age   Always       -       0
199 CRC_Error_Count         0x003e   100   100   000    Old_age   Always       -       0
235 POR_Recovery_Count      0x0012   099   099   000    Old_age   Always       -       26
241 Total_LBAs_Written      0x0032   099   099   000    Old_age   Always       -       6343034492

In the SMART Attributes section, we can find with the ID #241 our Total_LBAs_Written value. This value needs to be multiplied with the sector size which is 512 bytes and divided by 1099511627776 (1024^4) to get Terabytes.

Total_LBAs_Written * Sector Size / 1024^4 = TBW

6343034492 * 512 / 1099511627776 = 2.95 TBW

I've used 3 TBW from my guaranteed 75 TBW. According to Power_On_Hours, which can be found in SMART ID #9, the device is in use since about 200 days (24/7 online of course). Guess I have another 13 years to go...

This also proves that the value in "esxcli storage core device stats get" is wrong, respectively only counted since the last reboot. Blocks written according to this command is 2123298938 which results in about 1TB.

How to get smartctl
!!! Please note that the use of this VIB is absolutely unsupported. Use at your own risk !!!
I've tested the package with ESXi 6.0 only

  1. Download smartctl-6.6-4321.x86_64.vib
  2. Copy the VIB to the /tmp/ directory of an ESXi host
  3. SSH to the ESXi host
  4. Set the VIB acceptance level to CommunitySupported
    # esxcli software acceptance set --level=CommunitySupported
  5. Install the package (Maintenance Mode or Reboot is not required)
    #esxcli software vib install -v /tmp/smartctl-6.6-4321.x86_64.vib

The tool is located at /opt/smartmontools/smartctl and works just like the Linux version.
Locate physical disks with ls -l /dev/disks/

/opt/smartmontools/smartctl -d [Device Type] --all /dev/disks/[DISK]

# smartctl -d sat --all /dev/disks/t10.ATA_____Samsung_SSD_850_EVO_M.2_250GB___________S24BNXAG805065D_____
smartctl 6.6 2016-05-10 r4321 [x86_64-linux-6.0.0] (daily-20160510)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Samsung based SSDs
Device Model:     Samsung SSD 850 EVO M.2 250GB
Serial Number:    S24BNXAG805065D
LU WWN Device Id: 5 002538 d404b9f9f
Firmware Version: EMT21B6Q
User Capacity:    250,059,350,016 bytes [250 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2, ATA8-ACS T13/1699-D revision 4c
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART Status not supported: Incomplete response, ATA output registers missing
SMART overall-health self-assessment test result: PASSED
Warning: This result is based on an Attribute check.

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (    0) seconds.
Offline data collection
capabilities:                    (0x53) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        No Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        ( 133) minutes.
SCT capabilities:              (0x003d) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  9 Power_On_Hours          0x0032   099   099   000    Old_age   Always       -       5040
 12 Power_Cycle_Count       0x0032   099   099   000    Old_age   Always       -       35
177 Wear_Leveling_Count     0x0013   094   094   000    Pre-fail  Always       -       122
179 Used_Rsvd_Blk_Cnt_Tot   0x0013   100   100   010    Pre-fail  Always       -       0
181 Program_Fail_Cnt_Total  0x0032   100   100   010    Old_age   Always       -       0
182 Erase_Fail_Count_Total  0x0032   100   100   010    Old_age   Always       -       0
183 Runtime_Bad_Block       0x0013   100   100   010    Pre-fail  Always       -       0
187 Uncorrectable_Error_Cnt 0x0032   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0032   049   034   000    Old_age   Always       -       51
195 ECC_Error_Rate          0x001a   200   200   000    Old_age   Always       -       0
199 CRC_Error_Count         0x003e   100   100   000    Old_age   Always       -       0
235 POR_Recovery_Count      0x0012   099   099   000    Old_age   Always       -       26
241 Total_LBAs_Written      0x0032   099   099   000    Old_age   Always       -       6345601655

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

Warning! SMART Selective Self-Test Log Structure error: invalid SMART checksum.
SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
  255        0    65535  Read_scanning was never started
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

44 thoughts on “Determine TBW from SSDs with S.M.A.R.T Values in ESXi (smartctl)”

  1. Looks like it doesnt work properly for SM951 NVMe device:

    [root@esxi:/tmp] /opt/smartmontools/smartctl -d nvme --all /dev/disks/t10.NVMe____SAMSUNG_MZVPV512HDGL2D00000______________xxxxxxxxxxxxxx______00000001
    smartctl 6.6 2016-05-10 r4321 [x86_64-linux-6.0.0] (daily-20160510)
    Copyright (C) 2002-16, Bruce Allen, Christian Franke, http://www.smartmontools.org

    Read NVMe Identify Controller failed: NVME_IOCTL_ADMIN_CMD: Function not implemented

    I tried a scan but that failed:

    [root@esxi:/tmp] /opt/smartmontools/smartctl --scan
    Segmentation fault

    1. smartctl 6.6 2016-05-10 r4321 [x86_64-linux-5.5.0] (daily-20160510)
      Copyright (C) 2002-16, Bruce Allen, Christian Franke, http://www.smartmontools.org

      Smartctl open device: /dev/disks/naa.600605b00516c6971f219afb0f3cd956 [megaraid_disk_01] [SAT] failed: can't get bus number

  2. Just out of curiosity what did you use for a compile environment for the static smartctl?
    I'm currently using an aged CentOS 3.9 and was wondering if something newer was valid.
    I've experimented with a couple other options but always seem to go back to that one for one reason or another.
    (The latest addition to my custom local vib is a static version of whiptail for an experimental frontend to ghettoVCB)

  3. tanks you for smartctl I install on all my esxi
    (I find 3 Disk HS !!)

    I would like to chek Disk after a megaraid

    /opt/lsi/storcli/storcli -CfgDsply -a0 | grep "Device Id\|DISK"
    Number of DISK GROUPS: 1
    DISK GROUP: 0
    Device Id: 5
    Device Id: 4

    /opt/smartmontools/smartctl -d sat+megaraid,5 -a /dev/disks/naa.600605b006eb32f01a806e721f93a9a4
    smartctl 6.6 2016-05-10 r4321 [x86_64-linux-6.0.0] (daily-20160510)
    Copyright (C) 2002-16, Bruce Allen, Christian Franke, http://www.smartmontools.org

    Smartctl open device: /dev/disks/naa.600605b006eb32f01a806e721f93a9a4 [megaraid_disk_05] [SAT] failed: can't get bus number

    http://guides.ovh.com/LsiMegaraid remplacer MegaCli par storcli

    All the best

      1. So on my regular no ESXI hosts I use smartctl to check drive health behind raid controllers like so.

        smarctl -a -d sat+megaraid,$deviceid /dev/sd[a-d]

  4. Pingback: SSD Total Bytes Written Calculator | Virten.net

  5. esxi 6.5

    Installation Result
    Message: The update completed successfully, but the system needs to be rebooted for the changes to be effective.
    Reboot Required: true
    VIBs Installed: smartmontools_bootbank_smartctl_6.6-4321
    VIBs Removed:
    VIBs Skipped:

    can't find smartctl, there is no /opt/smartmontools

  6. smartctl is not working properly for NVMe drives, can you please let us know any alternative command do we have to get smart details for NVMe drive in esxcli.

    [root@esxi:/tmp] /opt/smartmontools/smartctl -d nvme --all /dev/disks/t10.NVMe___MZVPV512HDGL2D00000______________xxxxxxxxxxxxxx______00000001
    smartctl 6.6 2016-05-10 r4321 [x86_64-linux-6.0.0] (daily-20160510)
    Copyright (C) 2002-16, Bruce Allen, Christian Franke, http://www.smartmontools.org

    Read NVMe Identify Controller failed: NVME_IOCTL_ADMIN_CMD: Function not implemented

  7. Hi,
    First of all, congrats on the great work done on this, as well as your entire blog. I found it useful numerous times.

    On the topic - it's worth mentioning that most vendors nowadays report the Total LBAs Written in 32MB Blocks. It's still the same Attribute with ID# 241. However, the calculation will be:
    * 32 / 1024^2.
    I did some additional testing and got this proactively reported to Zabbix via simple triggers and zabbix trapper in the latest 3.4 release. If interested, drop me a line.

    Cheers.

  8. It does not seem to work with ESX 6.7 (properly), it has a bunch of unknown attributes...

    /opt/smartmontools/smartctl -d sat --all /dev/disks/t10.ATA_____INTEL_SSDSC2BW180A3L__________________00_CVCV224003TC180EGN__
    smartctl 6.6 2016-05-10 r4321 [x86_64-linux-6.7.0] (daily-20160510)
    Copyright (C) 2002-16, Bruce Allen, Christian Franke, http://www.smartmontools.org

    === START OF INFORMATION SECTION ===
    Device Model: INTEL SSDSC2BW180A3L
    Serial Number: CVCV224003TC180EGN
    LU WWN Device Id: 5 001517 bb296e50b
    Firmware Version: LE1i
    User Capacity: 180,045,766,656 bytes [180 GB]
    Sector Size: 512 bytes logical/physical
    Rotation Rate: Solid State Device
    Device is: Not in smartctl database [for details use: -P showall]
    ATA Version is: ACS-2 (minor revision not indicated)
    SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
    Local Time is: Thu Jul 26 08:43:24 2018 UTC
    SMART support is: Available - device has SMART capability.
    SMART support is: Enabled

    === START OF READ SMART DATA SECTION ===
    SMART Status not supported: Incomplete response, ATA output registers missing
    SMART overall-health self-assessment test result: PASSED
    Warning: This result is based on an Attribute check.

    General SMART Values:
    Offline data collection status: (0x00) Offline data collection activity
    was never started.
    Auto Offline Data Collection: Disabled.
    Self-test execution status: ( 0) The previous self-test routine completed
    without error or no self-test has ever
    been run.
    Total time to complete Offline
    data collection: ( 0) seconds.
    Offline data collection
    capabilities: (0x7f) SMART execute Offline immediate.
    Auto Offline data collection on/off support.
    Abort Offline collection upon new
    command.
    Offline surface scan supported.
    Self-test supported.
    Conveyance Self-test supported.
    Selective Self-test supported.
    SMART capabilities: (0x0003) Saves SMART data before entering
    power-saving mode.
    Supports SMART auto save timer.
    Error logging capability: (0x01) Error logging supported.
    General Purpose Logging supported.
    Short self-test routine
    recommended polling time: ( 1) minutes.
    Extended self-test routine
    recommended polling time: ( 48) minutes.
    Conveyance self-test routine
    recommended polling time: ( 2) minutes.
    SCT capabilities: (0x0021) SCT Status supported.
    SCT Data Table supported.

    SMART Attributes Data Structure revision number: 10
    Vendor Specific SMART Attributes with Thresholds:
    ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
    5 Reallocated_Sector_Ct 0x0032 100 100 000 Old_age Always - 0
    9 Power_On_Hours 0x0032 067 067 000 Old_age Always - 29774 (38 130 0)
    12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 698
    170 Unknown_Attribute 0x0033 100 100 010 Pre-fail Always - 0
    171 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0
    172 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0
    174 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 673
    183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 2
    184 End-to-End_Error 0x0033 100 100 097 Pre-fail Always - 0
    187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
    192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 673
    199 UDMA_CRC_Error_Count 0x0032 100 100 000 Old_age Always - 0
    225 Unknown_SSD_Attribute 0x0032 100 100 000 Old_age Always - 937268
    226 Unknown_SSD_Attribute 0x0032 100 100 000 Old_age Always - 65535
    227 Unknown_SSD_Attribute 0x0032 100 100 000 Old_age Always - 67
    228 Power-off_Retract_Count 0x0032 100 100 000 Old_age Always - 65535
    232 Available_Reservd_Space 0x0033 100 100 010 Pre-fail Always - 0
    233 Media_Wearout_Indicator 0x0032 100 100 000 Old_age Always - 0
    241 Total_LBAs_Written 0x0032 100 100 000 Old_age Always - 937268
    242 Total_LBAs_Read 0x0032 100 100 000 Old_age Always - 1984387
    249 Unknown_Attribute 0x0013 100 100 000 Pre-fail Always - 22056

    SMART Error Log not supported

    SMART Self-test Log not supported

    SMART Selective self-test log data structure revision number 0
    Note: revision number not 1 implies that no selective self-test has ever been run
    SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
    1 0 0 Not_testing
    2 0 0 Not_testing
    3 0 0 Not_testing
    4 0 0 Not_testing
    5 0 0 Not_testing
    Selective self-test flags (0x0):
    After scanning selected spans, do NOT read-scan remainder of disk.
    If Selective self-test is pending on power-up, resume after 0 minute delay.

  9. My esxi is set to use efi/secure boot. Am I correct that if acceptance level is set to community, esxi host will not boot with secure mode enabled?

  10. Hi, works great!

    How can we „bribe“ you to build a new version of the Tools with NVMe support? The built in commands even in 6.7 still don‘t cut it.

  11. Looks like it does not work with Samsung SSD NVMe.
    [:/opt/smartmontools] ./smartctl -d nvme -H /dev/disks/t10.NVMe____SAMSUNG_MZPLL3T2HAJQ2D00005______________S4CCNA0M800098______00000001

    smartctl 6.6 2016-05-10 r4321 [x86_64-linux-6.7.0] (daily-20160510)
    Copyright (C) 2002-16, Bruce Allen, Christian Franke, http://www.smartmontools.org

    Read NVMe Identify Controller failed: NVME_IOCTL_ADMIN_CMD: Function not implemented
    [:/opt/smartmontools] ./smartctl -d nvme --all /dev/disks/t10.NVMe____SAMSUNG_MZPLL3T2HAJQ2D00005______________S4CCNA0M800098______00000001

    smartctl 6.6 2016-05-10 r4321 [x86_64-linux-6.7.0] (daily-20160510)
    Copyright (C) 2002-16, Bruce Allen, Christian Franke, http://www.smartmontools.org

    Read NVMe Identify Controller failed: NVME_IOCTL_ADMIN_CMD: Function not implemented
    [:/opt/smartmontools] ./smartctl -d nvme -x /dev/disks/t10.NVMe____SAMSUNG_MZPLL3T2HAJQ2D00005______________S4CCNA0M800098______00000001

    smartctl 6.6 2016-05-10 r4321 [x86_64-linux-6.7.0] (daily-20160510)
    Copyright (C) 2002-16, Bruce Allen, Christian Franke, http://www.smartmontools.org

    Read NVMe Identify Controller failed: NVME_IOCTL_ADMIN_CMD: Function not implemented

  12. I've found another way to get NVMe S.M.A.R.T.:
    # get adapter name of storage device:
    esxcfg-scsidevs --hba-device-list

    # getting S.M.A.R.T.:
    esxcli nvme device log smart get -A vmhba1

    # getting raw data:
    # Data Units Written: 0x35074e

    # Converting raw data from hex to dec and calculating:
    # "Data Units Written" * 512 / 1048576 = X.XX GBW

    1. Alternatively, instead of doing * 512 / 1048576, you can simply do * 2048 to reduce the amount of math. This comment really saved me for getting NVMe SMART data info on ESXi though.

    2. Thank you! Here is my result in ESXi 6.7 with an ADATA NVMe drive:

      [root@esxi:~] esxcli nvme device log smart get -A vmhba2
      SMART And Health Info:
      Available Spare Space Below Threshold: false
      Temperature Warning: false
      NVM Subsystem Reliability Degradation: false
      Read Only Mode: false
      Volatile Memory Backup Device Failure: false
      Composite Temperature: 301 K
      Available Spare: 100 %
      Available Spare Threshold: 10 %
      Percentage Used: 18 %
      Data Units Read: 0x5cc257c
      Data Units Written: 0x4dd3bbb
      Host Read Commands: 0x4a312a2b
      Host Write Commands: 0x6d749ad1
      Controller Busy Time: 0x18036
      Power Cycles: 0xaf
      Power On Hours: 0x5e4f
      Unsafe Shutdowns: 0x4f
      Media Errors: 0x60
      Number of Error Info Log Entries: 0x0
      Warning Composite Temperature Time: 0 Mins
      Critical Composite Temperature Time: 0 Mins

      So, Data Units Written: 0x4dd3bbb in Hex is 81607611 in Dec
      Totol write is 81607611*512/1048576/1024 = 38.91 TB

  13. Hi, nice work. I'd like an update to the latest version to support json output.
    I have currently compiled the latest version only it doesn't read values.

    Can you share how you compiled it? Currently I've only done ./configure LDFLAGS="-static"

    1. FYI, I am getting "Function not implemented"

      [root@esx01:/tmp] ./smartctl -d ata -x /dev/disks/t10.ATA__DISK
      smartctl 7.1 2019-12-30 r5022 [x86_64-linux-6.5.0] (local build)
      Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

      Smartctl open device: /dev/disks/t10.ATA__DISKfailed: Function not implemented

  14. Hi @fgrehl.

    Smartmontools 7.1 was released 2019-12-30; any chance you could update the vib?

    Or/also, could you please give the steps how you ported to/compiled the Smartmontools source for ESXi as a vib?

    Many Thanks,
    Pin

    1. I would be very interested in that too - since esxi 6.7 my scheduled self-tests wont work anymore.
      Would be nice to know, how to port it from source files ...

  15. Hi

    Hope you can help how do i get smartools working for aacraid card please?

    aacraid,H,L,ID, but not sure what command to run.

    Any help appreciated
    Many Thanks AP

  16. Hi!
    Small oneliner sh script to print TBW for all disk (SATA), smartctl must be installed:

    disk=0; for i in `ls /dev/disks/ -1A | grep -v : | grep t10`; do let "disk++" ; echo -e "\e[4mDisk n°$disk\e[0m"; echo $i | sed -r "s/_{2,}/ /g" | awk {'printf "Model => " $2 "\nSerial => " $3 "\nTBW => "'};
    var=`/opt/smartmontools/smartctl -d sat --all /dev/disks/$i|grep Total_LBAs_Written | awk {'print $10'}`; [[ -z $var ]] && echo -e "\x1B[31mNaN\e[0m" || echo -e "\x1B[31m$(( $var * 512 / 1099511627776 )) TB\e[0m\n"; done

    Script output :
    Disk n°1
    Model => CT240BX500SSD1
    Serial => 2002E3E142FA
    TBW => NaN
    Disk n°2
    Model => Samsung_SSD_870_QVO_1TB
    Serial => S5SVNF0NC00481A
    TBW => 0 TB

    Disk n°3
    Model => Samsung_SSD_870_QVO_1TB
    Serial => S5SVNF0NC00494A
    TBW => 0 TB

    Disk n°4
    Model => Samsung_SSD_870_QVO_1TB
    Serial => S5SVNF0NC00511D
    TBW => 0 TB

    Disk n°5
    Model => Samsung_SSD_870_QVO_1TB
    Serial => S5SVNF0NC00514B
    TBW => 16 TB

  17. Thank you for this post. I have used the details in the past with great success. Unfortunately the link to smartctl-6.6-4321.x86_64.vib now appears to be broken and I do not have a copy. Would it be possible to fix the link please, when it is convenient?

    Thank you, J

  18. I used this solution for a few years with ESXi 7 on my homeserver. I read the values in PRTG Network Monitor. This way i could track the health of my SSD. It actually worked, because the SSD broke last week and i got informed that the SMART values were decreasing.

    I bought a new server, which is running ESXi 8. But the VIB from this blog isn't working anymore:
    [ProfileValidationError]
    In ImageProfile (Updated) ESXi-8.0U1-21495797-standard, the payload(s) in VIB smartmontools_bootbank_smartctl_6.6-4321 does not have sha-256 gunzip checksum. This will prevent VIB security verification and secure boot from functioning properly. Please remove this VIB or please check with your vendor for a replacement of this VIB
    Please refer to the log file for more details.

    Any way to make this work in 2023?

Leave a Reply to fgrehl Cancel reply

Your email address will not be published. Required fields are marked *