I was wondering if it possible to speed up my Intel NUC based ESXi with Virtual SAN. The idea is that compared against vSphere Flash Read Cache, Virtual SAN can use the SSD not only as read cache but also as write buffer. This post explains how you can create a Virtual SAN Datastore on a single ESXi host from the command-line without a vCenter Server.
It goes without saying that this is neither the idea behind Virtual SAN nor officially supported by VMware. It also violates VMware's EULA if you are running Virtual SAN without a VSAN license. To assign a licence you need a vCenter Server and wrap the single ESXi into a Cluster.
My configuration for this test:
- 5th Gen Intel NUC NUC5i5MYHE
- 32GB Memory
- SSD: 250GB Samsung 850 Evo M.2
- HDD: 2.5" 1TB 5400rpm
- vSphere ESXi 6.0
The guide starts with a fresh ESXi installation in evaluation mode and I will not cover licensing. According to this post it also works with the free ESXi Hypervisor license, but to be compliant with VMware's EULA you have to add the host to a vCenter Server and assign a Virtual SAN license.
First step is to enable VSAN traffic for the VMkernel interface
~ # esxcli vsan network ipv4 add -i vmk0
Next step is to identify the devices to be used for VSAN. Both, SSD and HDD, must not contain any data or partitions. The easiest way to identify devices is with the vdq -q command.
In this example, the HDD already contains a VMFS partition:
~ # vdq -q [ { "Name" : "t10.ATA_Samsung_SSD_850_EVO_M.2_250GB", "VSANUUID" : "", "State" : "Eligible for use by VSAN", "ChecksumSupport": "0", "Reason" : "None", "IsSSD" : "1", "IsCapacityFlash": "0", "IsPDL" : "0", }, { "Name" : "t10.ATA_ST9100AS", "VSANUUID" : "", "State" : "Ineligible for use by VSAN", "ChecksumSupport": "0", "Reason" : "Has partitions", "IsSSD" : "0", "IsCapacityFlash": "0", "IsPDL" : "0", }, ]
Use partedUtil to identify and delete all partions on ineligible drives. The partedUtil get command displays one partition (marked red) which can be deleted with the partedUtil delete command.
~ # partedUtil get /dev/disks/t10.ATA_ST9100AS 60801 255 63 976773168 1 2048 976768064 0 0 ~ # partedUtil delete /dev/disks/t10.ATA_ST9100AS 1
Next step is to add physical disks for VSAN usage. VSAN requires one SSD and one, or multiple HDDs. Use the esxcli vsan storage add command with the options -d HDD1 -d HDD2 -s SDD
~ # esxcli vsan storage add -d t10.ATA_ST9100AS -s t10.ATA_Samsung_SSD_850_EVO_M.2_250GB
To enable VSAN on the host simply create a new Cluster.
~ # esxcli vsan cluster new
Now you should see the vsanDatastore, but creating Virtual Machines fails with the following error message:
Create virtual machine
Cannot complete file creation operation.
This is caused by a missing redundancy defined in default policies. To fix this, change the default policies to hostFailuresToTolerate = 0 and forceProvisioning = 1.
~ # esxcli vsan policy setdefault -c cluster -p "((\"hostFailuresToTolerate\" i0) (\"forceProvisioning\" i1))" ~ # esxcli vsan policy setdefault -c vdisk -p "((\"hostFailuresToTolerate\" i0) (\"forceProvisioning\" i1))" ~ # esxcli vsan policy setdefault -c vmnamespace -p "((\"hostFailuresToTolerate\" i0) (\"forceProvisioning\" i1))" ~ # esxcli vsan policy setdefault -c vmswap -p "((\"hostFailuresToTolerate\" i0) (\"forceProvisioning\" i1))" ~ # esxcli vsan policy setdefault -c vmem -p "((\"hostFailuresToTolerate\" i0) (\"forceProvisioning\" i1))"
You can verify the default policy with the following command:
~ # esxcli vsan policy getdefault
That's it. You can now use the Virtual SAN datastore on your single ESXi host.
~ # esxcli vsan cluster get Cluster Information Enabled: true Current Local Time: 2015-11-09T10:26:45Z Local Node UUID: 563fc008-ab73-2e34-0075-005056bdaaf9 Local Node State: MASTER Local Node Health State: HEALTHY Sub-Cluster Master UUID: 563fc008-ab73-2e34-0075-005056bdaaf9 Sub-Cluster Backup UUID: Sub-Cluster UUID: 520afa76-164d-bab1-14ac-3cddab7a8570 Sub-Cluster Membership Entry Revision: 0 Sub-Cluster Member UUIDs: 563fc008-ab73-2e34-0075-005056bdaaf9 Sub-Cluster Membership UUID: db744056-5f49-77a1-7f1d-005056bdaaf9
~ # esxcli vsan storage list t10.ATA_ST9100AS Device: t10.ATA_ST9100AS Display Name: t10.ATA_ST9100AS Is SSD: false VSAN UUID: 5272476c-b445-4623-73f5-e2cd2a6265fd VSAN Disk Group UUID: 52b7be05-b875-a93e-24bc-62c2210948d7 VSAN Disk Group Name: t10.ATA_Samsung_SSD_850_EVO_M.2_250GB Used by this host: true In CMMDS: false Checksum: 10071299334450500695 Checksum OK: true Emulated DIX/DIF Enabled: false t10.ATA_Samsung_SSD_850_EVO_M.2_250GB Device: t10.ATA_Samsung_SSD_850_EVO_M.2_250GB Display Name: t10.ATA_Samsung_SSD_850_EVO_M.2_250GB Is SSD: true VSAN UUID: 52b7be05-b875-a93e-24bc-62c2210948d7 VSAN Disk Group UUID: 52b7be05-b875-a93e-24bc-62c2210948d7 VSAN Disk Group Name: t10.ATA_Samsung_SSD_850_EVO_M.2_250GB Used by this host: true In CMMDS: false Checksum: 3233630865690359925 Checksum OK: true Emulated DIX/DIF Enabled: false
I've done some basic performance tests with a datastore directely on the HDD, and with the VSAN datastore. Here are the results:
The test was the Exchange 2007 Workload from VMware I/O Analyzer (8k 55%Read 80%Random).
VMFS Datastore on local HDD
IOPS: 226
Read IOPS: 125
Write IOPS: 101
MBPS: 1.77
Read MBPS: 0.98
Write MBPS: 0.79
Single-Node VSAN Datastore
IOPS: 5294
Read IOPS: 2911
Write IOPS: 2382
MBPS: 41.36
Read MBPS: 22.75
Write MBPS: 18.61
May want to put in the post title 'non-nested', as I was expecting you'd be just nesting some ESX hosts and presenting the storage upwards. Nice post!
Not sure if you've seen my bootstrapping VSAN article which starts from a single VSAN Node?
http://www.virtuallyghetto.com/2013/09/how-to-bootstrap-vcenter-server-onto_9.html
If so, it would be great if you could give credit and source the original article :)
There is a link to your article where you are doing something similar with the free ESXi. I've seen it while searching for an answer to the license questions. I didn't know that bootstrapping article because we never had this problem as we are not running the vCenter in the VSAN itself ;-)
Curious to see the results from VMware I/O Analyzer for a VMFS Datastore on local SSD M.2 250GB Evo 850, just so we can compare the overhead of VSAN.
IOPS: 30004
Read IOPS: 16497
Write IOPS: 13507
MBPS: 234
Read MBPS: 128
Write MBPS: 105
Nice post.... Me too I was expecting that you'll be "nesting" the complete VSAN solution, for learning purposes. And actually, this makes me wonder (and gives you an idea for a follow-up post perhaps) if All-Flash VSAN would work the same way... ? -:)... Let say you replace the 5400 rpm drive with a SSD...
My nested VSAN post is about 2 years old ;-) http://www.virten.net/2013/12/vsan-lab-with-vmware-workstation-10/
Doing the same with an SSD as MD produces this error:
Unable to add device: Disk: [DISK] is an SSD and can not be used as a VSAN HDD
Seems not to work with All-Flash VSAN ;-)
Hi Florian Grehl,
Thank you for the informative post. what is the performance like - as I have heard that vsan + ahci / mobo ports aren't performing good and it get slow?
I am interested in setting up a three node vsan utilizing AHCI / MOBO SATA Ports.
Kind regards
RIhatum
There are some quick performance values in the article. I think it's hard to give an actual statemant about the performance. It's an unsupported configuration of course and you can't compare the performance against a supported enterprise-grade server.
The performance with VSAN (1 SSD + 1 HDD) is faster than against the HDD itself. I think the performance is "OK".
There are some critical arguments against VSAN with AHCI. There was a bug in 5.5 which has been resolved in 5.5 U2.
You need to tag the disks as capacityFlash:
esxcli vsan storage tag add -t capacityFlash -d "DISK_HERE"
Don't tag the cache disk as capacityFlash.
Very nice post, thanks for it!
Can you expect "some" redundancy with such a single node setup ? What if an HDD fails, for instance ?
You can't build someting similar to a RAID with a Single Node VSAN. You can stripe across drives for more performance, but you can't have redundancy. If a disk fails, the disk group is down.
Is it just me? I can't get this to work with 6.0 U2. U1b works just fine.
Have not tried it with 6.0u2. Do you have an error message? Which step does not work?
Everything goes fine until I actually try to access the datastore. As if I did not change the redundancy policy.
Just veryfied. Does not work for me either.
Call "FileManager.MakeDirectory" for object "ha-nfc-file-manager" on ESXi failed.
Operation failed, diagnostics report: Failed to create directory test (Cannot Create File)
Upgraded from U1b --> U2 and still working ok.
Idem for me: Clean install of 6.0U2, and after complete all tasks I can't do a simple "mkdir temp" in the vsan datastore.
Please, if someone can help to support 6.0U2!
I have created an updated guide for creating a single node VSAN and I am using vSphere 6.0 U2...works flawlessly. Hope this helps others. Kudos to virten! https://ithinkvirtual.com/2016/05/02/creating-a-single-node-vsan/
mkdir doesn't work against VSAN datastores. The VSAN datastore, as an object store, is not a filesystem. A lot of the filesystem-related things don't work because there isn't really any such thing as a "directory" or a "file" with VSAN.
All "directories" on VSAN are actually special-purpose objects that are subsequently formatted with a VMFS variant to provide a real filesystem for small files (like VMX config files, VMDK descriptors, etc.).
If you want to create a directory in the VSAN datastore, you actually have to create a VSAN namespace object, and it is a different command:
# /usr/lib/vmware/osfs/bin/osfs-mkdir /vmfs/volumes/vsan\:*/
Once that is created, you can then run mkdir for subdirectories, or touch files, or whatever else because the namespace object is formatted with a filesystem.
In the single-node case, the default policy for namespace objects will need to be set to hostFailuresToTolerate=0 or forcedProvisioning=1 (as outlined in the article above).
I hope this helps, as late to the party as it is!
@fgrehl - the creation of the vsandatastore works for me following your method, and I can browse the datastore, but I cannot create any machines or svmotion anything onto it. I am also running v6.0 Update 2
**update** I managed to resolve my issue and created n updated guide. See link in post/reply above
Thank you for sharing, I’m trying to setup a single node VSAN cluster for my home lab. Should I go with 1x240GB or 2x240GB or 1x480GB disk for the SSD layer.? I will also have 3x1TB HDD.
It works ! THX !
Have tried this on 6.7 and seems i just get errors trying to migrate or create a new VM on the vSAN system:
This storage policy requires at least 3 fault domains contributing storage but only 1 were found.
My policies are set though. did VMware change something in vSAN 6.7?
cluster (("stripeWidth" i3) ("hostFailuresToTolerate" i0) ("forceProvisioning" i1))
vdisk (("stripeWidth" i3) ("hostFailuresToTolerate" i0) ("forceProvisioning" i1))
vmnamespace (("stripeWidth" i3) ("hostFailuresToTolerate" i0) ("forceProvisioning" i1))
vmswap (("stripeWidth" i1) ("hostFailuresToTolerate" i0) ("forceProvisioning" i1))
vmem (("stripeWidth" i3) ("hostFailuresToTolerate" i0) ("forceProvisioning" i1))
I melt the same problem in 7.0.
So is there any one know the solution? thx a lo.t
I just checked and edited the vSAN default storage policy & it's OK now.
Please follow this document:
https://ithinkvirtual.com/2016/05/02/creating-a-single-node-vsan/
Clone the vSAN default storage policy & set the following items:
Number of failures to tolerate = 0 (Default is 1)
Force provisioning = Yes (Default is No)
Set the vSANdatastore default storage policy to the new cloned policy.
Thanks for this tutorial. This worked fine for me on a Dell R730XD with 24 disks, using ESXi 6.5.0 Update 3 (Build 13932383). Wanted to do a POC before purchasing vSAN license for 3 node cluster. Each R730 will have 20 1TB HHD and 4 400GB SSD's. These were originally Simplivity Cubes that are now EOL, but the hardware is fine for a test cluster using vSAN. All the commands worked as you stated. I just set the PERC to HBA mode and ESX can see all the drives.
Thanks!!!!
is this working on esxi 8.0 OSA or/and ESA?
how to increase performance deduplication and compression?
is it possible to allocate more memory and cores to improve performance?