Hypervisor Install and Configuration

Hypervisor Install and Configuration

From here on out I intend to basically break apart my posts based on which VM the issues I am working through are related to.  However, there are a few issues that aren’t related to VMs so much as the Hypervisor itself.  For those issues I will address them here. 

License:

With the decision of which Hypervisor to go with made (or at least researched, before purchase is a decision really made?), I did a little more research on VMWare licensing.  I am very preferential to the free version of VMWare ESXI.  In the past it was reasonably featured, and handled my basic cases.  However, it was worth looking into what VMWare’s current offerings look like. 

I am going to say this, VMWare’s licensing scheme is downright byzantine.  I have spent hours looking into it, and found old licensing information [1], half comparisons [2][3] (they only include certain license types and never between them), links to old licensing documents [4] (current vSphere is 7.0, the document references 6), videos that do not actually cover licensing at all titled as though they do [5], blogs that seem to contradict what VMWare’s website says (and I trust the blogger more, that says something) [6] [7] [8] vs [9] [10], and more.

What exactly does VMWare consider ESXI if it doesn’t include certain APIs?  What does it include?  Where can I get that list?  This is a mess of epic proportions.  I can’t get answers to basic questions.  Honestly, I want to go short VMWare right now.  The fact that literally nobody in all of the VMWares sales or marketing teams thinks it is important to have a clear, well documented comparison of what their offerings even are is a sign of mismanagement.

My best guess as to what the actual offerings are is that there is a free version of ESXI. By ESXI I mean just the bare hypervisor, nothing else, no API access, no backup, nothing.  Like the ability to spawn a VM and that is it.  Even then it might be hardware resource limited, which was contradicted by an enterprise sales director friend of mine (not a VMware sales director). 

I genuinely have no idea how anyone is expected to manage this.  I strongly suspect this is intentional.  They want people to contact a sales rep to have it explained to them.  However, having been in several meetings with sales and marketing professionally, if there isn’t a clear concise explanation available to me, the potential customer, it is at best 50-50 as to whether any person I talk to even knows what their own offerings are.  They are just really good at speaking while sounding certain. 

I think I need the essentials license, I hope that includes access to the storage APIs as one of the blogs and forum posts suggest [11], because that will make backup a lot easier.  I really don’t like managing a backup service on each VM.  This is a bit of a change from the last time I looked into this.  I purchased the Essentials license to get things working.  I hope I do not conclude that this is a mistake.  

Install:

With that out of the way, I went forward with the install.  This is a minor point, but what I got from VMWare is a downloadable ISO.  I used Rufus [12] to create an install USB drive.  I only really mention this in case someone isn’t familiar with Rufus, or a comparable application, that makes bootable USB drives.

Enable SVM Mode and SMEE

The next thing that needs to be done is bios configuration.  For my case there are 3 things that I did.  First I turned on CPU SVM mode.  This enables processor virtualization on AMD CPUs.  It is the equivalent of VT-X on Intel chips.  It speeds up operations involving system calls on VMs. 

There is a brief overview here [13].  It uses an old part of the x86 architecture, the ring security structure, to maintain control between the hypervisor or host and the guest operating system.  Whenever a system call needs to be made, a ring switch needs to occur, switching from an untrusted user to the trusted kernel, or in this case it is usually the untrusted guest OS to the trusted hypervisor. 

This involves a context switch, which includes building a new stack, switching the virtual memory tables, reloading the segments, flushing the translation lookaside buffer (this is where a lot of the meltdown and spectre issues came from, doing this improperly),  flushing some of the caches to prevent leakage (usually the L1s need flushing, not the L2s or greater), etc.  Intel and AMD both have devised ways to speed this process up, from letting the guest handle most operations (thus preventing the need of the context switch at all) to actually decreasing the call time. The context switch operation is on the order of ~300 cycles last I checked, but it may be even better now as they have continued to work on it.  

Above 4G Decoding and SR-IOV Support in Bios

Second I turned on SR-IOV and Above 4G Decoding.  SR-IOV is what enables device root sharing among VMs [14].  Think like multiple VMs can directly control the network card, or multiple VMs can share a USB controller.  This is still relatively new and not all devices support it.  Above 4G Decoding is about allowing for PCI devices that need large memory spaces.  A lot of hypervisors will simply “reserve” memory space for PCI devices in guest OSes.  If the device needs access to the 64-bit memory space (32 bit memory is limited to 4 GB or 2^32), which the NVidia GeForce GTX does for its 8GB of memory, this needs to be enabled.

Disable Compatibility Support

The last thing I needed to do was disable CSM.  CSM stands for compatibility support module.  This enables the legacy bios to boot, versus the UEFI bios.  Essentially, UEFI is the preferred method because it enables drivers to be loaded more easily from modern storage, enables faster booting, and programming in C for it (used to be assembler! It’s been a while since I’ve done assembly programming) [15].  Mostly, it’s the drivers that are needed.  More and more the legacy systems cannot load all features or make advanced drivers work correctly.  I switched ~ 5 years ago to UEFI on all my systems.  I’m still perplexed why that isn’t the default.

Internal USB Cable

I had intended to install ESXi on a USB drive, like I had done for my storage server.  However, there were no internal USB outlets.  Left with the choice, I either added one internally, which would be a rather bulky solution involving something like this [16], let it hang outside the case, or I could use an old drive. 

NexStar 2.5 in SATA Cage

I have two old Samsung 860 Pro SSD drives lying around from previous HTPC builds before the streaming setup.  I’m kinda a data pack rat like that.  These should be more than adequate for just an ESXI install.  Plus I still had two open SATA plugs on the motherboard.  I went and purchased a NexStar dual 2.5in enclosure for a 5.25 in bay [17].  I didn’t do a lot of research on it because I have used it before, and I don’t really have expectations.  A USB stick works for this.  I just wanted a bootable SATA interface, and this is better than letting them hang in the air like I was doing before.

After all of this, the install went smoothly.  There really isn’t a lot more to show here.  Just boot into the USB install stick, and select the 860 Pro.  The installer takes care of the rest.

PCI Passthrough Issues:

I ran into a general PCI issue wherein I would try to change the passthrough status of a device for PCI passthrough (either enabling or disabling), and it would state it needed a reboot (or not), but after reboot it didn’t work and in fact seemed to not recognize I ever made the attempt.

I have two examples of this.  First I can show the disabling PCI passthrough failure.

PCI Passthrough toggle failure after a reboot

I have x550 Intel 10G controllers set to active.

Then I select them

Then I disable PCI passthrough on them

Then I go to the manage page

Select reboot

And after reboot, they are still active

PCI Passthrough toggle failure when needing reboot

The second path I can show is that enabling, with it stating it needs a reboot, will not work either and appear to not even have saved the fact that the enable was requested.

First I Select the SAS3008 PCI card that is disabled, and I attempt to enable it.

It now states it needs a reboot

Then I reboot and recheck. It is still marked as disabled.

Debugging PCI Passthrough toggle failure

This was very strange behavior.  I looked for ways to manually toggle the passthrough status of PCI devices, which led me to a few articles that appear to be based on a KB article at VMWare [18]. 

grep for pciPassthru

So I logged on to one of the VMs I have, and SSH over into the ESXI box. I open esx.conf and there is no “passthru”.  I grep for it, nothing (grep is a text searching utility).  Then I checked the KB article and it is for ESXI 6.7 not 7.0, there is no corresponding article for 7.0.  This is very frustrating. 

ah-trees.conf

So I look for “passthru” in any file here.  I found something in ah-trees.conf.  This looks like a json blob to me.  I delete everything in the children of pciPassthru.  I know a fair bit about how json works, so if the list is empty hopefully the parser will just exit the loop.  So, I rebooted the machine and it still operated as if nothing changed.  I went and rechecked ah-trees.conf and all of the children list was back.  Ultimately, I gave up on manually toggling PCI passthrough.  I went ahead and reinstalled ESXI from scratch, and the problem went away.

I do have a theory as to what went wrong.  When I was toggling the AHCI controllers for PCI passthrough, I toggled all of them, even the controller that controlled the ESXI  installation.  I think that maybe when that controller is under passthrough the system becomes read only, and thus it cannot change anything.  It all appeared to be modified in memory rather than on disk.  It would explain why the reboot never saved anything that went on. 

With the fresh install I tried to toggle all of the AHCI controllers and one of them required a reboot, the other two did not.  I toggled the PCI passthrough on the one that required a reboot back to disabled.  Then I rebooted the system and the problem does not exist.  Another bullet point for this theory is the way the ESXI network PCI passthrough worked.  After the reinstalled ESXI reset to use one of the 10G ports.  I tried to toggle both ports

I selected and toggled passthrough for both controllers, but this failed.

I was able to toggle only one of them

But not the other

I later determined this was because ESXi was using the second controller.  I switched it to one of the 1G controllers from the physical keyboard and monitor, then I was able to toggle the second controller.

In retrospect, this was obvious, if ESXI was using a particular resource, it cannot be toggled to allow a VM complete control of it.  If the AHCI controller could be toggled, but only actually passthrough after a reboot, that implies something is attempting to use it at that point in time.  In that case it is probably ESXI itself.  After the reboot, ESXI will load itself into memory, then passthrough the controller.  Thus everything appears to be working, but permanent state changes are impossible.  Ultimately I am not certain, but there is a lot of secondary evidence that my supposition is correct.

In any event, this issue affected all VMs and thus deserved to be discussed in the general issues.  I only fixed it with a reinstall.  This is not ideal, and if it is related to my supposition, I think VMWare could probably give a warning about it if toggling the PCI passthrough of the AHCI controller that controls the ESXI install.

References

[1] https://blogs.vmware.com/vsphere/2018/10/vcenter-server-licensing-options.html

[2] https://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/products/vsphere/vmw-edition-comparison.pdf

[3] https://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/vsphere/vmw-flyr-vspherecomparekits-uslet.pdf

[4] https://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/whitepaper/vmware-vsphere_pricing-white-paper.pdf

[5] https://players.brightcove.net/1534342432001/Byh3doRJx_default/index.html?videoId=2011072193001

[6] https://www.vladan.fr/esxi-free-vs-paid/

[7] https://www.thomas-krenn.com/en/wiki/VMware_vSphere_6_Editions_Overview

[8] https://fastreroute.com/vsphere-6-7-editions-licensing-architecture-and-solutions/

[9] https://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/vsphere/vmw-flyr-vspherecomparekits-uslet.pdf

[10] https://store-us.vmware.com/vmware-vsphere-essentials-kit-282883900.html

[11] https://www.reddit.com/r/vmware/comments/9j7frg/esxi_essentials_67_backups/

[12] https://rufus.ie/

[13] https://www.anandtech.com/show/2480/9

[14] https://en.wikipedia.org/wiki/Single-root_input/output_virtualization

[15] https://phoenixts.com/blog/uefi-vs-legacy-bios/

[16] https://www.amazon.com/StarTech-Motherboard-4-Pin-Header-USBMBADAPT/dp/B000IV6S9S/ref=sr_1_5?dchild=1&keywords=usb+header+to+usb+port&qid=1602351449&sr=8-5

[17] https://www.vantecusa.com/products_detail.php?p_id=114&p_name=NexStar+SE&pc_id=7&pc_name=Mobile+Racks&pt_id=2&pt_name=Hard+Drive+Accessories

[18] https://kb.vmware.com/s/article/1022011

Leave a Reply