AHV | Invisible Infrastructure

With remote working now the new normal, it is challenging to send skilled IT professionals to data centers to install new equipment. Although Nutanix clusters have always been quick to install and configure, there was still a requirement to send a trained IT technician to site to run the Foundation software for deployment, usually connected to a local laptop. For large-scale or multiple sites, this can be a costly exercise.

How could we make this even easier in a ‘work from home’ world?

With the launch of Nutanix Foundation Central 1.0 with Prism Central 5.17, this specialist requirement is now removed.

Zero-touch deployments are now a reality for factory-shipped appliances from Nutanix, Lenovo, HPE, Dell, Fujitsu, and Inspur…. all will be Foundation Central ready out of the box.

Nutanix Foundation Central is a service on Prism Central 5.17+

Foundation Central (FC) is an at-scale orchestrator of Nutanix deployment and imaging operations. After the initial network prerequisites are met, new nodes ordered from the factory can be connected to the network and receive Foundation Central’s IP address via DHCP assignment. Since nodes are shipping from the factory, they will have either a functional CVM (running the CVM Foundation service) or DiscoveryOS (for NX International shipments) inbuilt.

The nodes (no matter the location) send their “I’m ready” heartbeat status to Foundation Central.

“Location” can be within the same Data Center and/or a remote site as an example.

Once the nodes are detected by Foundation Central, the administrator can create a deployment task and then send that task to the locations and the configuration job is conducted by the nodes themselves. The nodes send their job status back to Foundation Central for the administrator to monitor.

Foundation Central Home Screen with 15 discovered nodes. Nodes can be running different AOS and/or Hypervisor and can be re-deployed into clusters with a new AOS/Hypervisor of choice

Foundation Central never initiates a connection to the nodes. The nodes are the source for all communications and are awaiting to receive their orders on what task to do.

Unconfigured nodes send heartbeats to Foundation Central

Foundation Central receives these node heartbeats and then will display the nodes as available to be configured. By default this could take up to 20 minutes to appear in the Foundation Central UI. Heartbeats are sent until the nodes are configured and part of a formed cluster.

Unconfigured nodes send requests to FC and receive their configuration orders

Foundation Central is only receiving status updates until job completion. It receives these status updates from the coordinating node.

After a successful configuration/re-imaging process is done on one node, the original ‘coordinating node’ hands over to that new 100% completed node, and now this node takes over as the new (permanent) coordinating node for the remaining nodes.

If re-imaging to a different AOS or Hypervisor is required, Foundation Central will ask for the URL where these images can be found. These can be anywhere on the network, but given the file size it is recommended they be local to the nodes where possible.

Changing AOS and Hypervisor Type if required

Once the administrator configures the Foundation Central jobs as desired, Foundation Central will await the specified nodes to request their configuration task.

Imaging and Configuration tasks are always local to the nodes/location

Configuration tasks are then fully handed off to the local nodes and Foundation Central becomes a ‘passive partner’ in the process from here. The nodes elect a ‘coordinating node’ for the entire operation and will be responsible for keeping Foundation Central updated on the status of the tasks.

Deployment Complete in parallel with different AOS/Hypervisors no matter the location

Foundation Central Requirements

Nodes must be running Foundation 4.5.1 or higher (bundled with a CVM or DiscoveryOS). It is advisable to run the latest Foundation on the nodes
(upgrade using APIs is very easy before imaging)
Networking requirements must be met (see below)
Prism Central must be minimum version 5.17

Networking and DHCP Requirements to use Foundation Central

The factory nodes need to be connected to a network that has a DHCP scope defined which allows for specific DHCP options. This is to ensure the nodes automatically receive the Foundation Central IP address and API keys specific to your environment.

DHCP server must be present and must be configured with vendor-specific-options:
Vendor class: NutanixFC
Vendor encapsulated options (DHCP option 43):
- Option code: 200, Option name: fc_ip
- Option code: 201, Option name: api_key
L3 connectivity from remote nodes to Foundation Central must be available
L3 connectivity between ‘starting’ and ‘destination’ subnets if IP addresses are to be changed as part of the node configuration process
Remote CVMs must be configured in DHCP mode (default from factory)

Option 1: Nodes are discovered and configured to remain in the same subnet

Option 2: Nodes will be re-deployed to a different subnet. DHCP is not required for the 2nd subnet.

For more information contact your Nutanix SE or check the Foundation Central Guide on the Nutanix Support portal.

Special thanks to Foundation Engineering (Toms, Monica, Toshik, YJ and extended team) for the development of Foundation Central…they are already working on some improvements for v1.1 !

Please reach out with any feedback or suggestions and I trust this helps in making your working-from-home life a little easier.

One of the reasons people stay with a particular type of hypervisor is that it is too hard (or too costly) to migrate to another type. All that drama of converting, testing and making sure all is right and then the risk of having to move back if something went wrong.

Sure, there are separate software tools you can buy to do the conversion for you . . . but what if the virtualisation infrastructure itself – the thing that is actually providing your servers and storage – could do it as an in-built function? What if that could be done just by clicking a few buttons?

So in the demo video below, I take a running Windows VM on a Nutanix Cluster “A” running vSphere and then take a snapshot of it and send it to a second Nutanix Cluster “B” running Nutanix’s own free Hypervisor (AHV) and then start the VM. Job done. Easy.

Here’s the setup:

Basic lab setup using a flat L2 network. Production and DR deployments would use L3 networks – which is fine of course

..and here’s the demo:

For brevity, I cut out the initial one-off processes to set up the Replication. The full process was below (check out the Nutanix Index for articles describing setting up Replication):

1. Setup a Data Protection Remote Site ‘pair’ of clusters (so that they can replicate to each other) and test the connection.

Site A (ESXi cluster)
Site B (AHV cluster)

2. Set up a Protection Domain policy, add the VM you want to be a part of the replication policy and set a schedule.

3. On the Windows VM on ESXi on site A that you want to snap to Site B running AHV, make sure you install the Nutanix VM Mobility drivers MSI from the my.nutanix.com support portal. (These will soon be included in Nutanix Guest Tools (NGT) post Nutanix AOS 4.6 release, so by installing the NGT you will automatically get the VM Mobility drivers). The Nutanix VM Mobility installer deploys the drivers that are required at the destination AHV cluster. After you prepare the source VMs, they can be exported (snapped) to the AHV cluster.

4. Run the snapshot and restore operation as per the video. That’s it!

Almost as easy as clicking this button

A few points to note:

In the video I am just taking a crash-consistent snapshot, if you want a clean snap then shut down the source VM first, then snap, then restore. Live app-consistent snapshots will be coming in 4.6 for ESXi and AHV.

Obviously if your VMs have static IPs or to avoid computer naming issues, you should take care of these before joining the newly created AHV VM to the network. When you restore the VM on AHV, by default there is no virtual nic connected (so the risk is minimal if you just want to test). If you wanted it to connect to the network you would attach a nic to the restored VM on via Prism on the AHV cluster (go to the VM page).

Only 64-bit guest operating systems are supported at the time of writing (Nutanix AOS 4.5).

For Windows 7 and Windows 2008 R2 operating systems, you have to install SHA-2 code signing support patch before installing Nutanix VM Mobility installer. For more information, see https://technet.microsoft.com/en-us/library/security/3033929.

More info can be found in the Nutanix Prism Web Console Guide under the “Nutanix VM Mobility for Windows” section – which can be found on the Nutanix Support Portal.

Use cases:

A lot of people are trying AHV for the first time, and larger customers usually have a test/dev set of Nutanix nodes for testing. This method would be perfect to try snapping production VMs on AHV for testing and verify all is OK.

Also, I can see a use case where DR clusters could now use the in-built AHV on Nutanix clusters and save some licensing dollars.

It would also be possible to use Nutanix Community Edition as the AHV target – in case you had some spare hardware and wanted to just try this out without the need for a full Nutanix set of nodes.

Future software plans:

In a few weeks (early 2016), Nutanix will release AOS 4.6. With it, two-way VM conversion (ESXi<->AHV in either direction) should be included. In a future release AOS is expected to add support for Hyper-V, delta disks, and volume groups.

Yes, Nutanix will enable the ability to leave AHV and migrate your VMs *back* to ESXi (for example) should you choose. Put simply, the onus is on Nutanix to keep innovating to maintain your loyalty, rather than any technical or license ‘lock-in’. At the end of the day your workloads are just virtual machines – you should be free to move them wherever you see fit (even away from Nutanix if you choose).

There will be lots of improvement and extra features coming in future releases of course, which you will get by simply doing a standard Nutanix non-disruptive upgrade.

Conclusion:

In essence, you can see why going Hyper-converged makes doing things like this almost trivial compared trying to do the same in a traditional 3-tier infrastructure (separate servers and storage layers). As the Nutanix software improves, your life gets easier each time. With each Nutanix release, more and more features like this will continue to be added and improved. Being 100% in-software is going to be a necessity in the next decade and beyond.

Thanks to @danmoz for letting me borrow his Dell XC cluster… and I treated it badly too (eg. multiple times I hard powered it off with no care – and it self-recovered every time).

Hypervisor lock-in is sooo 2007 :)

Invisible Infrastructure

Helping people make storage and compute invisible since 2012. Next stop : making the hypervisor and cloud services just as invisible.

Category Archives: AHV

Nutanix Deployment Delight with “Zero-Touch” Foundation Central

Foundation Central Requirements

Networking and DHCP Requirements to use Foundation Central

Nutanix 4.5 Cross-Hypervisor VM Conversion