Author Archives: Cameron Stockwell

Nutanix Foundation 4.1

Foundation 4.1 brings a bunch of improvements to make Nutanix deployments even easier.

The Foundation Golden Rule

Always upgrade and use the latest available version of Foundation” prior to your deployments.  Each new version has new features (like the below) as well as additional new platforms and software supported by Nutanix. If you are using an old version you are missing out !

foundationinstallmethod

Foundation Release Notes help determine which method to use

Upgrading is easy. Download the tar.gz from the support portal. In Standalone VM Foundation, click the version number and follow the prompts. In CVM Foundation use the java applet to upgrade the inbuilt Foundation on the discovered nodes.

Let’s get into what is in 4.1.

 

 

Foundation UI Changes

CVM Foundation and Standalone VM Foundation in 4.1 have a more unified UI and workflow. However, Standalone Foundation VM is the only method that allows you to manually add nodes (instead of using discovery) or abort an in-progress imaging session.

Range Auto-fill option has been moved to the “Tools” menu to keep the default UI clean.

Screenshot 2018-06-23 14.08.47

Foundation now generates auto-filled host names without a hyphen by default.

CVM Foundation now supports imaging nodes without forming a cluster. This was already supported in Standalone VM Foundation.

Standalone Foundation VM (via discovery) can image nodes or create a cluster without IPMI being required. This mirrors the CVM Foundation experience in Standalone VM Foundation.

Faster Deployments via Skipping AOS Imaging

On discovered nodes, Foundation 4.1 can now skip imaging AOS (if all nodes are the same AOS version) and allow you to just change the underlying hypervisor whilst keeping the AOS version as-is. This saves downloading and re-uploading the AOS image file when the AOS version isn’t changing. This is a time saver.

Lets say your nodes arrive from the factory with AHV and AOS version 5.5.1. You do not need to upload a new AOS image during the Foundation process if you just wanted to change to ESXi for example.

Screenshot 2018-06-24 11.34.22

 

Screenshot 2018-06-24 11.34.44

As an example, below I selected ESXi 6.5U1 to change the primary hypervisor for the cluster. By default Foundation will image all nodes to ESXi, but in my case I wanted to select nodes C and D to remain on AHV in ‘Storage Only’ or ‘Storage Node’ mode. This way I could have all nodes participate in the storage pool but only need to license ESXi on nodes A and B.

Screenshot 2018-06-24 11.35.55

Once you click “Start” you can relax – it will take about an hour to complete the process. Note that if you aren’t changing AOS or hypervisor images at all, an install takes just a few minutes total (just applying your IP addresses).

41ImagingESXIandAHV

Imaging nodes with different hypervisors (ESXi and AHV for “Storage Nodes”) at the same time !

Side note : of course the AHV nodes finish first! (Though no one is surprised right?! :)

Screenshot 2018-06-24 12.30.57

All done! 

Screenshot 2018-06-24 12.51.55

Pre-Configuration Portal JSON file creation and import

Foundation can import configuration files generated at https://install.nutanix.com. This can be used to generate a Foundation JSON file that can be imported into Foundation 4.1+ on site. You can create a configuration manually ,or if you have placed an order for Nutanix NX nodes, you can see your order and auto-populate the nodes (while the nodes are being shipped for example).

Below are some screenshots from the install.nutanix.com site. The idea is that the UI should look very similar to the normal Foundation process.
EOS1

 

EOS2.PNG

Note that the “Import New Order” function only applies to Nutanix NX orders. We are investigating ways to expand this to other appliance types in a future release.

Once you’ve completed the fields, click the ‘Download’ link at the bottom of the page to generate your JSON file.

Other Fixes in 4.1:

Multi-homing is supported when IPMI configuration is skipped.

Foundation no longer fails if a NTP server is supplied but not reachable with AOS 5.6+

Summary:

Nutanix Foundation continues to evolve. We’ve got some more UI changes and features planned around better networking options and more central deployment methods, with the eventual goal that anyone would be able to deploy their Nutanix cluster with any hypervisor they choose, any model of node they choose, in any configuration they want and eventually any cloud with just a few clicks, and in an automated fashion.

Would love to hear your suggestions and feedback on Foundation to make it better.

 

“Remote” Bare Metal Foundation

One of the little known options when using “Bare Metal” Foundation is doing so over a layer 3 network, instead of the traditional “same layer 2 network + MAC address” method.

This allows Foundation imaging of Nutanix nodes over a (good!) WAN link or across different subnets in your DC for example.

Foundation-SiteA-SiteB

This method can be used to remotely ‘Bare Metal’ any hardware vendor platform running Nutanix via IPv4 – Nutanix NX, Lenovo HX, Dell XC, Software Only Cisco and HPE and others.

Foundation-Remote-Quote
Quick Summary of the “Remote Bare Metal Foundation” procedure:

  1. Rack and cable the nodes, and configure the IPMI ports on the network with an IPv4 address (eg. via BIOS see below). Do this first. 
  2. Deploy the Foundation VM on the network – ensuring it has IPv4 connectivity to the IPMI ports. The VM does not need to be on the same subnet as the IPMI ports and could be in a different site over a WAN.
  3. Go through the Bare Metal install process via the Foundation VM, skipping discovery and instead manually adding blocks/nodes via selecting the “I have configured their IPMIs to my desired IP addresses” option.

Critical Note on WAN Bandwidth Requirements

With this method you will copy AOS + Hypervisor image files over the network in parallel to each and every node – so consider available bandwidth and network utilisation as well as the AOS / Hypervisor image sizes that will be transferred from your Foundation VM to the nodes during the imaging process.

These files can be several GB in size. Foundation pushing images to nodes will time out after 15 minutes – so you will likely need a WAN link minimum of 50Mbit/s to copy the 4GB AOS file to a SINGLE node…and a better link if you are changing to ESXi (additional ~350MB) or HyperV (additional ~4GB) or if you are imaging more than one node.

If you have 4 nodes – multiply that by 4 of course. Clearly, this method is not for your small branch ROBO link. Use a tool like https://techinternets.com/copy_calc to see if your WAN link can handle the workload within that timeframe.

At time of writing you cannot modify the timeout setting.

Foundation-FileXfer-WAN2


In summary, ensure your network link is capable of respecting the timeout value taking into account the number of nodes you are imaging. For example, if you were imaging 4 nodes over the WAN, you will be copying at least 16GB in total over that link within 15 minutes.

Screenshot 2018-05-23 09.46.39

If you had a 1Gbit link (or local 1 Gbit switch), 20 nodes would take ~12 minutes just for the AOS images. If you are imaging HyperV nodes, you could only image 10 nodes (as you need to include the 4GB HyperV ISO as well) on 1Gbit links. This is why old 100Mbit switches or USB adapters won’t suffice when you are imaging multiple nodes. 

Site A and Site B can be different L3 subnets. Make sure Site A’s Foundation VM subnet and Site B’s IPMI subnet and Site B’s CVM/Hypervisor subnet are all routable to each other. That is, every subnet involved must be routable to each other.

Setting the IPMI Ports Manually

If you are unsure how to set the IPMI IP addresses manually, see “Setting IPMI Static IP Address” section in the Foundation Field Installation Guide for instructions for configuring via BIOS on each node.  The Foundation Field Installation Guide can be found on the Nutanix Support Portal.

Foundation-BIOS-IPMI

The above screenshot is from one node’s IPMI settings via BIOS. You would repeat this for each and every node you want to deploy, then use Foundation to image the nodes.

Quick UI Walkthrough

Below is a walkthrough of the initial screens in Foundation v4.1 for the Bare Metal via IPv4 process. Note that the IPMI addresses you type should match the IP addresses you’ve manually assigned to the nodes of course  :

FNDN-41-ipv4IPMI

We are also developing a “Foundation Central” microservice within Prism Central which will allow for ‘zero touch’ deployments at scale, including using a local (to the nodes) file store to avoid pushing files over the WAN – but for now this ‘bare metal’ method works if you have the luxury of bandwidth.

Intro to Nutanix Lifecycle Manager (LCM) v1.2

“Single Pane of Glass” gets thrown around a lot by vendors …. as does “upgrades are easy”. If you lucky enough to be a Nutanix customer, you already live this dream. 

But how could the Nutanix Engineering team make the experience even better?

While Hugh was using the tried-and-true traditional Nutanix AOS upgrade method within Nutanix Prism, each individual component (such as AOS, hypervisor, BIOS etc) had to be done independently. It was reliable of course, but what if you wanted to upgrade many components of your cluster at once and it be just as easy and reliable and to take care of the dependencies for you?

Plus, with security releases coming thick and fast these days, it is imperative that we try to make it dead simple for customers like Hugh to be able to react quickly to patch their infrastructure, regardless of hypervisor or hardware component type.

Thus, Nutanix Life Cycle Manager (LCM) was born.

LCM-menu.PNG

The LCM feature is available with Nutanix AOS 5.0+

LCM is a framework that can detect and upgrade hardware and software components in a rolling fashion completely in-band via Nutanix Prism, taking care of any dependencies and maintenance mode operations as needed to conduct the upgrades.

The idea is that you can go to the one location to manage all your Nutanix related software and firmware updates, click a button and then LCM will orchestrate the entire process, with no effect on your running workloads. All this while you go and do something else in the meantime, or perhaps just have a quiet glass of red.

LCM has the power to tell whatever brand of hypervisor you are using to evacuate VMs to other nodes and reboot the host should the update require it. Only when a host is confirmed that it has returned to service is the next host able to conduct it’s upgrade.

LCM is intelligent enough to allow you to select one node only for some updates. For example, you might want to just upgrade the BIOS or disks firmware in one node. Some updates are cluster wide and some can be node based depending on the component – but in either case LCM will take care of the operation for you.

If you are not familiar with Nutanix LCM, then take a look at the quick video demos :

 

LCM is the framework which (eventually) will be the method in which you manage updates and upgrades to your Nutanix clusters. I say ‘eventually’ because it is still early days for LCM, but things are ramping up quickly. As such, you should check for new LCM updates every week and see which new features are unlocked with the latest LCM Framework updates.

You don’t have to wait for a new version of the Nutanix AOS software either – LCM is independent of AOS – so you can upgrade LCM at any time a new update is available.

So what’s new in LCM v1.2?

With v1.2, LCM supports additional inventory and update components on Nutanix NX and Dell XC clusters. Lenovo HX and Nutanix Software-Only support is under development.  Up to now, only the SATADOM updates were supported.

The following has been added in LCM v1.2:

Nutanix NX and SX Platform LCM support requires AHV or ESXi 5.5, 6.0, or 6.5 and supports updates to the following components : HDD, SSD and NVMe drives. 

Dell XC Platform LCM support requires ESXi 5.5 and 6.0 and AOS 5.1.0.3 or newer and supports updates to the following XC components: XC BIOS, XC iDRAC, XC HBA controller, XC NIC, XC Disks (SSD and HDD). 

For more details, check the release notes on the Nutanix Support Portal.

Using LCM

If you’ve not had a look at LCM before, I suggest you update the LCM Framework to the latest version, run an ‘Inventory’ (discovery) job and take a look around.

It is a good idea to run a ‘Perform Inventory’ operation first. This will scan your cluster and check if there are any updates available for any components, including the LCM Framework itself.

lcm-perfinv.PNG

Go to the LCM page and select Options->Perform Inventory.  The status will change to “Perform Inventory in Progress” which takes a few minutes as your whole cluster is scanned.

lcm-perfinv-in-progress.PNG
You may see some available updates:

lcm-availupdate2.PNG

You can see that the above screenshot shows software and some hardware components that have available updates. In order to update LCM to the latest, I’ll select (and update) the ‘Cluster Software Component’ and hit the ‘Update Selected’ button.

 

lcm-availupdate3-sure.PNGRun that update. You will see a message that “Services will be restarted” – meaning the Nutanix internal services will restart (LCM related) – but this is a non disruptive operation to your workloads so it is safe to run this update anytime. Once you hit the “Apply 1 Update” button,  update process starts.

 

lcm-update-in-progress.PNG

Once the new update to LCM is installed, run Perform Inventory again to see if there are any new updates or components supported in the new version (now that you’ve updated the LCM Framework, there may be some more unlocked features).

If there are any other updates available, you may choose to update them as well using the same logic.

Future Plans

In the coming months you will see more unlocked updates appear in LCM, including broader hypervisor support, more hardware component support, more Nutanix software support (eg. NCC, Foundation etc) so that the current “Upgrade Software” menu will eventually be retired and LCM takes over all functions related to our “1-Click Upgrades”.

LCM in Prism Central will also launch in 2018, with the ability to expand LCM to handle upgrades across multiple clusters.

In the meantime, the LCM Engineering team would love to hear your suggestions and feedback. They also love twitter mentions, so please keep them coming.