Introduction
Introduction
Imagine being on a long road trip . After a while the gasoline or batteries start to run low, the driver and passengers get tired and it’s time for a pit-stop. The car pulls over, the passengers get out, everyone and everything is refueled and refreshed for the next leg.
But what if instead of stopping you’d just push a button and the car would be refueled / recharged while you drive. Even better - how about swapping out the entire car to a newer model while you’re at it?
Metaphorically speaking, the Nutanix Cloud Clusters (NC2) solution on AWS can change the entire car – without the need for a pit stop. The system can just keep running while the bare metal is replaced underneath, as if nothing has happened with the exception of getting more power. All the benefits – none of the downtime. And, it can be done with a single command through the NC2 management portal. .
In this example we swap out i3.metal nodes to more powerful i4i.metal while the cluster is running. The starting point is a cluster with three i3.metal nodes and the end state is the same cluster, but now with three i4i.metal nodes. The change is seamless for the workloads running on top of NC2. Apart from a brief dip in north-south traffic during the network change, they experience no disruption.
Starting point
Starting point
We start out with a plain NC2 on AWS cluster with three i3.metal nodes. In addition to the basic cluster components we have also opted to deploy the Prism Central control plane and the Flow Virtual Networking (FVN) overlay networking.
Multiple Virtual Machines (VMs) are running on the NC2 cluster. To monitor their health we start a continuous ping, the statistics of which can be evaluated after the cluster nodes have been replaced.
On the networking side we have set up No-NAT networking with Flow Virtual Networking and as such the subnet the test VM is attached to is accessible also from the native AWS Virtual Private Cloud (VPC). In this case, we are pinging the Linux NC2 test VM from a Windows EC2 instance in a separate AWS VPC as per the diagram above.
Updating the Cluster Capacity Settings
Updating the Cluster Capacity settings in the Nutanix MCM portal
The management portal for NC2 allows for easy updates to the cluster capacity and configuration. We highlight our cluster and navigate to Cluster Capacity where the node types and the number of nodes can be changed.
A few clicks later we have added three new i4i.metal (i4i) nodes (metal instances in the AWS parlance) to our original configuration of three i3.metal (i3) nodes and we have also set the number of i3 nodes to zero. This way we get three new nodes of a more powerful configuration added and after all data has been transferred over, the old cluster nodes will be removed and billing for them stopped.
The task has now been accepted by the MCM portal and is being executed in the background. VMs running on NC2 continue working as usual, unaware of the big changes to the system which are under way.
EC2 Bare-metal Changes
EC2 bare-metal changes as seen from the AWS console
In the AWS console it is possible to witness the process of the i4i.metal nodes being added, i3 and i4i nodes running at the same time while the cluster shifts to run on the new nodes and finally the decommissioning of the i3.metal nodes.
From a networking perspective: The i3.metal Elastic Network Interface (ENI) which was the active point of North-South communication for the cluster, and therefore part of the AWS VPC route table, has been shifted to an ENI on one of the new i4i.metal hosts post migration
Result
Result
The node swap was completed without a hitch and without any need of input from the IT administrator managing the NC2 cluster – well, apart from initiating the change at the start. In this case, our cluster was hosting only a handful of VMs and the entire process took just under one hour to complete. Naturally, the time required for this change will increase with the amount of storage used and the load the cluster is under during the change.
As the new nodes are added, VMs and data are automatically migrated between hosts without the need for user intervention or manual “re-balancing” effort. VMs remain available and data remains protected (either Redundancy Factor (RF) 2 or RF3) at all times. All nodes in the cluster take part in the movement of data meaning that it can happen relatively quickly.
More importantly, the workloads have experienced just a blip in network connectivity and no downtime or reboots.
The Linux VM, which we started pinging at the beginning of the blog post, is still up and the pings are still getting through. Throughout the hour-long change a total of 3381 pings were sent 26 of these were lost (near enough 0% loss according to Windows).
The uptime command on the Linux host also shows that there was no rebooting of VMs involved. Moreover, the SSH session from the Windows EC2 instance to the Linux VM on NC2 remained in place uninterrupted throughout the procedure.
Conclusion
Conclusion
This was an example showing how quick and easy it is to migrate your NC2 cluster from one EC2 bare metal instance type to another when using Nutanix Cloud Clusters on AWS. This is in sharp contrast to some other virtualization platforms on public cloud. This functionality can also be used to scale clusters up and down with similar ease.
These capabilities coupled with the ability to hot-add resources (disk, vCPU and RAM OS compatibility allowing) to VMs in virtually any configuration you choose make NC2 one of the most flexible and scalable ways you can run your workloads in a public cloud.
For more information, please visit the Nutanix Cloud Clusters page below: https://www.nutanix.com/products/nutanix-cloud-clusters