Skip to main content
Manually re-balance VMs on sole-tenant nodes with "Migrate within node group" maintenance policy set
  1. Posts/

Manually re-balance VMs on sole-tenant nodes with "Migrate within node group" maintenance policy set

·574 words·3 mins
Christoph Petersen
Author
Christoph Petersen

Sole-tenant nodes are an important service on Google Cloud Platform to run workloads that require workload isolation or need to comply to specific licensing requirements that demand dedicated infrastructure. A detailed description what a Sole-Tenant Node is and how it is different from general fleet VMs can be found in the Compute Engine documentation.

Sole-tenant nodes supports different maintenance policies which can be selected based on the workload requirements. A maintenance event happens when Google needs to update the underlying infrastructure.

This article focuses on the Migrate within node group maintenance policy which is primarily used for scenarios where a fixed-sized group of nodes is required (e.g. licensing constraints). The sole-tenant node documentation has more information about the possible maintenance policies and their use-cases.

Challenge
#

As sole-tenant nodes are a good fit for long running workloads due to shifts in resource requirements like the demand for new VMs or other VMs being decommissioned, over time the resource distribution within a node group may not be perfect. While the Compute Engine scheduler aims to achieve the highest possible bin-packing ratio depending on outside factors this might not always be possible.

In rare cases this may lead to situations, where more sole-tenant nodes are deployed than strictly necessary.

Unbalanced sole-tenant node group with Migrate within node group maintenance policy set

While it is possible for node groups that do not have the Migrate within node group maintenance policy set to manually live migrate VMs to other nodes in order to increase density, this is not possible for node groups that use Migrate within node group policy. This is due to the fact that manual live migration uses affinity labels to target individual nodes which can’t be used in conjunction with this policy.

Workaround
#

When a VM is either shutdown and restarted or even only suspended and resumed, the scheduler will make new decisions for placement and aim to maximize bin-packing.

While both of these actions come with downtime, suspend and resume will retain the VMs state and unavailability will be in the span of seconds.

Sole-tenant node group after manual re-balancing with one node that can be deleted

Proof of concept
#

The following script creates a sole-tenant node group with Migrate within node-group maintenance policy. It creates three nodes (two for running VMs and one to satisfy the maintenance policy requirements).

# Create node template
gcloud compute sole-tenancy node-templates create prod --node-type=n1-node-96-624

# Create node group
gcloud compute sole-tenancy node-groups create prod --node-template=prod --target-size=3 --maintenance-policy=migrate-within-node-group
Create node template and sole-tenant node group

The nodes running VMs are subsequently filled with VMs after which artificial gaps are created on both sole-tenant nodes.

# Create smaller VMs
for number in 01 02 03 04 05 06 07 08 09 10 11 12; do
    gcloud compute instances create small-vm-$number --machine-type=n1-highmem-16 --node-affinity-file affinity.json --enable-display-device --image=debian-11-bullseye-v20210817 --image-project=debian-cloud --boot-disk-size=10GB --boot-disk-type=pd-balanced --boot-disk-device-name=small-vm-$number --shielded-secure-boot --shielded-vtpm --shielded-integrity-monitoring
done

# Create some holes by deleting some VMs
for number in 01 02 03 10 11 12; do
    gcloud compute instances delete small-vm-$number --quiet
done
Schedule VMs on both sole-tenant nodes then create artificial holes

Finally the VMs on the second node are paused and then resumed causing the Compute Engine scheduler to resume the VMs on the first node to maximize resource utilization.

# Pause VMs
for number in 07 08 09; do
    gcloud beta compute instances suspend small-vm-$number
done

# Resume VMs
for number in 07 08 09; do
    gcloud beta compute instances resume small-vm-$number
done
Pause and resume VMs to show consolidation by the Compute Engines scheduler

Code
#

The full script is available on GitHub.

Related

Moving VMs between sole-tenant node groups

·324 words·2 mins
Sole-tenant nodes are being used by customers for workload isolation and also for licensing compliance (e.g. bringing Window Server licenses). Throughout the life cycle of a sole-tenant node there might be the necessity of moving virtual machines to another node group or even to another machine family (e.g. moving to N2 from N1). Refer to the documentation, to learn more about Node affinity and anti-affinity options.

Rethink. Reset. Restart.

·337 words·2 mins
Rethink. # In May I started my journey with HorseAnalytics serving as their CTO. Just a couple of weeks into my tenure the unthinkable happened: money ran out and we were not able to secure bridge funding to see the motion we already set into motion to an end.