Bare Metal Does Not Mean Inflexible

The Hybrid Burst Model — Cloud as Overflow, Not Foundation

By Catalin Lichi · Sugau

The most common objection to bare-metal infrastructure is not cost. It is not complexity. It is this:

“What happens when we need to scale faster than we can buy hardware?”

It is a legitimate question and it deserves a direct answer rather than a deflection. The answer is that a correctly architected bare-metal Kubernetes cluster can burst to cloud on demand, automatically, at the node level — and pull back when the pressure passes. Cloud elasticity does not disappear when you repatriate. It becomes a tool you reach for deliberately, rather than the substrate you pay for permanently.

This is the hybrid burst model. Bare metal as the anchor. Cloud as overflow.

The Architecture

Kubernetes does not care where a node lives. The scheduler sees compute, memory, and labels. Whether that node is a physical server in your rack or a virtual machine in AWS is, from the workload perspective, irrelevant.

This property is what makes hybrid burst tractable. Your control plane runs on bare metal. Your production node pool runs on bare metal. You add a second node pool — cloud-backed, auto-provisioned, set to scale to zero when idle — and you taint those nodes to signal that they are overflow capacity, not primary capacity.

Workloads that can tolerate cloud nodes carry a matching toleration. Everything else stays on bare metal. The scheduler does the rest.

The tooling that makes this real:

Cluster API provisions and manages cloud nodes using the same Kubernetes control plane that manages your bare-metal nodes. One cluster, two infrastructure backends. AWS, GCP, Azure, and Hetzner all have Cluster API providers. Your operations team does not need to learn a new system — they extend the one they already run.

Karpenter (AWS-native, with growing support elsewhere) handles node lifecycle automatically. When pods are pending because bare-metal capacity is saturated, Karpenter provisions cloud nodes in minutes and schedules the pending workloads. When those workloads finish, it terminates the nodes. You pay for the burst window, not for idle standby.

Virtual Kubelet takes this further for serverless-tolerant workloads — attaching AWS Fargate or Azure Container Instances as synthetic nodes. No VMs to manage, billing by the second, and the workloads think they are running on a normal Kubernetes node.

What Goes Where

Not all workloads are equal candidates for cloud burst. Being deliberate about workload placement is what makes the economics work.

Always on bare metal:

Production databases and stateful services. Data gravity is real, and egress costs on data-heavy workloads running in cloud will erase your savings fast.
Latency-sensitive APIs with strict SLA requirements. Bare metal removes the hypervisor layer and the noisy-neighbour problem.
Workloads handling sensitive or regulated data. Keeping that data off cloud infrastructure is often a compliance requirement, not a preference.
Everything with predictable, stable resource consumption.

Eligible for cloud burst:

Experimental services that have outgrown headroom capacity and need room to prove themselves before dedicated hardware is justified.
Batch processing jobs with variable timing — ML training runs, data pipeline jobs, nightly ETL.
CI/CD build infrastructure. Build load is inherently spiky. Running a standing fleet of build nodes on bare metal to handle peak load is wasteful. Burst nodes handle peaks cheaply.
Traffic spikes on stateless services — marketing campaigns, seasonal load, product launches.

The pattern is consistent: workloads that are stateless, time-bounded, or tolerant of variable latency are good burst candidates. Everything else belongs on bare metal.

This Neutralises the Main Objection

The hardware procurement cycle is real. If you need two new nodes and your procurement process takes six weeks, you have a six-week gap. Cloud burst fills that gap cleanly.

The operational pattern for a high-growth scenario looks like this: load increases beyond bare-metal capacity, cloud burst nodes provision automatically, the workload runs on cloud while the procurement order is placed, new hardware arrives and is added to the cluster, workloads migrate back to bare metal, cloud nodes scale to zero. You never get caught flat-footed, and you never pay for cloud capacity beyond the window you actually needed it.

This is categorically different from running everything in cloud by default. In the default cloud model, you are paying cloud prices for 100% of your workload, 100% of the time, including all the stable predictable capacity that bare metal would serve more cheaply. In the hybrid burst model, you pay bare-metal costs for the stable base — which is the majority of your load — and cloud costs only for the overflow, only while it is running.

The Correct Mental Model

Cloud is not your infrastructure. Cloud is your buffer.

A buffer has a specific job: absorb transient demand that exceeds your standing capacity. Buffers are sized for peaks, not for average load. Running your average load through a buffer permanently is an architectural mistake, and it is an expensive one.

Bare metal handles your average load. Cloud handles your peaks. Kubernetes provides the abstraction layer that makes both look like the same cluster to your workloads.

This is not a compromise position between bare metal and cloud. It is a more sophisticated architecture than either pure-cloud or pure-bare-metal alone. You get the economics of bare metal for the workloads that benefit from it, and the elasticity of cloud for the moments that actually require it.

Getting There from a Pure-Cloud Starting Point

If you are currently running entirely in cloud and considering repatriation, the hybrid burst model is also the correct migration path — not just the end state.

You do not move everything at once. You identify the stable, high-cost, cloud-permanent workloads that are the clearest economic case for repatriation. You move those to bare metal first. Cloud continues to run everything else. Over time, as more workloads migrate, cloud shrinks from primary infrastructure to burst buffer. The transition is gradual, reversible at each step, and each migration pays for the next hardware purchase.

By the time you reach a mature hybrid architecture, you have validated every workload placement decision against real operational data rather than pre-migration assumptions.

The Answer to the Objection

When a prospect or internal stakeholder asks “what if we need to scale faster than we can buy hardware” — the answer is straightforward.

We burst to cloud. Automatically. For as long as we need it. Then we pull back.

That is not a workaround. It is the architecture.

Sugau designs and operates bare-metal Kubernetes infrastructure with hybrid burst capability for organisations moving away from full cloud dependency. If you are ready to architect infrastructure that gives you bare-metal economics without sacrificing elasticity, get in touch.