Complexity Belongs in Software, Not in Infrastructure

2026-01-18

There is a fundamental category error in modern infrastructure thinking: we treat infrastructure complexity as if it were software complexity. They are not the same. They scale differently, fail differently, and place radically different cognitive demands on the people operating them. Ignoring this distinction is one of the reasons running modern systems often feels like a nightmare.

Software complexity is manageable because software is built for abstraction. Infrastructure complexity is dangerous because it forces reasoning about interdependent, emergent behavior long before runtime but during provisioning, wiring, and configuration.

Software Complexity Is Compressible

Software complexity is expensive but manageable because it is built to be abstracted. We can:

encapsulate behavior behind interfaces
reason locally about modules or services
test in isolation
refactor without changing semantics
inspect state deterministically
step through execution

Good abstractions compress complexity. They reduce the number of things a human must hold in their head simultaneously. This is why large codebases, large distributed applications, and complex security logic are possible: each moving part is designed to be understandable at least in isolation.

Infrastructure Complexity Is About Wiring, Not Behavior

The hardest part of infrastructure is not how systems behave under load. It is how parts are connected. And connecting many parts can be reallly complex. If one tries to abstract over this complexity it is going to leak. The leaks I’m talking about are provisioning and structural leaks:

IP addresses that should never be relevant to the operator
internal DNS names that must be manually referenced
certificates that must be created, mounted, and rotated correctly
storage paths that reveal implementation details
port numbers that encode topological assumptions
service meshes that layer extra hidden networking

This is where operational mental models collapse. Most operators cannot fully reason about all of these moving parts at once. Most rely on ritual, guesswork, or copying "known good" configurations. The system might run. "Good enough" often feels sufficient, until it isn’t.

Infrastructure Abstractions: Leaky or Opinionated

We do stack abstractions on infrastructure: Helm, Terraform, operators, CRDs, cloud APIs. But here's the problem: infrastructure abstractions are inherently leaky and often poorly designed.

Kubernetes deliberately chooses leakiness to remain unopinionated. It cannot assume how you want networking, storage, or certificates wired. Leaky abstractions are not a problem in principle. The problem is when they are high-complexity abstractions that multiply cognitive load instead of reducing it. Tools like Helm often amplify this issue: templating configuration on top of configuration creates indirection without reducing the mental model required to operate the system.

Docker Swarm illustrates the opposite approach. It is more opinionated. Decisions such as overlay networking via VXLAN, automatic service discovery, flat internal networking, and built-in load balancing reduce the operator's cognitive load. You don't place services into subnets, manage pod-to-pod routing, or design traffic topology. Services just talk.

The key distinction: good infrastructure abstractions remove structural decisions from the user. Bad ones expose every joint and bolt and call it "flexibility".

Kubernetes Optimizes for Flexibility, Not Operability

Kubernetes prioritizes:

composability
extensibility
vendor neutrality
maximal configuration freedom

These goals are not aligned with human operability. They produce systems that can be shaped into almost anything, but must be manually shaped every time. The cost is paid in:

YAML sprawl
fragile configuration graphs
accidental complexity
operational superstition

The platform becomes powerful, but cognitively hostile.

When Complexity Exceeds Understanding

When infrastructure becomes too complex to model mentally, engineers stop reasoning. They start:

copying example configs
toggling flags until things work
restarting components as a fix
relying on tribal knowledge

The system becomes ritualized. Not operated. Not understood. Maintained by pattern matching.

This is not a people problem. It is a system design problem.

Why Kubernetes Feels Worse Than Other Leaky Abstractions

It's important to be precise: leaky infrastructure abstractions are not unique to Kubernetes. Most infrastructure tools leak, because they have to. Infrastructure exists to manage complexity that software cannot handle directly: networking, storage, identity, certificates, and resource management. The infrastructure is fundamentally the abstraction layer over hardware that software runs on. Without exposing some of these details, it cannot function.

The difference is in behavioral incentives and operational culture. Kubernetes does three things other tools generally don't:

It advertises composability as a virtue. Kubernetes encourages you to break everything into separate services, deploy controllers, and compose them via APIs. This rewards fragmentation even when unnecessary.
It incentivizes layered abstractions. Helm, Operators, and CRDs exist to "simplify" deployments, but the simplification is only apparent. They multiply cognitive load instead of reducing it, creating patterns where operators feel compelled to encode business logic into infrastructure.
It fosters ritualized operational practices. Because the system is emergent and hard to reason about, many teams rely on copying charts, tweaking YAML, and restarting pods. A cargo-cult approach.

This is fundamentally an "enterprise" problem. Many enterprise platforms or frameworks contain anti-patterns, overengineered abstractions, redundant layers, and rituals, that are rationalized simply because they are labeled "enterprise." Kubernetes' ecosystem mirrors this: the tools tell you "this is the enterprise way", so teams adopt it without questioning whether it's actually manageable. Complexity is rewarded for being enterprise-grade, not for being operable.

Other leaky infrastructure tools, like Terraform, also expose internals, and they have to, but they generally don't reward creating multiple layers of abstraction or encoding business logic in infrastructure. Terraform's leaks are visible and deterministic. They do not create a culture of ritualized, cargo-cult operations.

In short: Kubernetes is not bad because it leaks. It is bad because it packages leakiness into a reward system that encourages complexity multiplication. A modern enterprise anti-pattern. That is what makes operational mental models collapse.

Security-Critical Complexity Should Stay in Software

Security teams often push tenant separation "down the stack", reasoning that the lower the isolation boundary, the stronger the guarantee.

Theoretically, kernel or VM isolation is stronger than application-level multi-tenancy. But moving tenant separation into infrastructure has a hidden cost: it moves complexity from a domain where it can be reasoned about into a domain where it cannot.

In software:

each moving part is explicit, testable, and observable
tenancy and authorization can be reasoned about in isolation
failures and bugs are deterministic and debuggable

In infrastructure:

tenants are isolated via namespaces, clusters, IAM policies, network segments, or replicated stacks
configuration multiplies: multiple deployments, secrets, certificates, storage backends
the system is emergent, opaque, and probabilistic
operators rarely maintain a complete mental model

Effectively, what was multi-tenant logic in the application becomes manual instantiation at the infrastructure layer. You duplicate deployments instead of enforcing tenancy through code. You didn't reduce complexity, you multiplied it.

Isolation feels safe because it is concrete: separate clusters, namespaces, and policies. But the operational reality is that more moving parts mean more misconfiguration, drift, and inconsistency. Security guarantees become probabilistic, not deterministic.

Not All Infrastructure Separation Is Bad

This is not to say infrastructure-level separation is inherently wrong. VMs, containers, and namespaces exist for a reason: they provide bounded, low-complexity isolation that allows multiple workloads to coexist without interfering with each other.

The key distinction is purpose and complexity. Infrastructure isolation should be:

low-complexity
bounded
a foundation for software, not a substitute for it

High-complexity separation, replicating stacks to enforce business or tenant logic, moves reasoning from the software, where each component can be understood, into infrastructure, where few humans can form a complete mental model. Infrastructure isolation is most effective when it reduces cognitive load, not when it multiplies it.

Infrastructure Should Be Boring and Opinionated

The combined effect of these dynamics is profound:

Complexity that could be managed in software, in isolation, is moved into infrastructure
Infrastructure is inherently emergent, leaky, and opaque
Operators cannot reliably reason about it
Systems may "run", and "good enough" often suffices, until failure or security incidents reveal hidden coupling

Strong abstractions are not inherently bad. Infrastructure isolation is not inherently bad. But security-critical and operationally significant complexity should remain where it can be understood: in software, in code, in tests, in explicit contracts, not hidden in provisioning, wiring, or duplicated environments.

The Real KISS Principle for Infrastructure

KISS in infrastructure is not fewer YAML lines. It is:

Can a human understand how this system is wired without drawing a graph on a whiteboard?

If the answer is no, the abstraction has failed. The system is already broken, even if it is currently running.

Infrastructure is the foundation of reliability. It should optimize for:

predictability
debuggability
minimal topology decisions
minimal required configuration
strong defaults

Not maximal flexibility.

Not infinite composability.

Not theoretical generality.

Complexity belongs in software logic, where it can be controlled, tested, reasoned about, and versioned safely.

Infrastructure should remove choices, not expose them. Infrastructure should be a foundation for software, not a substitute for it.