The Tooling Is the Threat
Editorial Series | Sugau — Private AI & Cloud Repatriation
Wiz Research analysed cloud intrusions across 2025 and landed on a number that should make every CTO uncomfortable: 80% of breaches came from known risks — misconfigurations, exposed secrets, overprivileged identities. Not zero-days. Not nation-state sophistication. Basic operational failures.
The security industry read that stat as reassuring. I read it as a warning dressed up as good news.
Because the definition of “known risks” just quietly expanded — and most organisations haven’t noticed yet.
A new tooling layer appeared. Nobody audited it.
In the last 18 months, your engineering teams deployed something that didn’t exist in any meaningful production context two years ago:
- AI coding assistants writing and committing code at scale
- Agentic frameworks — LangChain, AutoGen, CrewAI — pulling dependencies and executing tasks with broad permissions
- MCP servers: a brand new protocol, zero mature security tooling, wide-open by default
- LLM inference endpoints spun up fast, often unauthenticated
- Vector databases bolted onto RAG pipelines with minimal access control design
- Terraform state files storing AI API keys in plain text, sitting in shared S3 buckets
Every single one of these categories is young. Fast-moving. Under-audited. And none of them have the battle-hardened security track record of nginx, PostgreSQL, or the Linux kernel.
They were built to be capable first. Security came second — if it came at all.
The blast radius is bigger than anything that came before
A compromised web server is bad. A compromised AI agent is categorically worse.
These systems are designed with broad permissions by default. They have API keys to your codebase, your data stores, your communication platforms. They can read, write, and exfiltrate — at machine speed — before any human analyst sees an anomaly in a SIEM dashboard.
The Cloud Security Alliance put it plainly in their 2026 state of the industry report: machine identities now outnumber human identities 100-to-1 in mature cloud environments. Most of them are over-privileged. Many are unused. Almost none have automatic expiry.
You are no longer defending human logins. You are governing a non-human perimeter that your security team did not design, does not fully understand, and cannot see completely.
Vibe coding made it worse
There is now a name for something that has been happening quietly for over a year: vibe coding — developers using generative AI to source or write code snippets quickly, with minimal technical scrutiny of what they are actually integrating.
The CSA calls the output “slop code.” I call it a supply chain problem that scales at the speed of a tab-complete.
When a developer is less familiar with AI-written code than code they wrote themselves, they cannot reason about its security properties. Hidden vulnerabilities enter production not through sophisticated attack — but through enthusiasm and deadline pressure.
Treat every AI-generated snippet and every third-party library as an untrusted component. Not because AI is malicious. Because it is young, and young software has bugs.
The IaC state file nobody is talking about
Here is a specific vector that deserves more attention than it gets.
Terraform — and infrastructure-as-code tooling broadly — stores a comprehensive map of your environment in state files. Resource definitions, attribute values, dependency graphs. By default, if a credential is defined as a resource property without being explicitly marked sensitive, it is written to that state file in plain text.
Where do those state files live? In shared cloud storage buckets, accessible to your entire infrastructure team. Possibly to anyone with the right IAM misconfiguration — which, per Wiz, covers a significant portion of cloud environments.
An attacker who finds your Terraform state file does not need to compromise your application. They have your database passwords, your AI API keys, your cloud provider credentials. All of it. In one file. Formatted as JSON.
This is not theoretical. It is the documented root cause of consequential breaches in 2025.
The private infrastructure answer
The honest argument for keeping AI workloads on private, bare-metal infrastructure is not paranoia. It is surface area reduction.
When you control the hardware, you control what runs. When you control what runs, you can audit it. When you can audit it, you know what has network access, what holds credentials, what can exfiltrate data, and what cannot.
Cloud environments are elastic by design — and that elasticity is precisely what makes them hard to secure. New services spin up. Integrations are added. Permissions drift. AI tooling gets bolted on because a team lead saw a demo and approved it by Slack message on a Friday afternoon.
On private infrastructure with a disciplined GitOps posture, that Friday afternoon deployment does not happen silently. Every change is reviewed. Every service has a declared reason to exist.
The CSA data shows the “toxic cloud trilogy” — workloads that are simultaneously publicly accessible, critically vulnerable, and highly privileged — still affects 29% of cloud environments as of mid-2025. That is nearly one in three.
Your private AI deployment does not have to be one of them.
What CTOs should ask this week
- Do you know every AI tool your engineering team integrated in the last 12 months?
- Are your Terraform state files scanned for plaintext credentials before they are committed?
- Do your LLM inference endpoints require authentication?
- Are your agentic framework service accounts scoped to least privilege?
- How quickly would you detect an AI agent exfiltrating data at machine speed?
If the answer to any of those is “I’m not sure” — the tooling layer is already ahead of your security posture.
The next piece in this series goes into the technical controls: how to lock down inference endpoints, scope non-human identities correctly, and implement secret scanning that catches what your developers miss.
Catalin Lichi is the founder of Sugau — a bare-metal Kubernetes and private AI consultancy helping organisations repatriate workloads and run sovereign AI infrastructure.