Running AI Agents Safely Inside Kubernetes

by Nimesha Jinarajadasa
Nimesha Jinarajadasa
Nimesha Jianrajadasa is a DevOps & Cloud Consultant, K8s expert, and instructional content strategist-crafting hands-on learning experiences in DevOps, Kubernetes, and platform engineering.
•
Last updated: June 25, 2026
•
12 min read

Kubernetes is in production at the strong majority of cloud native organizations according to the CNCF 2024 Annual Survey, and the newest workload joining those clusters is the autonomous AI agent. An agent that calls tools, browses the web, and acts on user instructions creates a security problem the original Kubernetes threat model never anticipated. The compromise of an agent is not equivalent to the compromise of a stateless web service, because the attacker inherits the agent's tool surface, its credentials, and its authority to act on behalf of the user. This piece is a working engineer's guide to running agents on Kubernetes without giving away the cluster.

Highlights

AI agents change the Kubernetes threat model in a fundamental way.
Network egress is the single highest impact control in the entire stack.
Pod Security Admission with the restricted profile is the new baseline.
A sandboxed runtime is required for any agent that executes generated code.
Each MCP server should run as its own separately privileged process.
Every prompt, model response, and tool call must be captured.
Short lived credentials beat static secrets every time.

Why AI Agents Need a Different Security Model

A typical Kubernetes workload runs deterministic code. An NGINX pod serves HTTP, a Postgres pod answers SQL, a Go service handles a finite set of routes. The blast radius of a compromise is bounded by what the application was designed to do, and the security work mostly involves limiting that radius further.

An AI agent inverts that assumption. It receives instructions in natural language, decides which tools to call, and then calls them. The control flow is generated at runtime, not at build time. A user typing "summarize my open Jira tickets" might cause the agent to call the Jira API, retrieve issues, reason over them with an LLM, and write a summary. The same agent given a malicious instruction, planted inside a webpage it browses or a document it reads, might be coerced into calling an internal banking API or sending the contents of memory to an attacker controlled endpoint.

That combination of broad capability and instruction driven control flow is what makes traditional pod security insufficient on its own. You have to treat every agent as a process that is partially under the attacker's control whenever it ingests external content, even if your own users are entirely benign.

The Threat Model for Agentic Workloads

Before designing controls, name the attacks you are defending against. The most common categories in production are:

Prompt injection from external content. The agent reads a webpage, email, or PDF that contains attacker authored instructions, and those instructions hijack its next actions. The OWASP GenAI Security Project lists this as risk LLM01.
Tool abuse. A legitimate tool is invoked for an unintended purpose. A file read tool intended for project files gets pointed at /etc/shadow. A shell tool intended for a sandbox runs curl to ship secrets out.
Credential exposure through prompt extraction. The system prompt or an environment variable containing API keys ends up in the model's response because the user, or a prompt injection, asks for it.
Resource exhaustion. The agent enters a loop, calls itself recursively, or hits a tool that returns large output. Without resource limits, the pod consumes the node.
Supply chain attacks. A compromised base image, a malicious MCP server, or tampered model weights ship into production unnoticed.
Lateral movement. A breached agent uses its service account or network position to reach other pods, the API server, or cloud metadata endpoints.

Each control discussed below maps back to one or more of these. Keep the list in mind as you read.

Cluster Level Isolation

One Namespace per Tenant or per Agent Class

Use namespaces as the primary policy boundary. Each agent class (or each customer tenant for a multi tenant SaaS agent) gets its own namespace. The namespace becomes the unit for RBAC, NetworkPolicy, ResourceQuota, LimitRange, and PodSecurity Admission. If an agent in agents-finance is compromised, you have a clear boundary that prevents trivial movement into agents-engineering.

Default Deny NetworkPolicy

Kubernetes NetworkPolicy is opt in by default, which means a pod with no policy can reach any other pod in the cluster. That is the wrong default for agents. Apply a default deny for both ingress and egress, then explicitly allow only the traffic each pod needs.

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
  namespace: agents-prod
spec:
  podSelector: {}
  policyTypes:
  - Ingress
  - Egress

Then add a narrow allow for DNS and for the egress proxy:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-dns-and-proxy
  namespace: agents-prod
spec:
  podSelector:
    matchLabels:
      app: agent
  policyTypes:
  - Egress
  egress:
  - to:
    - namespaceSelector:
        matchLabels:
          name: kube-system
    ports:
    - protocol: UDP
      port: 53
  - to:
    - podSelector:
        matchLabels:
          app: egress-proxy
    ports:
    - protocol: TCP
      port: 3128

Egress Through a Forward Proxy

NetworkPolicy operates on IP and port. For domains, you need a layer 7 proxy. Run a forward proxy such as Envoy or Squid with an explicit allowlist of approved domains, for example api.openai.com, api.anthropic.com, your internal tool services, and nothing else. All agent egress flows through it. This gives you DNS level control, audit logs of every outbound request, a single point for rate limits, and a place to enforce request size caps.

This control alone eliminates a large fraction of prompt injection driven exfiltration, because even a fully coerced agent cannot send data to an attacker controlled host if that host is not in the allowlist.

Pod Level Hardening

Enforce Pod Security Admission, Restricted Profile

Set the restricted PSA profile on every agent namespace. Apply it at the namespace level:

apiVersion: v1
kind: Namespace
metadata:
  name: agents-prod
  labels:
    pod-security.kubernetes.io/enforce: restricted
    pod-security.kubernetes.io/enforce-version: latest

The restricted profile requires non root execution, seccomp RuntimeDefault, no privileged escalation, no host namespaces, no host paths, all capabilities dropped, and a read only root filesystem unless explicitly waived. These are the table stakes that turn a typical container breakout into a much harder problem.

Use a Sandboxed Container Runtime

Default runtimes like runc share the host kernel. A kernel exploit from inside an agent container reaches the node. For any agent that runs generated code, processes untrusted documents, or executes tools that take user input as arguments, add a second isolation boundary.

Pick the right runtime for the workload:

Runtime	Isolation Model	Performance Cost	Best Suited For
runc	Shared kernel, namespaces and cgroups	None	Trusted internal agents with no code execution
gVisor	Userspace kernel intercepting syscalls	10 to 30 percent on syscall heavy workloads	Multi tenant SaaS agents with API only tools
Kata Containers	Lightweight VM per pod	5 to 15 percent memory plus slower startup	Agents running shell, Python, or other code tools
Firecracker via Kata	MicroVM	Comparable to Kata, lower memory footprint	High density untrusted code execution
Confidential Containers (TDX, SEV SNP)	Hardware enclave	Higher and workload dependent	Regulated data with distrust of host

Apply them through RuntimeClass:

apiVersion: node.k8s.io/v1
kind: RuntimeClass
metadata:
  name: gvisor
handler: runsc

Then reference the class in the agent pod spec.

Always Set Resource Limits

Agents can loop. A reasoning loop that fails to terminate, a tool that returns a multi gigabyte response, or a runaway batch job can take down a node within minutes if memory and CPU are not bounded. Set both requests and limits on every container, and use LimitRange at the namespace level so no one accidentally deploys without them.

Secret Management for Agents

API keys for LLM providers and tool credentials are an agent's most sensitive assets. The rules are well known but consistently broken:

Never bake credentials into images. Use External Secrets Operator, Vault Agent Injector, or a CSI Secret Store provider to pull secrets from your central store (Vault, AWS Secrets Manager, GCP Secret Manager) at runtime.

Project secrets as files rather than environment variables. Environment variables leak through env, through accidental logging, through process listings, and through crash dumps. Files do not.

Prefer short lived identities to long lived secrets. IRSA on EKS, Workload Identity on GKE, AAD Workload Identity on AKS, and SPIFFE through service mesh let the pod authenticate as itself without ever holding a permanent token. A leaked token from a compromised pod expires in minutes rather than months.

Boundaries on Tool Calls

This is where most teams under invest. You can lock down the pod perfectly and still get owned through tools.

Run Each MCP Server as Its Own Pod

The Model Context Protocol exposes tools to agents through a defined RPC interface. Each MCP server should run in its own pod with the minimum scoped RBAC needed for its specific tools. A filesystem MCP server runs with read only access to a specific PVC. A Jira MCP server holds the Jira credential and exposes only the verbs you want available. A Kubernetes MCP server, if you allow one at all, has a service account scoped to a particular namespace with a strict Role.

The agent pod itself holds no tool credentials. It talks to MCP servers over the cluster network, and each MCP server enforces its own authorization. This pattern is the AI equivalent of running each microservice as its own user, and it pays off the first time an agent is coerced into asking for something it should not have.

Require Approval for Destructive Tools

For any tool that deletes, sends, writes, or transacts, build a two step pattern. The agent proposes the action, and a policy engine, a human, or both approve it before execution. This is straightforward for chat style agents where a user is in the loop, and it scales surprisingly well for autonomous workflows when you tier approvals by blast radius.

Validate Every Generated Action

Treat any string produced by an LLM as untrusted user input. If the agent generates a kubectl command, do not exec it directly. Parse it, check the verb and resource against an allowlist, and reject anything outside the allowlist. The same applies to SQL, shell commands, file paths, and URLs. The pattern is identical to how you would handle input from an untrusted REST client.

Observability and Audit

You cannot defend what you cannot see. Capture, at minimum:

Every prompt the agent received, including system prompts and any retrieved context. Every tool call with its arguments. Every model response. Resource usage per session. Anomalies such as a sudden spike in tool calls per minute, calls to a destination not seen before, or unusually large responses.

Useful tools in the cloud native stack include OpenTelemetry for distributed tracing across the agent and its MCP servers, Falco for runtime detection of suspicious syscalls, Loki and Prometheus for logs and metrics, and the Kubernetes audit log for control plane access. Pipe everything to a SIEM or central observability platform so a single query can answer "what did this agent do in the last hour."

A Reference Architecture

Putting it together, a defensible setup looks like this:

An agents-prod namespace with PSA set to restricted and a default deny NetworkPolicy. Cilium or Calico as the CNI to enforce that policy and provide identity aware controls. A forward proxy pod (Envoy is a good choice) with an explicit domain allowlist and structured logging of every request. Agent pods running on a gvisor RuntimeClass, with a separate kata class available for the subset of agents that run code execution tools. One MCP server pod per tool category, each with its own service account and Role. External Secrets Operator pulling LLM and tool credentials from Vault, mounted as files. OPA Gatekeeper or Kyverno enforcing admission policy so a pod without resource limits or a sandboxed runtime never deploys. Falco running as a DaemonSet for runtime detection. An OpenTelemetry collector forwarding traces and logs to your APM and SIEM.

This is not exotic infrastructure. Every component listed is a mature CNCF project or a standard cloud feature. The work is in wiring it together, not in inventing anything new.

Common Pitfalls

Five mistakes show up over and over in incident reviews.

Leaving DNS wide open. A NetworkPolicy that allows DNS to any resolver effectively allows egress to anywhere through DNS tunneling. Restrict DNS to the cluster's CoreDNS service and to no one else.

Storing LLM API keys as environment variables. Every error trace, every crash dump, every env invocation by an agent generating a debug tool call becomes an exfiltration channel. File mounts are not optional.

Granting cluster admin "just to ship it." The agent service account ends up with permissions far beyond what its tools need. Audit RBAC on day one and treat broad permissions as a release blocker.

Skipping resource limits. One runaway agent, one node down. The math is that simple.

Trusting tool descriptions as immutable. The agent reads them. If an attacker can edit the description of a tool through some indirect path, they can hijack the agent's behavior even without touching the prompt directly. Treat tool descriptions as code and review changes to them with the same rigor.

What Comes Next

The trajectory through 2026 and beyond is clear. WebAssembly based runtimes (WasmEdge, wasmCloud, Spin on Kubernetes) are starting to offer near VM isolation with container like startup, which is attractive for short lived agent invocations. Confidential containers using Intel TDX and AMD SEV SNP are arriving in mainstream managed Kubernetes, letting you run agents on infrastructure you do not fully trust without exposing memory contents. Service mesh and Gateway API are converging on per call authorization, making it practical to apply policy at every hop the agent makes. Expect agent specific operators that bundle the patterns in this article into a single CRD within the next few release cycles.

Running AI agents on Kubernetes is not categorically different from running any other workload with a wide blast radius. The principles you already apply, assume compromise, minimize what each process can do, validate everything that crosses a trust boundary, all hold. What changes is the threat model: the agent's code path is generated at runtime from input that an attacker may control. Lock down the network at the egress, isolate at the pod and runtime layers, separate tools into their own least privileged processes, and capture everything for audit. Build the setup once, codify it as policy, and your next agent rollout becomes a configuration change instead of a security review.

FAQS

Q1: What do I need to know before running AI agents in Kubernetes?

You need solid working knowledge of core Kubernetes objects, including Pods, Deployments, Services, Namespaces, and RBAC, along with familiarity with NetworkPolicy and how secrets are mounted into pods. On the AI side, you should understand what an LLM call looks like over HTTP, what the Model Context Protocol does, and what tool calling means in agent frameworks such as LangChain, LlamaIndex, or the OpenAI Agents SDK. You do not need to be a machine learning researcher. The hard parts of running agents in production are operational, not algorithmic, and they map closely to skills you already have if you have shipped microservices. For a structured path through the cluster level skills you will rely on daily, the Certified Kubernetes Administrator (CKA) Course covers the operational foundations end to end.

Q2: How is securing an AI agent different from securing a regular microservice?

A regular microservice has a fixed code path. Its inputs are constrained by a schema, its outputs are deterministic, and its blast radius is well understood at build time. An AI agent generates its control flow at runtime based on a prompt, which means a single piece of attacker controlled text inside a webpage, email, or document the agent reads can change what tools it calls and what data it touches. You are not just protecting the code path. You are protecting the agent from inputs that can rewrite its code path entirely. That is why egress control, tool approval, and output validation sit on top of standard pod hardening rather than replacing it.

Q3: Should I use gVisor or Kata Containers for AI agent isolation?

The right choice depends on what the agent actually does. If the agent only calls external APIs and does not execute generated code, gVisor is usually enough, because it intercepts syscalls in userspace so a kernel exploit cannot reach the host, and the performance overhead on typical API workloads stays manageable. If the agent runs generated code through a Python sandbox, a shell tool, or any tool with general computational capability, Kata Containers or Firecracker through Kata provide hardware level VM isolation, which is the safer default. Many production teams run both side by side and route different workload classes to different runtimes through Kubernetes RuntimeClass.

Q4: How do I handle LLM API keys safely in Kubernetes?

Never bake API keys into container images. Store them in an external secret store such as HashiCorp Vault, AWS Secrets Manager, or GCP Secret Manager, and pull them into the cluster with External Secrets Operator or a CSI Secret Store provider. Mount them as files rather than environment variables, because environment variables leak through process listings, crash dumps, and debug tools in ways that files do not. Where possible, replace static keys with short lived credentials issued by your cloud identity provider or a service mesh that injects mTLS certificates, so the agent never holds a long lived token at all. For deeper, hands on practice with these patterns, the HashiCorp Vault Course walks through production grade secret workflows step by step.

Q5: Can I run AI agents on managed Kubernetes services like EKS, GKE, or AKS?

Yes, and most teams do exactly this. Managed services give you a hardened control plane, integrated cloud identity through IRSA on EKS, Workload Identity on GKE, and AAD Workload Identity on AKS, and easy access to GPU node pools if you self host models. The tradeoff is that some advanced features take more setup, for example sandboxed runtimes like gVisor are available as a runtime class on GKE but require manual installation on EKS, and confidential nodes are available on GKE and AKS today and on EKS through specific Nitro instance families. The patterns in this article apply across all three providers with only minor configuration differences. The Certified Kubernetes Security Specialist (CKS) Course covers the cluster hardening skills that translate directly across EKS, GKE, and AKS.

Q6: What is the single biggest mistake teams make on their first production AI agent?

Skipping egress control. Teams tighten RBAC, set resource limits, and apply Pod Security Admission, then deploy an agent that can still reach any IP on the public internet because no NetworkPolicy or forward proxy stands in front of it. A single successful prompt injection that tricks the agent into hitting an attacker controlled endpoint can exfiltrate everything the agent has read in the conversation so far, including credentials it may have observed in tool responses. The fix is small in terms of configuration effort and very large in terms of risk reduction. Apply default deny egress and route every outbound call through a forward proxy with a tight domain allowlist before the first agent ever serves a user request.

Nimesha Jinarajadasa

Nimesha Jianrajadasa is a DevOps & Cloud Consultant, K8s expert, and instructional content strategist-crafting hands-on learning experiences in DevOps, Kubernetes, and platform engineering.