
BerriAI has published the LiteLLM Agent Platform as an open-source, self-hosted layer for operating multiple AI agents in production on Kubernetes, addressing scaling issues that appear when projects grow beyond single — process scripts. The platform’s core value is provisioning isolated, persistent sandboxes per team and per context so agents keep session history and tool outputs across restarts — a practical fix for lost context and state in typical pod-based deployments.
The project bundles a Next.js dashboard for LiteLLM v2 managed agents, a worker process for asynchronous agent tasks, and Postgres as the backing store. The codebase is predominantly TypeScript (92.8%), with Shell provisioning scripts, a Dockerfile and CSS for the UI. On startup a schema migration runs as an init container; the web process serves the dashboard on port 3000 while the worker handles runtime tasks.
For sandboxing, LiteLLM Agent Platform relies on the kubernetes — sigs/agent — sandbox CustomResourceDefinition to provision isolated runtimes on Kubernetes, and it supports local development with kind (Kubernetes in Docker). The repository includes a harnesses/opencode collection for running coding agents — examples cited are Claude Code and OpenAI Codex — inside per-session sandboxes, and a litellm — agent-runtime repository that contains runtime logic configurable via harness configuration or a hydrate payload.
Developer onboarding is designed for local testing without cloud credentials but requires Docker Desktop, kind, kubectl, helm and a LiteLLM gateway. The quickstart is two commands: bin/kind-up.sh (idempotent; it provisions a kind cluster named agent — sbx, installs the agent — sandbox controller and loads the harness image) and docker compose up (which boots Postgres, runs the migration and starts the web and worker processes).
The platform explicitly targets two operational problems: agents are stateful — session history, tool outputs and intermediate reasoning can be lost if pods crash or are replaced — and teams often need different runtimes, tools, secrets and access scopes that rule out a single shared container. LiteLLM exposes per-team and per-context sandboxes and session — continuity primitives to address those needs.
For production deployments the project recommends AWS EKS for the sandbox cluster and Render for hosting the web and worker processes, and the repository includes a bin/eks-up.sh script to assist provisioning. Other operational details provided to builders include a vault proxy for credential management, an environment — variable convention where.env entries prefixed CONTAINER_ENV_ are injected into sandbox containers (for example, CONTAINER_ENV_GITHUB_TOKEN =>GITHUB_TOKEN), and a design that keeps agent runtimes generic and configurable by harnesses.
Sources
Replies (0)
No replies in this topic yet.