AgentOS

Enterprise Multi-Agent Platform

AgentOS is a control plane for running AI agents inside a company: roles, teams, approvals, channels, memory, and runtime isolation managed from one dashboard. Operators can create and govern the fleet without touching the runtime underneath. I designed and built the product, backend, operator console, and AWS infrastructure solo.

Role: Solo Founder + Lead Engineer
Period: 2025 to present
Status: Production

FastAPINext.jsAWS CDKOpenClawQdrantFargateTeams BotSecrets Manager

— Chapter 01

System shape

How the system fits together.

Click a block to zoom in

Agents run under a control plane with roles, approvals, memory, and isolated runtimes. Click any block to see how a piece works.

Fig. 01 — AgentOS architecture

— Chapter 02

Decisions and outcomes

The calls that shaped it.

01

The core decision was to build a control plane, not a wrapper. Operators create agents, hand them skills, watch what they do, and shape their environment from the dashboard — the runtime stays an implementation detail. That one boundary shaped everything else.
02

Nothing risky happens unsupervised. Every action that reaches a real outside system passes a verification agent (GateKeeper) and, when it matters, a human — and it fails safe: if the gate is down, the action is blocked, not waved through.
03

You can describe a team in plain English and the platform designs it for you — agents, skills, schedules, channels — as a blueprint you review and approve before anything is created. Standing up a new set of agents feels like a conversation, not a config project.
04

A marketplace of skills and plugins with trust built in: agents can pull the safe, everyday capabilities themselves, sensitive ones need an admin’s sign-off, and every item tracks who made it.
05

It’s a real operations product, not a demo: live fleet monitoring, cost tracking that separates infrastructure from model spend, full audit history, Microsoft Teams / Slack / Telegram reachability, and a monitoring agent that watches the other agents — all deployed as infrastructure-as-code.

— Aside

The interesting work isn't the stack. It's the boundaries.

— Chapter 03

How it runs

What it runs on.

01
Python / FastAPI control plane — one service, separate doors for the dashboard and the agents
02
Next.js operator console for creating, watching, and governing the fleet
03
Agents run in isolated AWS Fargate containers on the OpenClaw runtime, each with a gateway that keeps it in sync
04
A verification agent (GateKeeper) in front of every external write
05
AI blueprints that design whole fleets or single agents, with human approval before anything is built
06
A marketplace for skills and plugins with per-item trust tiers
07
Microsoft Teams bots provisioned automatically through Azure, plus Slack and Telegram
08
Qdrant vector memory scoped by agent, team, and org
09
AWS, infrastructure-as-code across four CDK stacks: network, data, platform, agents

— Keep exploring

More from the workshop.

← Previous

AMOS

Asset Management Operations System

AIVA

Document Intelligence Platform