Codebolt is a
software foundry.
A software foundry runs many agents as one system — scaling across machines, verifying its own work, and closing its own review loops. Use what’s built in, or build your own agents and plugins on top — to build, ship, and maintain software autonomously.
Run it in your terminal.
Install the CLI, open a repo, and run built-in or custom agents with full access to the Codebolt runtime.
npm i -g codebolt
Use the desktop editor.
Open the full coding workspace with editor, terminal, browser, agents, plugin panels, and local runtime controls.
Start in the browser.
Spawn managed workspaces, run agents in cloud sandboxes, and share persistent state with your team.
Embed the foundry.
Use the Agent SDK, Plugin SDK, and Client SDK to build custom agents, products, and internal tools on top.
Getting agents to do work is easy.
Trusting the work is hard.
Agents can produce useful output quickly. The hard part is letting them handle real work on a real codebase without babysitting: staying on track, recovering from failures, and giving you enough control and evidence to trust the result.
The harness is where the leverage is — but every team ends up reinventing and rebuilding the same tools, controls, and process primitives from scratch.
Reliability
Over a real task it drifts, forgets, and loses the thread.
context plumbing · memory · checkpoints · retries
Coordination
Real work needs many agents, and they do not coordinate or scale on their own.
queues · handoffs · shared state · multi-agent glue
Control
Look away and it touches the wrong thing, or runs where it should not.
sandboxes · scoped permissions · blast-radius limits
Verification
A PR, report, or extract can look confident and still be wrong.
review gates · checks · second agents on results
Not a better agent.
A different kind of thing.
You’ve been hand-building all of that yourself. What if the controls around the agent were the product? A software foundry isn’t a smarter agent — it’s the system around it: tools, environments, coordination, verification, and monitoring, assembled into one operating loop.
Editor, terminal, files, git, browser, docs, and 500+ tools surfaced on demand — so agents work inside the same surface your developers do.
Work moves across local machines, cloud, sandboxes, and long-lived processes — without making the agent own that substrate.
Review, tests, peer agents, and correction loops run around the worker — so a result is checked before it’s accepted, instead of the agent grading itself.
Use built-in agents or your own lightweight agents, then let the foundry run the larger process — building, monitoring, and maintaining software on its own.
From bundled assistant
to runtime infrastructure.
Most coding agents are bundled assistants: agent logic, tools, model calls, permissions, and execution environment fused into one product. Codebolt pulls the toolchain, control plane, and execution substrate into a runtime layer that agents and plugins run on. That layer is the foundry — the system your agents run inside, instead of each one carrying its own.
How a foundry differs
from what you use today.
Every coding tool today hands you an agent — one you drive, prompt, or delegate to, in one place, with you checking the result. A foundry is a different unit: the system those agents run inside.
| Category | What it is | Runs across | Verifies the work | You’re in the loop |
|---|---|---|---|---|
| Coding editor | you, in the file | your machine | you | always |
| Coding assistant | a prompted agent | one session | you | every prompt |
| Autonomous engineer | one autonomous agent | one environment | you | per task |
| Personal assistant | one always-on agent | one place, via a gateway | you | every interaction |
| Software foundry | the system agents run in | many environments, scales out | itself — agents check agents | only when you choose |
— read down any column: every other category hands you an agent you run and verify. A foundry runs and verifies itself.
Compose the larger loop.
Keep agents lightweight.
The agent is not the product. The product is the process around it: when work starts, which agent handles it, what it deposits, who verifies it, and what happens next. Use existing agents, customize lightweight agents when needed, and wire the loop around them.
Building an agent
is the easy part.
Author at the altitude you want — remix an existing agent in prompts, build on framework primitives, or drop to raw code that calls the API directly. Easy by default, full control when you need it. Then extend any agent — at any level — with the same kit.
The loop is where the leverage is.
A useful agentic system is not one long prompt. It is a chain of work, deposits, independent checks, deliberation, and continuation. Codebolt gives that loop a runtime surface.
Trigger
A ticket, webhook, schedule, chat, or developer action starts the run.
Agent drafts
A lightweight agent plans and edits through runtime-owned tools.
Deposit
The work becomes a review request, test request, checkpoint, or automerge proposal.
Verify
A separate agent picks it up, checks it, and can deliberate before it closes.
Merge
The runtime records provenance, ownership, checks, and the accepted change.
Continue
Always-on plugins and persistent runtime state decide what should happen next.
The hard parts are already in the runtime.
Codebolt is the OS/infrastructure layer around agents: it owns the systems that make agent work trustworthy, scalable, and operable. Agents stay small because the runtime carries the substrate.
Every agent-infrastructure problem, already solved.
One idea, applied to every layer: the runtime owns the substrate so agents stay small. Here's the pain you feel today — and the thing that ends it.
Context is pushed, not pulled
The runtime assembles the right context and injects it at the right moment. The agent never plays librarian. Ends context rot and the 80% problem.
500+ systems behind a small interface
File read/write, terminal, browser, sandboxes, external systems, and more live in the runtime. The agent retrieves only the few capabilities its current step needs.
Routing, not rationing
Cheap models for cheap subtasks, expensive only where it counts. Loop detection and hard budget caps. Fewer turns beats cheaper tokens.
Verification at the handoff
Work passed between agents clears a runtime-native review-merge first. The next environment inherits verified work, not raw output.
Full provenance, every line
Every request mapped to every file and line change — which agent, on whose authority. The trust substrate for a world of many agents.
See exactly what was sent
Mission-control dashboards and a debug layer showing the precise context sent to each model, plus plugin traces. Operable in production.
The runtime nests n-levels deep
— and stays connected.
An environment can spin up child environments, which spin up their own, as deep as the work needs. The runtime keeps the whole tree connected: state and messages flow up and down every level. You never wire the topology — the runtime spawns, scales, and links it. This is the substrate expanding; whether the agents inside those environments talk to each other is a separate choice.
Stop babysitting your agents.
You read the draft, run it, catch what's wrong, hand it back. The work only advances while you're watching — you are the verification loop. Codebolt breaks that loop a different way: an agent deposits its work, and a separate agent picks it up. Checked by something other than the thing that wrote it — with you out of the seat.
From a lone agent
to a society of them.
Agents don't have to talk. When they do, the runtime gives them the full range — from a letter in an inbox to an entire economy. Because you write the agent, you pick the altitude: as plain as a single Claude-Code-style worker, or a coordinating colony. Same runtime, either way.
— the richest mode, up close: stigmergy is what lets a colony scale without a central hub.
Hub-and-spoke
Every agent talks through one controller. Add agents, connections explode N². The hub congests, then fails — and takes everything with it.
Signals, not a switchboard
Agents don't message each other — they leave signals in the environment, others respond, and signals bubble up to global scope. No N². No central bottleneck.
Agents are ephemeral.
The runtime is persistent.
An agent starts, does its work, and ends. Its memory, history, and output don't live in the agent — they live in the runtime. So agents stay short-lived and disposable, the runtime keeps the continuity, and no single context window ever grows long enough to rot. That's what lets it run forever: persistence isn't one agent staying alive (it would drift and forget) — it's the runtime holding state while ephemeral agents come and go.
A substrate that gets better the more it runs.
Every other harness is static — it improves only when its vendor ships a new version. Codebolt's substrate learns from real outcomes, and rewrites itself.
- RLContext assembly tunes itself — learning which context actually led to good outcomes, and pushing more of it.
- RLAgent logic is optimized against observed results, not hand-tuned by you in a config file.
- RLGuardrail tuning sharpens from production signal — fewer false stops, tighter real ones.
Extend the runtime.
Build on top of it.
Codebolt is programmable top to bottom: plugins, SDKs, dynamic panels, APIs, and full agent-native applications can all run on or connect to the runtime.
Open at every layer.
Codebolt is programmable top to bottom — nothing is a black box. Plugins run with full access to the server. A client SDK lets you build your own mission control. Dynamic panels add UI and features at runtime. And an API connects it to anything you already run. Build as deep into the foundry as you want — but you don’t have to. What’s built in runs on its own.
Build agent-native applications on top of the runtime.
Ship entire applications as plugins running on Codebolt — with native, in-process access to every agent and the whole substrate. No separate app reaching across an API to agents living somewhere else. Codebolt isn't only where agents run. It's what you build agent-native products on.
You’ve been building the harness.
Build agents instead.
Every team rebuilds the same plumbing — tools, sandboxes, verification, coordination — by hand, on every project. The foundry is that plumbing, already built. Bring your agents and plugins; skip the harness.
Built for everyone the agent touches.
- The verification loop runs in the agent — another agent checks the work, not you
- Agents are ephemeral; state lives in the runtime — no context rot
- Predictable cost, no rate-limit wall mid-task
- One model — from a lone agent to a coordinating colony
- Independent verification at every handoff, before a human looks
- Conventions enforced at generation time, not in review
- Per-team budgets and full provenance on every change
- Juniors ship safely — the runtime catches the misses
- Blast-radius limits and governed, revocable permissions
- Full local mode — proprietary code never leaves your infra
- Provider-agnostic — insulated from vendor regressions
- Provenance & observability for audit and compliance
Run coding agents
in a workspace built for them.
The agent OS for AI coding — process-specific agents and plugins on top; tools, models, environments, coordination, context, cost, and the rest of the agentic stack underneath.