Not a coding agent

Codebolt is a
software foundry.

A software foundry runs many agents as one system — scaling across machines, verifying its own work, and closing its own review loops. Use what’s built in, or build your own agents and plugins on top — to build, ship, and maintain software autonomously.

Many agents as one system · scales across environments · intent → verified result · programmable end to end

Choose your Codebolt surface

Run it in your terminal.

Install the CLI, open a repo, and run built-in or custom agents with full access to the Codebolt runtime.

npm i -g codebolt
~ codebolt
Codebolt CLI terminal interface

Use the desktop editor.

Open the full coding workspace with editor, terminal, browser, agents, plugin panels, and local runtime controls.

macOS · Windows · Linux with the foundry running locally.
Codebolt editor
Codebolt desktop editor interface

Start in the browser.

Spawn managed workspaces, run agents in cloud sandboxes, and share persistent state with your team.

No install. Managed environments and always-on runs.
Cloud workspace
Codebolt cloud workspace interface

Embed the foundry.

Use the Agent SDK, Plugin SDK, and Client SDK to build custom agents, products, and internal tools on top.

Agent · plugin · client layers for building on Codebolt.
codeboltjs
Codebolt SDK TypeScript editor example
§ 01 — The problem

Getting agents to do work is easy.
Trusting the work is hard.

Agents can produce useful output quickly. The hard part is letting them handle real work on a real codebase without babysitting: staying on track, recovering from failures, and giving you enough control and evidence to trust the result.

The harness is where the leverage is — but every team ends up reinventing and rebuilding the same tools, controls, and process primitives from scratch.

Major problems faced

Reliability

Over a real task it drifts, forgets, and loses the thread.

context plumbing · memory · checkpoints · retries

Coordination

Real work needs many agents, and they do not coordinate or scale on their own.

queues · handoffs · shared state · multi-agent glue

Control

Look away and it touches the wrong thing, or runs where it should not.

sandboxes · scoped permissions · blast-radius limits

Verification

A PR, report, or extract can look confident and still be wrong.

review gates · checks · second agents on results

What a foundry is

Not a better agent.
A different kind of thing.

You’ve been hand-building all of that yourself. What if the controls around the agent were the product? A software foundry isn’t a smarter agent — it’s the system around it: tools, environments, coordination, verification, and monitoring, assembled into one operating loop.

01Platform
A full coding platform

Editor, terminal, files, git, browser, docs, and 500+ tools surfaced on demand — so agents work inside the same surface your developers do.

toolscontextcodebase accessdeveloper controls
02Scale
Coordination across environments

Work moves across local machines, cloud, sandboxes, and long-lived processes — without making the agent own that substrate.

localcloudparallel runspersistent state
03Trust
Independent verification

Review, tests, peer agents, and correction loops run around the worker — so a result is checked before it’s accepted, instead of the agent grading itself.

review gatessecond agentschecksprovenance
04Outcome
Less babysitting

Use built-in agents or your own lightweight agents, then let the foundry run the larger process — building, monitoring, and maintaining software on its own.

buildmonitormaintaincontinue
§ 02 — Architecture

From bundled assistant
to runtime infrastructure.

Most coding agents are bundled assistants: agent logic, tools, model calls, permissions, and execution environment fused into one product. Codebolt pulls the toolchain, control plane, and execution substrate into a runtime layer that agents and plugins run on. That layer is the foundry — the system your agents run inside, instead of each one carrying its own.

CODEBOLT · SYSTEM ARCHITECTURE Process-specific agents & plugins run on top. The runtime runs everything else. ACCESS · SURFACES — CONNECTED TO THE RUNTIME Terminal / CLI Web UI Desktop App Mobile REST API / SDK surfaces ↔ runtime WHAT YOU AUTHOR · RUNS ON THE RUNTIME Agents — run on your message step planOAuth() … process-specific logic — steps, needs & policy. ↳ where other coding agents stop. Plugins — run 24/7, always on Telegram Linear GitHub Slack + monitoring code with full system access · connect external systems. ↳ turns the system into a 24/7 factory. HANDLED BY THE RUNTIME · YOU NEVER WIRE ANY OF THIS THE CODEBOLT RUNTIME The operating system for coding agents — every hard, repeated piece of the loop, owned here so agents & plugins stay small. Context Assembly right context, pushed in — agent never hunts Coordination & Orchestration stigmergic mesh · n-level · no N² messaging Model Routing every LLM call brokered here · budget caps Guardrails immutable invariants — the optimizer can't cross Tool Search read/write file → sandboxing · 500+ systems Review-Merge verified handoff between agents & environments Narrative Engine full provenance — every request → file → line Observability see exact context sent to each model · debug Learning Loop · RL tunes context & policy from outcomes + ALSO BUILT IN Deliberations Snapshots & Versioning Drift Detection Budget Control Sub-agent Spawning orchestrates work across routes every LLM call through EXECUTION · ENVIRONMENTS — COORDINATED BY THE RUNTIME Local & self-hosted runners your machine / your infra · direct · never routed to cloud privacy Remote sandbox · self-started cloud compute you spin up → reports back to your local app privacy Remote sandbox · cloud-started runs & reports in the cloud → close your laptop, it keeps going convenience ANY MODEL — REACHED ONLY THROUGH THE RUNTIME Opus GPT Gemini Fable Agents & plugins never call a model directly — every call is brokered, routed & logged by the runtime. swap models underneath · your workflow never breaks
Where it fits

How a foundry differs
from what you use today.

Every coding tool today hands you an agent — one you drive, prompt, or delegate to, in one place, with you checking the result. A foundry is a different unit: the system those agents run inside.

Category What it is Runs across Verifies the work You’re in the loop
Coding editor you, in the file your machine you always
Coding assistant a prompted agent one session you every prompt
Autonomous engineer one autonomous agent one environment you per task
Personal assistant one always-on agent one place, via a gateway you every interaction
Software foundry the system agents run in many environments, scales out itself — agents check agents only when you choose

— read down any column: every other category hands you an agent you run and verify. A foundry runs and verifies itself.

§ 03 — What you focus on

Compose the larger loop.
Keep agents lightweight.

The agent is not the product. The product is the process around it: when work starts, which agent handles it, what it deposits, who verifies it, and what happens next. Use existing agents, customize lightweight agents when needed, and wire the loop around them.

Building agents

Building an agent
is the easy part.

Author at the altitude you want — remix an existing agent in prompts, build on framework primitives, or drop to raw code that calls the API directly. Easy by default, full control when you need it. Then extend any agent — at any level — with the same kit.

1 · AUTHOR AT ANY LEVEL high-level · fastest Remix start from an agent — change prompts & config Framework build on agent primitives — assemble, don't wire Code · core write code that calls the API directly low-level · full control Your agent your process logic 2 · EXTEND WITH THE KIT Skills Actions Capabilities Action blocks Dynamic execution MCP pick your altitude to write it — then plug in skills, actions, capabilities and more to extend it.

The loop is where the leverage is.

A useful agentic system is not one long prompt. It is a chain of work, deposits, independent checks, deliberation, and continuation. Codebolt gives that loop a runtime surface.

01

Trigger

A ticket, webhook, schedule, chat, or developer action starts the run.

02

Agent drafts

A lightweight agent plans and edits through runtime-owned tools.

03

Deposit

The work becomes a review request, test request, checkpoint, or automerge proposal.

04

Verify

A separate agent picks it up, checks it, and can deliberate before it closes.

05

Merge

The runtime records provenance, ownership, checks, and the accepted change.

06

Continue

Always-on plugins and persistent runtime state decide what should happen next.

§ 04 — Runtime handles

The hard parts are already in the runtime.

Codebolt is the OS/infrastructure layer around agents: it owns the systems that make agent work trustworthy, scalable, and operable. Agents stay small because the runtime carries the substrate.

The substrate

Every agent-infrastructure problem, already solved.

One idea, applied to every layer: the runtime owns the substrate so agents stay small. Here's the pain you feel today — and the thing that ends it.

"My agent wastes turns hunting for context — and still misses things."
Context Assembly Engine

Context is pushed, not pulled

The runtime assembles the right context and injects it at the right moment. The agent never plays librarian. Ends context rot and the 80% problem.

"If the agent sees 500 systems, planning turns into noise."
Tool Search

500+ systems behind a small interface

File read/write, terminal, browser, sandboxes, external systems, and more live in the runtime. The agent retrieves only the few capabilities its current step needs.

"Costs are unpredictable and rising even when my code doesn't change."
Cost Runtime

Routing, not rationing

Cheap models for cheap subtasks, expensive only where it counts. Loop detection and hard budget caps. Fewer turns beats cheaper tokens.

"Reviewing agent output has quietly become my entire job."
Local Review-Merge

Verification at the handoff

Work passed between agents clears a runtime-native review-merge first. The next environment inherits verified work, not raw output.

"The model regresses and my agents get worse and I can't tell why."
Narrative Engine

Full provenance, every line

Every request mapped to every file and line change — which agent, on whose authority. The trust substrate for a world of many agents.

"When something breaks, it's a black box I can only pray to."
AI Observability

See exactly what was sent

Mission-control dashboards and a debug layer showing the precise context sent to each model, plus plugin traces. Operable in production.

Environment scaling

The runtime nests n-levels deep
— and stays connected.

An environment can spin up child environments, which spin up their own, as deep as the work needs. The runtime keeps the whole tree connected: state and messages flow up and down every level. You never wire the topology — the runtime spawns, scales, and links it. This is the substrate expanding; whether the agents inside those environments talk to each other is a separate choice.

LEVEL n n + 1 n + 2 Runtime root · level n child runtime local child runtime cloud child runtime self-hosted child child child child child … n-deep parent ↔ child communication across every level — the runtime spawns it, scales it, keeps it connected.
The deposition framework

Stop babysitting your agents.

You read the draft, run it, catch what's wrong, hand it back. The work only advances while you're watching — you are the verification loop. Codebolt breaks that loop a different way: an agent deposits its work, and a separate agent picks it up. Checked by something other than the thing that wrote it — with you out of the seat.

Coding agent plans, edits — then deposits deposits THE DEPOSITION SURFACE tangible artifacts, held until claimed Review request PR · awaiting pickup Test request suite · awaiting Checkpoint state · resumable Automerge on checks passing ↔ agents deliberate on a deposit before it closes discuss → resolve picks up picks up Testing agent separate · independent Review agent separate · independent ✓ result YOU out of the seat deposit → held → picked up by a different agent → deliberated → resolved. no one babysitting.
Deposit / pick up
Agents hand off to agents
An agent deposits a tangible artifact — a review request, a test request, an automerge, a checkpoint — into the runtime. A separate agent, in its own context and its own moment, claims it. Work changes hands without changing heads.
Independent check
Not an agent grading its own homework
The thing that writes the code isn't the thing that tests or merges it. Agents can also deliberate over a deposit before it closes. Verification by a different party — the same reason humans review each other — is more trustworthy than self-checking.
Open surfaces
Coding is just the sharpest case
Plugins and native apps can open their own deposit points, so the pattern isn't a fixed pipeline: a plugin deposits a sales draft, a compliance agent picks it up. The handoff substrate doesn't care what the agents do.
Agent coordination

From a lone agent
to a society of them.

Agents don't have to talk. When they do, the runtime gives them the full range — from a letter in an inbox to an entire economy. Because you write the agent, you pick the altitude: as plain as a single Claude-Code-style worker, or a coordinating colony. Same runtime, either way.

DIRECT · EXPLICIT EMERGENT · COLLECTIVE Inbox agent-to-agent mail addressed · async Direct message synchronous point-to-point Stigmergic signals leave & sense pheromones leaderless herding Reputation agents rate each other trust accrues Economy agents price, transact, allocate

— the richest mode, up close: stigmergy is what lets a colony scale without a central hub.

◇ The naive way

Hub-and-spoke

Every agent talks through one controller. Add agents, connections explode N². The hub congests, then fails — and takes everything with it.

topology: star · depth 1 messaging: direct · N² failure: single point
◆ Stigmergic mode

Signals, not a switchboard

Agents don't message each other — they leave signals in the environment, others respond, and signals bubble up to global scope. No N². No central bottleneck.

scope: n-level · governed messaging: stigmergic failure: contained
Why it scales Stigmergy is what makes the colony possible: full-mesh coordination with no per-pair connections and no direct-addressing attack surface. Ants coordinate millions strong with zero direct messaging — they read and write the environment. Addressing is granted and governed, and it runs local or cloud, your choice.
The runtime model

Agents are ephemeral.
The runtime is persistent.

An agent starts, does its work, and ends. Its memory, history, and output don't live in the agent — they live in the runtime. So agents stay short-lived and disposable, the runtime keeps the continuity, and no single context window ever grows long enough to rot. That's what lets it run forever: persistence isn't one agent staying alive (it would drift and forget) — it's the runtime holding state while ephemeral agents come and go.

One long-lived agent context rots fresh lost in the middle drift one growing context window — it forgets, drifts, and degrades the longer it lives. Codebolt ephemeral agents, persistent runtime you: away — not needed Telegram agent · ends schedule agent · ends issue opened agent · ends webhook agent · ends ↕ read / write state RUNTIME · persistent holds memory · history · output · context the agent is disposable — its memory, history & output live in the runtime, so it starts fresh, ends clean, and never rots.
The 24/7 part Because state lives in the runtime — not in a single context window that fills and rots — the system can run indefinitely. Plugins listen and react around the clock: a Telegram plugin holding a dozen threads, a 3am schedule, an issue that opens itself into a fix. The runtime spawns an agent, runs it, ends it, reports back — while you sleep. Sessions stop being sessions and become a system: a 24/7 autonomous operation, not a tool you keep poking.
The differentiator

A substrate that gets better the more it runs.

Every other harness is static — it improves only when its vendor ships a new version. Codebolt's substrate learns from real outcomes, and rewrites itself.

  • RLContext assembly tunes itself — learning which context actually led to good outcomes, and pushing more of it.
  • RLAgent logic is optimized against observed results, not hand-tuned by you in a config file.
  • RLGuardrail tuning sharpens from production signal — fewer false stops, tighter real ones.
The counterweight, built in. Only tuning is learnable. Hard, human-set invariants — the blast-radius limits — are immutable. The runtime learns everything except the lines you draw, and those it can never cross.
§ 05 — Extensibility

Extend the runtime.
Build on top of it.

Codebolt is programmable top to bottom: plugins, SDKs, dynamic panels, APIs, and full agent-native applications can all run on or connect to the runtime.

Extensibility

Open at every layer.

Codebolt is programmable top to bottom — nothing is a black box. Plugins run with full access to the server. A client SDK lets you build your own mission control. Dynamic panels add UI and features at runtime. And an API connects it to anything you already run. Build as deep into the foundry as you want — but you don’t have to. What’s built in runs on its own.

Codebolt runtime open at every layer CLIENT Client SDK build your own mission control CLIENT Dynamic panels add UI & features at runtime SERVER Plugins full access to the server · run 24/7 INTEGRATE API access connect with anything you run extend the client, the server, and the wire — it's open at every layer.
The expansion
Not just where agents run

Build agent-native applications on top of the runtime.

Ship entire applications as plugins running on Codebolt — with native, in-process access to every agent and the whole substrate. No separate app reaching across an API to agents living somewhere else. Codebolt isn't only where agents run. It's what you build agent-native products on.

Built on CodeboltSalesForge AIMulti-agent sales workflows with five levels of autonomy — agents commanded natively, not over a wire.
Built on CodeboltSEO Agent PlatformPer-action autonomy, transparent decisions, Git-first change management — agents as a first-class capability.
Built on CodeboltYour productInherit coordination, context, cost, guardrails, and provenance. Write the application, not the infrastructure.

You’ve been building the harness.
Build agents instead.

Every team rebuilds the same plumbing — tools, sandboxes, verification, coordination — by hand, on every project. The foundry is that plumbing, already built. Bring your agents and plugins; skip the harness.

§ 06 — Who it's for

Built for everyone the agent touches.

Developers
Stop babysitting your agents.
  • The verification loop runs in the agent — another agent checks the work, not you
  • Agents are ephemeral; state lives in the runtime — no context rot
  • Predictable cost, no rate-limit wall mid-task
  • One model — from a lone agent to a coordinating colony
Free local tier — run on a real repo today
Eng managers
The bottleneck is review. Move it off your team.
  • Independent verification at every handoff, before a human looks
  • Conventions enforced at generation time, not in review
  • Per-team budgets and full provenance on every change
  • Juniors ship safely — the runtime catches the misses
Pilot — measure review overhead, before/after
VP Eng / CTO
Agents are in your stack. Govern them.
  • Blast-radius limits and governed, revocable permissions
  • Full local mode — proprietary code never leaves your infra
  • Provider-agnostic — insulated from vendor regressions
  • Provenance & observability for audit and compliance
Architecture & security briefing
Drafting is cheap. Trust is the product.

Run coding agents
in a workspace built for them.

The agent OS for AI coding — process-specific agents and plugins on top; tools, models, environments, coordination, context, cost, and the rest of the agentic stack underneath.