Paperclip Review 2026: Running an AI Agent Company, Tested
Paperclip (paperclip.ing) is an open-source, self-hosted control plane that turns a pile of AI agents into something resembling a company — with org charts, budgets, ticketing, and human approvals.
Paperclip Review 2026: Running an AI Agent Company, Tested
For two years the AI agent conversation has been stuck on the same problem. A single agent is impressive in a demo and exhausting in practice — it forgets, it drifts, it burns tokens on the wrong thing, and the moment you want a second agent you are back to gluing scripts together. Paperclip is the first tool I have used that treats this as an organizational problem rather than a prompting problem.
Paperclip — the project lives at paperclip.ing — calls itself "the human control plane for AI labor." Less grandly: it is a self-hosted dashboard that lets you hire agents, give them roles and budgets, assign them goals, approve their work, and watch the bill. It looks like a task manager. Under the hood it is closer to a lightweight HR-and-finance department for software that does not have a body.
I ran Paperclip for about two weeks across two setups — a content pipeline and a small internal-tooling "team" — to find out whether the metaphor holds up or whether it is just a nicer way to lose money. Here is what actually happened.
TL;DR
| What it is | Open-source, self-hosted orchestration layer for teams of AI agents |
| What it is not | An agent. It manages agents; it does not write code or do tasks itself. |
| Best at | Coordinating multiple persistent agents with budgets, roles, and approvals |
| Weakest at | One-off tasks, non-technical users, anything needing zero setup |
| Pricing | Free, MIT-licensed, self-hosted only — no paid cloud tier |
| Verdict | The best answer yet to multi-agent chaos. Adopt it if you genuinely run 3+ agents; skip it if you do not. |
What Paperclip Actually Is
Paperclip is a Node.js server with a React UI. You deploy it yourself — locally, on a VPS, or in your own cloud — and it gives you one screen to manage everything. There is no Paperclip account, no hosted SaaS, no telemetry phoning home. The whole thing is MIT-licensed and free.
The core idea is that Paperclip sits above individual agent runtimes. It does not replace Claude Code or OpenClaw or Codex — it orchestrates them. You bring your own agents; Paperclip provides the layer multi-agent systems have been missing: an org chart, goals, budgets, ticketing, reporting lines, and human-in-the-loop approval gates.
The mental model the project pushes hard is "company." You create a company, define roles (a research agent, an engineering agent, a QA agent), give each one a goal and a monthly budget, and let them coordinate through tickets. A single Paperclip deployment can host an unlimited number of these companies, each fully isolated from the others — useful if you are running separate client projects or want a sandbox alongside production.
It is worth being precise here, because the marketing around "zero-human companies" invites confusion: Paperclip does not make your agents smarter. It makes a group of agents governable. The intelligence still comes entirely from whatever runtime and model you plug in.
→ View Paperclip on ToolCenter
Bring Your Own Agent: "If It Can Receive a Heartbeat, It's Hired"
This is the line the Paperclip team uses, and it is the most important thing to understand before you install anything.
Paperclip is deliberately runtime-agnostic. It does not ship an agent. Instead, it integrates with anything that can be woken up over HTTP and handed a task:
- OpenClaw — the open-source local agent runtime with persistent Markdown-file memory. This is the most documented pairing.
- Claude Code — Anthropic's CLI agent.
- Codex — OpenAI's coding agent.
- Cursor and other editor-based agents.
- Anything on OpenRouter — so effectively any frontier model behind a generic adapter.
The integration is webhook-based. Paperclip handles the organization — task assignment, roles, reporting lines, scheduling, budget limits, approvals. The runtime handles execution — receiving the task, routing it to a model, doing the work, and reporting state back.
In practice this separation is the smartest design decision in the product. It means Paperclip does not have to compete on agent quality, and it does not rot when a new model drops. You upgrade the runtime; the control plane stays put. It also means your evaluation of Paperclip is really two evaluations: is the orchestration good (yes, mostly), and are your agents good (entirely up to you)?
Heartbeats: How Agents Actually Get Work
Paperclip does not stream a chat at an agent. It uses heartbeats, and once you understand them the whole system clicks.
On a configurable interval — the default is roughly every 30 minutes — Paperclip wakes each agent. The agent reads a checklist (in OpenClaw's case, a HEARTBEAT.md file) and decides whether anything needs doing. If nothing does, it replies with a simple "OK" signal and Paperclip silently drops it — no tokens wasted on a non-event. If something does need attention — a new ticket, a blocked dependency, a scheduled report — the agent acts on it autonomously and reports back.
Under the hood, a heartbeat is more than a cron tick. Each wake-up runs through a queue with coalescing (so an agent does not get woken five times at once), a budget check, workspace resolution, secret injection, and skill loading before the agent ever sees the task. That plumbing is invisible when it works, and it mostly works.
The heartbeat model has two real consequences:
- Paperclip is built for persistent, long-running work, not snappy interaction. If you want an answer in ten seconds, this is the wrong tool. If you want an agent that quietly keeps a backlog moving over days, it is the right one.
- Cadence is a cost lever. A 5-minute heartbeat on six agents is a very different bill from a 60-minute one. You tune this per company.
This is genuinely different from the "fire a prompt, watch it think" loop most people associate with agents, and it takes a day to adjust to. Once you do, the appeal is obvious: you stop operating agents and start managing them.
Budgets: The Feature That Justifies the Install
If heartbeats are the clever part, budgets are the part that earns its keep.
Every agent in Paperclip gets a monthly budget. The enforcement is blunt and effective:
- At 80% utilization, you get a warning.
- At 100%, the agent auto-pauses and stops accepting new tasks.
Crucially, task checkout and budget enforcement are atomic. An agent cannot pick up a ticket it cannot afford, and two agents cannot grab the same ticket. In a multi-agent system this matters more than it sounds — runaway spend and duplicated work are the two failure modes that turn an interesting experiment into a scary credit-card statement, and Paperclip closes both by design.
Anyone who has left an autonomous agent running overnight and woken up to a surprise will appreciate this. It is the single feature I would point to if someone asked why you would bother with an orchestration layer at all instead of just running agents directly.
Setup and First Impressions
Standing up Paperclip is a developer task, full stop. You clone the repo, run the Node.js server, open the React UI, and then do the actual work: connecting at least one agent runtime via webhook, configuring secrets, and defining your first company.
If you already run OpenClaw or Claude Code, this is an afternoon. If you do not, budget more — you are really setting up two systems, and the agent runtime is the harder half. The Paperclip documentation is decent and the OpenClaw pairing is the best-trodden path, but there is no "click to deploy and it just works" experience. There is no hosted option that would provide one.
The UI itself is a pleasant surprise. It is clean, fast, and clearly built by people who have used Linear. Org charts are legible, the ticket view is sensible, and budget dashboards are front-and-center rather than buried. For an open-source project this young, the design taste is well above average — and it matters, because the whole pitch is that you stay in control, which means you need to be able to see what is going on at a glance.
Living With It: Two Weeks of Real Use
Setup 1 — a content pipeline. Three agents: a researcher, a drafter, a reviewer. The researcher pulls material on a heartbeat, files tickets; the drafter picks them up; the reviewer checks output before it reaches an approval gate I had to clear manually.
This worked well, with one honest caveat: the coordination was reliable and the output quality was exactly what those three underlying agents would produce on their own. Paperclip made sure the reviewer never started before the drafter finished and never blew its budget. It did not make the drafter write better. That is the correct division of labor — but do not expect orchestration to paper over weak agents.
Setup 2 — an internal-tooling team. Two engineering agents and a QA agent against a small repo. Here the budget enforcement paid for itself within three days: one agent got into a loop on an ambiguous ticket, burned toward its cap, hit 80%, warned me, and would have auto-paused at 100% before doing real damage. I rewrote the ticket and moved on. Without the cap that is a bad evening.
The friction points were real but mundane: heartbeat tuning took trial and error, a misconfigured secret cost me a half-day of silent failures, and "approval gates" are only as good as your discipline in actually checking them. The system invites you to trust it; the system is also clear that you should not, fully, yet.
What's Good
- The orchestration model is genuinely right. Companies, roles, tickets, budgets — it maps cleanly onto how multi-agent work actually needs to be governed.
- Budgets with atomic enforcement. No double-work, no runaway spend. This alone can justify adoption.
- Runtime-agnostic by design. You are not betting on Paperclip's agent quality, because it does not have any. Upgrade models freely.
- Open source, self-hosted, no account. Your data, your infrastructure, your model keys. MIT license, no lock-in.
- Design quality. Rare for a project this young; it makes "staying in control" practical rather than aspirational.
What's Not
- It is infrastructure. Setup is a developer task and the agent-runtime half is the hard half. Non-technical users should not even start.
- No hosted option. If you do not want to run a server, there is no Paperclip for you. This is a deliberate choice, but it narrows the audience.
- It does not improve agents. Weak runtimes plus Paperclip equals well-organized weak output.
- Heartbeat latency. Built for persistent work, not interactive speed. Wrong tool for "answer me now."
- Young project. Expect rough edges, fast-moving APIs, and the occasional silent misconfiguration. Read release notes before upgrading.
- Skill/permission model is immature. Agents and their skills can run with broad filesystem and network access. Treat every agent as something that can touch your machine, and isolate accordingly.
Pricing: Free, With an Asterisk
Paperclip itself costs nothing. It is MIT-licensed and self-hosted, there is no paid tier, and there is no hosted cloud version to upsell you. That is the honest, unusual headline.
The asterisk is that "free software" is not "free to run." Your actual costs are:
- Infrastructure — a VPS or cloud instance to host the Node.js server. Modest, but not zero.
- Model usage — every heartbeat that does real work spends tokens on whatever model your agents use. This is the line item that matters, and it scales with the number of agents, heartbeat frequency, and task complexity.
The irony is worth stating plainly: Paperclip is free, and its main job is to stop the non-free part — model spend — from getting out of hand. Budget enforcement is not a side feature; it is the economic point of the product.
Who Should Use Paperclip
Use it if you:
- Already run, or seriously intend to run, three or more persistent agents and feel the coordination pain.
- Are comfortable self-hosting a Node.js service and connecting an agent runtime over webhooks.
- Want hard budget caps and human approval gates before you let agents act unsupervised.
- Are running separate projects or clients and want isolated "companies" in one deployment.
Skip it if you:
- Just need one agent for one task — Paperclip is pure overhead at that scale.
- Want a zero-setup, hosted, click-and-go experience.
- Are non-technical. This is not the on-ramp; it is the control room.
- Expect orchestration to make mediocre agents good. It will not.
Alternatives Worth Knowing
Paperclip occupies a specific slot — the governance and orchestration layer — so the relevant comparisons are not other agents but other ways to manage them.
- Roll your own scripts. The status quo for most teams. Fine for two agents, miserable at five. Paperclip exists precisely to replace this.
- OpenClaw on its own. OpenClaw is an agent runtime, not an orchestrator — it is what you would plug into Paperclip, not an alternative to it. Run OpenClaw alone if you only need one persistent agent.
- Agent frameworks (LangGraph, CrewAI, AutoGen, and similar). These are developer libraries for building multi-agent logic in code. Paperclip is an operational product for running and governing agents you already have. Different layer; some teams will use both.
- Hosted multi-agent platforms. Several commercial products promise managed multi-agent orchestration. They trade Paperclip's self-hosted control for convenience and a subscription. If you do not want to run a server, that is the tradeoff on the table.
For a wider survey of the agents you would actually plug into a system like this, see our roundup of the best AI agents in 2026.
Verdict
Paperclip is not trying to win the agent race. It is betting that the race already has enough runners and what is missing is a stadium — a place to organize them, fund them, watch them, and pull them off the field when they misbehave. On that bet, it largely delivers.
After two weeks I would describe Paperclip as the first multi-agent tool that made me feel like a manager instead of a babysitter. The org-chart metaphor is not marketing fluff; it is load-bearing. Budgets with atomic enforcement are the kind of unglamorous feature that quietly prevents disasters. And the decision to be runtime-agnostic means the product should age well as models keep changing underneath it.
It is also, unambiguously, infrastructure. It demands a developer, a server, and the patience to wire up an agent runtime — and it does nothing to make those agents smarter. If you are running one agent, Paperclip is overhead. If you are running several and the coordination has started to hurt, it is the most coherent fix available, and it costs nothing but the servers and the tokens it is designed to help you control.
Install it when you have a real multi-agent problem. Not before.
Last updated: May 2026. Features and licensing verified at time of publication; Paperclip is a fast-moving open-source project — check the repository for the current release.
继续探索
继续你的阅读之旅

Best File Access Tools for AI Agents in 2026: 8 Options Compared
AI agents are only useful if they can read and write files — but file access is also the single most dangerous capability you can hand an agent, so the right tool depends as much on your security model as on features.

Kinovi Review 2026: The Seedance 2.0 AI Video Studio, Tested
Kinovi is an AI video generator built on the Seedance 2.0 model. Its standout feature is a multimodal reference system that lets you guide a clip with up to 9 images, 3 video clips, and 3 audio tracks.
