Autonomous agents, production-ready

The framework for running
Claude agents at work

007 is a self-contained Docker environment that runs Claude Code as an autonomous agent — clones a repo, receives a role, plans and ships changes, then hands off via the medium native to the domain.

See how it works Browse roles

$ just run engineer/software-developer

How it works

One container.
One role. One outcome.

Each run is stateless. The agent reads from the world, does its work, and writes back through the natural medium — no coordination layer needed.

📦

Container starts

entrypoint.sh sets up git, credentials, and installs the latest agent resources from GitHub. Clean slate every time.

📝

Role is loaded

header.md becomes CLAUDE.md (shared guardrails). Role-specific agents and skills are merged into ~/.claude/.

🔗

Prompt pulled from LangSmith Hub

The role prompt is fetched at runtime from LangSmith Hub. No local fallback — edit prompts in the Hub UI and promote through staging → production.

🤖

Claude works autonomously

Up to 100 turns. Claude reads its inputs from the world — a Linear issue, a Fiscal record, a codebase — and executes the role using all available skills and tools.

📋

Progress logged in real time

Every step is written to a structured log file, mounted back to the host. Tail it live with tail -f logs/agent.log.

🔄

Hands off via the right medium

A PR is opened, a Linear comment is posted, or a Fiscal record is updated. The next agent picks up exactly where this one left off.

Progress log

Every run is observable

Structured timestamped log lines. See exactly what the agent did, when, and whether it succeeded — while it's still running.

logs/engineer-software-developer-a3f2.log

2026-06-18T09:00:01Z done start Role loaded, starting work on INT-3847

2026-06-18T09:00:18Z done clone Cloned fiscal repository (main, 3847 commits)

2026-06-18T09:01:04Z done read-issue INT-3847: BookingService ignores tax override on credit notes

2026-06-18T09:02:31Z done find-target BookingService.kt:218 — applyTaxOverride skips CREDIT_NOTE type

2026-06-18T09:04:55Z done write-fix Added CREDIT_NOTE branch + 2 unit tests

2026-06-18T09:05:10Z done verify All 214 tests pass (0 failures)

2026-06-18T09:05:33Z done finish PR #287 opened — ready for review

Roles

One agent per job

Roles cluster by domain. Each gets its own prompt, guardrails, and skills. Trigger from the command line, a Slack message, or a Redis event on ECS.

# Trigger from the command line
just engineer INT-3847
just investigate INT-99
just run tax/monthly-closing

# Or via Slack
!engineer/software-developer LINEAR_ISSUE_ID=INT-3847
          

# task.json — single source of truth
{
  "role": "engineer/software-developer",
  "context": {
    "source": "slack",
    "triggered_at": "2026-06-18T09:00:00Z",
    "trace_id": "4bf92f35-77b3"
  },
  "task_payload": {
    "LINEAR_ISSUE_ID": "INT-3847"
  }
}
          

engineer

software-developer

Implements code changes, writes tests, and opens pull requests against a Linear issue.

PR → GitHub

engineer

investigate-issue

Reads code, logs, and the production DB replica to find root cause of a Linear issue.

Comment → Linear

engineer

data-engineer

Answers data questions and runs reports against the production database. Saves reusable queries via PR.

Query → PR

finance

revops-assistant

Revenue and sales reporting across HubSpot CRM and Stripe billing for RevOps and finance teams.

Report → Slack

finance

billing-investigator

Previews what will be invoiced for all orgs, or traces sold/signed/delivered/billed for root-cause.

Analysis → Slack

tax

monthly-closing

Runs the monthly tax closing workflow for an organization, writing draft bookings to Fiscal API.

Record → Fiscal

ops

hello-world

Smoke-test role that verifies infrastructure, credentials, and tool access are all working.

Log → stdout

ops

evaluator

Evaluates KAI conversation quality and posts structured feedback scores to the ops pipeline.

Score → Slack

Handoff model

The medium is the
coordination layer

No extra orchestration needed. Agents hand off through whatever is native to the domain — a second agent reads the current state and picks up from there.

Domain: Code

GitHub PR branch + comments

Reviewer requests changes on the PR; writer amends and pushes a new commit.

Domain: Product

Linear issue description + comments

Clarifier adds acceptance criteria to the issue; engineer implements from there.

Domain: Tax / Accounting

Fiscal API records

Booking agent writes a draft record; reviewer agent corrects and re-submits.

Architecture

Event-driven on ECS

Production runs are triggered by events on Redis Pub/Sub. The listener dispatches to a Lambda, which launches an ECS Fargate task — each agent gets a fresh container.

Redis Pub/Sub

──▶

agent-listener
ECS service

──▶

agent-task-runner
Lambda

──▶

Agent container
ECS Fargate

permissions.yaml controls which roles the listener will dispatch · each task gets its own ephemeral container

🔒

Non-root container

Runs as the agent user. --dangerously-skip-permissions is safe here because containers are fully ephemeral and isolated.

🧱

Hard guardrails

header.md enforces: no direct pushes to main, no force-push, no destructive SQL, no secret logging. Cannot be overridden by role prompts.

📡

LangSmith tracing

Every run creates a trace posted at start and updated on finish. Only operational metadata — role, tag, turn count, status. No customer data.

🛠

Shared + role-specific skills

Shared skills (Slack, Linear, log) bake into the Docker image. Role-specific skills mount at runtime — no rebuild needed to ship new capabilities.

☁️

AWS Bedrock

Claude runs via AWS Bedrock in eu-central-1 — data stays in the EU. Bedrock credentials are injected from Secrets Manager at task start.

🔁

Multi-agent loops

Writer → reviewer → writer. The outer loop stops when the reviewer approves. Each agent reads the current state of the medium — no extra coordination code.

The framework for runningClaude agents at work

One container.One role. One outcome.

Every run is observable

One agent per job

The medium is thecoordination layer

Event-driven on ECS

Ship your first agent today

The framework for running
Claude agents at work

One container.
One role. One outcome.

The medium is the
coordination layer