The Architect: Autonomous Development Lifecycle for Agentic AI Coding Tools
The Story
I spent three months watching AI coding agents fail quietly. An agent writes two files, declares success, then leaves you with broken tests. Another rewrites the same function over three retries because it lacks memory. An agent stalls at 11pm, burning tokens while you're asleep.
I was building agentic systems for production where babysitting every run isn’t an option. The tools were good at writing code but bad at finishing the job.
So I built The Architect—an open-source autonomous development lifecycle layer that wraps your AI coding CLI. It adds what’s missing: planning, completion verification, retry intelligence, quality review, and persistent memory. Provider-agnostic, it works with Claude Code CLI, Codex CLI, and OpenCode CLI. It's on PyPI. Build 10042 marks how many autonomous runs it took to stabilise this tool.
The Pain: You Are the Orchestration Layer
Using an AI coding agent directly on a multi-task goal means babysitting every run. You watch endless retries, hallucinated completions, and you manually carry context across attempts. After multiple tasks, fatigue sets in and edge cases ship because you can’t watch all night.
Active supervision: 3-4 hours for a 10-task goal. Mostly watching, not thinking. You’re catching stuck loops, not architecting solutions.
The Four Gaps
- Completion isn’t verified. Exit code 0 means nothing. Agents routinely hallucinate task completion.
- Retries have no memory. Each retry starts cold without knowledge of past failures or outputs.
- No stuck detection. Blocked agents burn tokens indefinitely unless manually stopped.
- Context resets every session. Project decisions and past lessons get erased between runs.
The Fake Solutions
- Better prompts? Help temporarily but don’t scale or adapt as code evolves.
- More expensive models? Reduce failure frequency but don’t eliminate hallucinations, stuck runs, or need for supervision.
The core issue isn’t model capability. You can’t hand off control without losing control. Without a fail-safe layer, you watch or pay for chaos.
The Solution
The Architect is the handoff mechanism. You stay in control—defining architecture, goals, and scope. It handles every failure mode that normally demands your intervention.
Provider-Agnostic Architecture
The Architect works seamlessly with your existing AI coding tools:
- Claude Code CLI – Anthropic’s official agentic coding tool
- Codex CLI – OpenAI’s terminal-based coding agent
- OpenCode CLI – Open-source multi-provider alternative
No vendor lock-in. Switch providers mid-run. Use different providers for planning vs execution. The orchestration layer remains constant.
Mechanism 1: Autonomous Planning
The architect agent decomposes your goal into numbered task files automatically. You don’t write tasks manually.
It reads your goal, auto-detected project structure, ARCHITECT.md, and context files you provide. It outputs well-defined task files in tasks/. Each contains specific, executable instructions so build agents don't waste time guessing.
Scope controls task sizing. Simple scope means many small tasks; complex means fewer broad tasks tuned to your model's context window.
Mechanism 2: Multi-Signal Completion Detection
No single signal is trusted. Multiple corroborating signals decide task completion:
| Signal | How it works | Strength |
|---|---|---|
| Promise tag | Agent outputs <promise>TXX_COMPLETE</promise> | Strong |
| PROGRESS.md | Task marked "Done" in progress file | Moderate |
| Clean exit | Provider CLI exited with code 0 | Weak |
| Progress signal | Text contains "all tests pass", "task is done" | Weak |
Decision rules: Two or more signals positive means Done. Promise tag alone suffices. Exit code alone doesn’t. Any stuck signal ("I’m stuck") overrides completion claims, detecting hallucinated finishes.
Mechanism 3: Circuit Breaker
Retries fix random issues. Circuit breaker detects persistent failure patterns.
Three checkpoint counters persisted to disk:
- No-progress: Three attempts with zero files written trips circuit.
- Same-error: Three attempts with same logical shell error trips circuit.
- Token decline: Third attempt uses under 40% tokens of first, combined with elevated counters, trips circuit.
When open, the circuit chooses recovery: WAIT, REPLAN, or COOLDOWN_WAIT. State persists even across restarts for resilience.
Mechanism 4: Retry with Context Carry
Failed tasks retry up to three times by default, with prior attempt context summarised and injected.
Retries use knowledge of files created/edited, commands run, and test failures to avoid repeating mistakes.
Different providers can be used per attempt, enabling fallback options when one model stalls.
Mechanism 5: Retrospective Reviewer
After all tasks finish, a separate reviewer agent audits work:
- Reads PROGRESS.md, task files, and actual code
- Runs your test suite
- Detects failed tests, missing requirements, and edge cases
- Creates R-prefixed fix-up tasks for unresolved issues
Persistent mode runs two retrospective passes for thorough validation. The reviewer writes no existing files or progress states—only fix-up tasks.
Mechanism 6: ARCHITECT.md — Persistent Project Intelligence
ARCHITECT.md accumulates structured knowledge across sessions, read by every agent before acting.
Contains:
- Project structure and dependencies, auto-generated every planning session
- Permanent decisions, constraints, lessons learned, and best practices—append-only records accrued automatically
- Planning history for traceability
This persistent intelligence transforms autonomous development from one-off tasks into continuous, context-aware cycles.
Production Codebases
Greenfield projects are easier; production codebases have accumulated decisions and constraints invisible to agents.
The Architect mitigates this by capturing decisions in ARCHITECT.md, using frontier models for planning with full context, keeping tasks small and isolated, and letting you define architecture explicitly. Agents implement your vision—they don’t invent it.
Local GPU Models
Context window limits are real.
Even large context models fill quickly with instructions, context files, code, and tests.
The Architect inverts the approach:
- Frontier models handle planning and retrospective with unlimited context
- Local smaller models execute focused tasks with manageable context sizes
This mixed-model pattern enables real production workflows on local hardware with 20k-40k token windows.
Overnight Safety
Configure for unattended overnight runs with:
[architect] persistent = true token_budget_per_hour = 500000
The system handles retries, cooldowns, circuit trips, replanning, and reviews autonomously. It persists state to resume exactly after interruptions.
Dog-Food
The Architect tests itself.
At task T47, the circuit breaker detected repeated FileExistsError failures, triggering REPLAN and targeted fixes automatically. This caught a bug in its own lock file implementation overnight.
Honest Limits
- Does not improve code quality beyond the underlying AI coding tool—quality ceiling is the model itself.
- Bad goals produce vague tasks—clear goals and context files are essential.
- Retrospective review catches test failures but not architectural drift or global design mistakes—it’s a quality gate, not a full code review.
- Token accounting unavailable with Claude Code due to output format—use OpenCode or Codex if this matters.
- Free open-source models are slower—expect about triple run times versus Claude Sonnet for 10-task goals.
Getting Started
Install from PyPI:
- pip install the-architect
Requires Python 3.11+ and one or more supported AI coding CLI providers: Claude Code, Codex, or OpenCode.
Basic commands:
- architect init
- architect --plan --goal "add Stripe payment integration"
- architect (to execute)
The Architect plans, executes, retries, reviews, and reports—fully unattended.
Full documentation at github.com/iNetanel/the-architect.