The Architect: Autonomous Development Lifecycle Layer for Agentic AI Coding Tools
The Story
I spent three months watching AI coding agents fail repeatedly — not catastrophically, but quietly. Agents that write a couple of files, declare success, exit clean, yet leave broken tests behind. Agents that rewrite the same function across retries because they forget previous context. Agents that token-burn in endless loops when stuck, especially while you’re asleep. I was building agentic systems for production—the kind where babysitting every run isn’t an option. The tools wrote code well, but failed to finish the job properly.
I built The Architect to fix this. It’s an open-source autonomous development lifecycle layer that wraps your AI coding CLI. It adds what these agents lack: comprehensive planning, rigorous completion verification, intelligent retries, quality review, and persistent memory. Provider-agnostic, it supports Claude Code CLI, Codex CLI, and OpenCode CLI. Available on PyPI, it ships itself. Build 10042 marks the effort to stabilise the tool through autonomous operations.
The Pain: You Are the Orchestration Layer
Using AI coding agents directly is exhausting in multi-task projects. You start the agent, watch it rewrite the same function repeatedly, kill it, retry with context manually re-injected, check broken tests, repeat. By task seven you’re fatigued. By task nine, bugs ship because it’s 1am and you must sleep.
Active supervision runs 3-4 hours for ten tasks. Most of that is babysitting, not building architecture. You’re the orchestration layer, not the model. The AI solves coding but not orchestration.
The Four Gaps
- Completion isn’t verified: Exit code 0 or "task complete" claims mean little. Agents routinely hallucinate or only partially deliver.
- Retries have no memory: Each retry starts cold, blindly repeating past mistakes.
- No stuck detection: Blocked agents run endlessly until manually stopped.
- Context resets each session: Every new session needs project re-explanation, losing prior lessons and decisions.
The Fake Solutions
Better prompts help briefly, but become obsolete as codebases evolve or models update. More expensive models hallucinate less but don’t eliminate failures or stuck states. They shift cost from supervision to pricier QA. The real obstacle is you can’t hand off control without losing control. Either you watch or chaos unfolds.
The Solution
The Architect is the handoff mechanism that keeps you in control. You define the architecture, goal, and scope. The Architect handles execution and every failure mode that otherwise demands your intervention.
Provider-Agnostic Architecture
Works with existing AI coding tools:
- Claude Code CLI — Anthropic’s agentic coding tool
- Codex CLI — OpenAI’s terminal coding agent
- OpenCode CLI — Open-source multi-provider alternative
No vendor lock-in. Switch providers mid-run. Use different providers for planning and execution. Orchestration remains consistent.
Mechanism 1: Autonomous Planning
The architect agent reads your goal, project structure (auto-detected), your ARCHITECT.md, plus any extra context. It decomposes the goal into numbered task files in tasks/. Tasks are self-contained, detailed instructions, not vague directives.
Scope controls task size: simple yields many small focused tasks, ideal for weaker models. complex creates fewer, larger tasks for frontier models.
Mechanism 2: Multi-Signal Completion Detection
The Architect requires corroboration among four signals before declaring task completion:
| Signal | How it works | Strength |
|---|---|---|
| Promise tag | Agent outputs <promise>TXX_COMPLETE</promise> | Strong |
| PROGRESS.md | Task marked "Done" in progress file | Moderate |
| Clean exit | Provider CLI exited with code 0 | Weak |
| Progress signal | Text contains phrases like "all tests pass", "task is done" | Weak |
Decision rules: completion requires two or more positive signals or the promise tag alone. Exit code zero alone does not confirm completion. Stuck signals override all indications of completion.
Mechanism 3: Circuit Breaker
Detects failure patterns beyond retries using three persisted counters:
- No-progress: Three consecutive attempts write no files → trip
- Same-error: Three attempts with identical error fingerprints → trip
- Token decline: Attempt 3 uses <40% tokens of attempt 1 combined with other counters → trip
Upon circuit open, it triggers recovery actions: WAIT, REPLAN (rewrite the failing task), or COOLDOWN_WAIT. Circuit state persistence enables restart resilience.
Mechanism 4: Retry with Context Carry
Retries up to three attempts (30 in persistent mode), each retry includes summarised context from prior attempts. The agent knows what failed, what was written, and can avoid repeating mistakes. Different models or providers can be selected per attempt allowing better recovery.
Mechanism 5: Retrospective Reviewer
After all tasks complete, a separate reviewer agent audits what was done versus planned, runs your test suite, and if it detects failures or gaps, creates fix-up tasks (prefixed with R) to re-run through the pipeline. It cannot modify prior tasks but supplements with corrective work.
Mechanism 6: ARCHITECT.md — Persistent Project Intelligence
A structured file accumulating project wisdom across sessions:
- Project Structure: repo type, languages, frameworks, dependencies
- Permanent Decisions: append-only architectural choices
- Known Constraints: technical limits discovered during builds
- Lessons Learned: append-only operational notes
- Best Practices: procedural rules collected over runs
- Planning History: auto-appended after each plan
Every planning and build session reads and updates ARCHITECT.md, creating a living memory that elevates long-term autonomous development.
Production Codebases
Working with production codebases compounds agentic AI challenges: decisions behind code aren’t visible as just source files. The Architect mitigates with ARCHITECT.md, frontier planning models with full context, scoped tasks to reduce risk, and fixed architecture from you, not the AI.
Local GPU Models
Local models have large token windows that fill fast under real-world conditions. The Architect’s mixed-model approach decomposes goals at a frontier scale, then executes scoped tasks on local models with manageable context. Frontier models plan and review, local models execute.
Overnight Safety
Unattended runs handle failures, retries, cooldowns, circuit trips, replanning, and retrospective fixes. State persistence survives restarts. Token budget caps limit spend and lock files prevent concurrent runs. Build 10042 reflects real operational stabilisation.
Dog-Food
The Architect uses itself. At task T47, its circuit breaker caught a bug in the lock file implementation by detecting recurring errors and triggering a targeted replan. I found this in logs the next morning, proof of the system's real-world robustness.
Honest Limits
- Does not improve code quality beyond the underlying model’s capability.
- Bad or vague goals produce weak tasks; clear specifications are essential.
- Retrospective reviewers catch test failures but not architectural drift or technical debt.
- Token counts unavailable for Claude Code due to output format.
- Free mode with OpenRouter models is slower, suitable for non-urgent workloads.
Getting Started
Install the package with:
pip install the-architect
Requires Python 3.11+ and at least one supported AI coding CLI: Claude Code, Codex, or OpenCode.
Basic workflow:
architect initarchitect --plan --goal "add Stripe payment integration"architect(runs the full cycle unattended)
The Architect plans, executes, retries, reviews, and reports automatically.
Full documentation: github.com/iNetanel/the-architect