Back to feed

evanklem/evanflow

evanklem/evanflow
325
+69/day
8
Shell

A TDD-driven iterative feedback loop for software development. 16 cohesive Claude Code skills walk an idea from brainstorm → plan → execute → iterate, with checkpoints throughout.

From the README

EvanFlow

A TDD-driven iterative feedback loop for software development with Claude Code.

16 cohesive skills + 2 custom subagents walk an idea from brainstorm through implementation, with checkpoints throughout where you stay in control. One entry point: say "let's evanflow this" and the orchestrator runs the loop.

brainstorm → plan → execute (sequential or parallel) → tdd → iterate → STOP

The loop is conductor, not autopilot: real checkpoints at design approval, plan approval, and after iteration. The agent stops short of every git operation and waits for your direction. No auto-commits. No forced ceremony. No "must invoke a skill" tax.

Quick Install

The recommended path — Claude Code's plugin marketplace:

/plugin marketplace add evanklem/evanflow
/plugin install evanflow@evanflow

Restart, then try:

"Let's evanflow this — I want to add a small feature that does X."

evanflow-go fires and walks the loop. The git-guardrails hook auto-activates with the plugin (no settings.json edit needed). Skills appear under the evanflow: namespace (e.g., /evanflow:evanflow-go).

See Installation below for two alternative paths.

What Makes It a Feedback Loop

The loop is built around discipline that compounds across iterations, not single-shot generation. Every step has a checkpoint that gates the next:

  • Brainstorm clarifies intent, proposes 2–3 approaches with embedded grill (stress-test) → you approve the design
  • Plan maps file structure first (deep modules, deletion test) → you approve the plan
  • Execute runs task-by-task with inline verification → blockers stop the loop and surface to you
  • TDD is vertical-slice only: one failing test → minimal impl → repeat. Tests verify behavior through public interfaces, so they survive refactors
  • Iterate re-reads the diff with fresh eyes, runs quality checks, screenshots UI changes, and runs against a Five Failure Modes checklist (hallucinated actions, scope creep, cascading errors, context loss, tool misuse). Hard cap of 5 iterations
  • STOP. Report. Await your direction. The agent never auto-commits, never auto-stages, never proposes a PR

For plans with 3+ truly independent units, the loop forks into a parallel coder/overseer orchestration: one coder per unit (using vertical-slice TDD with a RED checkpoint), one overseer per coder (read-only review subagent that can't modify code), plus an integration overseer that runs named integration tests at every touchpoint. The integration tests are the executable contract — interfaces can't drift if both sides have to satisfy the same passing test.

Hard Rules Baked Into the Loop

Several rules come from 2025-2026 industry research on agentic coding failure modes and are baked into every skill:

  • Never invent values — file paths, env vars, IDs, function names, library APIs. If unsure, the agent stops and asks. (Action-hallucination is the most dangerous agen