Back to News
Advertisement
Advertisement

⚡ Community Insights

Discussion Sentiment

0% Positive

Analyzed from 454 words in the discussion.

Trending Topics

#tdd#code#test#evanflow#don#tests#while#plan#never#impl

Discussion (17 Comments)Read Original on HackerNews

Deeds6714 minutes ago
To be honest, the official superpowers/brainstorming skill already does TDD so well, I don't see that much of a need for this. TDD is definitely the way to go with agentic development.
s20nabout 3 hours ago
EvanFlow - thoughts arrive like butterflies?
ge969 minutes ago
Seeeethinnggg tests failing not complete... again
__mharrison__34 minutes ago
Someday soon he'll begin his life again
sbseitzabout 3 hours ago
Oh, he don't know, so he chases them away
jamesbfbabout 3 hours ago
Oooohhhh
here2learnstuff43 minutes ago
Not bad, but also, forgive how mean this is going to come across: not using a product from someone who just started their undergrad.
evanklem2004about 4 hours ago
Built this as an opinionated Claude Code development flow based on evidence based practices and what has been working for me while developing professional code.

EvanFlow is a single TDD-driven loop. Say "let's evanflow this" and it walks brainstorm → plan → execute → tdd → iterate → STOP. Real checkpoints at design and plan approval. Never auto-commits, never auto-stages, never proposes integration - every git op is your call.

The three things that actually changed how I work:

1. Vertical-slice TDD. One failing test → minimal impl → next test. Watch each test fail before writing the impl that passes it. (Sounds obvious. Almost no agent does it by default. ~62% of LLM-generated test assertions are wrong per HumanEval research, so testing TDD discipline matters more than the impl discipline.)

2. Embedded grilling at decision points. Before locking a plan: what breaks if a user does X? What's the rollback? What's explicitly out of scope? Catches design flaws while they're still cheap.

3. Iterate-until-clean (hard cap of 5 rounds). Re-read the diff against dead code, naming, the deletion test, assertion correctness, and a Five Failure Modes pass (hallucinated actions, scope creep, cascading errors, context loss, tool misuse). For UI: screenshot via headless Chromium.

For bigger plans with 3+ independent units sharing types, it forks into a parallel coder/overseer orchestration. Integration tests at touchpoints ARE the cohesion contract.

Three install paths: Claude Code plugin marketplace, npx skills add, manual copy. MIT.

dpark27 minutes ago
I’ve thought of going down the TDD model for LLMs as a way of providing constraints on their behavior. I would think that “vertical slice” TDD would encourage the LLM to start tailoring the tests to the implementation rather than establishing the invariants up front, though. I was considering “horizontal” TDD to force the agent to implement constraints before coding to them.
lukewrites18 minutes ago
Curious, In the repo you mention

> Several rules come from 2025-2026 industry research on agentic coding failure modes

What are some of the papers you read?

girvo22 minutes ago
Please don’t post AI generated comments :(

Just write it yourself. I promise it’s worth it

shruubiabout 2 hours ago
Two questions

1) Do you not feel self-conscious or weird about calling this "EvanFlow"? Seems like a lot of people these days are naming their AI tools/skills/whatever after themselves which seems self-absorbed. Either that or they hope that if their thing takes off like OpenClaw did then they'll grab the fame that comes along with it.

2) Why does your TDD flow miss the refactor step of TDD?

wencabout 2 hours ago
I feel like 1 is a self correcting problem. If this goes nowhere it will soon be forgotten.

I can think of one example that did go somewhere: Linux.

normie3000about 2 hours ago
Ref 1, he should have called it Daughter.
reitzensteinmabout 1 hour ago
No Code, surely?
jtfrenchabout 3 hours ago
How does this handle “dumb zone” evasion while looping?
cratermoonabout 1 hour ago