Skip to content

Open Source

The Validation Tool That Couldn't Validate Itself

I've been building Haytham for months. It's a Claude Code plugin that orchestrates AI agents to validate startup ideas, scope MVPs, design architecture, and generate specs. Eight agents, four phases, structured handoffs, the whole thing.

A few weeks ago I had a long conversation with someone who poked holes in the project until I couldn't patch them anymore. The conclusion: Haytham has an identity crisis. It presents as a validation tool, but the actual vision is a lifecycle control plane for AI-built products. The validation pipeline is phase one of a three-phase vision that only I can see. Everyone else sees a validation tool in a crowded market.

That conversation surfaced a specific finding I couldn't shake: the product I'm building (Genesis, the validation pipeline) isn't the real product. The real product is Evolution, the phase where the reasoning graph I'm building gets used to handle change requests without full rewrites. Without Evolution, Genesis is a well-structured validation tool competing with ChatPRD and ValidatorAI. With Evolution, it's something genuinely new.

So I did the obvious thing. I ran /haytham:validate on Haytham's own idea and compared what the tool found to what that conversation found.

I Fed My AI Pipeline Its Own Idea

Haytham is a Claude Code plugin that takes a startup idea through four phases: market research, MVP scoping, architecture decisions, and spec generation. Eight agents, structured handoffs, validation hooks. I've been building it for months.

At some point I had to try the obvious thing: feed Haytham its own idea and see what comes out.

So I typed this into a fresh Claude Code session: "An open source Claude Code plugin that takes an Idea, researches it, provides recommendations on the research and competitors, proposes a MVP, suggests architecture, produce standardised specs and finally builds the system." Then I sat back and let it run.

Why You Should Ship Your Agentic Workflow as a Claude Code Plugin

I'm building an agentic workflow that takes a startup idea does market research, drafts a MVP, and generates a spec detailed enough to hand straight to a coding agent.

I started out with a full-fledged agentic framework (Strands SDK). That gives you a lot of flexibility and control over the agents, but it also comes with a lot of overhead. If the overhead was on the system only, it would have been fine, but it was on the user too.

My goal was to validate the idea from users, which ironically is the product I was building. In my test runs it worked great, but I didn't realise that the setup was a deal breaker for most people. This made me rethink my approach, and I decided to sacrifice the bells and whistles for something that can be tested with one prompt (if you already have Claude Code).

If you're building AI-powered developer tools, the trade-offs below might save you some time.