TL;DR: After three decades of writing code, running infrastructure, designing hardware, and breaking production at 2am, I picked up an AI coding tool in late 2024 expecting to be mildly impressed. Instead, it kicked off a two-year journey that rearranged how I think about software development entirely — and ended with me building an open-source framework to solve a problem I didn't even know I had when I started. This is that story.

The Setup: Thirty-Something Years and a New Toy

Let me paint you a picture. It's late 2024. I've been writing software since the era when you literally had to care about how many bytes a variable took. I've built embedded systems, network infrastructure, automation platforms, cloud architectures. I've survived the dot-com bust, the DevOps revolution, microservices mania, and the "everything is Kubernetes now" phase. I've watched more paradigm shifts than I care to count.

And now everybody is talking about AI coding assistants.

My gut reaction? Mildly skeptical. Not dismissive — I've been around long enough to know that dismissing things because "that's not how we did it before" is a fast track to irrelevance. But also not wide-eyed. I'd seen too many "game changers" come and go. I gave myself the same speech I always give: stay curious, stay skeptical, measure what actually changes.

So I picked up Cursor.


Act One: The Honeymoon Phase (Late 2024)

The first week with Cursor was, I'll admit, genuinely surprising.

Not because the AI was magic — it wasn't. It hallucinated. It misread context. It confidently wrote code that looked right and was subtly wrong in ways that would have cost a junior dev an afternoon to debug. I spotted those immediately because, you know, thirty years.

But the productivity texture was different. Tasks that used to require a context switch — writing boilerplate, scaffolding a module structure, translating a mental model into a function signature — suddenly had less friction. I found myself staying "in flow" longer. Less Alt-Tab, less Stack Overflow rabbit holes, more actual thinking about the actual problem.

The killer insight in those early weeks wasn't "AI writes code for me." It was: AI reduces the cost of the boring part, which frees up more cognitive budget for the interesting part. For someone with decades of pattern recognition about what the boring parts are, that's a significant multiplier.

I started using it for everything. Infrastructure automation. Python scripts. Config boilerplate. Quick prototypes. The tool paid for itself in hours.

Then, around the time I hit my first serious project with it, I ran into The Wall.


Act Two: The Wall (Early 2025)

The Wall has a specific shape. It looks like this:

You've been building with AI assistance for a few weeks. The codebase is growing. Features are landing. The AI is humming along. And then you realize — gradually, then all at once — that the AI has no idea why it's doing what it's doing.

It knows the code. It doesn't know the spec. It doesn't know what the original GitHub issue said. It doesn't know that you changed direction two weeks ago. It doesn't know that the function it just "helpfully" extended contradicts a decision you made last sprint.

It's a very competent amnesiac.

I found myself spending more time reviewing AI output against my mental model of the system than I saved by having the AI generate it. The more complex the project, the worse this got. And being someone who thinks in systems — hardware, software, automation, all of it — I was building complex things.

My instinct as an engineer kicked in: this is a tooling problem, not a capability problem. The AI is fine. The workflow has gaps.

So I started filling them.


The Hidden Layer: Research Before Code

Here's something I didn't mention yet, and it matters a lot: before any of the tooling existed, there was a research layer. A serious one.

From the very beginning I wasn't just prompting an AI to write code. I was using thinking models to design what to build in the first place. Early on that meant ChatGPT o1 — the first model that actually paused and reasoned before answering. Later, o3. For architecture questions, competitive analysis, "should this feature even exist" — these models were my sounding board before a single line of implementation was written.

The output of those sessions wasn't code. It was large, structured markdown research artifacts: feature analyses, SWOT analyses, architectural decision write-ups, market positioning notes. I committed these directly into the internal codebase, cross-referenced them with each other, and treated them like a living product roadmap. Every significant feature in SpecFact CLI has a corresponding research document that predates the implementation by days or weeks.

The workflow looked like this: research question → thinking model session → markdown artifact → slice and dice with Cursor using reasoning models → implementation plan → code. The plans weren't throwaway notes — they had explicit relationships to each other, essentially forming a dependency graph for product decisions.

Then in 2026 that research layer got a significant upgrade. The pipeline became: Perplexity deep research first (for broad, source-grounded exploration), fine-tuned with ChatGPT deep research (for sharper synthesis and gap analysis), complemented with Claude deep research (for architectural reasoning and edge-case thinking), and finally the compiled report fed directly into an OpenSpec change proposal as the basis for the next SpecFact feature. Research became a first-class step in the spec-driven workflow, not a background activity.

Why does this matter? Because the instinct when people talk about "AI-assisted development" is to picture the coding part. The coding part is actually the end of the chain. The part that determines whether what you build is worth building — that happens in the research and planning layer. Getting that layer right, with the right tools, is where most of the leverage is.


Act Three: Building the Infrastructure (Early–Mid 2025)

Between February and June 2025, I built a small stack of tools to address the gaps I was experiencing.

The first was a multi-agent automation system — a Supervisor Agent that read structured project plans, a Coding Agent that did the actual generation, and a QA Agent that validated the output. This was my attempt to give the AI system some memory about why it was doing things, not just what to do next. It used LangChain, Redis for state, and GitHub integration for issue tracking. It worked — sort of. The architectural concept was right. The execution was messy.

The second was an MCP server for log analysis. Model Context Protocol had just become a thing, and I immediately saw the value: instead of copy-pasting error logs into a chat window, give the AI a tool to read them directly. I built log_analyzer_mcp in May 2025, partly because the debugging loop for the multi-agent system was grinding me down. Logs are how 30-year engineers debug. Logs with AI-readable structure are how AI systems debug.

The third was a ChromaDB MCP server (chroma_mcp_server), which gave the AI a persistent, searchable memory across sessions. Think of it as teaching the AI to keep notes about the codebase rather than starting every session cold. Vector search, semantic chunking, automated context recall. This one was genuinely satisfying to build.

By mid-2025 I had an interesting meta-situation: I was using AI to build tools that made AI more useful for building tools. If that sounds recursive, it is. I spent a moment appreciating the absurdity before moving on.

The pattern I was converging on had a name, though I hadn't named it yet: spec-driven development. The missing layer wasn't better AI. It was better context — and context, in software, is specs.


Act Four: The Framework Emerges (Mid–Late 2025)

In August 2025 I started using Spec-Kit, a lightweight OSS toolkit for what its authors call Spec-Driven Development. The idea is simple: write the spec first, in a structured way, and let the AI work from that as its source of truth rather than from freeform instructions.

This sounds obvious in retrospect. It was not obvious at the time.

The shift it required was more than technical. It meant accepting that AI-assisted development isn't a replacement for design work — it's an accelerator for design work done first. You can't vibe-code your way to a production system. You can vibe-code a demo, a prototype, a proof of concept. But production requires constraints, and constraints need to be written down before the AI touches the keyboard.

Spec-Kit was the proof of concept. SpecFact CLI was the real thing.

The first commit landed October 29, 2025. Version 0.4.0 went out on November 5. I will be honest with you: I was not expecting to build an open-source CLI tool when I picked up Cursor a year earlier. But here we are.


Act Five: The Snowball (Late 2025–Early 2026)

From v0.4.0 to v0.40.0 in roughly five months. That's not a typo.

The pace was possible because I was by now using the exact workflow I was building. SpecFact CLI was developed using SpecFact CLI. Every feature started with an OpenSpec change proposal. Every function had @icontract decorators and @beartype type checking. Every module was tested before it was released. The framework enforced its own discipline on its own development.

If you've ever had the experience of a tool "clicking" in a way that makes you want to use it more, which makes you better at building it, which makes the tool better — that was this.

The capabilities landed quickly:

  • Backlog sync (v0.25.1) — bidirectional sync between OpenSpec change proposals and GitHub Issues / Azure DevOps. Because specs and backlogs kept drifting apart, and that drift was killing context.
  • Module marketplace (v0.38+) — a cryptographically signed, versioned registry for extending SpecFact. Because the community needed a safe way to contribute without breaking the core.
  • VR-SDD methodology — Value- & Requirements-Driven Spec-Driven Development. Because passing all the specs doesn't mean you built the right thing.
  • Module migration architecture (v0.39–0.40) — restructuring the entire command surface into bundle namespaces. Because a tool that's meant to enforce architecture should itself have clean architecture.

I also moved, somewhere in here, from primarily using Cursor to using Claude Code as my main AI coding environment. Each tool has its strengths — Cursor is excellent for in-file AI editing and contextual autocomplete; Claude Code is strong on multi-step reasoning and architectural work. The mental model I'd been building about "specs as context" transferred cleanly to both. The workflow doesn't care which AI is underneath; it cares whether the AI has something to work from.


The Bit I Didn't Expect

Here's the thing nobody tells you when you're a senior engineer experimenting with AI tools: it makes you a beginner again, and that's actually great.

I spent thirty years developing very reliable intuition about software. I know roughly how long things take. I know which abstractions are load-bearing. I know where the gotchas live. That intuition is genuinely valuable — but it's also a filter that can make you stop examining your assumptions.

AI-assisted development forced me to examine assumptions I'd stopped noticing I had. Why do we treat specs as afterthoughts? Why does every team I've been on treat the backlog as a source of truth when it's never actually current? Why do we generate docs manually when the information to generate them automatically is already in the codebase?

These aren't new questions. Smart people have been asking them for decades. But working with AI tools — with their specific failure modes, their specific strengths, their specific need for structured context — made the answers urgent in a way they hadn't been before.

The AI didn't know why it was generating what it generated. So I had to build a system that made "why" explicit. That's Spec-Driven Development. That's SpecFact.

Thirty years of experience told me what was broken. The AI made it impossible to ignore that it was broken.


What Two Years Actually Looks Like

Period What I was doing What I was learning
Late 2024Cursor for coding; ChatGPT o1 for architecture researchAI reduces friction; thinking models are a different beast
Early 2025ChatGPT o3 + Claude reasoning for feature design; markdown artifacts into codebaseResearch before code is where the real leverage is
Feb 2025Multi-agent automation systemContext management is the unsolved problem
Mar 2025MCP servers (log analysis, ChromaDB)Give AI better tools, not just better prompts
May–Jun 2025Coding Factory with contract enforcement90% test coverage changes how you think
Jul 2025Self-hosted infra stack (N8N, Supabase)AI-era infrastructure needs AI-era automation
Aug 2025Adopting Spec-Kit, crystallising SDD methodologySpecs first. Always. No exceptions.
Oct 2025SpecFact CLI v0.4.0 launchThe tool becomes the product
Nov 2025 – Dec 2025v0.4.0 → v0.20.0 LTS; Perplexity deep research enters the pipelineResearch layer formalised: Perplexity → plan → spec → code
Jan – Mar 2026v0.20.0 → v0.40.0; Perplexity + ChatGPT + Claude research → OpenSpec proposalsMulti-AI research pipeline feeds the spec-driven workflow directly

The through-line is not "AI got better so I could do more." The through-line is "I understood the AI's limitations better, built for them, and the result was better than what I'd built without AI at all."

That's a subtle distinction but an important one. The developers I see struggling with AI tools are often waiting for the AI to get smarter. The ones succeeding are building systems that make the AI's current capabilities enough.


The Lesson (Predictable, Still True)

After thirty-plus years of watching technology change, I have noticed that the people who adapt well to new paradigms share a specific trait: they don't ask "is this better than what I already do?" They ask "what is this actually good at, and how do I build around that?"

AI coding tools are genuinely good at: reducing boilerplate cost, surfacing alternatives, maintaining consistency within a defined context, and moving fast on well-specified problems.

AI coding tools are genuinely bad at: understanding why, maintaining context across sessions, knowing when something is wrong at the architectural level, and — left unchecked — caring about whether the thing they build is the right thing.

If you approach AI tools expecting them to replace judgment, you will be disappointed. If you approach them as a way to make judgment cheaper to apply, you will be surprised how much more you can build.

The bottom line: Never stop learning. And yes — things might actually be quite different from what you expect when you start. I expected to be mildly impressed by an AI coding assistant. I ended up building an open-source specification framework for the era of AI-driven development. The AI helped with every line of it. That, I did not predict. And I love that I didn't.


Where Things Stand (March 2026)

SpecFact CLI is live at specfact.dev, open-source on GitHub under Apache 2.0, and actively developed. The module marketplace is live with Wave 0–2 bundles. VR-SDD is the methodology underpinning the whole thing.

If you're an engineer navigating the same questions I was navigating in late 2024 — "is this AI thing real, and if so, how do I actually use it in production?" — I hope this story is useful.

And if you build something that helps solve the problem better than I did, I genuinely want to know about it.

# See what specfact thinks of your codebase
uvx specfact-cli@latest init --profile solo-developer
specfact project import from-code my-project --repo .
specfact plan review my-project

No API keys. No vendor lock-in. Just specs.


Further Reading