TL;DR: We ran SpecFact CLI on its own codebase in two ways: (1) Brownfield analysis discovered 19 features and 49 stories in seconds, found 24 deviations, and blocked the merge. (2) Contract enhancement added contracts to our core telemetry module with 7-step validation (all tests passed). Total time: seconds for analysis (varies by codebase size), ~3 minutes for contract enhancement. ๐
The Challenge
We built SpecFact CLI and wanted to validate that it actually works in the real world. So we did what every good developer does: we dogfooded it.
"Dogfooding" is a well-known tech term meaning "eating your own dog food" โ using your own product. It's a common practice to validate that tools work in real-world scenarios.
Our Goal: Analyze the SpecFact CLI codebase itself and demonstrate:
- How fast brownfield analysis is
- How enforcement actually blocks bad code
- How the complete workflow works end-to-end
- How contract enhancement works on real production code
Part 1: Brownfield Analysis โก
First, we analyzed the existing codebase to see what features it discovered:
specfact import from-code specfact-cli --repo . --confidence 0.5 Note: Analysis time varies by codebase size. For ~19 Python files, this typically completes in a few seconds. Larger codebases may take longer as the CLI performs AST analysis, Semgrep pattern detection, and dependency graph building.
Output:
๐ Analyzing Python files...
โ Found 19 features
โ Detected themes: CLI, Validation
โ Total stories: 49
โ Analysis complete!
Project bundle written to: .specfact/projects/specfact-cli/ What It Discovered
The brownfield analysis extracted 19 features from our codebase:
| Feature | Stories | Confidence | What It Does |
|---|---|---|---|
| Enforcement Config | 3 | 0.9 | Configuration for contract enforcement and quality gates |
| Code Analyzer | 2 | 0.7 | Analyzes Python code to auto-derive plan bundles |
| Plan Comparator | 1 | 0.7 | Compares two plan bundles to detect deviations |
| Report Generator | 3 | 0.9 | Generator for validation and deviation reports |
| Protocol Generator | 3 | 0.9 | Generator for protocol YAML files |
| Plan Generator | 3 | 0.9 | Generator for plan bundle YAML files |
| FSM Validator | 3 | 1.0 | FSM validator for protocol validation |
| Schema Validator | 2 | 0.7 | Schema validator for plan bundles and protocols |
| Git Operations | 5 | 1.0 | Helper class for Git operations |
| Logger Setup | 3 | 1.0 | Utility class for standardized logging setup |
Total: 49 user stories auto-generated with Fibonacci story points.
Time taken: A few seconds for 19 Python files (varies by machine and codebase complexity) ๐๏ธ
Setting Enforcement Rules ๐ฏ
Next, we configured quality gates to block HIGH severity violations:
specfact enforce stage --preset balanced Output:
Setting enforcement mode: balanced
Enforcement Mode:
BALANCED
โโโโโโโโโโโโณโโโโโโโโโ
โ Severity โ Action โ
โกโโโโโโโโโโโโโโโโโโโโฉ
โ HIGH โ BLOCK โ
โ MEDIUM โ WARN โ
โ LOW โ LOG โ
โโโโโโโโโโโโดโโโโโโโโโ
โ Enforcement mode set to balanced What this means:
- ๐ซ HIGH severity deviations โ BLOCK the merge (exit code 1)
- โ ๏ธ MEDIUM severity deviations โ WARN but allow (exit code 0)
- ๐ LOW severity deviations โ LOG silently (exit code 0)
Comparing Plans โ The Magic Moment ๐
We already had a manual plan in our repo (created during initial development with specfact plan init). Now we compare it against the auto-derived plan from the brownfield analysis:
๐ก Best Practice: Use /specfact.compare --code-vs-plan in your AI IDE for LLM-enriched deviation analysis with suggested fixes.
# In AI IDE (recommended): /specfact.compare --code-vs-plan
# Or via CLI (basic):
specfact plan compare --code-vs-plan Plan loading time depends on the number of features and stories detected.
Results
Deviations Found: 24 total
- ๐ด HIGH: 2 (Missing features from manual plan)
- ๐ก MEDIUM: 19 (Extra implementations found in code)
- ๐ต LOW: 3 (Metadata mismatches)
๐ด HIGH Severity โ BLOCKED!
The comparison found that our manual plan calls it FEATURE-ENFORCEMENT, but the code has FEATURE-ENFORCEMENTCONFIG. This is a real deviation โ our naming doesn't match!
๐ก MEDIUM Severity โ Warned
We have 19 utility features (YAML utils, Git operations, validators, etc.) that exist in code but aren't documented in our minimal manual plan.
This is exactly what we want! It shows us undocumented features that should either be added to the plan or removed if not needed.
Enforcement In Action ๐ซ
With balanced enforcement enabled:
โ Enforcement BLOCKED: 2 deviation(s) violate quality gates
Fix the blocking deviations or adjust enforcement config Exit Code: 1 (BLOCKED) โ
In CI/CD: This would fail the PR and prevent the merge until we fix the deviations.
Switching to Minimal Enforcement
Let's try again with minimal enforcement (never blocks):
specfact enforce stage --preset minimal
specfact plan compare Result:
โ
Enforcement PASSED: No blocking deviations Exit Code: 0 (PASSED) โ
Same deviations, different outcome: With minimal enforcement, even HIGH severity issues are downgraded to warnings. Perfect for exploration phase!
Part 2: Contract Enhancement (Production Use Case) ๐ฏ
After validating the brownfield analysis, we took it a step further: we used SpecFact CLI to enhance one of our own core modules with contracts.
Goal: Add beartype, icontract, and CrossHair contracts to src/specfact_cli/telemetry.py โ a core module that handles privacy-first telemetry.
๐ก Best Practice: Use the /specfact.07-contracts slash command in your AI IDE for the full contract enhancement workflow. It orchestrates: analyze coverage โ generate prompts โ apply contracts with validation loop.
Generate Contract Enhancement Prompt
# In AI IDE (recommended): /specfact.07-contracts --apply all-contracts
# Or via CLI (basic):
specfact generate contracts-prompt src/specfact_cli/telemetry.py --bundle specfact-cli-test --apply all-contracts What happened:
- CLI analyzed the telemetry module (543 lines)
- Generated a structured prompt for AI IDEs
- Included instructions for beartype, icontract, and CrossHair
AI IDE Enhancement + 7-Step Validation
The AI IDE (Cursor) enhanced the code, then ran comprehensive validation:
| Step | Check | Result |
|---|---|---|
| 1/7 | File Size Check | โ 678 lines (was 543) |
| 2/7 | Syntax Validation | โ Python compilation passed |
| 3/7 | AST Structure Comparison | โ All 23 definitions preserved |
| 4/7 | Contract Imports Verification | โ beartype, icontract imports verified |
| 5/7 | Code Quality Checks | โ Ruff linting passed |
| 6/7 | Test Execution | โ 10/10 tests passed |
| 7/7 | Diff Preview | โ Changes reviewed |
โ All validations passed!
Enhanced code applied to: src/specfact_cli/telemetry.py
Total validation time: < 10 seconds
What We Achieved
Contracts Applied
- beartype decorators: Runtime type checking for all public APIs
- icontract decorators: Preconditions and postconditions where appropriate
- CrossHair tests: Property-based test functions for edge case discovery
Production Value
This demonstrates real production use:
- Enhanced a core module (telemetry) used throughout the CLI
- Applied all three contract types
- All tests passed (10/10) โ no regressions
- Fast validation (< 10 seconds for comprehensive 7-step process)
Key Takeaways
1. Speed โก
| Task | Typical Time* |
|---|---|
| Analyze 19 Python files | ~3โ5 seconds |
| Set enforcement | < 1 second |
| Compare plans | ~2โ5 seconds |
| Total Analysis | ~10โ15 seconds |
*Times vary by machine, codebase size, and number of features/stories detected.
2. Accuracy ๐ฏ
- Discovered 19 features we actually built
- Generated 49 user stories with meaningful titles
- Detected real naming inconsistencies
3. Enforcement Works ๐ซ
- Balanced mode: Blocked execution due to 2 HIGH deviations (exit 1)
- Minimal mode: Passed with warnings (exit 0)
- CI/CD ready: Exit codes work perfectly with any CI system
4. Real Value ๐
The tool found real issues:
- Naming inconsistency between manual plan and code
- Undocumented features that should be documented
- Documentation gaps that need addressing
These are actual questions that need answers, not false positives!
Try It Yourself
Option A: AI IDE Mode (Recommended)
For best results, use the slash commands in your AI IDE (Cursor, VS Code + Copilot, etc.). The LLM enriches the CLI output with semantic understanding, business context, and "why" reasoning.
# Step 1: Install slash commands in your AI IDE
specfact init --ide cursor # or --ide vscode
# Step 2: In your AI IDE, use the slash command:
/specfact.01-import --repo .
# The slash command runs CLI + LLM enrichment for better feature detection Option B: CLI-Only Mode (Quick Start)
For quick validation or CI/CD, you can run the CLI directly (no LLM enrichment):
# Clone SpecFact CLI
git clone https://github.com/nold-ai/specfact-cli.git
cd specfact-cli
# Step 1: Analyze the codebase (CLI-only, no LLM enrichment)
specfact import from-code specfact-cli --repo .
# Step 2: Set enforcement rules
specfact enforce stage --preset balanced
# Step 3: Compare plans (requires existing manual plan)
specfact plan compare --code-vs-plan
# Step 4: Run full validation suite
specfact repro --verbose Note: CLI-only mode is fast and deterministic but provides basic feature detection. AI IDE mode (slash commands) provides significantly better results through LLM-enriched analysis.
Conclusion
SpecFact CLI works. We proved it by running it on itself.
- โก Fast: Analyzed 19 files in seconds
- ๐ฏ Accurate: Found 24 real deviations
- ๐ซ Blocks bad code: Enforcement prevented merge with HIGH violations
- ๐ก๏ธ Comprehensive: 7-step validation for contract enhancement
- โ Production-ready: All tests passed, no regressions
Built by dogfooding โ This example is real, not fabricated. We ran SpecFact CLI on itself and documented the actual outcomes.