Article 008: Beyond vibe coding: why chat needs verification at every stage
Key Message: AI amplifies process quality. Weak process produces slop faster. Strong process produces quality faster.
Series: B — Reframing the work (Part 3 of 6) Publish Date: Monday 2026-03-09
Story Outline
Opening: AI amplifies process quality
Core thesis (establish immediately):
- AI agents amplify whatever process you give them
- Weak process → slop at speed
- Strong process → quality at speed
- Same tool, different outcomes, the difference is verification
Two patterns emerging:
Vibe coding: Chat → “build authentication” → agent generates → “looks good” → ship → breaks in production
Verified progression: Requirements → verify → Design → verify → Tests → verify → Code → verify → ship working code
The contrast:
- Both teams use chat constantly
- One skips verification, one builds it at every stage
- One produces slop faster than manual coding
- One produces quality faster than manual coding
- AI didn’t change whether verification matters
- It amplified how much it matters
The insight: Catch problems at every stage
The surface-level assumption:
- Problem is in generated code
- AI produces buggy, insecure, low-quality implementations
- Solution is better code review, more testing, stricter linters
- Focus verification at the code stage
What verified progression reveals:
- Hallucinations and mistakes happen at every stage (requirements, design, tests, code)
- The staged process catches them wherever they occur
- Requirements mistakes caught before design
- Design mistakes caught before tests
- Test mistakes caught before implementation
- Implementation mistakes caught before deployment
- Each stage costs less to fix than the next
Why this matters:
- Vibe coding jumps straight to code without verified requirements or design
- All problems (requirements, design, tests, implementation) discovered simultaneously in production
- Expensive, wasteful, produces slop
The pattern verified progression enables:
- Problems caught at the stage where they occur
- With verification at each stage: caught early, cheap to fix (hours not days)
- Without staged verification: discovered in production (days to fix, or incidents)
Evidence: The staged workflow in practice
The model: Requirements → Design → Tests → Code
At each stage: chat coordinates, tools verify, humans approve, PR gates progression.
Real example: Building domain API to orchestrate three backend services
Context: Gateway routing to Apache Camel orchestrator, coordinating three existing backend systems (validation, enrichment, storage). Running on Kubernetes, needs to handle production traffic patterns.
Stage 1: Requirements (spec.md)
Chat coordinates: “What are we solving? Need unified API orchestrating three backend systems. Who are the users? External clients. What constraints matter? Latency targets, idempotency, failure handling.”
Agent explores edge cases (idempotency, sparse fieldsets, retrieval patterns), documents requirements, surfaces assumptions about backend responsibilities.
Tools verify: Requirements completeness checks, stakeholder review, conflict detection with existing platform APIs.
Human approves: Architect reads spec, verifies business need, confirms feasibility, spots gaps in error handling requirements.
PR gate: spec.md approved and locked before design starts.
What this caught: Missing requirement for handling cases where identifier from one backend isn’t immediately available. Caught at requirements stage, not after designing orchestration flow.
Cost if missed: Would have discovered during implementation. Days to redesign orchestration vs hours to update spec.
Stage 2: Design (plan.md)
Chat coordinates: “Given requirements, what’s the orchestration approach? Sequential vs parallel? Who owns what data? What are failure modes when backends timeout or return errors?”
Agent drafts plan, explores alternatives, documents system of record responsibilities (validation rules, reference data, persistence/idempotency), maps failure modes (timeout = 503 with retry hints, validation failure = 400 with detail, enrichment failure = degrade gracefully).
Tools verify: AI review spots architectural issues (sequential calls would add latency), pattern consistency checks flag deviation from existing patterns, dependency analysis shows orchestration dependencies.
Human approves: Lead architect reviews plan, verifies against locked spec, checks fit with broader strategy, pushes back on agent’s sequential approach, chooses parallel enrichment where possible.
PR gate: plan.md and architecture approved and locked before tests.
What this caught: Agent suggested simpler sequential orchestration that would’ve added 500ms+ latency. Architectural review caught parallel opportunity before implementation.
Cost if missed: Would have discovered in performance testing. Week to refactor vs hours to revise plan.
Stage 3: Tests
Chat coordinates: “Given this design, generate acceptance test cases. Cover submission flow, retrieval patterns, idempotency, error handling, sparse fieldsets.”
Agent generates comprehensive test suite: happy path, error cases, edge cases, orchestration boundaries.
Tools verify: Requirements coverage (all requirements have tests), spec coverage (critical design elements covered), test quality analysis flags brittle tests.
Human approves: Senior developer reviews tests, verifies they match locked plan, spots mechanical tests (checking implementation details, not behavior), rewrites three tests to verify from client perspective.
PR gate: Tests approved and locked before implementation.
What this caught: Tests that would have passed but didn’t verify parallel enrichment actually worked (mocked orchestrator instead of backends). Human spotted gap tools missed.
Cost if missed: False confidence. Would have passed all tests but failed in production. Days to debug vs hours to fix tests.
Stage 4: Code
Chat coordinates: “Implement orchestrator to pass these tests. Use locked design: parallel enrichment where possible, sequential where dependencies require it.”
Agent has everything: verified requirements (spec.md), approved approach (plan.md), concrete tests. Implementation happens overnight.
Tools verify: All tests pass, linters clean, security scan clean, OpenAPI validation confirms implementation matches spec.
Human approves: Developer examines implementation, verifies locked plan, confirms tests pass, spot-checks error handling (finds edge case where backend timeout returned generic 500 instead of 503 with retry hints).
PR gate: Code approved and merged after error handling fix.
What this caught: Error handling gap where backend failures weren’t giving clients actionable information (tools missed, human spotted reviewing error paths).
Cost if missed: Production incident. Customers seeing unhelpful errors, operations scrambling to diagnose.
Implications: The economics of staged verification
Cost structure:
- Early stages (requirements, design): Cheap to fix. Hours to rewrite docs, not code.
- Late stages (code, production): Expensive to fix. Days to refactor, or production incidents.
- Skipping early verification: All problems hit simultaneously at most expensive stage.
What verified progression enables:
- Each stage builds verified artifacts for the next
- By code stage, you have locked requirements, approved design, concrete tests
- More good conversations early (requirements/design with domain experts)
- Learnings feed back (patterns, failure modes, architectural decisions documented)
- Requires strong context at each stage (system knowledge, domain understanding, integration patterns)
Why vibe coding fails:
- Skips straight to code with unclear requirements, uncertain design, weak tests
- Fast output becomes slow when counting rework
- Amplifies process weakness: bad requirements → bad code, faster
Why verified progression works:
- Amplifies process strength: verified requirements → verified design → verified tests → good code, faster
Close: The supervision paradox
What I’m seeing:
- Everyone uses chat
- Some verify at every stage, others don’t
- Same tool, dramatically different outcomes
The supervision paradox: Verification at each stage requires judgment:
- Is this requirement right?
- Is this design sound?
- Are these tests meaningful?
Juniors and new hires are supervising agents before they’ve built that judgment and context.
- Can they verify requirements they haven’t learned to write?
- Can they spot flawed designs they haven’t learned to create?
- Can they identify mechanical tests when they haven’t learned what good tests look like?
“The more we learn, the more we realize how little we know.”
People deeply using AI realize how complex verification is. Module boundaries, integration assumptions, business context that doesn’t live in code. Meanwhile juniors are supervising before they’ve learned enough to realize how little they know.
Brief note on experienced resistance: Some experienced developers dismiss AI entirely after encountering vibe coding stories and “Beyond the Vibes” discourse on social media. They should know better than to discard something existential without deep evaluation. The volume of dismissal might reveal they sense the threat (this isn’t mobile — there’s nowhere to hide). But dismissing based on anxiety instead of engaging deeply is exactly the wrong response. Developers using verified progression can supervise more work than developers rejecting AI entirely. The gap compounds.
The shift:
- Chat didn’t make verification optional
- It made verification essential at every stage
- Vibe coding is what happens when speed feels more important than rigor
- Until the rework costs more than the speed gained
- AI amplifies process quality: weak process produces slop faster, strong process produces quality faster
Notes
References
- “Beyond the Vibes” article (https://blog.tedivm.com/guides/2026/03/beyond-the-vibes-coding-assistants-and-agents/)
- Reddit discussions on vibe coding and code quality
Threads from earlier articles
| From | Theme | Pick up here |
|---|---|---|
| 004 | IDE as intervention surface | Human verification happens in IDE at code stage |
| 006 | Shift to supervision | Supervision means approving at each stage, not just at end |
| 007 | Attention allocation | Tools help allocate attention to what needs human review |
| 002 | Work continues without you | Agents work between stages, but gates prevent skipping verification |