Article 008: Beyond vibe coding: why chat needs verification at every stage

Key Message: AI amplifies process quality. Weak process produces slop faster. Strong process produces quality faster.

Series: B — Reframing the work (Part 3 of 6) Publish Date: Monday 2026-03-09

Story Outline

Opening: AI amplifies process quality

Core thesis (establish immediately):

AI agents amplify whatever process you give them
Weak process → slop at speed
Strong process → quality at speed
Same tool, different outcomes, the difference is verification

Two patterns emerging:

Vibe coding: Chat → “build authentication” → agent generates → “looks good” → ship → breaks in production

Verified progression: Requirements → verify → Design → verify → Tests → verify → Code → verify → ship working code

The contrast:

Both teams use chat constantly
One skips verification, one builds it at every stage
One produces slop faster than manual coding
One produces quality faster than manual coding
AI didn’t change whether verification matters
It amplified how much it matters

The insight: Catch problems at every stage

The surface-level assumption:

Problem is in generated code
AI produces buggy, insecure, low-quality implementations
Solution is better code review, more testing, stricter linters
Focus verification at the code stage

What verified progression reveals:

Hallucinations and mistakes happen at every stage (requirements, design, tests, code)
The staged process catches them wherever they occur
Requirements mistakes caught before design
Design mistakes caught before tests
Test mistakes caught before implementation
Implementation mistakes caught before deployment
Each stage costs less to fix than the next

Why this matters:

Vibe coding jumps straight to code without verified requirements or design
All problems (requirements, design, tests, implementation) discovered simultaneously in production
Expensive, wasteful, produces slop

The pattern verified progression enables:

Problems caught at the stage where they occur
With verification at each stage: caught early, cheap to fix (hours not days)
Without staged verification: discovered in production (days to fix, or incidents)

Evidence: The staged workflow in practice

The model: Requirements → Design → Tests → Code

At each stage: chat coordinates, tools verify, humans approve, PR gates progression.

Real example: Building domain API to orchestrate three backend services

Context: Gateway routing to Apache Camel orchestrator, coordinating three existing backend systems (validation, enrichment, storage). Running on Kubernetes, needs to handle production traffic patterns.

Stage 1: Requirements (spec.md)

Chat coordinates: “What are we solving? Need unified API orchestrating three backend systems. Who are the users? External clients. What constraints matter? Latency targets, idempotency, failure handling.”

Agent explores edge cases (idempotency, sparse fieldsets, retrieval patterns), documents requirements, surfaces assumptions about backend responsibilities.

Tools verify: Requirements completeness checks, stakeholder review, conflict detection with existing platform APIs.

Human approves: Architect reads spec, verifies business need, confirms feasibility, spots gaps in error handling requirements.

PR gate: spec.md approved and locked before design starts.

What this caught: Missing requirement for handling cases where identifier from one backend isn’t immediately available. Caught at requirements stage, not after designing orchestration flow.

Cost if missed: Would have discovered during implementation. Days to redesign orchestration vs hours to update spec.

Stage 2: Design (plan.md)

Chat coordinates: “Given requirements, what’s the orchestration approach? Sequential vs parallel? Who owns what data? What are failure modes when backends timeout or return errors?”

Agent drafts plan, explores alternatives, documents system of record responsibilities (validation rules, reference data, persistence/idempotency), maps failure modes (timeout = 503 with retry hints, validation failure = 400 with detail, enrichment failure = degrade gracefully).

Tools verify: AI review spots architectural issues (sequential calls would add latency), pattern consistency checks flag deviation from existing patterns, dependency analysis shows orchestration dependencies.

Human approves: Lead architect reviews plan, verifies against locked spec, checks fit with broader strategy, pushes back on agent’s sequential approach, chooses parallel enrichment where possible.

PR gate: plan.md and architecture approved and locked before tests.

What this caught: Agent suggested simpler sequential orchestration that would’ve added 500ms+ latency. Architectural review caught parallel opportunity before implementation.

Cost if missed: Would have discovered in performance testing. Week to refactor vs hours to revise plan.

Stage 3: Tests

Chat coordinates: “Given this design, generate acceptance test cases. Cover submission flow, retrieval patterns, idempotency, error handling, sparse fieldsets.”

Agent generates comprehensive test suite: happy path, error cases, edge cases, orchestration boundaries.

Tools verify: Requirements coverage (all requirements have tests), spec coverage (critical design elements covered), test quality analysis flags brittle tests.

Human approves: Senior developer reviews tests, verifies they match locked plan, spots mechanical tests (checking implementation details, not behavior), rewrites three tests to verify from client perspective.

PR gate: Tests approved and locked before implementation.

What this caught: Tests that would have passed but didn’t verify parallel enrichment actually worked (mocked orchestrator instead of backends). Human spotted gap tools missed.

Cost if missed: False confidence. Would have passed all tests but failed in production. Days to debug vs hours to fix tests.

Stage 4: Code

Chat coordinates: “Implement orchestrator to pass these tests. Use locked design: parallel enrichment where possible, sequential where dependencies require it.”

Agent has everything: verified requirements (spec.md), approved approach (plan.md), concrete tests. Implementation happens overnight.

Tools verify: All tests pass, linters clean, security scan clean, OpenAPI validation confirms implementation matches spec.

Human approves: Developer examines implementation, verifies locked plan, confirms tests pass, spot-checks error handling (finds edge case where backend timeout returned generic 500 instead of 503 with retry hints).

PR gate: Code approved and merged after error handling fix.

What this caught: Error handling gap where backend failures weren’t giving clients actionable information (tools missed, human spotted reviewing error paths).

Cost if missed: Production incident. Customers seeing unhelpful errors, operations scrambling to diagnose.

Implications: The economics of staged verification

Cost structure:

Early stages (requirements, design): Cheap to fix. Hours to rewrite docs, not code.
Late stages (code, production): Expensive to fix. Days to refactor, or production incidents.
Skipping early verification: All problems hit simultaneously at most expensive stage.

What verified progression enables:

Each stage builds verified artifacts for the next
By code stage, you have locked requirements, approved design, concrete tests
More good conversations early (requirements/design with domain experts)
Learnings feed back (patterns, failure modes, architectural decisions documented)
Requires strong context at each stage (system knowledge, domain understanding, integration patterns)

Why vibe coding fails:

Skips straight to code with unclear requirements, uncertain design, weak tests
Fast output becomes slow when counting rework
Amplifies process weakness: bad requirements → bad code, faster

Why verified progression works:

Amplifies process strength: verified requirements → verified design → verified tests → good code, faster

Close: The supervision paradox

What I’m seeing:

Everyone uses chat
Some verify at every stage, others don’t
Same tool, dramatically different outcomes

The supervision paradox: Verification at each stage requires judgment:

Is this requirement right?
Is this design sound?
Are these tests meaningful?

Juniors and new hires are supervising agents before they’ve built that judgment and context.

Can they verify requirements they haven’t learned to write?
Can they spot flawed designs they haven’t learned to create?
Can they identify mechanical tests when they haven’t learned what good tests look like?

“The more we learn, the more we realize how little we know.”

People deeply using AI realize how complex verification is. Module boundaries, integration assumptions, business context that doesn’t live in code. Meanwhile juniors are supervising before they’ve learned enough to realize how little they know.

Brief note on experienced resistance: Some experienced developers dismiss AI entirely after encountering vibe coding stories and “Beyond the Vibes” discourse on social media. They should know better than to discard something existential without deep evaluation. The volume of dismissal might reveal they sense the threat (this isn’t mobile — there’s nowhere to hide). But dismissing based on anxiety instead of engaging deeply is exactly the wrong response. Developers using verified progression can supervise more work than developers rejecting AI entirely. The gap compounds.

The shift:

Chat didn’t make verification optional
It made verification essential at every stage
Vibe coding is what happens when speed feels more important than rigor
Until the rework costs more than the speed gained
AI amplifies process quality: weak process produces slop faster, strong process produces quality faster

Notes

References

“Beyond the Vibes” article (https://blog.tedivm.com/guides/2026/03/beyond-the-vibes-coding-assistants-and-agents/)
Reddit discussions on vibe coding and code quality

Threads from earlier articles

From	Theme	Pick up here
004	IDE as intervention surface	Human verification happens in IDE at code stage
006	Shift to supervision	Supervision means approving at each stage, not just at end
007	Attention allocation	Tools help allocate attention to what needs human review
002	Work continues without you	Agents work between stages, but gates prevent skipping verification

Techcle Wiki

Explorer

Article