Article 010: Outline (Distilled from Draft)

Title: What long-running agents change, and what they don’t Key Message: Persistence matters more than intelligence. Agents extend work across time and expand what’s economically viable, not just improve capability. Series: B — Reframing the work (Part 5 of 6) Status: Drafted, publishes Monday 2026-03-23

Core thesis: Long-running agents change WHEN work happens (temporal) and HOW MUCH work becomes feasible (economic). Not about what agents can do, but when work happens and at what cost.


Structure

Opening: The wrong question

What most people ask:

  • What can the agent do? (capability focus)
  • Can it reason? Write production code? Handle full features?
  • These are reasonable but not the most important questions

The real shift:

  • Temporal and economic, not capability
  • Long-running agents change when work happens AND how much work becomes feasible
  • Not how well work is done, but when it happens and at what cost
  • Distribution across time + volume at reasonable cost

The capability framing (why it misses the point)

The natural tendency:

  • Evaluate agents by what they can replace
  • Capability as threshold (above = no humans needed)
  • Substitution model: capability rises, human involvement falls

What this misses:

  • Agents don’t replace work, they change when work is done
  • Not a substitute for a person
  • A way to extend judgment across time

The persistence shift (core insight)

Example: Eight hours overnight

  • You approve plan, lock tests, document constraints
  • Agent works through night, commits at stages, surfaces blockers
  • You wake to Git log, completed tasks, open questions

What changed:

  • Not quality of your judgment
  • Decisions still required your expertise
  • But decisions made in afternoon, not 2am
  • Agent moved exercise of judgment to different point in day

The real change:

  • Work happens in hours that previously couldn’t happen (you were asleep)
  • Agent = way of front-loading human judgment for async application
  • Not capability improvement, but temporal extension

The economic dimension (how much becomes viable)

The volume question:

  • Not just “work happens overnight” (temporal)
  • But “how much work becomes economically feasible?” (volume)

Without agents:

  • 8 hours overnight = expensive night shift OR unsustainable personal hours OR work doesn’t happen
  • Continuous operation = prohibitive human cost
  • Parallel streams = need multiple people

With agents:

  • 8 hours overnight = marginal compute cost
  • Continuous operation = affordable
  • Parallel streams = economically viable

What this enables:

  • Not just redistributed work (same total, different times)
  • MORE total work becomes feasible
  • Work that wouldn’t happen at all (too expensive with human cost)
  • Volume of work expands, not just timing

The shift:

  • Human cost: linear with time (pay for hours)
  • Compute cost: marginal (doesn’t scale same way)
  • Work that was economically unviable becomes viable
  • Not “move work to night,” but “do work that wouldn’t happen”

Examples:

  • Running comprehensive test suites continuously (not just on commit)
  • Exploring multiple architectural approaches in parallel
  • Implementing fallback options “just in case”
  • Maintaining documentation that stays in sync with code
  • Work with uncertain ROI becomes affordable to attempt

What extends across time

Key question shifts:

  • From “what can the agent do?”
  • To “what decisions need to be made before the agent starts?”

Async work depends on clarity at handoff:

  • Spec precise enough (ambiguities don’t block)
  • Tests capture what “done” means
  • Constraints explicit, not tacit

Async surfaces cost of vagueness:

  • Synchronous: hit ambiguity, ask in 30 seconds (low cost)
  • Async: agent makes wrong assumption OR surfaces blocker (high cost)

What extends:

  • Not just execution capacity
  • Quality of artifacts (spec, plan, tests, constraints)
  • Decisions at 4pm shape what happens at 2am

The skill shift:

  • Less about moment of execution
  • More about preparation that enables execution without you
  • Less doing, more enabling

What doesn’t change

The temptation:

  • If agent does more work, is human judgment less important?

The reality:

  • Importance of judgment unchanged
  • What changes: points at which it applies
  • Not present for implementation, but present for decisions that shaped it

When things go wrong:

  • Questions are still human questions
  • Did spec capture right requirements?
  • Did tests test right things?
  • Was architecture sound?

Understanding still matters:

  • Perhaps more so
  • Need to evaluate overnight results
  • Understand agent’s choices
  • Recognize when something looks right but isn’t
  • Can’t outsource understanding, only change when you apply it

The time structure of work

Synchronous development:

  • Continuous: sit down, code, stop
  • Bounded by presence
  • Work starts/stops with you

With long-running agents:

  • Handoff pattern, not continuous flow
  • Setup phase: thinking, decisions, artifacts
  • Execution phase: agent works, you do other things
  • Review phase: return, assess, decide next

This doesn’t mean less work:

  • Work differently distributed
  • Setup requires careful thinking (vagueness propagates)
  • Review requires genuine engagement (not rubber-stamping)
  • Execution is clock-hours, but human effort relocated

Natural fit for:

  • Batch processing
  • Multi-step implementation
  • Long test runs
  • Research compilation

Poor fit for:

  • Fast iteration
  • Exploratory coding
  • Debugging (needs immediacy)

The judgment:

  • Which kind of task am I doing?
  • Which mode fits it?
  • Judgment remains human

Close: The shape of change

What changed (not what you thought):

  • Early framing: capable agents replace human tasks
  • Actual shift: agents change when work happens and how much work becomes economically viable
  • Temporal: extend reach of human decisions into periods when human isn’t present
  • Economic: work at compute cost vs human cost dramatically changes what’s feasible

The dual shift:

  • Calendar of work changes (temporal)
  • Volume of work expands (economic)
  • Not replacement, but extension across time + expansion of what’s affordable

What didn’t change:

  • Need for human understanding
  • Sound judgment at decision points
  • Careful preparation of guiding artifacts
  • Judgment isn’t automated, it’s relocated and amplified

That relocation and amplification matters:

  • Full-day work → morning prep + overnight execution + hour review (temporal)
  • Work that wouldn’t happen → becomes viable at compute cost (economic)
  • Not less work, differently distributed AND more total work becomes affordable
  • Not less judgment, judgment at different moments + judgment about what work to attempt

The real questions:

  • Not “what can the agent do?”
  • “What decisions need to be made before agent starts, and how do I make them well?”
  • “What work becomes worth attempting at compute cost that wasn’t at human cost?”
  • Agents extend time over which work can happen + expand volume of work that’s economically viable
  • Thinking that makes both extensions valuable remains ours

Threads from earlier articles

FromThemeConnection
002Work continues without youExtends to “work continues because agent persists”
005Continuity over speedPersistence enables continuity across time
008Verified progressionSetup phase = creating artifacts that guide execution
009Git as memoryArtifacts in Git enable agent to work without you

Key contrasts emphasized

  • Capability vs temporal + economic shift
  • Replace vs extend (time) + expand (volume)
  • Substitution vs relocation + amplification
  • Continuous vs handoff pattern
  • Doing vs enabling
  • Human cost vs compute cost
  • Redistributed work vs more total work
  • What can agent do vs what work becomes worth attempting