The Code Is the What; The Transcript Is the Why

SF Scott Farrell July 5, 2026 scott@leverageai.com.au LinkedIn

AI Architecture · LeverageAI · Knowledge Assets

The Code Is the What; The Transcript Is the Why

Your coding agent writes a dated, first-person record of how you think — your intent, the alternatives you rejected, the plans you never shipped. Then it deletes it on a thirty-day timer. The repository can reconstruct none of it. Here is how to distil that record into a knowledge asset before it’s gone — and why the most valuable thing in the file is the stuff with no code.

Scott Farrell · LeverageAI · A field note for people who live in an agent CLI

The short version

  • Do this first: Claude Code deletes old sessions on a default 30-day timer, with a hard delete and no undo.12 Snapshot your ~/.claude/projects/ and Codex session dirs somewhere durable now. The raw JSONL is the source asset; everything downstream is regenerable, but a deleted session is gone.
  • The split: the repo is the what — the final state. The transcript is the why — your dated intent in your own words, the alternatives you considered and dropped, and the things you planned but never built. None of those three are recoverable from the code.
  • The fix: a two-stage distiller. Stage one strips the noise with zero AI. Stage two hands a cheap model a North Star and gets back a session brief. In one real session: ~240k tokens → ~14k after the deterministic strip → a ~900-token brief.

Stop reading this for a second and go save your sessions. I mean it — open a terminal and copy ~/.claude/projects/ and your Codex session directories somewhere you trust, because while you’ve been working, your coding agent has been keeping a remarkably complete diary of your reasoning, and it is also quietly deleting the old pages. Claude Code ships with a retention setting, cleanupPeriodDays, that defaults to thirty days, and “session files older than this period are deleted at startup.”1 There is no trash folder and no grace period; the cleanup calls unlink() and the file is gone.2 People have built whole tools for the sole purpose of rescuing these files before they vanish — one developer archives Claude Code conversations into SQLite specifically because, in his words, he “got tired of losing past debugging sessions.”3 So before we talk about what’s in the transcript and why it’s worth so much, secure the raw material. Everything I describe after this — the distillate, the brief, the eventual wiki — can be regenerated from the JSONL. The JSONL cannot be regenerated from anything.

Step zero — before anything else

cp -R ~/.claude/projects/ ~/session-archive/$(date +%F)/

Do the same for your Codex session dirs. Push it to durable storage. The raw transcript is the one artefact in this entire pipeline that is not regenerable — treat it like source, not like a log.

The code is the what. The transcript is the why.

Here is the thing I keep coming back to. When a session ends, you keep the code and you throw away the conversation — because the code is the deliverable and the conversation is “just chat.” But the code and the conversation record two completely different things, and only one of them can be reconstructed from the other. The repository is a snapshot of the final state: this is what got built, this is how it ended up. The transcript is the record of the why — and it holds three things that are structurally invisible in the code.

First, your intent, in your own words, dated. Not a commit message written after the fact to satisfy a hook, but the actual sentence you typed at 11pm — “I want the agent to handle a caller interrupting mid-sentence so it feels like a real conversation, not a walkie-talkie.” That is the strongest provenance you will ever have for why a system exists, and it is timestamped to the minute.

Second, the alternatives you rejected. Every real build is a graveyard of considered-and-dropped approaches: the library you evaluated and abandoned, the architecture you argued yourself out of, the shortcut you decided wasn’t worth it. The finished code shows the one path you took and erases every path you didn’t. This is the same point our Discovery Accelerators work makes about visible reasoning — that the rejected branches carry real value — turned back on yourself: your own dead ends are information, and the transcript is the only place they survive.

Third, the things you planned but never built. The feature you described in detail in turn 44 and never got to. The fallback you sketched and deferred. There is no code for these — that’s the whole point — so the repository literally cannot contain them. They exist in exactly one place.

The most valuable things in a session are the ones the repository can’t hold: the thing you considered and rejected, and the thing you planned and never built. There’s no code for either — only the transcript remembers.

Put those three together and the transcript stops looking like exhaust and starts looking like an asset class in its own right — a dated, first-person record of how you actually think, sitting right next to the code that only ever shows how things turned out. The trouble is that nobody reads it, because in its raw form it is genuinely unreadable.

Everyone archives the transcript. Nobody reads it back.

Before designing anything, I checked what already exists, because agent-session logs are hardly a secret. The surprise was how mature the tooling is on the plumbing and how completely empty it is on the layer that matters. The ecosystem has solved parsing, rendering, archiving and search. It has not solved distillation.

Look at the landscape. SpecStory auto-saves every Claude Code, Cursor and Codex conversation to a local .specstory/history/ folder and recommends committing those transcripts alongside your code to preserve design intent — its pitch is essentially “intent is the new source code.”4 claude-code-log converts the JSONL into readable HTML and Markdown, with detail levels and a compact mode.5 Simon Willison’s claude-code-transcripts turns sessions into clean, paginated HTML you can publish to a Gist.6 claude-devtools renders a chronological conversation view with per-tool renderers and cross-session search.7 claude-conversation-extractor pulls clean logs out of the local store — explicitly framed as getting your conversations out “before they’re deleted.”8

Every one of these is good, and every one of them stops at the same place: it preserves the transcript verbatim. It makes the raw thing browsable, searchable, archivable — but it never reads it back for you. None of them will tell you what a session revealed about your thinking. None of them produces the one-paragraph answer to “what did I decide here, and what did I reject?” They archive the diary; nobody distils it. That is the gap, and it’s worth being precise about why the gap exists: distillation is harder than rendering, and it looks like it needs an expensive model over an enormous file. It mostly doesn’t — because most of that file is noise the code already accounts for.

Stage one: strip the noise with zero AI

A raw session is verbose in an extremely structured way. The JSONL format is, as one write-up puts it, “verbose, with tool calls containing full file contents and bash outputs,” and a single conversation “can easily be tens of thousands of lines.”9 But that structure is exactly what lets you throw most of it away without a model touching it. Every line is a typed event — a user message, an assistant turn, a tool call, a tool result. Route by type and the noise falls out deterministically.

  • Strip every tool-result payload. The file contents, the diffs, the bash output — this is the bulk of the tokens, and it is already represented, perfectly, by the finished code sitting in your repo. Re-reading it into a distillate is paying twice for the same information.
  • Keep every user turn verbatim. Your turns are the thought stream — the single highest-signal channel in the whole file, and cheap to keep because there simply aren’t many of them relative to the tool traffic. This is the one place you do not compress.
  • Route the tool calls by name. Edit, Write and Bash collapse to a one-line action log (“edited server.py ×14”). WebSearch and WebFetch keep their query and their result URLs — that’s your research trail, and it’s genuinely interesting. Plan-mode output and to-do writes are kept in full — that’s the planning process, which is precisely what you want.
  • Keep assistant prose that has no tool calls. The trade-off explanations, the architecture proposals, the “here’s why I’d do it this way” paragraphs. Pure reasoning, no payload.

Then there’s the one heuristic that earns its keep more than any other: grind-block detection. A long, repetitive bug-fixing loop is the most token-expensive and least interesting thing in most sessions — and it is detectable purely by shape, with no understanding of content at all. The signature is a long alternating run of tool calls interleaved with short user turns (“still broken,” “try again,” “nope”), the same file and the same tool cycling in an error→edit→error rhythm. When you see that shape, collapse the whole run to a single marker — “debugging block, 40 turns, server.py” — with a pointer back into the archive, and never send it to a model at all. The debugging is the least interesting part of the session, and the finished code already represents where it ended up.

One refinement, because a hard rule here would be wrong. The instinct that “the early bits of a session are the interesting bits” is right often enough to encode — but as a soft weighting, not a law. Keep the opening segment intact, yes; but the genuinely interesting moments happen anywhere, and they announce themselves. Pivot markers in your own turns — “no, actually…,” “let’s step back,” “wait” — flag the exact points where your thinking turned, wherever they fall in the session. Weight those up too.

Notice what this stage is: pure deterministic code, making the expensive stage cheap. That’s the pendulum from Text Is the Model’s Home Turf applied cleanly — deterministic code strips the file down to signal; model judgment is reserved for the part that actually needs judgment. After stage one you have a distillate that is a fraction of the original size and is almost entirely your words and your decisions.

Stage two: one cheap model, one North Star

Now, and only now, you bring in a model — a cheap one — and hand it the deterministic distillate. Its job is to produce a session brief: your intent (with pull quotes of your own words), the decisions you made and why, the alternatives you considered, what your research turned up, where you pivoted, and what you planned but never built.

The temptation is to hand the model a rigid schema — fill in these eight fields — and that’s exactly the wrong move. This is what our North Star Prompt doctrine is for: don’t give a thinking model a category checklist, give it a purpose and let it judge what matters. The North Star for this task is one line:

Capture what this session reveals about my thinking — not what it did to the code.

That single sentence does more work than a field list ever could, because it tells the model what to ignore. What the session did to the code is already in the repo; the model is explicitly relieved of re-summarising it. What the session reveals about your thinking — the intent, the hesitation, the abandoned branch — is the target. Point a cheap model at a clean distillate with that North Star and it produces something you would actually open six months later. And because the deterministic stage did the heavy lifting, you can run it over hundreds of sessions for the price of a rounding error.

Show me the collapse

Numbers make this concrete. Take one real session from my own machine — a multi-hour build on a Twilio voice agent, the kind of session that ends with a working feature and a long tail of debugging. The figures below are from that one session; the exact digits vary session to session, but the shape of the collapse is the invariant.

~240k
tokens — raw JSONL (tool-result payloads dominate)
~14k
tokens — deterministic distillate (~17× smaller, zero AI)
~900
tokens — session brief (cheap model + North Star)

Two collapses, and they’re doing different work. The first — 240k to 14k — is pure deterministic stripping: the tool-result payloads leave, the grind blocks collapse to markers, your turns stay whole. No model, no cost, no judgment. The second — 14k to 900 — is the cheap model reading what’s left and writing the brief. End to end that’s roughly a 260× reduction, and the thing you keep and read is the 900 tokens. Here is what one such brief looks like, from that session:

Session brief — twilio-voice-agent

Session2026-03-14 · 4h 20m · 38 files touched · 1 grind block collapsed (auth loop, 40 turns)
Intent (your words)“I want the agent to handle a caller interrupting mid-sentence — barge-in — so it feels like a real conversation, not a walkie-talkie.”
Decision & whyChose OpenAI Realtime over chaining Whisper + GPT + TTS. “The round-trip latency on the chain was killing the illusion.”
Considered, deferredBarge-in via server-side voice-activity detection. “Considered barge-in handling for the Twilio agent, deferred — the media stream made it fiddly, parked for v2.”
Planned, never builtSMS fallback when a call drops mid-conversation. Described in turn 44; no code exists for it.
Research trailTwilio Media Streams docs; OpenAI Realtime latency benchmarks (URLs preserved).

Look at the two rows in the middle of that brief — “considered, deferred” and “planned, never built.” Those are the rows that make the whole exercise worth it, and they are precisely the rows the repository cannot produce.

The gap between what you meant and what you shipped

This is the part that turns transcript distillation from tidy housekeeping into something genuinely new. The brief above claims: “considered barge-in handling for the Twilio agent, deferred.” Now go looking for that claim in the code. It isn’t there — it can’t be there — because the barge-in handling was never written. There is no function, no file, no comment, no commit. The repository has no way to emit the sentence “I considered this and chose not to build it,” because a repository can only describe what exists.

So imagine, months later, you’re assembling what you know about voice AI — a capability audit, a cover letter, a pitch. You ask your knowledge system: “Where have I worked on barge-in and call interruption?” The code-derived answer is silence — there’s nothing to find. The transcript-derived answer is: “Considered barge-in handling for the Twilio agent in March, deferred to v2 because the media stream made it fiddly.” That’s a real, dated, defensible claim about your own capability, and it came from a session brief, not from source code, because only the transcript ever knew it.

The diff between transcript-intent and code-reality is itself signal. “Considered X, deferred” is a claim the repository cannot generate — and the gap between what you meant and what you shipped is information with exactly one source.

That’s the reframe. We usually treat the transcript and the code as two views of the same work, one of them redundant. They’re not. The code is the intersection of intent and reality — the part where what you meant and what you built line up. The transcript is the whole set. Everything in the difference — every rejected alternative, every unbuilt plan, every reason-for-a-choice — lives only on the transcript side, and that difference is often the most useful thing you produced all day.

Where this goes next

A session brief is useful on its own — a searchable, dated record of your decisions is worth having by itself. But its real destiny is to become a feeder. Once you have briefs across hundreds of sessions, they slot in beside your code and your notes as a third input to a project-level knowledge asset — the kind of self-cleaning, cross-referenced graph I’ve described in The Index Is the Data — where “where have I explored voice AI” becomes a single lookup instead of an archaeology project. (Building that ingestion pipeline is its own article; so is the related idea of harvesting the agent’s own exploration paths as telemetry, which I’ll take up separately in “File Back the Walk.”) The economics are the same ones that make the whole approach work: pay a cheap model once to compile a durable asset, and every future query runs against the compiled brief instead of re-reading raw logs — the context arbitrage move, applied to your own history.

But none of that downstream value exists if the source is gone. So the argument closes where it opened. Your coding agent is writing the single best record you will ever have of how you think — dated, first-person, complete with the roads you didn’t take — and it is deleting that record on a timer measured in weeks. The tooling around it will happily archive the raw file verbatim, and none of it will read the file back to you. The move is neither to delete nor to hoard. It’s to distil: keep the raw JSONL cold and safe as source, strip it deterministically, and let a cheap model with a North Star turn it into a brief you’d actually open. The code already tells you what you built. Only the transcript can tell you why — and only if you save it first.

Sitting on hundreds of agent sessions you’ve never read back?

The raw logs are the highest-density record of how your team actually reasons — and most of them are on a deletion timer. At LeverageAI we build the two-stage distiller and the knowledge graph it feeds: deterministic strip, cheap-model briefs, and a navigable asset where every capability traces back to a dated decision. Talk to us about turning your session exhaust into an asset.

References

  1. [1]Anthropic — Claude Code settings documentation. The cleanupPeriodDays setting governs session retention: “Default: 30 days, minimum 1. Session files older than this period are deleted at startup.” code.claude.com/docs/en/settings
  2. [2]Claude Code issue #59248 — “Silent retention cleanup deletes session transcripts.” Confirms transcripts live at ~/.claude/projects/<sessionId>.jsonl and are hard-deleted: “Deleted transcripts go straight to unlink(). No soft-delete folder, no grace period, no --restore command.” github.com/anthropics/claude-code/issues/59248
  3. [3]claude-vault (kuroko1t) — a CLI that archives Claude Code conversations into SQLite before they vanish. “Claude Code deletes old session files over time. I got tired of losing past debugging sessions, so I built a CLI that archives them into SQLite before they disappear.” A single reported import: “Imported 94562 messages… from 203 files.” github.com/kuroko1t/claude-vault · dev.to/kuroko1t/i-built-a-tool-to-stop-losing-my-claude-code-conversation-history-5500
  4. [4]SpecStory — auto-saves Claude Code / Cursor / Codex conversations locally to .specstory/history/ and recommends committing transcripts alongside code to preserve design intent. “Turn your AI development conversations into searchable, shareable knowledge. Never lose a brilliant solution, code snippet, or architectural decision again.” Archives verbatim; does not distil. github.com/specstoryai/getspecstory
  5. [5]claude-code-log (daaain) — “A Python CLI tool that converts Claude Code transcript JSONL files into readable HTML / Markdown format,” with detail levels and a compact mode. Rendering layer. github.com/daaain/claude-code-log
  6. [6]claude-code-transcripts (Simon Willison) — “Convert Claude Code session files (JSON or JSONL) to clean, mobile-friendly HTML pages with pagination,” optionally published to a GitHub Gist. Rendering / archiving layer. github.com/simonw/claude-code-transcripts
  7. [7]claude-devtools — parses ~/.claude/ session transcripts into a chronological conversation view with per-tool renderers, expandable sections and cross-session search; exports Markdown/JSON/text. “Raw transcripts are unreadable — thousands of lines of escaped JSON…” Rendering / search layer. claude-dev.tools/docs/transcripts
  8. [8]claude-conversation-extractor — extracts clean conversation logs from Claude Code’s local JSONL store. “Claude Code stores chats in ~/.claude/projects as JSONL files with no export button – this tool solves that,” framed around exporting “before they’re deleted.” Extraction layer. github.com/ZeroSumQuant/claude-conversation-extractor
  9. [9]Fazm — “Parsing Claude Code JSONL Format for macOS Dev Tools.” On the raw format: “parsing raw transcripts is painful. The JSONL format is verbose, with tool calls containing full file contents and bash outputs. A single conversation can easily be tens of thousands of lines.” The direct justification for stripping tool-result payloads first. fazm.ai/blog/claude-code-previous-sessions-jsonl-transcripts
  10. [10]LeverageAI — related canon (named for framing, not statistics): Discovery Accelerators (rejected branches carry value); The North Star Prompt (purpose over checklist); Text Is the Model’s Home Turf (the deterministic/judgment pendulum); The Index Is the Data (the graph a brief feeds); Context Arbitrage (compile once, query cheap); Why LLMs Can Walk a Wiki but Can’t Drive a RAG (the sibling map argument). leverageai.com.au

Discover more from Leverage AI for your business

Subscribe to get the latest posts sent to your email.

Leave a Reply

Your email address will not be published. Required fields are marked *

© 2026 Leverage AI, Scott Farrell. All rights reserved. This content is made available on a limited, revocable, read-only basis only. No licence or right is granted to copy, reproduce, republish, scrape, store, adapt, summarise, index, embed, or use this content to create derivative works, work product, deliverables, methodologies, training materials, prompts, templates, software, services, research, or commercial outputs, whether by humans or machines, without prior written permission. This restriction includes internal business use, client work, consulting, advisory, implementation, and any use in or for artificial intelligence, machine learning, data extraction, retrieval, evaluation, fine-tuning, or knowledge-base construction.