Reusing the Pull Request: From Code Contribution to Research Artifact

A few weeks ago I opened a pull request I never intended to merge.

I knew that before I opened it. The branch was an agent’s first attempt at a feature I was still arguing with myself about. The code compiled. The preview deployed. The diff touched the database, two services, a chunk of the frontend, and an architecture diagram I had been avoiding updating for a month. And then I closed it, on purpose, after I had learned what I needed from it.

If you had described that workflow to me three years ago, I would have called it a waste. You wrote code you knew you would throw away. You opened a PR you knew you would not merge. You asked another human — me — to review something that was never a candidate for production.

But none of those objections survive contact with how I actually work now. The PR was not a waste. It was one of the most efficient research instruments I had that week. And the reason it worked is that the pull request, the most boring and load-bearing object in our entire workflow, has quietly taken on a second job it was never designed for — without giving up its first.

We did not build a new tool for this. We reused an old one.

What a pull request used to be

For most of its life, the pull request had a single, well-understood job. It was a unit of finished work. You took a definition of done, you implemented it, you packaged the implementation into a branch, and you submitted it to be checked and merged. The PR was a request in the most literal sense: please pull my code into yours. Its entire future was binary. It merged, or it didn’t.

Everything we built around the PR assumed that frame. Review existed to gate the merge. CI existed to protect the merge. The conversation in the comments existed to unblock the merge. A PR that sat open for weeks without merging was a smell, a sign of indecision or abandonment. A PR that got closed unmerged was, softly, a small failure — somebody’s effort that didn’t make it.

I have written before, more than once, that you should think before you implement. Design first. Understand the problem before you reach for the keyboard. I still believe most of that. But it was built on an assumption I want to make explicit, because the assumption is the part that broke: writing the code was the expensive step. Implementation was where your hours went, so implementing the wrong thing was where your hours got wasted. “Think first” was, underneath, a budgeting rule about human time.

That budget no longer looks the way it did.

We stopped spending our time on the keys

The most concrete change in how I work is also the most underrated: I barely type the implementation anymore.

I am not saying this as a slogan. I mean it as a description of where the hours physically go. My day is mostly reviewing, researching, deciding, and steering. Reading a diff and judging it. Comparing two approaches an agent took to the same problem. Working out what we actually want to build before something gets blessed as the next step. The keyboard time that used to dominate — the part where you sit and translate a known design into syntax — has collapsed into a background process that runs while I think about something else.

When the cost of producing an attempt drops to roughly the price of some tokens, the math under “think first” changes. The point of thinking first was never that thinking is holy. It was that building was costly, so you wanted to be sure before you paid. If building a serious first attempt now costs me a little compute and almost none of my own time, then building is part of how I think. The attempt is research. I can ask an agent to go implement a direction I’m unsure about, not because I expect to ship it, but because seeing it built tells me things no amount of upfront design would have.

This is the same shift I keep circling back to in AI Slop, Human Bandwidth, and Where I Draw the Line: the scarce resource is not code anymore, and it is not tokens. It is responsible human attention. Generation got cheap. Comprehension and judgment did not. Once you internalize that, you stop spending your scarce attention on producing first drafts and start spending it on evaluating them. And the object you evaluate is the pull request.

A PR carries things an issue never could

Here is the question I keep getting, usually with a slightly impatient tone: if not every PR is headed for a merge anymore, why use a PR at all? Why not write an issue? Why not put it in the project tracker, the spec doc, the task board — the tools literally built for describing work?

Because none of those tools can hold what a PR holds.

A project tracker can hold text and images. It can hold a description, an argument, a mockup, a checklist. That is genuinely useful, and for a lot of work it is enough. But a tracker card does not hold code. It does not have a CI status. It does not carry the perspective you only get from the system actually trying to build the thing. It has no attempt inside it.

A pull request has all of that, and the richness compounds:

It holds real code, often compiled, sometimes running. Not a description of a solution — an instance of one.
It carries a CI pipeline that can build artifacts off that code: a preview image, a rendered diff, a video, a generated report. The same evidence you’d manually paste into a tracker, except it’s produced by the change itself and stays attached to it.
It can expose a live link — a sandbox deploy, an ephemeral preview environment — where the thing is compiled and clickable right now, curated to exactly this branch.
And, if you’ve structured your repository well, it shows cross-cutting impact visually. One feature touches infrastructure, the docs, an architecture diagram, the frontend, the backend, the database — and the PR lays out, in one place, every surface it disturbs and how.

That last point is the one people underrate. A tracker card describing a feature gives you the idea of its blast radius. A PR shows you the blast radius. You can see, concretely, that this “small” change also rewrites a migration and quietly alters an API contract three layers away. That visibility is not a side effect of merging. It is independently valuable, and you get it whether or not the branch ever lands.

This is the reuse. We took a mechanism built to gate merges and discovered it is also the best container we have for showing what a change means across an entire system. Same object, new job.

A pull request doesn’t have to be mergeable

So let me say the uncomfortable part directly, because it’s the hinge of the whole argument.

A pull request does not have to be mergeable. It does not have to reach production. It does not even have to be used.

In Caro, the project where this way of working got most extreme for me, most PRs behave exactly like this. They are not finished contributions. They are feature requests rendered as attempts — an agent’s best one-shot pass at whatever the original session asked for. Someone described what they wanted, an agent took its single best swing at it, and the result is a branch plus a diff plus a pile of context: the research it did, the decisions it made, the next steps it would take. The PR is, at its core, sometimes just a prompt that grew a body. A prompt, a code artifact, and a description of where this could go.

That sounds like a degradation of the PR. I think it’s the opposite. It’s a PR doing more than it used to, not less. An issue can tell you “here is what we should build.” A PR of this kind tells you “here is what it looks like when we actually try” — with the friction, the surprises, the parts that turned out harder than the card implied, all surfaced by the attempt instead of imagined in advance.

And critically, the agent’s attempt is almost always incomplete. That is not a defect; it is the normal condition. Any non-trivial feature takes several iterations to land. Complex work gets split across days, across multiple passes, across more than one mind. Expecting an agent’s first PR to be a complete, mergeable feature is the same mistake as expecting a human’s first PR on a hard problem to be one. It rarely is. The difference is that now we can afford to make the incomplete attempt anyway, look at it honestly, and learn from it — because it cost us tokens, not a week.

A PR like that doesn’t end at “merge or close.” It can feed the next PR. It can be handed to another agent that consumes the attempt and finishes it. Several of them can be folded together into a real understanding of what we should actually build.

A PR that’s generated, learned from, and thrown away on purpose is a small instance of a larger pattern — software itself becoming disposable, built to exist just long enough to do its job. I follow that thread up a scale in Disposable Software.

When “merge” stops meaning “take the code”

This is where the vocabulary starts to strain, in a way I find genuinely interesting.

If I’m staring down something hard, I don’t have to bet on one approach. I can send three agents at it from three different angles and let them run in parallel. Then I “merge” the results — except merge no longer means what it used to. I am not concatenating three diffs into one branch. I am reading three attempts, seeing where they agree and where they diverge, working out which direction actually understood the underlying problem, and using all three to decide how the real thing should be built.

The output of that “merge” might be a fourth PR that resembles none of the three. The three inputs were never products. They were probes. Their value was in what they revealed about the problem, not in the lines they contained.

I touched the edge of this in the AI Slop piece — running multiple coders against one task and using their outputs as competing inputs to a final synthesis. I want to push it one step further here. The convergence isn’t a code-merge at all. It’s a decision. The PRs are how the options make themselves legible enough to choose between. You couldn’t do this with three issue cards, because three issue cards are three guesses. Three PRs are three demonstrations, each with its own CI status, its own preview, its own honest record of where it got stuck.

And the resource cost of running them is the thing that makes it reasonable. Three parallel attempts would have been an absurd luxury when each one cost a human a day. As tokens, with a reasonable subscription, it’s close to a rounding error — and the converged result is often better than what any single upfront design would have produced, because it was chosen against real evidence instead of imagined ahead of time.

The review and the checks don’t go away — they get more important

I want to be careful here, because it would be easy to read everything above as “merge discipline is over.” It is not. None of this relaxes the bar for a PR that is meant to merge.

When a PR is a real contribution, it goes through the full gate, exactly as it always has. Human code review. Required status checks. Lint, type-check, the test suite. Security and license scanning. Branch protection. Whatever the project’s definition of done and compliance rules demand before anything touches a protected branch. That machinery is not a relic of the old model that the new one outgrows. It is the part that earns a change the right to ship, and I would not merge agent-written code without it any more than I would merge my own.

If anything, agent-authored code raises that bar rather than lowering it. When generation is cheap, the scarce and expensive thing is the reviewer’s judgment and accountability — the same point I made in the AI Slop piece. A PR an agent produced in four minutes can still demand a careful hour to review responsibly, and it should. Cheap to write is not the same as cheap to trust.

And here is the part people miss: those same checks are what make the research mode work at all. CI is the honest signal that an attempt actually compiles and passes, not merely that it reads plausibly. Review comments are how a probe gets steered toward the real problem. The compliance gates — no forbidden license pulled in, no secret committed to the diff, no convention quietly broken — are exactly what let a PR be trusted enough for anyone to act on it. Strip those away and you do not get a richer artifact. You get a confident-looking guess wearing a green checkmark it never earned.

So when I say a PR does not have to be mergeable, I do not mean review and checks are optional. I mean a PR now has two modes, and both run on the same rails. The merge-bound PR earns its merge by passing them. The research PR earns its credibility the same way.

One artifact, many roles

There’s a second kind of reuse hiding in here, and it’s about people rather than process.

Because the PR now carries the whole vertical slice — infra to docs to UI to data — it stops being the exclusive territory of the engineer who would have written the code. A product manager can open one. A product engineer can. A designer can. They’re all working against the same constraints, in the same surfaces, with the same CI and the same preview environment standing behind their change. The PR becomes a shared workspace, and what it can do next depends only on the stage of the work: ship to a sandbox to get a feel for it, or, when it’s ready and owned, ship further.

An issue body can’t be a shared workspace in that way. It can hold everyone’s words, but only one discipline’s artifact — the prose. The PR holds the running thing, which is exactly what lets a designer and a backend engineer and a PM all point at the same object and mean the same object. They’re not describing a feature to each other. They’re standing inside one.

The gap I keep running into

Which brings me to the friction that made me want to write this down in the first place.

There is a persistent, recurring mismatch between how a pull request gets interpreted and how, increasingly, it’s meant. Someone opens a PR as a research artifact — an attempt, a probe, a way of communicating “here’s what this feature could be, here’s the research, here’s a partial working prototype.” And it gets received as what PRs have always been: a finished contribution presented for merge. The reviewer reaches for the ritual — is this correct, is this complete, is this ready to land — when the author was asking a different question entirely: is this the right direction, and what did building it teach us? The reviewer is not wrong to ask those questions. For a merge-bound PR they are exactly the right ones. The mismatch is about mode, not about rigor.

That gap is not a small communication hiccup. I think it’s the central source of confusion in agentic teams right now. The artifact evolved faster than our shared understanding of it. Half the people in the conversation are using the PR as a merge candidate and half are using it as a thinking tool, and they’re staring at the same screen wondering why the other side seems to be missing the point.

Naming the gap is most of the fix. A PR can be a contribution to be merged. It can also be a feature request with working code attached, a product definition that happens to compile, a partial prototype, a decision document with a CI status, a first attempt meant to be consumed by the next one. These are different objects wearing the same interface. The interface stopped telling you which one you’re looking at. So we have to say it — in the description, in the conventions, in the team’s shared vocabulary: this PR is here to be merged, or this PR is here to be learned from.

The attempt is the thinking

I’ll close where I started, with the thing that still feels slightly heretical against my own old advice.

“Think before you implement” was good guidance in a world where implementing was the expensive part. But the implementation is now, often, part of the thinking — because asking an agent to build a direction costs almost nothing I care about, and a built attempt answers questions that staring at a design doc never will. The pull request is where that attempt becomes visible, reviewable, and shareable: code, CI, preview, and cross-cutting impact, all in one object that the rest of the system already knows how to display.

We didn’t need a new tool for the agentic era. We had one all along. We just used only half of what it could do. The merge — gated by review and checks, exactly as it should be — was always one essential job, and it still is. The other job, being the richest way we have to show in running detail what a change would mean, is the one we left mostly on the table, right up until building got cheap and deciding got expensive.

The PR is still a contribution waiting to be merged: gated, reviewed, checked. It’s also a research artifact now. Holding both at once, and being honest about which one you’re handing someone, is the whole shift. Once you start treating it that way, a lot of the friction dissolves — and a lot of the leverage shows up.

Written and developed with Claude. The arguments are mine; the drafting was collaborative. The PR I opened and closed on purpose was, naturally, also mine.