Ideals for Developers —
Not the Ideal Developer

Marcus Persson Rydberg · March 2026 · ↓ PDF

Engineering culture is full of signals about what kind of developer to be. Some of those signals are worth following. Many of them optimize for the wrong thing — and the costs show up in the code.

The Question in the Pause

A developer is in a job interview. The hiring manager leans forward: "We move fast here. Are you comfortable working nights and weekends when the project demands it?"

The question sounds reasonable. The developer pauses. In the pause, something is known that hasn't been said.

The question is not assessing competence. It is asking: are you willing to normalize sacrificing yourself for the company's inability to plan? Are you willing to accept manufactured urgency as the default operating mode and call it passion? The question is a filter — and what it filters for is compliance with a system that externalizes its costs onto developers, and onto future codebases that no one in this room will maintain.

What the developer wants to say, and almost never does: "I'll do what it takes in a true emergency. I won't normalize treating every week as an emergency. Those are different things, and the distinction matters."

This essay is about the distinction that surfaces in that pause. The claim is simple: being happy, relaxed, and authentic produces better code and better teams. This is not a claim about working less. It is a claim about which internal conditions produce the judgment, taste, and verification that software now requires most. The AI era makes this argument more urgent, not less.

The State Being Defended

Before the evidence, a definition. The word "relaxed" is easily misread as passive or unambitious. The state being defended here is something more specific.

There is a quality of mind that produces the best software, and it is not the quality produced by fear. It is not relaxed in the sense of low stakes. The developer in this state may care intensely. They may stay late because the problem still has them. But there is a difference — felt in the body, visible in the code — between staying because the problem has you and staying because someone will think less of you if you leave.

What this state actually consists of: the freedom to think out loud without editing the thought before it forms. The ability to say "I don't know" without that admission becoming a permanent mark in someone's mental ledger. The willingness to pursue a direction that might be wrong, because being willing to be wrong is the only way to find what's right. The capacity to sit with an unresolved problem long enough that the problem itself tells you what it needs.

None of this looks productive from outside. It doesn't show up in commit graphs. It is what happens in the hour before the commit, in the quiet between the PRs, in the walk that looks like avoidance but is actually where the work gets done. Environments that optimize for visible productivity destroy this state because they make the invisible work dangerous.

The ideal being defended is also not a call for comfort over craft. A developer who is happy is not one who is never challenged — they are one who finds the work meaningful. A developer who is authentic is not one who avoids conflict — they are one whose work reflects genuine values rather than a performed role. These ideals are in direct tension with what might be called struggle-porn culture: companies that brag about intense hours, treat exhaustion as a signal of commitment, and measure work by presence rather than output.

I. What We Know

Psychological safety — the strongest single finding

Google's Project Aristotle studied over 180 teams and found that psychological safety — the shared belief that the team is safe for interpersonal risk-taking — was the top predictor of team effectiveness, ranking above individual skill, seniority, or team composition. Amy Edmondson (Harvard Business School) first operationalized the concept in 1999 in hospital teams. Her mechanism: psychological safety enables learning behaviors — questioning, admitting uncertainty, reporting failures, proposing unproven ideas — that are necessary for complex, knowledge-intensive work. Under threat, people protect themselves. Under safety, they improve the system.

The direct application to software is not indirect. A team where developers fear judgment for imperfect PRs, wrong estimates, or "I don't understand this" conversations will conceal uncertainty, ship code they're not confident in, and miss the early architectural problems that are cheap to fix and expensive to defer. This is not a culture problem with soft consequences — it is an engineering problem with hard ones.

The delivery data: elite teams don't trade off

The DORA research — the largest ongoing empirical study of software delivery performance, running annually since 2014 — finds that elite-performing teams are also the ones with generative cultures, psychological safety, and lower burnout. Speed and stability are not in tension at the elite level; they are correlated. The teams deploying most frequently also have the lowest change failure rates. This directly defeats the "performance or wellbeing, not both" framing that struggle-porn culture rests on.

The SPACE framework (Forsgren, Storey et al., 2021) names the corollary: developer satisfaction is a productivity input, not a soft outcome. Over-optimizing for activity metrics — lines, commits, tickets closed — actively degrades satisfaction and communication. The proxy destroys the thing it was supposed to track. Satisfaction predicts retention; retention preserves the institutional knowledge that is load-bearing in any non-trivial codebase.

Mood and code quality

Graziotin et al. (IEEE Transactions on Software Engineering, ~2017) is the most direct individual-level evidence: developer affect correlates with problem-solving quality and bug rates. Positive developer mood correlates with better problem-solving performance and fewer bugs; negative affect with disengagement and higher error rates. This is not team-level data, which strengthens the mechanistic argument — it runs through individuals, not just cultural aggregates.

The cost of interruption

Gloria Mark's research at UC Irvine finds that after an interruption, average recovery time to return to the prior cognitive state is approximately 23 minutes. Interrupted workers compensate by working faster but make more errors and report higher stress. Parnin and DeLine (ICSE, 2010) extend this to software specifically: developers spend significant time after an interruption reconstructing the mental model of the code — reloading not just the task but the state, the direction, the next step.

A day with six 30-minute meetings does not cost three hours. It prevents deep work entirely. A developer who cannot achieve flow cannot produce work that requires it.

The crunch counter-experiment

The game industry is the most documented natural experiment on sustained overwork in software development. The evidence is consistent across post-mortems, IGDA developer satisfaction surveys, and detailed accounts (Jason Schreier's Blood, Sweat, and Pixels and Press Reset): crunch increases the rate of errors while simultaneously reducing the team's capacity to catch them. Both the error rate and the review quality are affected by exhaustion. The practice persists not because the evidence supports it but because the culture that produces it also produces the framing that normalizes it.

II. The Startup Case

The counter-argument must be stated fairly before it can be answered. The strongest version: most startups are default dead — burn rate exceeds path to sustainability. The only escape is growth fast enough to change the fundraising timeline. In winner-take-most markets, second place may be worth nothing. This creates genuine structural pressure for speed that looks like superhuman effort from outside. Paul Graham's essays frame this as physics: increase the number of trials per unit time to raise the probability of finding product-market fit before the money runs out.

This argument applies most cleanly to the early-startup survival phase. Concede the domain.

What the evidence does not support is the extension of this frame beyond the survival phase — and the evidence on what the survival phase actually requires is less favorable to intensity than is typically claimed. CB Insights, Startup Genome, and Failory analyses of startup failures consistently show the primary causes as: no market need, ran out of cash, wrong team composition, getting outcompeted, and product problems. Culture factors appear in roughly 5–10% of cases, typically as toxic team dynamics (fear, conflict, blame), not as insufficient intensity. If chronic intensity were causally necessary for success, "insufficient drive" would appear regularly as a failure cause. It does not.

The correlation between intense startup cultures and success is largely selection. The Attraction-Selection-Attrition framework explains it: intense cultures attract people already predisposed to high-intensity work. Those who aren't fit leave. The observed high output may be mostly a function of who passed the filter rather than what the culture produced from a general population. Repeat founder testimony is consistent: first-venture intensity is typically emergent from inexperience and genuine mission alignment, not something that can be manufactured by management and applied to others. Subsequent ventures tend to be calmer and equally or more productive.

The honest synthesis: intense effort from people who are intrinsically motivated about a specific problem can produce extraordinary output. Paul Graham's model requires the John Carmack distinction to already be in place: founders who cannot help working this hard because they love the problem. When companies apply this as managed policy — extracting intensity from people who don't share the intrinsic motivation through social pressure, surveillance, and fear — they get the cost without the output. The struggle-porn culture extracts from the Carmack example exactly the wrong lesson.

The role being offered, when the lesson is extracted that way, is not for the developer who does great work. It is for a person-shaped tool.

There is also a distinction that rarely gets named in startup culture discourse, between genuine and manufactured urgency. Genuine urgency is external in origin, time-bounded, shared, and rare — a competitor move, a production outage, a regulatory deadline. Manufactured urgency is internal, has no resolution state, and is the default operating mode. Real urgency burns hot and clean. Manufactured urgency leaves ash. The distinction is not detectable from a commit graph.

III. The AI Era

The argument that began in empirical research gets structurally stronger as AI tools become standard. Here is why.

A 2022 GitHub/Microsoft study found developers using Copilot completed a well-defined task 55% faster than controls. The figure is cited everywhere. What it actually measures — a single, narrow, JavaScript task, with no follow-up on cognitive load, code quality at scale, or wellbeing over time — is cited nowhere near as often. The complementary finding from Sandoval et al. (2022): Copilot-assisted developers produced code with significantly more security vulnerabilities, and were often unaware they had done so. The "faster" and "less secure" findings are both real. Together they tell a more honest story: AI tools accelerate generation while the verification burden they create is real and consequential.

As generation becomes cheap, the bottleneck shifts. The expensive operations are now: architectural judgment (what to build, how it fits, what the second-order consequences are); taste (what is the right solution, not just a working solution); verification (is this AI output correct, secure, and idiomatic in ways that matter); and problem decomposition (framing the query that gets useful output). These are not rote tasks. They are high-cognitive-load, experience-dependent, context-saturated operations that require exactly the mental state that sustained stress degrades.

This means the "wellbeing is nice but the company needs output" framing misunderstands which output now matters. A developer who produces twice as many AI-generated lines under pressure but whose architectural judgment is impaired produces compounding technical debt, security vulnerabilities, and wrong-direction work at a cost that exceeds any line-count gain. The Sandoval security-vulnerability finding is a practical instantiation: the developer accepted code that looked roughly right without the scrutiny that would have caught the subtle semantic error. The verification burden is not optional — it is where the risk now lives. Under stress, this is precisely the work that gets compressed.

There is also a risk in the opposite direction: AI-powered productivity surveillance. The tools exist — commit frequency analysis, PR cycle time tracking, code review rate monitoring — and the incentive to apply them as performance management tools exists too. Organizations using AI-derived productivity metrics to raise throughput expectations are extrapolating far beyond what the evidence supports. In adjacent knowledge-work fields (customer service, content moderation) the pattern is documented: AI monitors throughput, management uses the data to set new baselines, workers face escalating targets with no natural ceiling. The results are documented increases in stress, monitoring anxiety, and reduced job satisfaction. The same infrastructure now exists in software engineering.

Monitoring creates a specific rational incentive: approve AI-generated code quickly (visible throughput) rather than perform the slow scrutiny that catches subtle errors (invisible, penalized by cycle time metrics). The monitoring system is precisely inverted from what AI-era software work requires.

A third dimension of the AI era argument is worth naming honestly: the emotional experience of watching a significant fraction of your technical fluency become cheap. Not obsolete — that is the wrong word. The skills still work. Their rarity has changed. There is a difference between a thing having no value and a thing having stopped being a differentiator, and the emotional register of the two is completely different. The second is a kind of grief that doesn't have a name yet.

What remains when the generation is cheap is harder to hold and more important: the ability to look at the AI's output and know what it doesn't know it doesn't know. The unstated case, the production environment assumption, the edge at the intersection of two behaviors the test suite doesn't cover. This was always the most valuable part of the work. AI makes it visible by removing everything around it. The honest emotional experience is grief mixed with liberation, and both are real simultaneously.

IV. The Mechanism

For a technical audience, the case should not rest on correlation alone. There is a mechanistic chain that connects the neuroscience of stress to the operations that now constitute the bottleneck in software work.

Link one: stress impairs prefrontal cortex function. Arnsten (2009, Nature Reviews Neuroscience) established the mechanism: cortisol and norepinephrine, released under stress, activate the amygdala and simultaneously suppress prefrontal cortex function. The PFC governs working memory (holding multiple considerations active simultaneously), cognitive flexibility (switching between framings, updating understanding), and inhibitory control (suppressing the first plausible answer in order to reach a better one).

The Yerkes-Dodson inverted-U describes the relationship between arousal and performance: performance rises with moderate arousal and falls with high or sustained arousal. Crucially, the peak depends on task complexity. Simple, well-defined tasks have a higher peak — they can tolerate more arousal before performance degrades. Complex, ambiguous tasks requiring integration have a lower peak and a steeper falloff. The tasks being displaced by AI are exactly the ones where the Yerkes-Dodson argument for pressure is strongest. The tasks that remain are exactly the ones where it is weakest.

McEwen (1998, New England Journal of Medicine) established the duration dimension: allostatic load is the cumulative physiological cost of chronic or repeated stress activation. Acute stress followed by genuine recovery accumulates little allostatic load. Chronic stress without recovery accumulates structural cost — not just fatigue, but degradation of the PFC itself.

Link two: interruptions fragment the cognitive work that matters. Mark, Gudith & Klocke (CHI, 2008) found that knowledge workers take an average of 23 minutes to return to a task after an interruption. Parnin and DeLine (ICSE, 2010) extend this to software specifically: developers must reload not just the task but the state — where they were, what they were thinking, what the next step was. In AI-assisted development, the verification task that now constitutes the core of the work is precisely the task most disrupted by interrupted attention.

Link three: repetitive high-stakes judgment fatigues. Danziger, Levav and Avnaim-Pesso (PNAS, 2011) documented Israeli parole judges' favorable ruling rates falling from approximately 65% at the start of a session to nearly zero just before a break, then recovering fully after the break. The analogy to AI code verification is structural: both involve sequential scrutiny of discrete items, high stakes if errors are missed, no external signal reliably identifying which items require more scrutiny, and no natural endpoint. The tendency under fatigue is to default to the less effortful response. For judges, that meant denial of parole. For developers reviewing AI-generated code, the fatigued default is approval — accepting code that looks roughly right without the scrutiny that would catch the subtle error. The Sandoval finding is a practical instantiation.

Link four: recovery is structurally required. Harrison and Horne (2000) documented a consistent pattern in sleep deprivation studies: routine, well-practiced task performance degrades less under sleep loss than innovative, flexible, plan-updating decision-making. The capacities most sensitive to sleep restriction are exactly the ones the argument identifies as the AI-era bottleneck. Recovery is not comfort — it is the maintenance schedule for the cognitive machinery that now constitutes the core of the work.

The mechanism also runs through surveillance. Amabile (1996) showed that expected evaluation degrades creative performance by approximately 20% — specifically impairing open-ended, integrative, exploratory thinking while leaving rote task performance relatively intact. Deci and Ryan's self-determination theory research established that perceived surveillance shifts motivational orientation from intrinsic to extrinsic. Monitoring on commit frequency and PR cycle time rewards high-velocity visible activity. It penalizes the deep, slow work that architectural judgment and careful code review require — which looks like low activity from outside. The monitoring system creates an incentive structure precisely inverted from what AI-era software work requires.

The complete chain: AI displaces routine code generation. The bottleneck shifts to high-complexity, high-ambiguity, integrative cognitive tasks. These tasks perform at their peak at lower arousal levels than routine tasks. Sustained manufactured urgency accumulates allostatic load and suppresses PFC function through well-understood neuroendocrine mechanisms. This impairment is subtle on routine tasks — which is why it goes undetected under output metrics — and significant on complex integrative tasks — which is why architectural debt, verification failures, and security vulnerabilities accumulate. Decision fatigue compounds this for verification specifically. Surveillance infrastructure adds motivational suppression. Recovery is the mechanism by which the cognitive capacity at the bottleneck is restored.

V. The Code as Evidence

The argument so far is empirical and mechanistic. There is also a kind of evidence that doesn't appear in any study — but that engineers with enough experience recognize immediately.

A codebase reflects its culture.

Code written under chronic pressure has a characteristic signature: it handles the case the developer was thinking about when they wrote it, and not the others. The edge cases are missing — not because the developer didn't know about them, but because the cognitive state that anticipates what you don't know requires a kind of spaciousness that pressure forecloses.

Read the diffs. You can often tell which code was written in flow and which was written under duress. One is tight, confident, occasionally a minimal abstraction that does more than it looks like. The other is verbose and defensive, its escape hatches visible everywhere — the proliferation of TODO comments, the special-cased branches, the function that grew past its original purpose because there wasn't time to refactor and the developer knew it but kept going anyway. Code written by someone who had time to think looks different. It has edges, guards, structure that anticipates what it doesn't know.

Two kinds of fatigue

There are two kinds of tired after a hard day of writing code.

One is clean, like the ache from a long hike. The problem was genuinely difficult and you were genuinely inside it. You followed a trail of logic until it opened onto something. You forgot to eat. Time collapsed. The code you produced in those hours has a certain quality — not because you were trying to be elegant, but because you had the uninterrupted time to find the core shape of the thing. You are tired in the way that means something was used well.

The other fatigue is different. It's dirty, sticky. You worked as many hours and produced as many lines, but the work was performed rather than done. You were thinking about being seen working rather than thinking about the problem. The function you wrote handles the case that was specified. You knew while you wrote it that there were other cases, but checking would mean slowing down, and slowing down would look like not trying. The fatigue here is the fatigue of alienation — of having been somewhere else while your hands moved.

Developers know these two states in their bodies. The distinction almost never comes up in any review, retrospective, or performance conversation. The cultures built around the second state call it productivity.

The Carmack distinction from the inside

John Carmack working through the night because the renderer glitch had teeth — because solving it was the puzzle pulling him forward — is a different thing from a developer grinding past midnight under the ambient pressure of a culture that calls exhaustion dedication.

From outside, they look the same. Both are still at the keyboard at 2am. But the phenomenology is entirely different.

In the first state, the pull feels like flying. Time dissolves. Coffee goes cold. The breakthrough arrives with a quality of surprise even when you were almost there. The code that comes out has contact with the material — it isn't just correct, it shows that someone was actually thinking.

In the second state, the push feels like wading through mud. You can see the next step but not two steps ahead, and the architecture of what you're building is increasingly opaque to you. The code that comes out is done. Rarely delightful. It handles the stated cases. The developer knew while writing it that it was B-work. There was no other available gear.

What Carmack's example actually demonstrates is that the conditions for extreme productive effort are internal. They cannot be manufactured by someone else. Trying to replicate the output without the internal state produces something that looks similar on a time-tracking spreadsheet and is categorically different in quality. The companies that invoke Carmack as justification for demanding intensity have misread the example in a way that is convenient for them.

The ideals this thesis defends do not produce passivity. They produce a specific kind of attention: to what matters and what doesn't, to genuine urgency when it arrives, to the difference between a problem that requires slowing down and a problem that requires moving. The relaxed state is not low-arousal disengagement. It is the absence of threat, which is the precondition for the kind of presence that complex work requires.

A companion document — What Organizations Have Built — collects five companies that rejected intensity-as-virtue and described their reasoning (GitLab, Basecamp, Shopify, Atlassian, Valve), alongside the full reference list for this essay.