Anthropic Economic Research · ~400,000 Claude Code Sessions · 2026-06-16

AI Is a Personal
Amplifier
400K sessions prove domain expertise beats coding ability

A privacy-preserving analysis of ~400,000 Claude Code sessions from ~235,000 people (October 2025 to April 2026). People decide what to build, the agent decides how to build it. Whether a session succeeds turns more on how well you understand the problem than on whether you can code. Two numbers frame the study.

By occupation
within 7 pts
In code-producing sessions, every one of the ten largest occupations lands within seven percentage points of software engineers. A coding background barely decides success.
By expertise
more than 2×
Sessions rated expert reach verified success more than twice as often as those rated novice. Domain expertise is what decides success.
01 · Nine work modes

What people use Claude Code for

Each session is classified into the single work mode that best describes what it is trying to accomplish. About 56% of sessions involve writing, fixing, or testing code directly; 17% operate software; 14% plan or explore; and 13% produce analysis or prose.

Figure 1 · Nine work modes (static cross-section)
25%
26%
5%
17%
8%
6%
7%
6%
Building new 25% Fixing 26% Testing & orchestrating 5% Operating software 17% Understanding 8% Planning 6% Data analysis 7% Writing/prose 6%
02 · Division of labor

People decide what to build, the agent decides how

A decision-attribution classifier separates every meaningful decision in a session into planning (what to do, which approach, what counts as done) and execution (which files to change, what code to write, which commands to run), then attributes each to the user or to Claude.

Figure 2 · Share of decisions made by the user (distribution with median & IQR)
User's share of planning decisions
~70%
User's share of execution decisions
~20%

Median: users make about 70% of planning decisions but only about 20% of execution decisions. The "how" is largely delegated to the agent.

A typical session runs about four turns. Each prompt sets off a chain of around 10 Claude actions on average (sometimes over 100), producing about 2,400 words per turn. When the user keeps execution control (>80% of execution decisions), Claude takes about 8 actions per turn; when Claude takes over planning (>80%), it takes about 16.

"
People decide what to build, and the agent decides how to build it.
— Anthropic, "Agentic coding and persistent returns to expertise" "
03 · What expertise means

Task-specific, unrelated to job title, and it directly buys more output

A classifier rates each user's apparent expertise at the task on a five-point scale, from novice to expert. It looks at three signals: how precisely the user frames their directions, what they ask Claude to verify, and whether the user tends to correct Claude or Claude tends to correct the user. This expertise is task-specific, distinct from job title or general ability.

The study's two examples

A senior engineer asking their first Rust question is a beginner at Rust.

An accountant who has never used Python, but tells Claude exactly which reconciliation rules a script must enforce and catches the edge case it mishandles at month-end close, is an expert at that task.

Figure 3 · Claude actions & output words per prompt, by expertise level
Novice
~5
~600w
Beginner
~7
~1,100w
+9%/level
Intermed.
~8
~1,700w
+13%/level
Advanced
~10
~2,400w
Expert
~12
~3,200w
5× output

Controlling for work mode, task value, month, occupation, and model family, the trend remains significant: about +9% actions and +13% output per expertise level (p < 0.001). The gap appears within every kind of work and every band of task value.

04 · Returns to expertise

More expertise means more success, and a better chance of recovering

Success is measured as judged success (classifier reads the full transcript) and the stricter verified success (judged successful plus at least one hard signal: matching git commits, passing tests, or explicit user affirmation). Across all measures, more expertise means more success. Most of the gain comes from novice to intermediate; the slope flattens between intermediate and expert.

Figure 5a · Session outcomes by expertise level
Novice
15%
62%
23%
Beginner
24%
64%
12%
Intermed.
28%
63%
9%
Advanced
31%
61%
8%
Expert
33%
59%
8%
Verified successPartial successFailure
Figure 5b · Among sessions that hit trouble
Novice
4%
56%
21%
19% abn
Intermed.
10%
71%
13%
6%
Expert
15%
66%
14%
5%
VerifiedPartialFailedAbandoned (0 lines)

Abandoned rate (judged failed, zero lines written): novice 19% vs everyone else 5-7%. The least experienced users give up at several times the rate of everyone else when struggling.

05 · Occupation vs expertise

A coding background is becoming less relevant: all 10 occupations within 7 points

The study infers occupation from transcripts using the SOC taxonomy and explicitly instructs the classifier not to treat coding as evidence of a coding profession. A lawyer who builds a script to flag missing clauses is mapped to Legal Occupations.

Figure 6 · Verified success rate by occupation (code-producing sessions)
Management
~35%
Computer & Mathematical (incl. SWE)
34%
Business & Financial Operations
~32%
Life, Physical & Social Sciences
~31%
Arts, Design & Media
~31%
Education & Training
~30%
Legal
~30%
Architecture & Engineering
~29%
Sales & Related
~29%
Healthcare
~28%

Management is slightly above software engineering. This may reflect management skills transferring to directing an agent; it may also partly reflect measurement (verified success rests partly on explicit confirmation, and managers may be more likely to say so). Under the looser "at least partial success" measure: 89% vs 88%.

06 · The work is shifting

Debugging share nearly halved; task value rose ~27%

The composition of work changed substantially over the seven months. The share of sessions spent fixing broken code fell by nearly half; the freed-up share went to operating software, writing, and data analysis.

Figure 4 · Work composition, Oct 2025 → Apr 2026 (time series)
Oct 2025 Dec Feb 2026 Apr 33% 19% 14% 21% ~10% ~20% Share of sessions
Fixing (33% → 19%) Operating (14% → 21%) Writing + analysis (~10% → ~20%)

Tasks also grew more valuable. Approximated by what the work would cost on a freelance marketplace, the average session value rose about 27% (the Key Findings summarize this as "about 25% on average"); building +43%, operating +34%, fixing +32%. The study notes these are coarse estimates meant for relative comparison over time.

07 · What to watch

Two dynamic signals, and a few honest limits

The study frames the overall picture as agentic coding amplifying some knowledge while substituting for others. The gains come mostly from competence, not mastery: "proficiency in a domain is enough to use the tool almost as effectively as those with deep mastery."

Signal 01
Whether returns to expertise start to fall
If the returns decrease over time, models are starting to supply the judgment users currently bring, and gains are broadening beyond domain experts.
Signal 02
Whether non-software occupations keep rising
If the share of coding sessions completed successfully by users outside software occupations continues to grow, software production may be becoming part of ordinary work in every field.

Stated limits: cannot measure real-world outcomes (whether code is actually used); excludes non-interactive usage (a substantial share); all classifications rely on a model reading the transcript (appendix shows alignment with independent telemetry).

Source

A single official source, faithfully restated

OfficialAgentic coding and persistent returns to expertise

anthropic.com · 2026-06-16 · Economic Research (Zoe Hitzig, Maxim Massenkoff, Eva Lyubich, Ryan Heller, Peter McCrory). All claims, numbers, definitions, and quotes here come from this source. All figures are Anthropic's own and cannot be independently verified. Classifiers use Claude Sonnet 4.6; data excludes third-party IDEs, SDKs, and non-interactive claude -p usage. Figure 5's "Beginner" and "Advanced" rows are interpolated from the reported +9%/+13% per-level regression coefficients (the study reports specific numbers for novice, intermediate+, and expert only).