The Pragmatic Summit 2026
The Pragmatic Summit — San Francisco, February 11, 2026
Gergely Orosz organized a one-day summit on AI and software engineering in San Francisco, hosted by Statsig. Attendees were selected through an application process — builders and practitioners. Sessions were recorded.
I’ve been a Pragmatic Engineer subscriber since ~ December 2021. This was the first Pragmatic Summit.
Sessions I attended
Welcome — Gergely Orosz
Gergely opened by noting he started Pragmatic Insights 8 years ago while at Uber, wanting to share engineering open secrets without hype.
How AI is reshaping the craft of building software — Vijaye Raji (CTO Applications, OpenAI), Tibo Sottiaux (OpenAI)
OpenAI’s engineering leadership described how bottlenecks keep shifting inside their own org: code generation got fast, so the bottleneck moved to code review, then to understanding customer needs.
- OpenAI reported a 5x productivity increase internally
- Tibo leads Codex with 33 direct reports in a flat structure
- Hiring ~100 summer interns/new grads — first batch. Codex handles onboarding
- Alexander, a PM, “hyper-leveraged himself with Codex” — building prototypes directly
- Designers ship code. Roles blur across product, design, and engineering
- When asked about a prediction for next couple years - replied “2 years is way too long” for predictions” answered with what he sees in ~ 6 months.
- Models are more capable when given larger tasks with built-in QA loops (feedback)
- Demo day depth has increased consistently — more corner cases solved, not just happy paths
Data vs. hype — Laura Tacho (CTO, DX)
Laura Tacho created the Core 4 metrics framework. 10+ years in developer tools and productivity. Her talk was around the space program — wonder balanced with pragmatism.
Industry numbers:
- 92.6% of developers use AI coding assistants; 44.1% use them daily
- 4.08 hours saved per week per developer on average (Google reports ~10% savings — hasn’t changed dramatically)
- 26.9% of code is AI-authored, up from 22% last quarter
- AI cuts onboarding time in half: ~90 days to reach 10 merged PRs (Q1 2024) down to ~40 days (Q4 2025). Effect persists 2+ years
Agentic adoption:
- 50.5% daily agentic tool usage
- OpenAI reported 1M Codex downloads in the prior week
- Trillions of tokens processed per week (per OpenAI)
- 95% of OpenAI developers use Codex (internal figure)
What the data actually shows:
- Some organizations saw 2x more customer-facing incidents after AI adoption. Others saw 50% fewer. Same tools, different outcomes. Tooling isn’t causal — engineering discipline and guardrails dominate outcomes
- “Orgs that were dysfunctional are dysfunctional faster”
- “Adoption doesn’t mean impact” — tools enhance individual productivity, not P&L performance
- Spray and pray doesn’t work. Successful orgs point AI at a concrete goal and measure progress
- AI measurement framework: utilization, impact, cost
Her recommendations:
- Invest in developer experience: feedback loops, documentation, fast CI, solid testing. “Just call it agent experience when chatting with the C Suite” 😉
- Treat AI as an organizational problem, not a technical one. Barriers are change management and lack of executive backing, not models or tools
- Reference: DORA AI capabilities model, ThoughtWorks AI readiness framework
Case studies cited:
- Haven (healthcare) — fine-tuned HIPAA-compliant model on 100K historical messages for migraine patient messaging. 3x industry average CSAT
- 18K Cisco developers using Codex
- JPMorgan’s multi-agent annotation framework (MAFA) for labeling customer interaction data
Reinventing software — Martin Fowler, Kent Beck, Gergely Orosz
Martin Fowler and Kent Beck in conversation. One of the strongest sessions of the day.
I hope to write more about this session in a follow-up post. For now, the key points:
- Fowler: “Programming is NOT the bottleneck”
- Fowler: “AI is an amplifier” — amplifies good practices and bad practices equally
- Fowler: “To go faster you need higher quality”
- Beck: “Value of software = features + what we can do in the future”
- Beck: “At this moment, no one has the answers”
- Fowler said he’s seen nothing “with the magnitude of AI” in his career — not object oriented, not the internet, not agile
- Beck: “What’s the smallest experiment I can run?” — the useful skill right now is validating cheaply
- Fowler compared this to a Venn diagram of agent experience and developer experience. The overlap was nearly complete: “Our craft done well, works for agents”
- Beck noted the temptation to go solo — one person managing agents instead of leading people. He pushed back on it - we are still building for humans.
Product engineering in an AI-native world
- Michelle Lim - Flint
- Tuomas Artman - Linear
- Drew Hoskins - Temporal
- Margaret-Ann Seger - OpenAI
Panel on how product engineering teams are changing.
- Artman (Linear): “It comes down to testing — as a code reviewer I concentrate on tests”
- “Product becomes engineer, or engineer becomes product” - lines blur
- PMs prototyping with Replit. Product teams using agents as standard workflow
What’s next: building world-class engineering orgs — Rajeev Rajan (CTO, Atlassian), Thomas Dohmke (ex-CEO GitHub), Gergely Orosz
Closing panel.
- Rajan: “Agents — we just shift to the right. New bottleneck.”
- Atlassian experimenting with smaller, more creative teams (managing down to sqrt(n) direct reports via agents)
- Dohmke: “Lines blur between product, designer, engineer”
- Dohmke Entire. First product is Checkpoints — an open-source CLI that collects agentic session information automatically, pushes to Git, enables rollback and rewind
- Gergely noted Entire’s funding as “the biggest seed round in dev tool history” (speaker claim) - Dohmke noted inflation 😏
Roundtable: AI product development — evals, annotations, tools
Small group session led by [RC Johnson] (Bumble)](https://www.linkedin.com/in/rcjohnson/).
- Two eval types: deterministic evals and LLM-as-judge
- Guardrails for off-topic detection using classifiers
- Scale challenges with live evals: latency and cost
- Product teams should own model selection and testing
- Reference: hamel.dev for evals guidance
Hallway conversations
- OpenAI Codex team: discussed agent repo strategies, skill composition patterns, and MCP vs. direct API tradeoffs
- Statsig: experiments and feature flags platform, dynamic config, positioning for AI-native teams
Sessions I missed (recordings to watch)
- Lessons from building Cursor — Sualeh Asif (CPO, Cursor), Alex Xu, Sahn Lam
- Vercel v0 and d0 agent — Malte Ubl (CTO, Vercel)
- New AI product at Ramp — Ian Tracey, Veeral Patel, Will Koh
- Coding agents for ICs — Simon Willison, Marcos Arribas (VP Eng, Statsig)
- Building AI applications — Chip Huyen
- High-performing teams — Nicole Forsgren, Gergely Orosz
- Uber’s agentic shift — Ty Smith, Anshu Chadha
Tools and products mentioned
- Codex (OpenAI) — agentic coding platform
- Cursor — AI code editor ($10B+ valuation)
- Statsig — think of it as a bit higher level than Launch darkly - product led experiments and feature flags
- Linear — issue tracking
- Entire / Checkpoints — open-source CLI for agentic session capture
- DX / Core 4 — developer productivity metrics
- DORA — AI capabilities model
- Replit — used by PMs for prototyping