Summary of "Scaling Uber with Thuan Pham (Uber’s first CTO)"
Thuan Pham: Scaling Uber During “Hypergrowth”
Thuan Pham (Uber’s first CTO) explains how Uber scaled its engineering systems during “hypergrowth,” and how the pressure shaped both the company’s architecture and its culture.
Early Uber Scaling Crisis & the Need for Internal Tooling
- In 2013, when Pham joined Uber, it had:
- ~40 engineers
- ~30,000 rides/day
- systems that crashed multiple times per week
- The key failure mode: architectures that worked for functionality but could not scale—especially for:
- dispatch/matching
- Dispatch became an urgent bottleneck (described as a “brick wall”), leaving only months to fix it.
- This drove:
- continuous rewrites
- strong internal engineering practices
Dispatch Rewrites: Surviving the “No Runway” Problem
Pham explains that dispatch couldn’t scale because it effectively depended on single-threaded execution, and that scaling by using faster hardware caused partitioning issues.
To make progress under extreme time pressure, he emphasizes:
- Being explicit about “where the wall is”
- for example, capacity timelines for places like New York
- Setting minimal scaling constraints, such as:
- multiple boxes powering one city
- one box powering multiple cities
- A survival-first principle:
Don’t aim for perfect long-term architecture when time is short. Rewrite to buy survival time—“live to fight another day”—then fix the next choke point.
China Launch in 4–5 Months: Feasibility + Security Engineering
Uber’s CTO Travis Kalanick demanded a China launch in ~2 months, which Pham says was unrealistic for a straightforward migration because:
- Uber couldn’t run two divergent systems concurrently
- Uber lacked staffing to maintain both
Instead, the solution required building a partitioned system that could operate on China soil without security/data-control “bleed-through,” which added deployment/release complexity.
- The timeline ultimately stretched to ~5 months
- A key execution tactic:
- incremental city rollout
- start with the hardest/biggest city first (e.g., Chengdu)
- later launches became “downhill”
Why Uber Ended Up with Thousands of Microservices (and Why It Later Reduced)
Pham describes a two-stage path to microservices:
1) Organizational split into “program/platform”
- This came first as an organizational response to friction.
- Functional teams created bottlenecks because feature launches depended on too many cross-team negotiations.
2) Operational need to avoid an API monolith
- Microservices arrived later because new backend work couldn’t keep being added to the API monolith without slowing velocity.
Decomposition lagged behind business demand
- A dedicated decomposition effort (“Darwin”) tried to break apart the monolith.
- But as demand continued accelerating, decomposition couldn’t keep up.
- Result: a large microservice footprint (described as thousands).
Cleanup as growth stabilized
- As growth stabilized, Uber improved structure and reduced complexity using:
- higher-level domain organization
- better observability/tooling
- cleanup efforts (e.g., an “ARC” cleanup initiative)
- The subtitle claim: microservices later reduced somewhat (e.g., fewer in 2026 than in 2016).
Internal Tools and Open-Source “Breaking Points”
Pham notes that Uber initially relied heavily on open source, but hit limits in reliability and observability.
A painful example:
- PostgreSQL failures at scale led to:
- outages
- debugging uncertainty
- Uber lacked:
- deep vendor-level support for specific failure modes
- definitive internal experts with the right expertise
This pushed Uber to build:
- its own infrastructure / data / monitoring layers
- plus some externally published tooling
- such as trace/observability
- and custom data/control layers
Helix: Rewriting the Uber Consumer App for Scalability + Extensibility
Pham recounts Helix, Uber’s large-scale app rewrite (where he met Pham early).
- The motivation wasn’t only aesthetics.
- Uber’s existing app limited the ability to add new services/features.
- The rewrite required changes across the system, including:
- shifting from polling to push/real-time patterns
- It took 6–8+ months across large mobile and backend teams.
Long-term value is framed as future-proofing the architecture.
Leadership & Culture: Engineering Rigor and “No Mickey Mouse Shop”
Pham stresses that serious engineering practices become essential as systems grow, including:
- code quality
- naming conventions
He also discusses leadership structuring:
- splitting leadership levels into L5A/L5B
- creating clearer growth milestones
- reducing “stagnation” in long promotion timelines
People Systems: Easier Internal Transfers to Retain Talent
Pham pushed for a more permissive internal transfer process because:
- requiring internal permission is harder for engineers
- than leaving and re-interviewing externally
His philosophy:
- engineers should have free will
- managers should be incentivized to develop and place talent
- managers shouldn’t act as gatekeepers
Career Philosophy & CTO Mission
Pham describes his “purpose” framework as structuring his “three tours” at Uber:
- Fixing broken systems and improving reliability
- Global scaling (including China and broader scaling)
- Turbulence/transition after leadership changes until stability and a new CEO direction
He argues the most important CTO jobs are:
- Build a high-performance team
- high talent density
- alignment
- trust
- Look 2 years out (“see around the corner”) while execution happens in the nearer term
AI Outlook at Fair (Current Role)
Pham says AI is already improving productivity and output, including:
- coding support via agentic workflows
- “swarm coding” and orchestrators
The next challenge is using AI to help build features on existing entangled legacy codebases, not just greenfield work.
He concludes that while tools change, top-performer traits remain:
- curiosity
- fearlessness
- innovation
He warns complacency is still “death.”
Presenters or Contributors
- Thuan Pham (interviewee; Uber’s first CTO)
- Travis Kalanick (referenced; Uber CEO during early scaling and interview process)
- Travis Kellik / Travis (interviewer; mentioned as recruiter and CTO-role interviewer who ran a long interview loop)
- Charles (referenced; sharing a memory of Pham’s talk)
- Bill Gurley (referenced; Benchmark Capital)
- Jeff Holden / Yuki (referenced; leaders involved in Uber initiatives)
- Statsig (presenting sponsor; mentioned by the show)
- Sonar (sponsor; mentioned by the show)
- Work OS (sponsor; mentioned by the show)
- Max / CEO of Fair (mentioned; Fair CEO in the conversation)
- Seyo(a) (mentioned; partner who arranged the Fair meeting)
Category
News and Commentary
Share this summary
Is the summary off?
If you think the summary is inaccurate, you can reprocess it with the latest model.