Summary of "The Microservices Scam Nobody Talks About"
Tech/product/architecture analysis: the “microservices scam”
- Microservices are framed as a “tax.” Splitting a monolith into microservices often makes systems slower, more expensive, and harder to terminate (organizationally and operationally), especially when the team isn’t large enough to justify the overhead.
- Key claim: many organizations don’t truly gain autonomy because business logic remains coupled across services.
- Example: changing one rule required touching ~5 services, coordinating multiple releases, and running multiple integration test sets → described as “coordination hell.”
- Debugging can be extremely slow when failures span many services (e.g., an “order failure” traced through 8 services for weeks).
“Distributed monolith” problem
The video describes microservices that are physically separated but logically chained as the “worst of both worlds.”
Recommendation: do a coupling audit before splitting services.
Performance/latency cost (network physics)
Network calls are expensive compared to in-process calls:
- In-process function calls: nanoseconds
- HTTP over TLS: ~1–10 ms
- Chaining 5 services: ~50–100 ms overhead before business logic runs
Advice: measure inner service latency; the “number” you observe can be misleading about real system capability.
Cost breakdown: orchestration, service mesh, observability
Microservices are said to require ~25% more compute than an equivalent monolith due to overhead such as:
- Container orchestration
- Sidecar proxies / service mesh
- Istio-style sidecars may consume up to ~90% of a pod’s CPU/memory, leaving only leftovers for the workload
Observability at scale is also expensive:
- distributed tracing
- centralized logging
- APM across many services
An estimate given: $50,000–$500,000/year for tooling “to see what’s happening.”
Reported real-world example:
- Prime Video / AWS serverless microservice stack hits a ceiling at ~5% expected load
- They “consolidated” into a monolith on EC2/ECS, cutting infrastructure costs by 90%+
Operational staffing model (SRE/DevOps headcount)
A major “gets teams fired” point: staffing requirements scale with service count.
- Mature microservices: ~1 SRE per 10–15 services
- Less mature: ~1 SRE per 5–10 services
Example: 40 services with only 2 DevOps engineers → described as “already underwater.”
Alternative: a well-architected monolith can be supported with 1–2 DevOps engineers, regardless of internal modules.
Evidence/metrics cited (DORA)
The argument uses DORA 2024 claims to suggest that elite teams achieve:
- higher deployment frequency
- lower change failure rate
- faster mean time to recovery
Crucial interpretation: the improvement is attributed to modularity (modular monoliths can reach similar results), not microservices specifically.
Bottom line: “Architecture style isn’t a variable—coupling is.”
Preferred pattern: modular monolith
Proposed “best default” architecture:
- Modular monolith with clean internal boundaries (domain-owned modules)
- Often includes a shared database, but with logical schema separation
Strategy:
- Build modules with hard internal API boundaries first
- Distribute only with a specific measurable reason
- Use tools to enforce/verify boundaries:
- ArchUnit
- Spring Modulith (or similar runtime/package boundary verification)
Migration & evolution guidance
Legacy migration (Strangler Fig pattern):
- route traffic incrementally via API gateway
- retire old code gradually
- avoid big-bang rewrites / long freezes
“Launchpad” framing:
- start with a modular monolith
- extract modules later when you can name the specific constraint (scaling spike, team size, polyglot needs, etc.)
AI coding tools + architecture warning (2025 DORA research claim)
Claim:
- AI tools increased task completion (+21%) and pull request volume (+98%), but delivery performance stayed flat.
Explanation:
- AI amplifies change volume
- in tightly coupled architectures, that generates more bugs faster due to higher churn
Recommendation:
- fix coupling first before adopting AI coding assistance
- modular architectures allow more isolated review/verification
Organizational alignment: “org chart writes the code”
The video claims architecture and communication structure mirror each other.
Solution: align team ownership to business capabilities:
- “Stream-aligned” teams own a capability end-to-end (e.g., catalog team owns catalog, order teams own orders)
- A threshold is mentioned: “Two-pizza team” size (~8–10 people); if bigger, domain/architecture is too tangled
It also argues that repeated reorganizations without architectural/ownership fixes cause codebases to “snap back” to the same tangled structure.
Industry trend statistics (microservices retreat)
- CNCF survey (2025): ~42% of organizations are actively consolidating microservices back into larger deployable units.
- Service mesh adoption dropped from 18% (2023) to 8% (2025).
Conclusion: for ~90% of applications, modular monolith is positioned as the right choice due to reduced “network tax,” unless independent scaling or organizational constraints truly require microservices.
When microservices are justified (and what they really “buy”)
Microservices are presented as valuable when you need:
- independent scaling for components with extreme load variance
- organizational independence (bounded blast radius for decisions like runtime/language/data store changes)
Caution: for small teams (“not 12 engineers in a dream”), the overhead likely outweighs the benefits.
The speaker disputes “CPU efficiency” as the real reason; the true purchase is independence.
FinOps / cost governance (architecture as a financial decision)
“In 2026, FinOps is board-level” claim:
- the CTO must justify infrastructure spend in quarterly reviews
Practical guidance:
- know costs before spinning up services using cost calculators
- document tradeoffs using ADRs (Architectural Decision Records) above a cost threshold
- make dollar amounts visible and decisions reversible
Example:
- microservices infrastructure: $80,000/month
- reduced to $4,000/month after moving to a monolith with the same feature set
Mindset takeaway
- Engineers aren’t choosing tech options—they’re allocating complexity budgets:
- distributed calls = complexity investment
- more services = cognitive load investment
- more coordination = velocity investment
- “Monolith is not a failure; microservices are not maturity.”
- Key principle: tight coupling is the failure; aim for modular logic, while deployment location should remain changeable.
Architecture decisions are ultimately decisions about coupling, independence, and the complexity/financial budget you’re willing to pay.
Main speakers / sources
- Main speaker: the video narrator/author (referred to as “I” throughout), who mentions a blog, a link tree, and “link in the description” for further material.
- Sources/statistics cited:
- DORA data (2024) (deployment frequency, change failure rate, MTTR)
- DORA research (2025) on AI tools (task completion, PR volume, delivery performance)
- CNCF survey (2025) (microservices consolidation and service mesh adoption)
- Mentions Amazon Prime Video and the AWS Step Functions / Lambda ecosystem as an example case study.
Category
Technology
Share this summary
Is the summary off?
If you think the summary is inaccurate, you can reprocess it with the latest model.