Summary of "My chaotic journey to find the right database"
Summary of "My chaotic journey to find the right database"
The video is an in-depth exploration of the challenges and lessons learned while building a complex, rapidly evolving AI chat application (T3 Chat) with a focus on selecting and managing the right database technologies and data models. The creator shares a candid, technical journey involving multiple database switches, synchronization strategies, and architectural decisions.
Key Technological Concepts & Product Features
- Local-first Data Model: The app uses a local-first approach where the entire chat data (threads, messages, projects, tokens) is stored on the client side using IndexedDB, enabling offline navigation and instant UI updates without network round trips. This approach contrasts with many AI chat apps that rely heavily on server-side data fetching, causing delays and poor UX.
-
IndexedDB and Dexie:
Dexie, a minimalist wrapper for IndexedDB, is used extensively on the client side to manage local storage. Dexie simplifies the notoriously complex IndexedDB API by allowing schema definitions, indexes, and reactive queries (via
liveQuery), enabling efficient UI updates as data changes. - Sync Layer Challenges: Synchronizing local data with the server is complicated, especially with operations like deletions, which require "soft deletes" (marking data as deleted rather than removing it) to avoid data resurrection during sync. The sync mechanism evolved from sending gzipped JSON blobs of the entire DB to a more granular key-value sync per message and thread.
- Server-side Data Storage and Sync: Initially, Redis was used as the server-side key-value store for syncing user data. However, Redis struggled with scale as the number of keys exploded from tens of thousands to nearly a million due to storing individual messages and threads separately. This caused performance degradation and high bandwidth usage.
- Migration to PlanetScale (Vitess-based MySQL): The final stable solution moved server-side data storage to PlanetScale, a horizontally scalable MySQL-compatible database built on Vitess. This handled the large read-heavy workload efficiently, scaling to hundreds of thousands of rows with low latency and stable performance on a modest $30/month plan.
- Drizzle ORM: Drizzle was used as the SQL ORM for PlanetScale, simplifying schema management and queries. The migration from Redis to SQL was smooth and required no client-side changes due to a well-designed API layer.
-
Explored Alternatives:
- Zero (by Replicate): A sync engine promising local-first with real-time and collaborative features, but immature, not fully open source, and requiring complex schema and permissions definitions.
- Jazz: A reactive database with a global state model, but unsuitable due to lack of signed-out state support.
- TinyBase: Open-source reactive local-first DB with sync via WebSockets or custom solutions, but WebSocket dependency was a dealbreaker.
- Legend State: React-focused state management with sync plugins, but sync was either too hand-holding or too raw, leading to the decision to build a custom sync solution.
- Electric SQL: Postgres sync engine and Postgres in-browser (PG Light), interesting but not production-ready for this use case.
- Other DBs: Neon, Turo, Supabase, SingleStore, SQLite-as-a-service were tested or considered but found lacking in scale, reliability, or pricing.
Key Lessons and Conclusions
- Local-first is Hard and Often Not Worth It: Building a local-first database solution is complex, error-prone, and often unnecessary unless it is a core differentiator for your product. Most apps can achieve 95% of the desired performance with simpler caching and server-side optimizations.
- Local-first Tools Are Overhyped and Misunderstood: Many "local-first" tools bundle real-time sync and collaboration, which are distinct problems. No one-size-fits-all solution exists, and many tools do not scale or integrate well for large, complex apps.
- Soft Deletes and Sync Workarounds Are Inevitable: Handling deletes and sync conflicts requires workarounds like soft deletes, which complicate the data model and syncing logic.
- Server-driven UI is Best: The video emphasizes that UI should be primarily driven by server data with caching rather than complex local-first models, except in very specialized scenarios.
- Open Source and Control Matter: The creator prefers open-source or at least partially open solutions that can be debugged and fixed independently, which influenced the choice to avoid Zero and similar proprietary sync engines.
- Redis is Not Suitable for Large-scale KV Sync: Redis struggles with millions of keys and large-scale sync workloads, making it unsuitable as a primary sync store in this context.
- PlanetScale is a Great Scalable SQL Solution: PlanetScale’s Vitess-based architecture offers excellent scale, reliability, and performance for server-side SQL storage in a serverless environment, outperforming Redis for
Category
Technology