Skip to main content
Version: 1.0.1

Replication Cache: A Low-Latency Data Solution

Replication Cache is a caching system that significantly reduces data retrieval latency. It leverages Change Data Capture (CDC) to monitor a database's binary log, ensuring that a local, in-memory cache stays consistently in sync with the source database. This architecture is ideal for high-throughput, latency-sensitive applications using MySQL-compatible databases because it completely eliminates the need for network calls to a central database or a distributed cache for read operations.

The Problem

A traditional database query might have a latency of 15 milliseconds, and a distributed cache query can speed that up to around 1 millisecond. That's a huge improvement, but it overlooks a critical question: is the real problem simply a slow lookup?

For an application like Brightspot, which can perform hundreds—or even thousands—of lookups on a single page, the cumulative effect of all those tiny delays adds up. Even a fast 1-millisecond query to a distributed cache can become a major source of latency and network usage when you multiply it by a thousand. The true bottleneck isn't the speed of a single lookup; it's the fact that so many lookups require a network trip in the first place.

The Solution

Replication Cache tackles these issues by having the database proactively push changes directly to a local, in-memory cache within the web application.

By eliminating network calls for read operations entirely, it delivers data at near-zero latency, freeing up network bandwidth and shifting the focus from speeding up lookups to eliminating them altogether.

It achieves this with CDC by subscribing to the database's transaction log (the binlog in MySQL). The system reads binlog events and applies every INSERT, UPDATE, and DELETE operation directly to the local cache in near real-time.

It's important to understand that the Replication Cache is a non-blocking, fail-safe layer. It operates asynchronously in the background and doesn't change your application's primary data retrieval logic. If for any reason the cache is unavailable, or a specific piece of data isn't in the cache yet, the application seamlessly falls back to querying the database directly. This ensures the application remains fully functional and resilient—the cache is an optimization layer, not a required component of the data flow.

This approach is available in two main configurations:

Direct Connect

In this simple model, your web application acts as a database replica, connecting directly to the database to read the binlog stream and updates its own local cache.

  • Benefits: This is a simple, lightweight approach that requires no additional services. It's a great fit for a single-tenant application.
  • Considerations: Each application instance maintains a persistent database connection, which can increase the number of replica connections.

Intermediary Replicator

For a multi-tenant environment, an intermediary replicator service is the recommended model. This service centralizes the process and provides granular access control. In a setup where a single database server hosts multiple databases, the replicator ensures that each web application only receives the data specific to its tenant. The replicator then broadcasts these data changes to the web applications via gRPC, a high-performance framework for remote procedure calls.

  • Benefits: This model provides granular access control, decouples applications from the database, and centralizes binlog parsing.
  • Considerations: It adds one more service to manage and requires some understanding of configuring Java direct memory used by gRPC.

Key Advantages

Replication Cache system offers several significant benefits:

  • Near-Zero Latency: Read operations are served from the local, in-memory cache, delivering data at memory-access speeds—orders of magnitude faster than a network-bound distributed cache.
  • Real-Time Consistency: By reading the binlog, the cache stays perfectly in sync with the database, providing up-to-the-minute data without complex invalidation logic.
  • Reduced Database Load: With most read queries handled locally, the primary database is free to focus on write transactions.
  • Simplified Scalability: You can easily scale horizontally by simply adding more web application instances, each with its own local cache, without putting extra stress on the database.
  • Access Control: The intermediary replicator model provides robust access control for multi-tenant environments.