Caches

  1. A cache is temporary storage that stores a part of our database in memory.
  2. It is much faster to get the data from the in-memory storage, so it reduces the response time. Also, it reduces the load on the database because if the data is present in the cache it does not send requests to the database.

Types of Cache

  1. Local cache

    • Since the data is present in the memory, it is much faster to get the data (because we don't need to make any network calls).
    • The issue with a local cache is that it can cause a fanout. For example, we have three boxes and each one uses a local cache to store data. If we want to update the profile data of a user, we need to send requests to all the boxes. This is a fan out.
    • To avoid fanouts we can shard the data and distribute load across the boxes using a load balancing algorithm (like consistent hashing). But what if one of the nodes fails? To avoid this we can replicate the data to multiple nodes. However, this can lead to data inconsistency.
    • So using a local cache reduces data consistency while increasing response time
  2. Global cache

    • Instead of each node having its data copy, we have a central data copy, which is a single node storing all the key-value pairs in its memory.
    • This improves the data consistency but reduces the response time (because we have to make network calls).

Write Policy

  1. A write policy is triggered when there is a write operation in the cache. A write request means some entry is added, updated or deleted in the cache. But because a cache is a source of truth each of the write requests will impact the entire system.

Write Back Policy

  1. If the key-value pair that is to be updated is present in the cache then it is updated. However, the key-value pair is not immediately updated in the database. So as long as the cache is alive, users will get consistent data. However, if the cache is not alive, the data will be stale.

  2. To avoid this problem we can use Timeout-Based Persistence. In this mechanism, we have a TTL (Time to Live for each entry in cache). If the timestamp becomes greater than the TTL, the entry is evicted and the data is persisted in the database. This makes our database eventually consistent.

  3. Another approach is to use Event-Based Write Back. In this mechanism, instead of keeping a TTL, we are keeping an event-based trigger. For example, we can keep the count of several updates, if it becomes greater than 5 we persist it in the database.

  4. We can also use Replacement Write Back. For each entry in the cache, we have an attribute that tells us if the entry is updated or not. When we want to evict the entry we update the entry in the database if the entry was updated in the cache.

  5. This policy is efficient. It is especially useful when we want efficient querying in our cache and we are fine with persistence not being 100%.

Write through Policy

  1. In this policy, when there is a write request, we evict the key that is being updated, while simultaneously updating the database. The next time there is a read request, that is when the cache polls the database for the entry, persists the entry and sends the response to the user.

  2. However, we can run into problems when using this policy. For example, Initially, we have X = 10. There is a write request for X = 20

  3. Then there is a read request for X , but the write request is not updated yet. So the read request returns X = 10 . So it can cause inconsistency.

  4. To avoid such problems, we can lock the data which is to be written and only unlock the data after the update operation is completed.

  5. This policy is useful when we need a high level of consistency and when we need a high level of persistence. However, it is less efficient compared to the write-back policy.

Write around policy

  1. In this policy, when we get a write request, instead of updating the entry in the cache, we update the entry in the database. Now when we get a read request, we will send the stale value from the cache. And we will be getting stale value until the entry is evicted from the cache.

  2. This policy is useful When we need a high level of efficiency When we need a high level of persistence. However, it makes our system eventually consistent

Replacement Policy

  1. Replacement policy is triggered when there is no space for a new key and a key is evicted from the cache.
  2. Least Recently used: In this policy, we evict the entry that has not been used for the longest
  3. LFU Policy (Least Frequently Used): In this policy, we evict the entry that is used least frequently.
  4. When the miss ratio is high and there are a lot of loads and evictions, we call it thrashing. It happens because the cache does not have enough space (Number of entries cache can store << Number of entries in the database).
  5. Replacement policy in Memcache: In Memcache we have two different data stores
    • One data store is used to store entries that are less requested. It is also known as cold region.
    • Another data store is used to store entries that are more requested. It is also known as hot region
    • This uses both LRU and LFU. Frequency of requests is used to decide where to store the entry. Based on LRU, one entry is evicted from one region and moved to another region.

Redis

Redis (Remote Dictionary Server) is an open-source, in-memory key-value data structure store renowned for its sub-millisecond latency, high throughput, and versatility as a database, cache, message broker, and real-time analytics engine.

It supports diverse data types (strings, hashes, lists, sets, sorted sets, bitmaps, hyperloglogs, geospatial indexes, streams) and advanced features like Lua scripting, LRU eviction, transactions, pub/sub messaging, and Lua scripting.

Redis's design emphasizes simplicity and speed through its in-memory model, but it provides configurable persistence mechanisms for durability.

Internal Architecture

Redis's architecture is optimized for performance and simplicity, revolving around a single-threaded event-driven model for command processing. This avoids multi-threading complexities (e.g., locks, race conditions) while achieving high concurrency through non-blocking I/O. The system is divided into a core event loop for handling client interactions and background threads/processes for non-critical tasks like persistence.

Key Architectural Components

Single-Threaded Architecture

Redis's single-threaded command execution is its defining feature, processing all client commands sequentially on one core. This design prioritizes simplicity, atomicity, and cache efficiency over parallelism for CPU-bound tasks.

How Single-Threaded Processing Works

  1. Client Connection Handling: The main thread accepts connections on port 6379 and registers sockets with the event loop for monitoring.
  2. Event Detection: Using I/O multiplexing (e.g., epoll), the thread polls for ready events across all sockets (e.g., data to read). It detects without blocking, handling thousands concurrently.
  3. Command Parsing & Execution: For a ready socket:
  4. Read the full command (RESP format).
  5. Parse and execute atomically in memory (e.g., SET key value updates the hash table instantly).
  6. Write response back to the socket.
  7. No locks needed—sequential execution ensures consistency.
  8. Loop Continuation: Return to polling; background tasks (e.g., persistence) are offloaded via fork or threads.

Advantages of Single-Threaded Design

Limitations & Mitigations

This model suits Redis's I/O-bound workloads (caching, messaging), achieving 1M+ ops/sec on modest hardware.

Replication Process

Redis replication provides asynchronous master-replica synchronization for read scaling, high availability, and failover. It's lightweight and configurable for partial or full resyncs.

Core Mechanism

Synchronization Types

  1. Full Sync (Initial/Desync):
  2. Triggered on first connection, ID mismatch, or offset gap > buffer.
  3. Master:
    1. Forks a child process (Copy-on-Write: shares unchanged pages).
    2. Child generates RDB snapshot (compressed binary dump).
    3. Sends RDB to replica (compressed over network).
    4. Buffers new writes during transfer.
  4. Replica: Loads RDB into memory, then replays buffered commands.
  5. Overhead: CPU/memory for fork/snapshot; network for transfer (mitigated by compression).

  6. Partial Sync (Incremental):

  7. For small gaps: Master sends only missing commands from replication buffer.
  8. Replica replays them in order.
  9. Efficiency: Low overhead for minor desyncs (e.g., network hiccup).

Failover & Sentinel

Pros: Simple HA; read offload. Cons: Data loss on failure (pair with persistence).

Data Persistence

Redis is in-memory but offers configurable persistence to survive restarts/crashes, balancing speed vs. durability.

RDB (Redis Database Snapshots)

AOF (Append-Only File)

Hybrid Approach (RDB + AOF)

Persistence Nuances

Redis's architecture excels in simplicity and speed, making it ideal for caching/real-time use, with replication and persistence adding robustness for production. For deeper dives (e.g., clustering), let me know!

Redis Cluster

Redis Cluster is a built-in feature of Redis (since version 3.0) that turns a single Redis instance into a distributed, sharded, highly available system. Core Characteristics

  1. Automatic Sharding: The key space is divided into 16,384 hash slots. Each slot is owned by one master node.
  2. High Availability: Every master has at least one replica (slave). If a master fails, a replica is automatically promoted.
  3. Horizontal Scaling: Add/remove nodes → slots are rebalanced automatically.
  4. Partial Availability: The cluster continues working as long as the majority of masters are reachable.
  5. Client-Side Routing: Smart clients (like JedisCluster, redis-py-cluster) understand MOVED/ASK redirects and route commands to the correct node.