27 Mar 2021

读书笔记 - DDIA(2)

Distributed Data

Replication

Keeping a copy of the same data on multiple machines that are connected via a network.

This helps:

  • Keep data closer to your user
  • Prevent system down
  • Increase read output (can read from multiple places simultaneously)
  1. Leader and Follower
    • Leader based replication: to ensure all replica has the data
      • The write request can only be sent to leader, who will then send write request to its followers
      • The read request can go to either leader or followers
    • Async vs Synchronous: whether the leader wait for follower to confirm before report back to client that the data is ready.
      • If all sync: then if one follower fail, it will block any write requests to the leader
      • Semi-async: one is sync, so there are at least two copies of the data
      • All-async: can potentially cause data lost and the write is not durable
    • Create New Followe: standard file copy can not cover the data written to different part at different time
      • Setup regularly snapshot dump form the leader; when a new follower is added, it first copies the latest snapshot, and request the leader for all data changed since snapshot
    • Handle Node Outages:
      • Follower: it keeps a record of the last transaction it received from the leader, when it is back, it can request the leader for all data changed after the last logged transaction
      • Leader: promote one of the follower to a leader - failover
        • Determine the leader is down - use a timeout
        • Elect new leader - the one with the most up-to-date received from old leader
        • Reconfiguraiton - the system needs to treat the newly elected leader as the leader and the old one as not leader
        • Async replicate might be an issue - what if the old leader came back and the write keeps going, the new leader may recieve conflicting write request
        • Write data loss - consider if the leader increment some primary key that an external system use, the key will be out of sync when a new leader is promoted
    • Replication Log - WAL: store the data to be written in a log ahead, these are very low level data, will be hard to migrate to other system - Logical Log: store a sequence of statement, and these can be migrated to external application - Trigger-based: use an external application to sotre this - Problem with the log is - eventual consistency, especially async. - Anomaly: - Read-after-write Consistency:
    • When read somethign the user posts, only read form the leader, if only the user can post this information
    • If others can post information, use some metrics to determine whther to read from leader or follower (like if the follower is n minute behind the leader when you are trying to read at the nth minute after write) - Monotonic Reads: Read from same user should return consistent result, this can be achieved by user always reading from the same replica - Consisten Pre-fix Reads: if a sequence of writes happen in a certain order, then anyone reading those writes will see them appear in the same order - Multi-Leader Pelication: - Multi-datacenter Operation
    • Every write processed in the local datacenter and is replicated async to other datacenter
    • Each data center operates indipendently - Client with Offline
    • The local database acts as leader to access write and read - Handling Write Conflicts
    • sync write vs async: sync will basiclaly just block the second one until first one succeed; however, the async won’t detect such discrepency until a later time

Tags: