27 Mar 2021
读书笔记 - DDIA(2)
Distributed Data
Replication
Keeping a copy of the same data on multiple machines that are connected via a network.
This helps:
- Keep data closer to your user
- Prevent system down
- Increase read output (can read from multiple places simultaneously)
- Leader and Follower
- Leader based replication: to ensure all replica has the data
- The write request can only be sent to leader, who will then send write request to its followers
- The read request can go to either leader or followers
- Async vs Synchronous: whether the leader wait for follower to confirm before report back to client that the data is ready.
- If all sync: then if one follower fail, it will block any write requests to the leader
- Semi-async: one is sync, so there are at least two copies of the data
- All-async: can potentially cause data lost and the write is not durable
- Create New Followe: standard file copy can not cover the data written to different part at different time
- Setup regularly snapshot dump form the leader; when a new follower is added, it first copies the latest snapshot, and request the leader for all data changed since snapshot
- Handle Node Outages:
- Follower: it keeps a record of the last transaction it received from the leader, when it is back, it can request the leader for all data changed after the last logged transaction
- Leader: promote one of the follower to a leader - failover
- Determine the leader is down - use a timeout
- Elect new leader - the one with the most up-to-date received from old leader
- Reconfiguraiton - the system needs to treat the newly elected leader as the leader and the old one as not leader
- Async replicate might be an issue - what if the old leader came back and the write keeps going, the new leader may recieve conflicting write request
- Write data loss - consider if the leader increment some primary key that an external system use, the key will be out of sync when a new leader is promoted
- Replication Log - WAL: store the data to be written in a log ahead, these are very low level data, will be hard to migrate to other system - Logical Log: store a sequence of statement, and these can be migrated to external application - Trigger-based: use an external application to sotre this - Problem with the log is - eventual consistency, especially async. - Anomaly: - Read-after-write Consistency:
- When read somethign the user posts, only read form the leader, if only the user can post this information
- If others can post information, use some metrics to determine whther to read from leader or follower (like if the follower is n minute behind the leader when you are trying to read at the nth minute after write) - Monotonic Reads: Read from same user should return consistent result, this can be achieved by user always reading from the same replica - Consisten Pre-fix Reads: if a sequence of writes happen in a certain order, then anyone reading those writes will see them appear in the same order - Multi-Leader Pelication: - Multi-datacenter Operation
- Every write processed in the local datacenter and is replicated async to other datacenter
- Each data center operates indipendently - Client with Offline
- The local database acts as leader to access write and read - Handling Write Conflicts
- sync write vs async: sync will basiclaly just block the second one until first one succeed; however, the async won’t detect such discrepency until a later time