About System Design

System design is not something taught in our class, but is often discussed in computer science world. The book Design Data-Intensive Applications written by Martin Kleppmann, discussed three core concerns about software systems: Reliability, Scalability and Maintainability. Reliability: The system should be able to work even when hardware or software faults has happened. Scalability: The system should be able to deal with growing data volume or traffic volume. Maintainability: The system is easy to operate and evolve new features.

I think the most important part of the book discussed about the distributed data. The modern software application’s data are not stored in one node at one area. For users around the globe to be able to access the data, engineers use data replication to provide a scalable and fault tolerance system. These approaches brings another problem, which node should a request access first and how the data communicate between those nodes. I will just discuss something I learned about data replication in this blog.

One Leader Data Replication: When user make a write to database, the request will be sent to the leader node, then the leader node will send the data change to its followers. This approach although looks simple enough, but replication lag is something we cannot ignore. The data update from leader to follower is normally asynchronous, in this case use may see old data when read from different nodes and sometime cannot see their write after successful a write to the leader node. Some solutions are: when read something user may have modified, read from the leader node, client remember the timestamp of last write and then sure to read from a node that has been updated at least to that timestamp.

There is much more to discuss about system design, I will continue to learn and share. Happy coding!

Leave a comment

Your email address will not be published. Required fields are marked *