Scalability

Understanding Scalability for System Design


Scalability Basics

Important interview concepts:

  • How does DNS work?
  • What is Vertical Partitioning?
  • What is Horizontal Partitioning?
  • What is a Single Point of Failure (SPOF)?
  • What is Resiliency? How do you make systems resilient?
  • What is Redundancy? How do you introduce redundancy in systems?

Mark is a 19-year old Harvard student. He has created a simple website ‘Thefacebook’. Thefacebook offers the creation of a new profile where you can upload a photo, share your interests, and connect with other people.

Currently, he has this website running on his localhost and database on his own computer.

Scaling for 10,000 Req/s users:

Now he decides to launch this website for Harvard college students. He buys a domain thefacebook.com and creates an API. Hence when a Harvard student types thefacebook.com, a request is sent to the college server, and they get a response on their own device. Currently, he is paying for one server that has 8GB RAM and 100GB memory.

Learn: How does DNS work? Watch this 6 min video.

Scaling from 10,000 Req/s users to 1L Req/s user:

Thefacebook is getting really popular in college. The user base has increased such that the server is receiving about 1L requests/sec. Users start complaining that the system takes more time to generate a response. Why so?

Learn: How many simultaneous connections can a single server handle? What does it depend on?

According to this discussion, Each connection has some required memory associated with it to maintain the connection. That memory may only be a KB or even less per person, or it could be many MB, depending on the type of request.

If you are talking Internet Protocol version 4 (IPv4) connections (the normal Internet connection in the world today between clients and servers) the protocol itself has only 16 bits for the port number, so only 65,535 connections of any TCP/IP or UDP/IP type can exist simultaneously on a single computer.

Mark doesn’t want to lose his user base because of the degrading performance. He buys more RAM and Memory. He has now 128GB RAM and 500GB of hard disk and now even he has brought a new processor.

Learn: What Mark just did is he got a bigger machine. We can handle more requests by throwing more money at the problem. When you are buying a bigger machine it means your computer is going to be larger and therefore it can process the request faster. This is called ‘Vertical Scaling’. That is you are scaling your computer vertically and making it bigger and bigger.

Scaling from 1L Req/sec to 1M Req/sec:

Mark's users are rapidly increasing and now mark knows that his users are about to reach 1 million. His server will again soon become insufficient in handling a million requests.
Mark says, ‘Ok, I will buy more RAM and more Memory!, but how long can I keep doing the same thing?’ Although he has got money to get a bigger server, there is still a limit up to which he can scale up his server.

One of his friends just walks in his room and was just about to trip over the server power cable. Thankfully, he managed to balance himself.

Hush!!! Mark is in a big problem now. If anyone trips over the power cable, all the system shuts down in a fraction of second. He recognizes it is a single point of failure.

Learn: Single Point of Failure

A single point of failure (SPOF) is a part of a system that, if it fails, will stop the entire system from working.

‘Rather than getting a few or really good one big machine, why don't I get a bunch of cheaper machines? Each of maybe 8GB RAM and 100GB Memory. This way there would be no single point of failure as well as I can get as many servers as I want!!’

Soon Mark gets 20 such servers to handle a million requests.

( 1 server can handle 65,535 simultaneous connections~ min 60k connections, to handle. Therefore to handle 1M connection he would need almost 16 such servers. Mark is very confident about the popularity of his website. So he got 20 such servers :D)

Learn: Horizontal Scaling

When you scale by adding more machines into your pool of resources, it is called Horizontal Scaling.

By scaling this way, he ensured that if after a few years even if his user base starts to decrease, he can just sell the extra servers. But Mark’s future is different! His website is amazing and it is going to attract more and more users. So he can also keep on adding more servers whenever he feels the need of adding more servers! Even if any server fails, another server can take the charge. Now there is no single point of failure. This way he has introduced Resiliency.

Learn: Resiliency

System resilience is an ability of the system to withstand a major disruption and to recover within an acceptable time. If one server crashes, another server in the pool can start to serve traffic. There is no single point of failure.