When a system is first built, it’s designed to handle a certain amount of traffic.
But as more users come in, that initial capacity starts to run out. Pages load more slowly, requests pile up, and at some point, the system can’t keep up. (Especially if traffic spikes).
So to avoid performance issues or downtime, businesses need a way to expand their capacity called ‘scaling’. However, the real challenge isn’t just to decide on wanting to scale — it’s choosing how to scale.
There are two main approaches:
Each method has advantages and trade-offs, so how do you know which one is right for you?
The good news is that in this guide, we’ll break down what horizontal and vertical scaling are, how they work, and when to use each one — as well as how this connects with cloud scaling and why it might be the best option for your projects.
So let’s dive in…
Imagine you run a coffee shop with one espresso machine. At first, it works fine, but as more customers walk in, the line gets longer, and you need to serve drinks faster.
So what can you do?
Well, one option is to upgrade your machine. A high-end model can steam milk faster, pull multiple shots at once, and handle a bigger workload.
That’s vertical scaling. Instead of adding more servers to handle your app or website traffic, you can upgrade the one you already have by increasing its CPU, RAM, or storage. (Kind of like how you might upgrade your PC to perform better).
Vertical scaling is a popular choice for businesses running databases, monolithic applications, or workloads that need a single high-performance machine.
That being said, there are downsides to each scaling option.
With vertical scaling the issues are:
Vertical scaling is great for short-term boosts, but when demand keeps growing, it may not be enough. That’s where horizontal scaling comes in.
Let’s go back to the coffee shop example.
Imagine that we upgraded our espresso machine, and that helped for a while but now the line is out the door again. (Perhaps everyone was loving how fast the service was and now created even more demand!)
The problem of course is that no matter how powerful or fast the new machine is, it can only serve so many customers at once.
This leads us to the next logical solution:
That’s horizontal scaling. You add more servers to share the load. This way, rather than relying on a single powerful machine, you build a network of (often smaller) machines that work together.
Tech giants like Google, Amazon, and Netflix rely on horizontal scaling because it’s the only way to handle millions of users. Instead of one giant server doing all the work, they distribute traffic across a fleet of machines.
Like I said before though, no method is perfect and each has its challenges.
With horizontal scaling, the issues are:
Most businesses start with vertical scaling because it’s simple, but once they hit a limit, they move to horizontal scaling to grow without restrictions.
That being said, managing multiple machines in horizontal scaling takes effort. But what if you could avoid that complexity? That’s where cloud scaling comes in
So far, we’ve covered vertical scaling (making one machine more powerful) and horizontal scaling (adding more machines). But what if you didn’t have to manage servers at all?
That’s where cloud platforms like AWS, Azure, and Google Cloud come in. In simple terms, instead of buying your own machines, you rent them as needed.
Why do this? Well, the benefits stack up pretty fast…
For example
If we go back to our coffee shop example from before:
Handy right?
Now imagine if we took that a step further, and your coffee shop could automatically bring in the perfect number of baristas at any moment without you having to call them in. They just appeared as traffic increased. No call up needed or asking people to come in on days off.
Better still, what if they could also bring in enough coffee machines as needed and then remove them when not needed?
That’s what happens with cloud scaling, often referred to as auto-scaling. When traffic spikes, the cloud spins up and adds more servers instantly. When demand drops, it scales back down, so you don’t pay for resources you aren’t using.
With the cloud, businesses can:
Pretty good right?
Well, it depends.
However, not every company started in the cloud. Some of them are still following more traditional methods of scaling vertically and horizontally, but more and more of them are moving to the cloud.
Hence the growing demand for Cloud Engineers and Architects, as well as the booming cloud industry growth.
There’s no one-size-fits-all approach. Vertical scaling is simple but limited. Horizontal scaling offers more flexibility but adds complexity. Cloud scaling makes both approaches easier but comes with its own challenges if you’re adding this in after you already started.
The key is understanding what works for your specific needs so you can scale efficiently, avoid bottlenecks, and ensure your system keeps up with demand.
Want to see how easy it is to build a new project on the cloud?
Check out my mini course on how to build an end-to-end web app from scratch with AWS:
When you’re first learning AWS it’s sometimes hard to know how to put all the pieces together, and how to use the various services to create an actual application that you could use in the real world.
This project helps solve that!
Inside the course you’ll be designing and building a simple web application from scratch, using five different services — Amplify, Lambda, IAM, API Gateway and DynamoDB — so you can get to grips with each of these tools.
Not only that, but I also break down why you should use them, where to use them, and how to get them to work with each other. So that as we go, we’ll build out each of the services, resulting in a fully-functional math web application.
Not only will you learn how to get to grips with building on the cloud, but when you’re done you’ll have something you can share with friends, family, and potential employers!