Have you ever tried to speed up your Python code with threads or processes, only to end up more confused than when you started?
You’re not alone. A lot of developers hit a wall when they realize Python doesn’t behave like other languages when it comes to concurrency.
Well don’t worry! In this guide, we’ll walk through how both approaches work in Python, along with what the Global Interpreter Lock (GIL) really does, and when each tool shines.
So that by the end, you’ll know exactly when to use threads, when to go with processes, and how to avoid the common mistakes that trip up most beginners.
Let’s get started…
Sidenote: If you find any of this confusing, or simply want a deep dive into Python, check out Andrei's Python Coding course taken by 200,000+ people:
It’ll take you from an absolute beginner to understanding everything you need to be hired ASAP.
Alternatively, if you're already pretty good at Python and want to build some interesting and useful projects, why not check out my course on Python Automation:
It'll show you how to automate all of the boring or repetitive tasks in your life - and makes for some pretty standout portfolio projects as well!
With all that out of the way, let's get into this 5-minute tutorial!
At some point, most Python developers hit a wall where their code feels slow. Maybe you're downloading hundreds of files, processing a big batch of data, or just running something that takes longer than you'd like.
And naturally, you start wondering: “Can’t I just make Python do multiple things at once?”
The short answer is, well yes... but it depends.
Python gives you several tools for running things “in parallel” (i.e. at the same time), but they don’t all work the same way. Nor do they always behave how you might expect, especially if you’re new to programming or have a background in a different language.
This is because the solution to doing multiple things at once in Python depends entirely on what kind of work your program is doing.
Let’s say you have two tasks:
Both feel slow, but for totally different reasons:
From the outside, they both just look “slow.” But under the hood, they’re slow for different reasons, and that matters a lot when you try to speed them up.
That’s why Python gives you two different approaches:
But before we dig into when to use each one, let’s look at how threads actually work in Python and why they don’t always behave the way you’d expect.
Let’s say you’re running a coffee shop, and you’ve got one barista and one espresso machine.
You run for a few weeks and business is growing, so you hire a few more baristas to help with the morning rush. The goal? Get more drinks out faster.
That’s the same idea behind threads in Python.
Each thread is like a separate worker, handling its own task, but they all share the same workspace. They use the same memory, access the same tools, and can easily pass things back and forth. Because of that, threads are lightweight and quick to create, and they’re great for handling lots of small tasks that work with shared data.
But there’s a catch...
In Python, even if you create multiple threads, only one of them can actually run Python code at a time. The others have to wait their turn.
That’s because of something called the GIL, or Global Interpreter Lock.
It works like this:
Even though you’ve got multiple baristas (threads), there’s still only one espresso machine. And Python insists that only one barista can use it at a time. No matter how many threads you create, they have to take turns using the machine.
That’s manageable in some situations. Let’s say one barista is halfway through making a drink and steps aside to warm some milk. Another barista might jump in, use the machine to pull a shot, then step aside too. If everyone is doing small things or waiting between steps, they can rotate quickly and stay productive.
But it doesn’t always work like that.
Sometimes one barista gets a huge order — five complicated drinks in a row — and ends up hogging the machine. The others just have to wait.
That’s exactly what happens in Python when you use threads to run CPU-bound tasks. Even if you split the work across multiple threads, the GIL makes them take turns, one after another.
For example
Imagine that we’ve written some code that is going to allow us to square four numbers.
Each operation takes one second, and we’re using threads to try and do them all at once.
import threading
import time
def square(n):
time.sleep(1) # Simulating a time-consuming task
return n * n
numbers = [1, 2, 3, 4]
threads = []
start = time.time()
for n in numbers:
thread = threading.Thread(target=square, args=(n,))
threads.append(thread)
thread.start()
for thread in threads:
thread.join()
end = time.time()
print("Time taken:", round(end - start, 2), "seconds")
You might expect this to finish faster than 4 seconds right? After all, the threads were started at the same time. But what actually happens is that it still takes about 4 seconds.
This is because each thread is doing real computation, and the GIL only lets one thread run Python code at a time. So even though the work is split across threads, they’re still taking turns, not working in parallel.
In other words, we’re not doing four 1 second tasks all at once; we’re doing them sequentially, one after another.
Safety!
You see, Python’s memory system isn’t built for multiple threads changing variables at the same time. They can take turns, sure, but they can’t all run at once.
This is because without the GIL, two threads could try to update the same variable at the same time, and completely corrupt your program in the process. Which of course is why Python uses the Global Interpreter Lock in the first place.
In our cafe analogy, it’s like a strict floor manager who says, “Only one person at the machine at a time! No exceptions.”
That’s why multithreading in Python doesn’t give you true parallelism; because it’s not *really *doing the two things at the same time. Instead, it alternates back and forth during downtimes.
Multithreading works best when threads are doing things like waiting on downloads or reading from disk. So if your code is mostly waiting like a web scraper or API client, then threads can help you get more done faster.
But if your code is mostly working i.e. doing things like crunching numbers or transforming data, for example, then threads won’t give you the speed boost you’re looking for.
That’s where multiprocessing comes in.
Let’s go back to the cafe analogy again.
So, we’ve got a team of baristas and only one espresso machine. That setup works to a point, but when orders keep piling up and each one takes real effort, taking turns just doesn’t cut it.
But what if you could give each barista their own machine?
That’s what multiprocessing does. Instead of asking multiple threads to share one workspace, Python’s multiprocessing
module creates entirely separate workspaces.
Each process runs in its own memory space, with its own Python interpreter, and crucially, its own GIL. That means tasks don’t need to wait in line. They can all run truly in parallel, across different CPU cores.
Back in the coffee shop, this is like expanding the counter and bringing in more espresso machines. Now four baristas can each make drinks at the same time. This means no more queuing, no more bumping into each other, and no need for a strict manager telling people to wait.
This completely avoids the GIL bottleneck. Since each process runs separately, your code can finally use all your CPU cores to do heavy work in parallel.
Let’s see this in code.
For example
Let’s use the same task as before of squaring numbers, but this time we’ll use multi processing instead of threads.
You probably already have a good idea of how fast it’ll be but let’s break it down:
import multiprocessing
import time
def square(n):
time.sleep(1) # Simulate a CPU-bound task
return n * n
if __name__ == '__main__':
numbers = [1, 2, 3, 4]
start = time.time()
with multiprocessing.Pool() as pool:
results = pool.map(square, numbers)
end = time.time()
print("Results:", results)
print("Time taken:", round(end - start, 2), "seconds")
Here’s what’s happening:
multiprocessing.Pool()
spreads the work across multiple processes — usually one per CPU coreSince the tasks are independent and CPU-bound, they can all run at the same time. This means we have four x 1 second tasks running at the same time, so the entire thing finishes in just over 1 second, as opposed to 4 seconds with threads.
That’s the power of multiprocessing, in that we can get real speedups on real machines.
You’d reach for this approach when you’re:
Just keep in mind: processes don’t share memory the way threads do. If you need to exchange data between them, you’ll use tools like Queue
, Pipe
, or Manager
.
It’s a bit more setup but you get true parallel execution without the GIL standing in your way.
At this point, you might be wondering: “If multiprocessing avoids the GIL and runs in parallel, why not just always use it?”
A totally fair question.
The answer is that it depends on what your code is doing, and how much overhead you’re willing to take on.
If your code is mostly waiting i.e. doing things like downloading files, reading from disk, or making API requests for example, then threads are the better choice.
Threads are lightweight. They start quickly, use less memory, and can easily share data since they live in the same memory space. When one thread is waiting, another can jump in and get work done.
You don’t need to worry about parallelism. You can just be happy in the knowledge that something’s always moving forward.
If your code is mostly working, doing things like number crunching, image processing, or data transformations, then use multiprocessing.
Processes run in true parallel across CPU cores. Each one has its own memory and its own interpreter, so the GIL doesn’t get in the way. That makes them ideal for CPU-heavy workloads.
They’re heavier to start, and sharing data between them takes more work, but if your program is doing real computation, the speedup is worth it.
Task | Use | Why it fits |
---|---|---|
Downloading files | Threads | Waiting on I/O (can switch while waiting) |
Reading from disk | Threads | Low CPU, lots of waiting |
Web scraping | Threads | Waiting on external responses |
Image processing | Multiprocessing | Heavy CPU work benefits from parallel cores |
Math simulations | Multiprocessing | Needs real parallelism |
Data transformations | Multiprocessing | CPU-intensive and independent |
Once you start writing real code with threads or processes, a few surprises tend to pop up.
These aren’t big design flaws. They’re just the kinds of mistakes that quietly break things or make your code slower than expected.
Let’s look at the ones most beginners run into...
You write a function that does heavy computation, run it in threads, and expect a big boost in performance. But when you run it… your code takes just as long to run.
This is usually your first encounter with the GIL, and it comes down to what we said earlier. Python only lets one thread execute Python code at a time, so no matter how many threads you use, they still take turns.
Fix:
If your code is CPU-bound, use multiprocessing
instead of threads.
__main__
check in multiprocessingOn Windows (and sometimes macOS), forgetting this guard can cause your script to crash, hang, or recursively spawn child processes.
Fix:
Always wrap your multiprocessing code like this:
if __name__ == '__main__':
# multiprocessing code
Even if it “works” without it, skipping this is asking for bugs later.
Threads can share memory. Processes can’t. If you update a list in one process, that change won’t be visible in another.
Fix:
If you need to share data, use multiprocessing.Queue
, Pipe
, or a Manager
object to send it between processes. Or return results and combine them at the end.
More isn’t always better. Spawning 100 threads or 50 processes often leads to crashes, memory issues, or worse performance.
Fix:
Keep things simple. Start with a handful of workers, or use a thread/process pool to manage how many run at once.
So as you can see, both methods are incredibly helpful, but like anything, they each have their strengths and weaknesses.
Just remember:
As you’ve probably already realized, understanding this concept is the key to leveraging these powerful techniques effectively.
Threads help overlap I/O wait time. Processes run CPU-heavy work in parallel, avoiding Python’s GIL. You don’t need to memorize the internals—just understand the tradeoff.
But most importantly, you *do *need to try this out for yourself!
Pick something small. Write a script that downloads a few files using threads. Then try a task that crunches numbers and swap in multiprocessing. Watch what changes; you’ll learn more from running those two scripts than reading another tutorial.
And most importantly, you’ll know which tool to reach for next time.
Remember - If you want to dive deep into Python then be sure to check out Andrei's Complete Python Developer course:
It’ll take you from an absolute beginner and teach you everything you need to get hired ASAP and ace the tech interview.
This is the only Python course you need if you want to go from complete Python beginner to getting hired as a Python Developer this year!
Alternatively, if you're already pretty good at Python and want to build some interesting and useful projects, why not check out my course on Python Automation:
It'll show you how to automate all of the boring or repetitive tasks in you life - and makes for some pretty stand out portfolio projects!
Plus, as part of your membership, you'll get access to both of these courses and others, and be able to join me and 1,000s of other people (some who are alumni mentors and others who are taking the same courses that you will be) in the ZTM Discord.
If you enjoyed this post, check out my other Python tutorials: