Parallelism vs Concurrency
4 min read
Today's topic is something I've been wanting to delve into for a while. It's one of those things where you kinda sorta think you understand what's happening, but are constantly questioning whether you'd truly know how to explain the difference if someone asked you the question.
Program, process and threads
Understanding these terms will be really helpful when we get down to the main concepts below, so here they are:
- A process is an instance of a program that's loaded on a computer, ready to be or being executed by the processor (Central Processing Unit or CPU).
- A process can be divided into independent units of works, called threads.
Taking a step back:
- A system with multiple threads running concurrently is a multi-threaded system.
- A system with multiple processes running simultaneously (through multiple processors or a multicore processor) is a multi-process system.
Note that processors run independently from each other. It can therefore be difficult to communicate between them. Threads on the other hand, running on a single processor, would have shared resources.
Parallelism occurs when tasks literally run at the same time. For this to take place, we need to have multiple processors (CPUs) working at the same time (or a multicore processor). This allows us to increase the throughput and computational speed of the system.
To take advantage of parallelism, we'd need to take a process and split it into independent sub-processes. We can then execute each sub-process independently through a dedicated processor.
Concurrency relates to a program that can start, run and complete more than one task, in an overlapping time period. Note that this does not necessarily mean the tasks are running at the same instance in time (since we might only be using a single processor). What happens instead is context switching, where a single processor can kick off a task, and instead of waiting for that task to complete, to start another task, switching back and forth until both (or more) tasks are completed.
If we were referring to a process designed to operate within a multi-threaded system, each thread would executing concurrently.
In order for this to work, the program needs to have the ability for it to be executed out of order, or in partial order, without the final output being affected. Concurrency should therefore be thought of more as the structure of a program, versus parallelism which pertains more to the execution of the program. A program which implements concurrrency, and which is executed in a multi-processor environment, could implement parallelism - both of these concepts are therefore not mutually exclusive.
I came across this analogy during my reading, that I think really helps illustrate the point.
- Imagine a shop with a single cashier. Concurrency would be having two lines of customers that both have to pay at that single cashier. The lines take turns paying, and while the first customer might have finished paying and is packing away their items, the cashier might start with the second customer.
- Now imagine the shop with two cashiers and two lines. Both lines are independent and get their own cashier. This illustrates parallelism.
Modern browsers typically set up each browser tab to run in a separate thread (this is why we can refresh one tab, go to a second one to refresh that, and return to the first to find it updated.