An Introduction to Concurrency: Understanding Threads and Shared Data
Understanding concurrency through the lens of threads and shared data.
đź§µ An Introduction to Concurrency: Understanding Threads and Shared Data
When you write code, you typically imagine a single line of instructions executed one after another. But today’s systems often run multiple tasks simultaneously. Understanding concurrency, therefore, becomes essential to building robust and efficient software. Let’s delve into concurrency through the lens of the widely respected resource, Operating Systems: Three Easy Pieces (OSTEP).
The Essence of Concurrency and Threads
At its core, concurrency allows your program to do multiple things seemingly at once. This is achieved through a concept known as threads. A thread is like a lightweight process—it has its own execution context, including a program counter, registers for computation, and its own stack for local variables and function calls.
But threads differ significantly from processes. While separate processes have independent memory spaces, threads within a single process share a common address space. This makes communication between threads incredibly straightforward but introduces challenges around safely accessing shared data.
How Threads Work Under the Hood
When multiple threads run on a single processor, the CPU rapidly switches between them in what’s called a context switch. During this switch, the CPU saves the state (the registers and program counter) of the current thread and restores the state of another. Unlike context switching between processes, threads don’t require changing the address space or page tables since all threads in a process share the same memory.
Moreover, each thread maintains its own stack in the shared memory space. Having separate stacks allows threads to independently call functions and maintain local data without interference from other threads. While this can complicate the neat memory layout, stacks generally remain small, making it a manageable trade-off.
Let’s visualize the memory layout differences:
graph TD;
A-->B;
A-->C;
B-->D;
C-->D;
In the single-threaded process, we have a simple memory layout with one stack. In contrast, the multi-threaded process shares the code, data, and heap segments while maintaining separate stacks for each thread. This shared memory model is what makes thread communication efficient but also introduces the challenges of synchronization.
Why Use Threads?
Threads primarily serve two purposes:
First, parallelism. By splitting computationally intensive tasks across multiple threads, we leverage modern multi-core processors to significantly speed up execution. For example, processing large data sets or performing heavy mathematical computations can be dramatically accelerated using threads.
Second, overlapping I/O with computation. Threads help ensure that when your program waits for slow input/output operations—such as reading data from disk or the network—it doesn’t just idle. Instead, another thread can perform computations or initiate other operations. This capability is especially important in server applications like web servers and databases, enabling greater responsiveness and efficiency.
You might wonder if separate processes could achieve similar goals. Indeed, they could, but threads simplify communication due to their shared memory, making threads a natural fit for closely related tasks needing frequent data sharing.
A Simple Example of Threads in Action
Consider a straightforward C program that creates two threads:
#include <stdio.h>
#include <pthread.h>
void *mythread(void *arg) {
printf("%s\n", (char *)arg);
return NULL;
}
int main() {
pthread_t t1, t2;
pthread_create(&t1, NULL, mythread, "Thread A");
pthread_create(&t2, NULL, mythread, "Thread B");
pthread_join(t1, NULL);
pthread_join(t2, NULL);
printf("Both threads finished\n");
return 0;
}
When this code runs, you’ll notice the execution order isn’t predictable. Sometimes “Thread B” might print before “Thread A,” even though “Thread A” was created first. This unpredictability stems from the OS scheduler deciding which thread to run next based on internal algorithms and available CPU resources.
The Pitfalls of Shared Data: Race Conditions
Threads introduce complexity when they access shared data. Consider a seemingly simple increment operation on a shared counter:
int counter = 0;
void *increment(void *arg) {
for (int i = 0; i < 1e7; i++) {
counter++;
}
return NULL;
}
Running two threads each executing this function, we might expect a final counter value of exactly 20,000,000
. However, in practice, the result varies unpredictably due to what is known as a race condition.
This happens because the increment operation (counter++
) is not atomic. Under the hood, it breaks down into three separate CPU instructions. On ARM64 architecture, it looks like this:
ldr x0, [x8] ; Load the counter value into register x0
add x0, x0, #1 ; Increment the value in x0
str x0, [x8] ; Store the updated value back into the counter variable
To see this for yourself, you can compile and disassemble the C program using the following commands:
# Compile the program with debug symbols
gcc -g counter.c -o counter
# Disassemble the program using objdump
objdump -d -S counter | less
# Or using llvm-objdump on macOS
llvm-objdump -d counter | less
If one thread gets interrupted after incrementing the register but before updating the memory, another thread might overwrite its changes. This leads to incorrect, unpredictable results.
Understanding Critical Sections and Mutual Exclusion
This problematic part of the code—where shared data is accessed and modified—is called a critical section. To fix race conditions, we need to ensure that critical sections are executed by only one thread at a time. This solution is known as mutual exclusion, often implemented using synchronization primitives like mutexes.
Can We Rely Solely on Atomic Operations?
Modern CPUs indeed provide atomic instructions for simple operations, like incrementing an integer. However, expecting hardware to atomically handle complex data structures—like trees or lists—is impractical. Thus, developers rely on synchronization mechanisms provided by operating systems and libraries, such as:
- Mutexes to control access to critical sections.
- Semaphores and Condition Variables to manage waiting and signaling between threads.
Think of atomic operations as basic building blocks provided by hardware. It’s up to the programmer to use these building blocks to safely construct concurrent programs.
Waiting and Coordination Among Threads
Besides mutual exclusion, threads often need mechanisms to wait for events or each other. Consider a thread waiting for data to arrive from a network socket or a disk operation. During such waits, the thread sleeps, freeing CPU resources for other threads. Once the awaited operation finishes, the waiting thread is notified or woken up.
Such coordination often involves condition variables and other signaling mechanisms, integral parts of concurrent programming.
Summary: Threads, Concurrency, and You
Concurrency, introduced historically by operating systems, remains fundamental in modern software development. It allows efficient use of resources and responsiveness. But it also complicates programming due to issues like race conditions and indeterminate results.
To successfully navigate concurrency, always remember:
- Threads share memory but maintain separate execution contexts.
- Race conditions result from unsynchronized access to shared data.
- Critical sections must be protected with mutual exclusion techniques.
- Atomic instructions help, but real-world concurrency demands more robust synchronization constructs.
- Condition variables and signaling ensure threads can efficiently wait and communicate.
Concurrency challenges us to think carefully about timing, shared resources, and synchronization—but mastering it unlocks tremendous performance and responsiveness for your applications.
As OSTEP succinctly puts it:
“If your head doesn’t hurt at least a little, you’re not doing it right.”