Concurrency

Processes and threads

Modern processors have multiple cores. To allow a single program to run on these cores at the same time, we need concurrent programming. Concurrent programming is based on processes and threads.

A process is a program in execution, where a program refers to an executable stored in the file system. There may be multiple processes of the same program. For example, if a user runs an editor program N times, there will be N processes of the editor.

The address space of a process consists of the following sections:

  • code: contains instructions.
  • static: contains static data, e.g., static x: i32 = 1.
  • heap: contains boxes, e.g., Box::new(1).
  • stack: contains variables, e.g., let x = 1.

If a program wishes to run multiple tasks concurrently, it could run each task in a separate process, but doing so would be expensive because creating processes is expensive. This is because each process has its own static, heap, and stack sections even if they share the same code section (i.e., run the same program). To reduce the overhead of creating multiple processes, the program can create threads instead.

Similar to processes, each thread runs a task independently, i.e., each thread is independently scheduled. However, unlike processes, all the threads of the same process share the same static and heap sections, but each thread has its own stack section. Because of the shared static and heap sections, when a process creates a new thread, it need create only a new stack section, so creating a thread is more lightweight than creating a new process. Concurrent programs use threads extensively.

Creating threads

std::thread::spawn() creates a new thread. When a Rust program starts, its main function runs in the main thread. When the main thread finishes, the entire process exits even if other threads may still be running. To wait for another thread, the waiting thread can call join() on the other thread's handle, which was returned by std::thread::spawn().


# #![allow(unused_variables)]
#fn main() {
let t = std::thread::spawn(|| println!("child thread"));
println!("parent thread");
t.join().unwrap();
#}

Since the parent and child threads are scheduled independently, we do not know which of them executes the println macro first.

Move closure

std::thread::spawn() takes a closure as the parameter and runs the closure in the new thread. However, there are restrictions on the closure.


# #![allow(unused_variables)]
#fn main() {
fn f() {
    struct Foo(i32);
    let foo = Foo(1);
    // Error: `foo` does not live long enough.
    std::thread::spawn(|| println!("{}", foo.0));
}
#}

The program above generates a compiler error, for good reasons. The closure passed to spawn captures the variable foo in its environment by reference. However, since the parent and child threads are scheduled independently, when println!() in the child thread accesses foo by reference, f() in the parent thread may have returned so that foo no longer exists.

To prevent this error, we could let the Rust compiler be aware of the thread API (like the Go programming language), but this would prevent a program from using alternative thread models. Instead, the Rust compiler is unaware of the thread API, and std::thread provides the thread API instead. In other words, the compiler does not know that std::thread::spawn() creates a thread. But then how does the compiler detect the above use-after-free error?

The answer is in the declaration of std::thread::spawn():

fn spawn<F, T>(f: F) -> JoinHandle<T> 
    where F: FnOnce() -> T, F: Send + 'static, T: Send + 'static

This declaration is rather complex, but the key here is the 'static lifetime required of the closure f. Since f has the static lifetime, it cannot reference any variable whose lifetime is not static, because a reference cannot outlive the value that it points to. Note that the static lifetime outlives any non-static lifetime.

To overcome this lifetime problem, the program can instruct the compiler to always move the variables in the environment that are used in the closure into the closure, by specifying the move keyword in front of the closure declaration.


# #![allow(unused_variables)]
#fn main() {
fn f() {
    struct Foo(i32);
    let foo = Foo(1);
    // OK: the compiler moves `foo` into the closure.
    std::thread::spawn(move || println!("{}", foo.0));
}
#}

Share data between threads

Threads often share data between them. However, once a parent thread moves a value into a child thread, the parent thread can no longer access the value.


# #![allow(unused_variables)]
#fn main() {
struct Foo(i32);
let foo = Foo(1);
std::thread::spawn(move || println!("{}", foo.0));
// Error: `foo` is moved.
println!("{}", foo.0);
#}

To allow multiple threads to share the same value, Rust provides smart pointers. Smart pointers circumvent Rust's single ownership type system. Without smart pointers, a value is owned by exactly one owner. When the owner goes out of scope, Rust destroys the value. Since at compile time the compiler knows when each variable goes out of scope, the compiler knows when to destroy each value.

By contrast, a value may be owned by multiple smart pointers, and Rust won't destroy the value until all the owners go out of scope. Since at compile time the compiler does not know which owner is the last one to go out of scope, the compiler must generate code that, whenever an owner goes out of scope, checks if that is the last owner. If so, the code destroys the shared value. Therefore, smart pointers incur runtime overhead.

Rust provides different types of smart pointers. Some allow only reading the shared values, and some allow writing the shared values.

Immutable shared data

std::sync::Arc is a smart pointer providing read-only access to shared data.

struct Arc<T> { ... }

impl<T> Arc<T> {
    fn new(x: T) -> Arc<T> { ... }
    ...
}

impl<T> Deref for Arc<T> {
    type Target = T;
    ...
}

A program may clone Arc<T> to create many owners of the same shared data.


# #![allow(unused_variables)]
#fn main() {
struct Foo(i32);
let foo = std::sync::Arc::new(Foo(1));
{
    // Rebinds the name `foo`. After this statement, `foo` in this scope and 
    // `foo` in the parent scope are different smart pointers, but they point
    // to the same shared value `Foo(1)`.
    let foo = foo.clone();
    std::thread::spawn(move || println!("{}", foo.0));
}
// OK
println!("{}", foo.0);
#}

The above program shows an idiomatic way to clone a smart pointer and to move it into a closure:

  • Create a new scope.
  • In the new scope, clone the smart pointer.
  • Move the cloned value into the closure.

You might wonder how the program could access foo.0, because the type of foo is Arc<Foo> rather than Foo. The answer is that Arc implements the Deref trait:

impl<T> Deref for Arc<T> {
    type Target = T;
  ...
}

Therefore, the compiler automatically coerces a reference to Arc<Foo> into a reference to Foo.

Arc<T> is read only. Since multiple threads can read the same Arc<T> concurrently, allowing any thread to mutate the share value would cause a data race. To prohibit write access to Arc<T>, Arc<T> does not implement the DerefMut trait.

Data race happens when

  • multiple threads access the same shared data, and
  • at least one access is writing, and
  • the threads lack proper synchronization.

As a result, the behavior of the program is non-deterministic, depending on the relative order by which the instructions in different threads run. For example, the following C pseudocode has data race, which may print either 0 or 1.

void f(int *i) {
    *i = 1;
}

int main() {
    int i = 0;
    // pseudo code
    thread t = new_thread(f, &i);
    println("%d", i);
    t.join();
}

Mutable shared data

To allow threads to mutate shared data, we need a type that can enforce mutually exclusive access to the shared data at runtime. Rust's standard library provides std::sync::Mutex<T> for this purpose.

struct Mutex<T> { ... }

impl<T> Mutex<T> {
    fn new(t: T) -> Mutex<T> { ... }
    fn lock(&self) -> LockResult<MutexGuard<T>> { ... }
    ...
}

impl<T> Deref for MutexGuard<T> {
    type Target = T;
    ...
}

impl<T> DerefMut for MutexGuard<T> {
    type Target = T;
    ...
}

type LockResult<Guard> = Result<Guard, PoisonError<Guard>>;

A Mutex<T> value m contains a value v of type T. However, the thread cannot access v via m by simply deferencing m, because Mutex<T> implements neither Deref nor DerefMut. Instead, to access v, the thread must call the lock method of Mutex<T>, which returns a value g of the type MutexGuard<T>. Then, the thread deferences g to access v, because MutexGuard<T> implements both Deref and DerefMut.

While a thread t is holding a MutexGuard<T>, when another thread s calls lock() on the same Mutex<T>, s is blocked until the MutexGuard<T> value goes out of scope in t. This ensures mutually exclusive access to the value in Mutex<T>.

Mutex<T>::lock() returns the type LockResult<MutexGuard<T>>, which is an alias of Result<MutexGuard<T>, PoisonError<MutexGuard<T>>>. The value Err<PoisonError<MutexGuard<T>> indicates that the thread that held the MutexGuard<T> value panicked, which allows the calling thread an opportunity to handle this when necessary. A common use of Mutex<T> is to enforce an invariant on a shared mutable data. Before a thread mutates the data, the thread acquires mutually exclusive access to the data by calling Mutex<T>::lock() to get a MutexGuard<T>. While the thread is holding the MutexGuard<T>, it may modify the data during which the invariant may no longer hold. When the thread finishes modifying the data, it restores the invariant before releasing the MutexGuard<T>. This programming discipline ensures that whenever a thread acquires a MutexGuard<T>, it is confident that the invariant protected by the Mutex<T> always holds. However, when the thread holding the MutexGuard<T> panics, it may not have finished modifying the shared data and restored the invariant, so the Rust library poisons the Mutex<T> by letting Mutex<T>::lock() always return Err<PoisonError<MutexGuard<T>> in the future. This notifies the calling thread to handle the potentially invalid invariant.

Mutex<T> is a type with interior mutability. To mutate a value of most types, the program must hold a mutable reference to the value.1 This is called exterior mutability. By contrast, to mutate the value in a Mutex<T>, the program needs no mutable reference to Mutex<T>. Instead, the program acquires a MutexGuard<T> and mutates the value via MutexGuard<T>.

1

We can think of owning a value as having a special mutable reference to the value.

To allow multiple threads to access the same Mutex<T>, the program must wrap it in Arc<_>, and send a clone of the Arc<_> to each thread.


# #![allow(unused_variables)]
#fn main() {
struct Foo(i32);
let foo = Foo(0);
let mutex = std::sync::Mutex::new(foo);
let arc = std::sync::Arc::new(mutex);
let child;
{ 
    let arc = arc.clone();
    child = std::thread::spawn(move || { 
        let mut guard = arc.lock().unwrap();
        guard.0 += 1;
    });
}
{
    let mut guard = arc.lock().unwrap();
    guard.0 += 1;
}
child.join().unwrap();
let mut guard = arc.lock().unwrap();
assert_eq!(guard.0, 2);
#}

How Rust prevents common synchronization errors

This section assumes that the reader knows synchronization in operating systems.

It is difficult to write correct multi-threaded programs, because it is difficult to implement synchronization correctly. Common synchronization errors include the following:

  • Forget to acquire the lock before accessing a mutably shared object.
  • Forget to release the lock after accessing a shared object.
  • Place shared objects on the stack.

Detecting these errors in C is challenging, but Rust eliminates these errors at compile time, thanks to its powerful type system.

Error: forget to acquire the lock before accessing a mutably shared object

Using only safe Rust, multiple threads cannot share mutable values without using a synchronization type, such as Arc. The reason is as follows. A Rust program stores values in the static data section, stack, or the heap. Variables represent memory locations and their values in the static data section or stack. Values on the heap are accessible only via their boxes on the stack.

Immutable static variables can be read in different threads, but this is safe. Mutable static variables cannot be accessed in safe Rust.

Multiple threads may not share values on the stack (see the explanation). Therefore, threads must place shared values into Arc<_>. After that, the program can no longer access the original variables directly, because they have been moved. The only way to access these variables is via Arc.

Since Arc<_> implements only Deref but not DerefMut, the program can only read the value in Arc<_>. If the program needs to overwrite the value, it must place the value in Arc<Mutex<_>>. In this case, the program must call Mutex::lock() to get a MutexGuard. MutexGuard implements both the Deref and DerefMut traits, which autoderef to the type of the shared object to allow reading and/or writing the object.


# #![allow(unused_variables)]
#fn main() {
struct Foo(i32);
let foo = Foo(1);
let mutex = std::sync::Mutex::new(foo);
{ 
    let mut guard = mutex.lock().unwrap();
    assert_eq!(guard.0, 1);
    guard.0 = 2;
}
// Error: value is moved.
foo.0 = 2;
#}

Error: forget to release the lock after accessing a shared object

After the program calls Mutex::lock(), it gets a MutexGuard, which holds the lock. MutexGuard implements the Drop trait, so when it goes out of scope, it releases the lock automatically, so the thread cannot forget to release the lock (unless it enters an infinite loop or crashes).

Error: place shared objects on the stack

Since each thread has its own stack, a thread t1 should not access an object stored on the stack of another thread t2 unless the program can guarantee that t2 outlives t1; otherwise, the reference to the object that t1 holds will become dangling after t2 exits.

Rust's type system statically prevents the use of values after they go out of scope. Consider the following incorrect program.


# #![allow(unused_variables)]
#fn main() {
struct Foo(i32);
let foo = Foo(1);
// Error: the reference to `foo` in the closure outlives `foo`.
std::thread::spawn(|| println!("{}", foo.0));
#}

This program does not compile because spawn() requires its argument to be a closure of static lifetime. Rust declares spawn() this way because the new thread may live until the end of the process.

fn spawn<F, T>(f: F) -> JoinHandle<T> 
    where F: FnOnce() -> T, F: Send + 'static, T: Send + 'static

To avoid this error, the program has two choices:

  • Move the value into the closure. But in this case, this value isn't shared any more, because the parent thread loses this value.

  • Wrap the value in an Arc, and then clone it in every thread. Arc keeps only one copy of the value and releases it when all the owners of Arc go out of scope at runtime.


# #![allow(unused_variables)]
#fn main() {
struct Foo(i32);
let foo = Foo(1);
let shared = std::sync::Arc::new(foo);
let shared_ = shared.clone();
std::thread::spawn(move || println!("{}", shared_.0));
#}