Closures
Function types
Functions are first-class objects in Rust, meaning that a program may use functions in the same way as other values. For example, the program may assign a function to a variable, and then invoke the function via the variable.
# #![allow(unused_variables)] #fn main() { fn max(x: i32, y: i32) -> i32 { if x > y { x } else { y } } let f: fn(i32, i32) -> i32 = max; assert_eq!(f(1, 2), 2); #}
Rust uses the keyword fn
to represent function types. A function type is parameterized by the types of its parameters and return value. For example,
# #![allow(unused_variables)] #fn main() { // A function type with no parameter or return value let x: fn(); // A function type with two parameters but no return value let y: fn(i32, bool); // A function type with one parameter and one return value let z: fn(i32) -> i32; #}
If two functions take the same types of parameters and return value, if any, in the same order, they have the same type.
# #![allow(unused_variables)] #fn main() { fn max(x: i32, y: i32) -> i32 { if x > y { x } else { y } } fn min(x: i32, y: i32) -> i32 { if x < y { x } else { y } } let mut f: fn(i32, i32) -> i32 = max; f = min; assert_eq!(f(1, 2), 1); #}
Internally, a variable of a function type contains a pointer to the function's entry point.
Anonymous functions
When a program assigns a function to a variable and later invokes the function only via the variable, the function name becomes unimportant. Rust allows a program to directly assign a function to a variable without naming the function. This is called an anonymous function.
# #![allow(unused_variables)] #fn main() { let a = |x: i32, y| if x > y { x } else { y }; assert_eq!(a(1, 2), 2); let b = |x: i32, y| if x < y { x } else { y }; assert_eq!(b(1, 2), 1); #}
A program defines an anonymous function slightly differently than a named function. The syntax is | parameter_list | body
. The parameter list contains zero or more comma-separated parameter names. Unlike in named functions, the program may omit the types of parameters and return value in anonymous functions. The body contains one expression. If it needs to contain multiple statements, enclose them in curly braces (i.e., construct a block expression).
The caller calls anonymous functions in the same way as it calls named functions.
A notable difference between anonymous and named functions is type equality. Two named functions are of the same type if they contain the same types of parameters and return value in the same order. By contrast, each anonymous function has a unique type. In other words, no two anonymous functions have the same type, even if they have the same types of parameters and return value.
# #![allow(unused_variables)] #fn main() { let mut f = |x: i32, y: i32| if x > y { x } else { y }; // Doesn't compile: the anonymous functions above and below have different types. f = |x: i32, y: i32| if x < y { x } else { y }; #}
Closures
Anonymous functions are not only simpler to write than named functions but also more powerful. They can access the variables defined outside them.
# #![allow(unused_variables)] #fn main() { let a = 1; let f = |x| x + a; assert_eq!(f(2), 3); #}
In the above example, the anonymous function f
uses the variable a
defined outside it. We call the variables defined outside f
and accessible to f
the environment of f
. We say f
captures its environment, or closes over its environment. To emphasize this property, Rust calls anonymous functions closures.
You may wonder why we need closures, because the above example seems contrived. Closures are part of the foundation for Rust's multi-threaded programming paradigm. In this paradigm, the program creates a closure and executes it in a new thread. Since a closure captures its environment, the child thread can use variables defined in the parent thread. For more information, see concurrency.
Capture environment
A closure captures its environment, but there are three different ways by which a closure may use a variable in its environment:
- Borrow by shared reference
- Borrow by mutable reference
- Move the value into the closure
The first option gives the code outside the closure, the enclosing code, the maximum flexibility, because the enclosing code can still read the variable. The second option gives less flexibility, because when the closure is in scope, the enclosing code cannot access the variable (because the closure is holding a mutable reference to the value); but after the closure goes out of scope, the enclosing code can continue to use the variable. The last option gives the enclosing code no flexibility, because the code can no longer access the variable after the value is moved into the closure.
The Rust compiler automatically selects the most flexible way of capture, in the order of the three options above, that satisfies the requirement of the closure.
# #![allow(unused_variables)] #fn main() { let mut a = vec![1, 2]; let mut b = vec![3, 4]; let mut c = vec![5, 6]; let x = || { // Takes `a` by shared reference assert_eq!(a[0], 1); // Takes `b` by mutable reference, because a shared reference wouldn't work b[0] = 1; // Moves `c` into the closure, because neither shared nor mutable reference // would work let d = c; }; x(); // OK assert_eq!(a[0], 1); // Compiler error: `b` is already mutably referenced. assert_eq!(b[0], 3); // Compiler error: `c` is moved. assert_eq!(c[0], 5); #}
When a program sends a closure to a thread, it needs to capture the environment by moving all the values used in the closure into the closure, no matter how the closure uses them. See concurrency for a more detailed explanation. To force the compiler to capture the environment by moving values only (not by taking shared or mutable references), use the move
keyword in front of the closure definition.
# #![allow(unused_variables)] #fn main() { let mut a = vec![1, 2]; let mut b = vec![3, 4]; let mut c = vec![5, 6]; let x = move || { // moves `a`, `b`, and `c` into the closure because of the `move` keyword assert_eq!(a[0], 1); b[0] = 1; let d = c; }; x(); // Compiler error: `a`, `b`, and `c` are moved. assert_eq!(a[0], 1); assert_eq!(b[0], 3); assert_eq!(c[0], 5); #}
Internal implementation
Internally, the Rust compiler transforms each closure into a struct that implements one of the three traits. Each trait declares a method that is parameterized by the types of its parameters and return value.
trait Fn<Args> : FnMut<Args> {
extern "rust-call" fn call(&self, args: Args) -> Self::Output;
}
trait FnMut<Args> : FnOnce<Args> {
extern "rust-call" fn call_mut(&mut self, args: Args) -> Self::Output;
}
trait FnOnce<Args> {
type Output;
extern "rust-call" fn call_once(self, args: Args) -> Self::Output;
}
For each closure, the compiler
-
defines an anonymous struct to represent the type of the closure. The struct contains a field for each captured variable;
-
creates a value of the struct that captures the variables in the environment;
-
implements one of the three above traits on the struct.
For example, Rust transforms this program
# #![allow(unused_variables)] #fn main() { let mut a = vec![1, 2]; let mut b = vec![3, 4]; let mut c = vec![5, 6]; let x = || { // Takes `a` by shared reference assert_eq!(a[0], 1); // Takes `b` by mutable reference, because a shared reference wouldn't work b[0] = 1; // Moves `c` into the closure, because neither shared nor mutable reference // would work let d = c; }; x(); #}
to
let mut a = vec![1, 2];
let mut b = vec![3, 4];
let mut c = vec![5, 6];
// Define a struct to represent the type of the closure
struct Closure1 {
a: &Vec<i32>,
b: &mut Vec<i32>,
c: Vec<i32>,
}
// Creates a value to capture the variables in the environment
let x = Closure1{ a: &a, b: &mut b, c: c };
// Implements a trait.
// This is for illustration purpose only.
impl FnOnce for Closure1 {
fn call_once(self) {
assert_eq!(self.a[0], 1);
self.b[0] = 1;
let d = self.c;
}
}
// Calls the closure
x.call_once();
Which trait to implement?
For each closure, the compiler selects one trait from Fn
, FnMut
, and FnOnce
to implement. The compiler selects a trait that provides the most flexibility to the code outside the closure.
# #![allow(unused_variables)] #fn main() { let a = vec![1, 2]; // The compiler implements the `Fn` trait, because the closure needs only a // shared reference to `a`. let x = || assert_eq!(a[0], 1); let mut b = vec![1, 2]; // The compiler implements the `FnMut` trait, because the closure needs // a mutable reference to `b`. let y = || b[0] = 2; let c = vec![1, 2]; // The compiler implements the `FnOnce` trait, because the closure needs // to move `c`. let z = || { let d = c; }; #}
Inheritance between traits
Note the inheritance between the three traits.
trait Fn<Args> : FnMut<Args> {
extern "rust-call" fn call(&self, args: Args) -> Self::Output;
}
trait FnMut<Args> : FnOnce<Args> {
extern "rust-call" fn call_mut(&mut self, args: Args) -> Self::Output;
}
trait FnOnce<Args> {
type Output;
extern "rust-call" fn call_once(self, args: Args) -> Self::Output;
}
This makes sense. If a closure implements Fn
, it can certainly implement FnMut
as well, as follows:
impl FnMut<Args> for _ {
fn call_mut(&mut self, args: Args) -> Self::Output {
self.call(self, args)
}
}
Similarly, if a closure implements FnMut
, it can certainly implement FnOnce
as follows:
impl FnOnce<Args> for _ {
fn call_once(self, args: Args) -> Self::Output {
self.call_mut(&self, args)
}
}
Closure type
Each closure has a unique anonymous type, which means that a program cannot declare the type of a closure.
On the other hand, each closure implements a trait, so it can be used anywhere a value implementing the trait is accepted.
Pass closures into functions
In the example below, the compiler monomorphizes the function foo
based on the type parameter F
. Since each closure has a unique type, the compiler monomorphizes (makes specialized copies of) foo
as many time as foo
is called. In this example, foo
is monomorphized twice.
# #![allow(unused_variables)] #fn main() { fn foo<F: Fn(i32) -> i32>(f: F, i: i32) -> i32 { f(i) } let a = 1; let b = |x| a + x; let c = |x| a + x; // Monomorphizes `foo` foo(b, 1); // Monomorphizes `foo` again foo(c, 2); #}
Alternatively, the function can take a closure as a trait object. In the example below, the compiler keeps only one copy of the function foo
.
# #![allow(unused_variables)] #fn main() { fn foo(f: &Fn(i32) -> i32, i: i32) -> i32 { f(i) } let a = 1; let b = |x| a + x; let c = |x| a + x; // Doesn't monomorphize foo(&b, 1); foo(&c, 2); #}
We can view named functions as special closures in that they capture no variable in the environment. Therefore, a program may provide a named function as an argument to a function parameter expecting a closure, as long as the named function and the closure have the same types of parameters and return value in the same order.
# #![allow(unused_variables)] #fn main() { fn foo<F: Fn(i32) -> i32>(f: F, i: i32) -> i32 { f(i) } fn f(x: i32) -> i32 { x + 1 } assert_eq!(foo(f, 1), 2); fn bar(f: &Fn(i32) -> i32, i: i32) -> i32 { f(i) } fn g(x: i32) -> i32 { x + 1 } assert_eq!(bar(&g, 1), 2); #}
Returning closures
Returning a closure presents a challenge to functions. The first problem is that the size of the closure is unknown to the caller at compile time, but the caller needs to know the size of the returned closure.
A common way to circumvent the problem of statically unknown sizes is to use pointers, since all pointers have the same size. In the case of traits, we could use trait objects, which are pointers. But trait objects present another problem. In the program below, the closure is created on the stack of the function f
, so its lifetime is that of f
, but f
returns a pointer to the closure. Since the pointer outlives the closure, this violates Rust's borrow checker.
# #![allow(unused_variables)] #fn main() { // Doesn't compile: the returned reference outlives the closure. fn f() -> &Fn(i32) -> i32 { let a = 1; |x| x + a } #}
The solution is to place the closure on the heap instead. In the example below, f
creates the closure in a Box
, whose value is placed on the heap.
# #![allow(unused_variables)] #fn main() { fn f() -> Box<Fn(i32) -> i32> { let a = 1; Box::new(move |x| x + a) } let a = f(); assert_eq!(a(2), 3); #}