Enumerations and Pattern Matching
Motivation: error reporting
A robust program should handle errors gracefully. For example, a function for computing the square root should indicate an error when the input is negative. But how to notify the caller of this error?
The simplest approach is to reserve a special return value for error. For example:
# #![allow(unused_variables)] #fn main() { fn sqrt(x: f64) -> f64 { if x < 0.0 { -1.0 } else { // Ignore the actual computation of the square root of `x` x } } #}
But this approach has two shortcomings. First, what if there is no such special value, i.e., all values of the return type are potentially valid? For example, a function reads a file and returns the content as a byte array. Since any byte array value could be the content of some file, there is no special byte array value for indicating errors. Second, the compiler cannot force the caller to check for errors, but ignoring errors may lead to disasters.
Some languages introduce exceptions to report errors. For example:
def sqrt(x):
if x < 0:
raise Exception
...
This approach also has shortcomings. First, the compiler cannot force the caller to check for errors. Second, even if the caller wishes to check for errors, it would be difficult to find all the potential exceptions that may be raised in the function.
Rust doesn't provide exceptions. Instead, Rust provides the enumeration type, which can handle errors and much more.
Enums
An enumeration type, enum
, contains several variants. Each variant has one of three forms:
- a name only.
- a name and a set of values.
- a name and a sequence of (name: value) pairs.
# #![allow(unused_variables)] #fn main() { enum Object { // This variant has only a name. Foo, // This variant has a name and a sequence of values. Bar(i32, bool), // This variant has a name and a set of (name: value) pairs. Baz{ x: i32, y: bool }, } let x = Object::Foo; let y = Object::Bar(1, true); let z = Object::Baz{ x: 1, y: true }; #}
When a variant has a name and a set of values (the second form above), it may be used as a function, whose parameter values are the values in the variant and whose return type is this enum.
# #![allow(unused_variables)] #fn main() { #[derive(Debug)] enum Object { // This variant has only a name. Foo, // This variant has a name and a sequence of types. Bar(i32, bool), // This variant has a name and a set of (name: types) pairs. Baz{ x: i32, y: bool }, } let f: fn(i32, bool) -> Object = Object::Bar; println!("{:?}", f(1, true)); #}
match
expression
An enum type has one or more variants, but an enum variable contains one and only one of those variants. To find which variant an enum variable contains and to extract the values in the variant, if any, use the match
expression.
A match
expression contains a head expression (the expression after the match
keyword) and a sequence of branches. A branch has the form pattern =>
expression. match
examines the patterns from top to bottom. If a pattern matches the head expression, the expression on that branch becomes the value of this match
expression.
# #![allow(unused_variables)] #fn main() { #[derive(Debug)] enum Color { Red, Yellow, Green, } enum Fruit { Orange, // The `bool` value represents if the banana is ripe. Banana(bool), Apple{ color: Color }, } fn print(x: Fruit) { match x { Fruit::Orange => println!("Orange"), // The variable `ripe` is bound to the value in this variant. Fruit::Banana(ripe) => println!("Banana is {}", if ripe { "ripe" } else { "raw" }), // The variable `color` is bound to the value named `color` in this variant. Fruit::Apple{ color } => println!("Apple is {:?}", color), } } #}
The above example illustrate several features of the match
expression.
-
If a pattern is a variant of an enum, it must be qualified by the name of the enum, e.g., use
Fruit::Orange
instead ofOrange
. -
If a variant has data, the program may capture the data by pattern matching. E.g., if
x
isFruit::Apple
, the program may capture the value ofcolor
in the variant by the patternFruit::Apple(color)
, which binds a new namecolor
to the value. Note that as we described in scope, whenever we rebind a name in a nested scope, it may shadow the same name in the parent scope. -
The patterns must be exhaustive, i.e., they must cover all the variants in the enum.
Matching other types
The match
expression can be used on types other than enum.
# #![allow(unused_variables)] #fn main() { fn f(x: i32) { println!("{}", match x { 0 => "zero", 1 => "one", 2 | 3 | 4 => "two, three, four", 5 ... 9 => "five -- nine", _ => "other", }); } #}
This example illustrates that a pattern may use
|
to specify multiple values....
to specify a range of values._
to specify the rest of the values. Sincematch
examines all the patterns from top to bottom,_
must appear as the last pattern.
Because the expression of the matched branch becomes the value of the match
expression, the above match
expression returns a string value.
Advanced bindings
When matching an enum variant with data, a program may use ..
to ignore some data.
# #![allow(unused_variables)] #fn main() { enum Shape { Circle{ x: i32, y: i32, r: i32 }, Rectangle{ x: i32, y: i32, w: i32, h: i32}, } let s = Shape::Circle{ x: 0, y: 0, r: 10 }; println!("{}", match s { // Bind new names `a`, `b`, `c` Shape::Circle{ x: a, y: b, r: c } => format!("Circle {} {} {}", a, b, c), // Ignore all the fields except `x` and `y` Shape::Rectangle{ x, y, .. } => format!("Rectangle {} {}", x, y), }); #}
When binding a name, the program moves the value, which may sometimes be undesirable.
# #![allow(unused_variables)] #fn main() { #[derive(Debug)] enum Color { Red, Yellow, Green, } #[derive(Debug)] enum Fruit { Orange, // The `bool` value represents if the banana is ripe. Banana(bool), Apple{ color: Color }, } fn print(x: Fruit) { match x { // `color` in `x` is moved into the new variable `color`. Fruit::Apple{ color } => println!("Apple is {:?}", color), _ => (), } // Error: x may have been partially moved because `color` in `Fruit::Apple` may have been moved. println!("{:?}", x); } #}
To avoid moving values in variants, use ref
or ref mut
in the pattern.
# #![allow(unused_variables)] #fn main() { #[derive(Debug)] enum Color { Red, Yellow, Green, } #[derive(Debug)] enum Fruit { Orange, // The `bool` value represents if the banana is ripe. Banana(bool), Apple{ color: Color }, } fn print(x: Fruit) { match x { // The name `color` is a reference to `color` in `Fruit::Apple(color)` // so no move of values here. Fruit::Apple{ ref color } => // No need to dereference `color` because of auto-deref println!("Apple is {:?}", color), _ => (), } // OK println!("{:?}", x); } #}
You may use match guard
to conditionally match patterns.
# #![allow(unused_variables)] #fn main() { #[derive(Debug)] enum Color { Red, Yellow, Green, } #[derive(Debug)] enum Fruit { Orange, // The `bool` value represents if the banana is ripe. Banana(bool), Apple{ color: Color }, } fn print(x: Fruit) { match x { Fruit::Banana(ripe) if ripe => println!("Ripe banana"), // Alternatively, `Banana(ripe) if !ripe => println!("Raw Banana"),` Fruit::Banana(ripe) => println!("Raw Banana"), _ => (), } // OK println!("{:?}", x); } #}
Error handling --- a first step
Let's revisit our initial motivation for the enum
type: error handling. To differentiate normal return values from errors, let's define
# #![allow(unused_variables)] #fn main() { enum Option { Some(f64), None, } fn sqrt(x: f64) -> Option { if x < 0.0 { Option::None } else { // Ignore the actual computation of the square root of `x` Option::Some(x) } } match sqrt(-1.0) { Option::Some(v) => println!("{}", v), Option::None => println!("Invalid argument"), } #}
Use Option
as the return type has these advantages:
- It unambiguously differentiates normal return values from errors.
- The caller cannot forget to check for error, because the return type is
Option
instead off64
. The caller extracts the value inOption
bymatch
.
Option
is so useful that it is defined in Rust's standard library. During testing, the program is sometimes just interested in the value in Some
and doesn't care about None
. In this case, the program can extract the value in Some
by unwrap()
:
# #![allow(unused_variables)] #fn main() { // `Option<f64> is a generic enum, to be discussed in later chapters. fn sqrt(x: f64) -> Option<f64> { if x < 0.0 { None } else { // Ignore the actual computation of the square root of `x` Some(x) } } println!("{}", sqrt(1.0).unwrap()); #}
When sqrt()
returns None
, unwrap()
panics (the thread crashes). The following shows a simplified implementation of unwrap()
:
# #![allow(unused_variables)] #fn main() { fn unwrap(x: Option) -> f64 { match x { Some(v) => v, None => panic!("..."), } } #}
if let
If a program is interested in only one variant of an enum
and wishes to test if a variable has this variant, it can use if let
:
# #![allow(unused_variables)] #fn main() { #[derive(Debug)] enum Color { Red, Yellow, Green, } #[derive(Debug)] enum Fruit { Orange, // The `bool` value represents if the banana is ripe. Banana(bool), Apple{ color: Color }, } fn print(x: Fruit) { if let Fruit::Banana(ripe) = x { println!("ripe banana"); } } #}
You should use if let
sparingly. Use it only when you are interested in only one variant and don't care about the others. Otherwise, use match
instead, because match
enforces exhaustive matching to prevent errors but if let
doesn't. The following is an anti-pattern for Rust:
# #![allow(unused_variables)] #fn main() { enum Color { Red, Yellow, Green, } fn f(x: Color) { // Anti-pattern if let Color::Red = x { println!("red"); } else if let Color::Yellow = x { println!("yellow"); } // Oops: forgot to match `Color::Green` } #}