Iterators
The power of for
Rust's for
loop is safer, more powerful, and more efficient than C's. For example, let's iterate over an array. To write the program in C style:
# #![allow(unused_variables)] #fn main() { let v = vec![1, 2, 3]; for i in 0..v.len() { println!("{}", v[i]); } #}
The above is an anti-pattern in Rust. In idiomatic Rust, it becomes:
# #![allow(unused_variables)] #fn main() { let v = vec![1, 2, 3]; for x in v { println!("{:?}", x); } #}
The advantages of the second, idiomatic version are:
-
correctness: since the code inside the
for
loop does not access the indexi
directly, it cannot access an element not inv
, e.g.,v[3]
. -
efficiency: since the code inside the
for
loop does not access the indexi
directly, the compiler need not generate code for verifying thati
is in the bound of the vector. -
flexibility: the order in which
for
iterates over the elements inv
is transparent to the code inside the loop, so the inside code works with any types of collections, be it a vector, tree, hash table, etc.
Create iterators
Iterator traits
Rust allows a program to iterate over any type T
in a for
loop, as long as T
implements the Iterator
trait.
# #![allow(unused_variables)] #fn main() { trait Iterator { type Item; fn next(&mut self) -> Option<Self::Item>; } trait IntoIterator where Self::IntoIter::Item == Self::Item { type Item; type IntoIter: Iterator; fn into_iter(self) -> Self::IntoIter; } impl<I> IntoIterator for I where I: Iterator #}
The above looks complicated, so let's dissect them one by one.
-
The first trait,
Iterator
, defines the interface for an iterator. When a caller invokesnext()
, the iterator returnsSome(_)
if it has more items, orNone
if it has exhausted all its items. -
The second trait,
IntoIterator
, defines the interface for creating anIterator
. -
The last line is interesting. It instructs Rust to automatically implement
IntoIterator
for any type that implementsIterator
.
Desugar for
loop
Rust's for
loop syntax is sugar for iterators. A program may use a for
loop on any type that implements IntoIterator
. The statement for i in v { statement }
desugars approximately to
let mut iter = IntoIterator::into_iter(v);
loop {
match iter.next() {
Some(x) => { statement },
None => break,
}
}
Create iterators
If you wish to allow your type T
to be usable in a for
loop, you have two options:
impl Iterator for T
, orimpl IntoIterator for T
Which one should you choose?
In the first option, you must implement the method next()
for T
, which requires T
to contain some state that tracks which elements have been returned and which element to return next.
In the second option, T
creates an iterator and returns it. It is the responsibility of this iterator to maintain some state to track which element to return next.
The following two programs contrast these two options.
# #![allow(unused_variables)] #fn main() { struct Counter { max: i32, // `count` tracks the state of this iterator. count: i32, } impl Counter { fn new(max: i32) -> Counter { Counter { count: -1, max: max } } } impl Iterator for Counter { type Item = i32; fn next(&mut self) -> Option<Self::Item> { self.count += 1; if self.count < self.max { Some(self.count) } else { None } } } for i in Counter::new(10) { println!("{}", i); } #}
# #![allow(unused_variables)] #fn main() { struct Counter { max: i32, // No need to track the state, because this isn't an iterator. } impl Counter { fn new(max: i32) -> Counter { Counter { max: max } } } impl IntoIterator for Counter { type Item = i32; type IntoIter = std::ops::Range<Self::Item>; fn into_iter(self) -> Self::IntoIter { std::ops::Range{ start: 0, end: self.max } } } for i in Counter::new(10) { println!("{}", i); } #}
Range operators
Rust's standard library defines several range operators in std::ops
. For example,
# #![allow(unused_variables)] #fn main() { struct Range<Idx> { pub start: Idx, pub end: Idx, } #}
The compiler treats a..b
as a shorthand for Range{ start: a, end: b }
. For example, the following two lines are equivalent:
for i in 0..n { ... }
for i in std::ops::Range{ start: 0, end: n } { ... }
There are two common uses of iterators.
- Adapters create a new iterator based on a current iterator.
- Consumers consume an iterator to produce another value.
Adapters
An iterator adapter is a function that takes an iterator and returns another iterator. Common adapters are map()
, take()
, and filter()
.
trait Iterator {
type Item;
fn filter<P>(self, predicate: P) -> Filter<Self, P>
where P: FnMut(&Self::Item) -> bool;
fn map<B, F>(self, f: F) -> Map<Self, F>
where F: FnMut(Self::Item) -> B;
fn take(self, n: usize) -> Take<Self>;
}
map()
map()
maps each element in the current iterator to an element in the new iterator. It takes a closure as the parameter. The closure takes one element in the current iterator and returns a value that becomes an element in the new iterator.
# #![allow(unused_variables)] #fn main() { for i in (0..10).map(|x| x * 2) { println!("{}", i); } #}
Iterators are lazy, in the sense that the next()
method is called only when the program iterates over the iterator. For example, the following program prints nothing.
# #![allow(unused_variables)] #fn main() { (0..10).map(|x| println!("{}", x)); #}
take()
take()
creates an iterator that contains the first n
elements of the current iterator.
# #![allow(unused_variables)] #fn main() { for i in (0..10).take(5) { println!("{}", i); } #}
filter()
filter()
creates a new iterator that contains some elements in the current iterator. It takes one parameter, which is a closure that determines whether each element in the current iterator should be included in the new iterator.
# #![allow(unused_variables)] #fn main() { for i in (0..10).filter(|x| x % 2 == 0) { println!("{}", i); } #}
Chaining
Since an adapter returns an iterator, a program can chain adapters.
# #![allow(unused_variables)] #fn main() { for i in (0..10).map(|x| x * 2).take(5).filter(|x| x % 3 == 0) { println!("{}", i); } #}
Consumers
A consumer also takes an iterator, but instead of returning another iterator, it returns a value that is not an iterator.
Common consumers are find()
, collect()
, and fold()
.
trait Iterator {
fn collect<B>(self) -> B where B: FromIterator<Self::Item>;
fn find<P>(&mut self, predicate: P) -> Option<Self::Item> where P: FnMut(&Self::Item) -> bool;
fn fold<B, F>(self, init: B, f: F) -> B where F: FnMut(B, Self::Item) -> B;
}
trait FromIterator<A> {
fn from_iter<T>(iter: T) -> Self where T: IntoIterator<Item=A>;
}
find()
find()
searches for the first element that satisfies a predicate. It returns Some(_)
if it finds it, or None
otherwise. find()
takes a parameter, which is a closure that takes a reference to each element in the iterator and returns true
or false
.
# #![allow(unused_variables)] #fn main() { assert_eq!((0..10).find(|&x| x > 5).unwrap(), 6); assert!((0..10).find(|&x| x > 10).is_none()); #}
collect()
collect()
returns a collection that contains all the elements in the iterator. The receiving type (the type of the returned collection) must implement the FromIterator
trait.
# #![allow(unused_variables)] #fn main() { struct Foo<T> { v: Vec<T>, } impl<A> std::iter::FromIterator<A> for Foo<A> { fn from_iter<T>(iter: T) -> Self where T: IntoIterator<Item=A> { let mut foo = Foo{ v: vec![] }; for i in iter { foo.v.push(i); } foo } } let foo: Foo<_> = (0..10).collect(); // or // let foo = (0..10).collect::<Foo<_>>(); println!("{:?}", foo.v); #}
fold()
fold()
applies a function to each element in the iterator to produce a single, final value. For example, the following program computes the sum of all the values in the array.
# #![allow(unused_variables)] #fn main() { assert_eq!((0..10).fold(0, |sum, x| sum + x), 45); #}
fold()
takes two arguments. The first argument is the initial value of the return value. The second argument is a closure that takes the return value computed so far and the next element to update the return value.
You can implement MapReduce by combining map()
with fold()
. For example, the following computes the squared sum of an array.
# #![allow(unused_variables)] #fn main() { let input = vec![1, 2, 3, 4, 5]; let output = input.iter().map(|x| x * x).fold(0, |x, y| x + y); assert_eq!(output, 55); #}