Rust自定义迭代器与实现Iterator特型

Rust 自定义迭代器与实现 Iterator 特型

在 Rust 编程中，迭代器是一种强大且常用的工具。它提供了一种统一的方式来遍历集合或序列中的元素。Rust 标准库已经为许多内置数据结构提供了迭代器实现，比如 Vec、HashMap 等。然而，在某些情况下，我们需要为自己定义的数据结构创建自定义迭代器，这就涉及到实现 Iterator 特型。

迭代器的基本概念

迭代器是一种能够按顺序逐个产生一系列值的对象。在 Rust 中，迭代器遵循一种拉取（pull - based）模型。这意味着迭代器不会一次性生成所有的值，而是在调用者需要时生成下一个值。这种模型的优点在于它的高效性和灵活性，尤其是对于处理大型数据集或动态生成的数据。

Rust 中的迭代器是通过 Iterator 特型来定义的。这个特型定义了一系列方法，其中最主要的是 next 方法。next 方法每次调用时返回迭代器中的下一个元素，当没有更多元素时返回 None。

实现 Iterator 特型的基础

让我们从一个简单的例子开始，假设我们有一个自定义结构体 Counter，它有一个内部状态 count，我们希望通过迭代器从 0 开始逐个返回数字，直到达到某个上限。

struct Counter {
    count: u32,
    upper_bound: u32,
}

impl Counter {
    fn new(upper_bound: u32) -> Counter {
        Counter {
            count: 0,
            upper_bound,
        }
    }
}

impl Iterator for Counter {
    type Item = u32;

    fn next(&mut self) -> Option<Self::Item> {
        if self.count < self.upper_bound {
            let result = Some(self.count);
            self.count += 1;
            result
        } else {
            None
        }
    }
}

在上面的代码中：

我们首先定义了 Counter 结构体，它有两个字段：count 用于记录当前计数值，upper_bound 用于指定计数的上限。
接着，我们为 Counter 结构体实现了一个 new 方法，用于创建 Counter 的实例。
最重要的部分是实现 Iterator 特型。我们指定了 Item 类型为 u32，这是迭代器每次返回的元素类型。next 方法检查当前 count 是否小于 upper_bound。如果是，则返回 Some(self.count) 并将 count 加 1；否则返回 None，表示迭代结束。

我们可以这样使用这个自定义迭代器：

fn main() {
    let mut counter = Counter::new(5);
    while let Some(num) = counter.next() {
        println!("{}", num);
    }
}

上述代码创建了一个上限为 5 的 Counter 实例，并使用 while let 循环逐个打印出迭代器返回的值，最终输出 0 到 4。

迭代器方法链

Rust 的迭代器支持方法链，这使得我们可以对迭代器进行一系列操作。一旦我们为自定义类型实现了 Iterator 特型，我们就可以使用标准库中提供的各种迭代器适配器方法。

例如，我们可以对 Counter 迭代器使用 map 方法来对每个返回的值进行转换。

fn main() {
    let mut counter = Counter::new(5);
    let squared_counter = counter.map(|num| num * num);
    while let Some(squared_num) = squared_counter.next() {
        println!("{}", squared_num);
    }
}

在上述代码中，map 方法接受一个闭包，该闭包将迭代器中的每个元素作为参数，并返回转换后的结果。这里，我们将 Counter 迭代器中的每个数字平方，然后遍历打印出平方后的结果。

更复杂的自定义迭代器示例：树形结构迭代

让我们考虑一个更复杂的例子，实现一个二叉树的迭代器。

struct TreeNode<T> {
    value: T,
    left: Option<Box<TreeNode<T>>>,
    right: Option<Box<TreeNode<T>>>,
}

impl<T> TreeNode<T> {
    fn new(value: T) -> TreeNode<T> {
        TreeNode {
            value,
            left: None,
            right: None,
        }
    }
}

struct TreeIterator<T> {
    stack: Vec<Box<TreeNode<T>>>,
}

impl<T> TreeIterator<T> {
    fn new(root: Option<Box<TreeNode<T>>>) -> TreeIterator<T> {
        let mut stack = Vec::new();
        if let Some(mut node) = root {
            stack.push(node);
            while let Some(mut current) = stack.last_mut() {
                if let Some(ref mut left) = current.left {
                    stack.push(left.take());
                } else {
                    break;
                }
            }
        }
        TreeIterator { stack }
    }
}

impl<T> Iterator for TreeIterator<T> {
    type Item = T;

    fn next(&mut self) -> Option<Self::Item> {
        loop {
            let node = self.stack.pop()?;
            let value = node.value;
            if let Some(mut right) = node.right {
                self.stack.push(right);
                while let Some(mut current) = self.stack.last_mut() {
                    if let Some(ref mut left) = current.left {
                        self.stack.push(left.take());
                    } else {
                        break;
                    }
                }
            }
            return Some(value);
        }
    }
}

在上述代码中：

我们定义了 TreeNode 结构体来表示二叉树的节点，每个节点包含一个值 value 以及左右子节点的 Option<Box<TreeNode<T>>>。
接着定义了 TreeIterator 结构体，它使用一个 Vec<Box<TreeNode<T>>> 来模拟栈，用于实现深度优先遍历。
TreeIterator 的 new 方法初始化栈，并将根节点及其左子树节点依次压入栈中。
在 Iterator 特型的实现中，next 方法从栈中弹出节点，返回节点的值，并将其右子树及其左子树节点压入栈中，这样就实现了深度优先的迭代。

我们可以这样使用这个二叉树迭代器：

fn main() {
    let mut root = TreeNode::new(1);
    root.left = Some(Box::new(TreeNode::new(2)));
    root.right = Some(Box::new(TreeNode::new(3)));
    root.left.as_mut().unwrap().left = Some(Box::new(TreeNode::new(4)));
    root.left.as_mut().unwrap().right = Some(Box::new(TreeNode::new(5)));

    let mut tree_iter = TreeIterator::new(Some(Box::new(root)));
    while let Some(value) = tree_iter.next() {
        println!("{}", value);
    }
}

上述代码构建了一个简单的二叉树，并使用 TreeIterator 进行深度优先遍历，依次打印出节点的值。

迭代器与所有权

在 Rust 中，所有权规则在迭代器的使用和实现中起着重要作用。当我们实现 Iterator 特型时，需要注意迭代器方法对数据所有权的影响。

例如，在前面的 Counter 示例中，next 方法返回的是 Option<Self::Item>，这里 Item 类型的所有权被转移出迭代器。如果我们希望迭代器保留对元素的所有权，同时返回借用的元素，可以修改 next 方法的返回类型为 Option<&Self::Item>。

struct Counter {
    count: u32,
    upper_bound: u32,
}

impl Counter {
    fn new(upper_bound: u32) -> Counter {
        Counter {
            count: 0,
            upper_bound,
        }
    }
}

impl Iterator for Counter {
    type Item = &u32;

    fn next(&mut self) -> Option<Self::Item> {
        if self.count < self.upper_bound {
            let result = Some(&self.count);
            self.count += 1;
            result
        } else {
            None
        }
    }
}

在这个修改后的版本中，next 方法返回的是 &u32，这样迭代器仍然拥有 count 的所有权，而调用者只是借用了值。

双端迭代器（Double - Ended Iterator）

除了基本的 Iterator 特型，Rust 还提供了 DoubleEndedIterator 特型，它允许我们从迭代器的两端进行迭代。要实现 DoubleEndedIterator，我们需要为自定义类型同时实现 next 和 next_back 方法。

例如，对于一个简单的双向链表结构，我们可以实现 DoubleEndedIterator。

struct Node<T> {
    value: T,
    next: Option<Box<Node<T>>>,
    prev: Option<Box<Node<T>>>,
}

struct DoubleLinkedList<T> {
    head: Option<Box<Node<T>>>,
    tail: Option<Box<Node<T>>>,
}

impl<T> DoubleLinkedList<T> {
    fn new() -> DoubleLinkedList<T> {
        DoubleLinkedList {
            head: None,
            tail: None,
        }
    }

    fn push_front(&mut self, value: T) {
        let new_node = Box::new(Node {
            value,
            next: self.head.take(),
            prev: None,
        });
        if let Some(mut old_head) = new_node.next.take() {
            old_head.prev = Some(new_node);
            self.head = Some(old_head);
        } else {
            self.head = Some(new_node);
            self.tail = self.head.clone();
        }
    }

    fn push_back(&mut self, value: T) {
        let new_node = Box::new(Node {
            value,
            next: None,
            prev: self.tail.take(),
        });
        if let Some(mut old_tail) = new_node.prev.take() {
            old_tail.next = Some(new_node);
            self.tail = Some(old_tail);
        } else {
            self.tail = Some(new_node);
            self.head = self.tail.clone();
        }
    }
}

struct DoubleListIterator<T> {
    current: Option<Box<Node<T>>>,
    forward: bool,
}

impl<T> DoubleListIterator<T> {
    fn new(head: Option<Box<Node<T>>>, forward: bool) -> DoubleListIterator<T> {
        DoubleListIterator {
            current: head,
            forward,
        }
    }
}

impl<T> Iterator for DoubleListIterator<T> {
    type Item = T;

    fn next(&mut self) -> Option<Self::Item> {
        if self.forward {
            let node = self.current.take()?;
            self.current = node.next;
            Some(node.value)
        } else {
            let node = self.current.take()?;
            self.current = node.prev;
            Some(node.value)
        }
    }
}

impl<T> DoubleEndedIterator for DoubleListIterator<T> {
    fn next_back(&mut self) -> Option<Self::Item> {
        if self.forward {
            let node = self.current.take()?;
            self.current = node.prev;
            Some(node.value)
        } else {
            let node = self.current.take()?;
            self.current = node.next;
            Some(node.value)
        }
    }
}

impl<T> DoubleLinkedList<T> {
    fn iter(&self) -> DoubleListIterator<&T> {
        let head = self.head.as_ref().map(|node| {
            Box::new(Node {
                value: &node.value,
                next: node.next.as_ref().map(|next| Box::new(Node {
                    value: &next.value,
                    next: None,
                    prev: None,
                })),
                prev: None,
            })
        });
        DoubleListIterator::new(head, true)
    }

    fn iter_back(&self) -> DoubleListIterator<&T> {
        let tail = self.tail.as_ref().map(|node| {
            Box::new(Node {
                value: &node.value,
                next: None,
                prev: node.prev.as_ref().map(|prev| Box::new(Node {
                    value: &prev.value,
                    next: None,
                    prev: None,
                })),
            })
        });
        DoubleListIterator::new(tail, false)
    }
}

在上述代码中：

我们定义了 Node 和 DoubleLinkedList 结构体来表示双向链表。
DoubleListIterator 结构体用于迭代双向链表，它通过 forward 标志来决定是正向还是反向迭代。
我们为 DoubleListIterator 实现了 Iterator 和 DoubleEndedIterator 特型，分别实现了 next 和 next_back 方法。
DoubleLinkedList 提供了 iter 和 iter_back 方法来获取正向和反向的迭代器。

我们可以这样使用双向链表的迭代器：

fn main() {
    let mut list = DoubleLinkedList::new();
    list.push_front(1);
    list.push_back(2);
    list.push_front(0);

    println!("Forward iteration:");
    for value in list.iter() {
        println!("{}", value);
    }

    println!("Backward iteration:");
    for value in list.iter_back() {
        println!("{}", value);
    }
}

上述代码构建了一个双向链表，并分别使用正向和反向迭代器进行遍历，打印出链表中的元素。

无限迭代器

在某些情况下，我们可能需要创建无限迭代器。实现无限迭代器时，next 方法永远不会返回 None。

例如，我们可以创建一个无限生成斐波那契数列的迭代器。

struct Fibonacci {
    a: u64,
    b: u64,
}

impl Fibonacci {
    fn new() -> Fibonacci {
        Fibonacci { a: 0, b: 1 }
    }
}

impl Iterator for Fibonacci {
    type Item = u64;

    fn next(&mut self) -> Option<Self::Item> {
        let result = self.a;
        self.a = self.b;
        self.b = result + self.b;
        Some(result)
    }
}

在上述代码中，Fibonacci 结构体保存当前斐波那契数列中的两个数 a 和 b。next 方法返回当前的 a 值，并更新 a 和 b 为下两个斐波那契数。由于这个迭代器永远不会结束，我们在使用时通常需要结合 take 等方法来限制迭代的次数。

fn main() {
    let fibonacci_iter = Fibonacci::new();
    let first_10_fibonacci = fibonacci_iter.take(10);
    for num in first_10_fibonacci {
        println!("{}", num);
    }
}

上述代码使用 take 方法从无限的斐波那契数列迭代器中获取前 10 个斐波那契数并打印出来。

通过以上各种示例，我们深入了解了如何在 Rust 中自定义迭代器并实现 Iterator 特型，以及相关的所有权、双端迭代和无限迭代等概念。这使得我们能够根据具体需求为自定义数据结构提供高效、灵活的迭代方式，进一步提升 Rust 程序的功能和性能。