C++ STL 算法 sort 的数据类型适配

1. C++ STL 中 sort 算法概述

在 C++ 标准模板库（STL）中，sort 算法是用于对序列进行排序的强大工具。它定义在 <algorithm> 头文件中，提供了对各种容器中的元素进行排序的能力。sort 算法采用的是一种高效的排序策略，通常是快速排序（QuickSort）、插入排序（Insertion Sort）和堆排序（HeapSort）的混合使用，以达到较好的平均性能和最坏情况性能。

sort 算法有两种常见的重载形式：

// 形式一：使用默认比较函数（升序）
template<class RandomIt>
void sort(RandomIt first, RandomIt last);

// 形式二：使用自定义比较函数
template<class RandomIt, class Compare>
void sort(RandomIt first, RandomIt last, Compare comp);

在第一种形式中，sort 会使用元素类型的默认比较运算符 < 来进行升序排序。而在第二种形式中，用户可以提供一个自定义的比较函数 comp，从而实现按照特定的规则进行排序。

2. 基本数据类型的适配

2.1 整数类型

对于基本的整数类型，如 int、long、short 等，sort 算法可以直接使用默认的比较函数进行排序。这是因为这些类型已经定义了 < 运算符。

#include <iostream>
#include <algorithm>
#include <vector>

int main() {
    std::vector<int> numbers = {5, 2, 8, 1, 9};
    std::sort(numbers.begin(), numbers.end());

    for (int num : numbers) {
        std::cout << num << " ";
    }
    std::cout << std::endl;

    return 0;
}

在上述代码中，std::vector<int> 中的元素通过 std::sort 使用默认比较函数进行升序排序，输出结果为 1 2 5 8 9。

2.2 浮点数类型

同样，对于浮点数类型，如 float 和 double，sort 算法也能直接使用默认比较函数。

#include <iostream>
#include <algorithm>
#include <vector>

int main() {
    std::vector<double> floats = {3.14, 1.618, 2.718, 0.577};
    std::sort(floats.begin(), floats.end());

    for (double f : floats) {
        std::cout << f << " ";
    }
    std::cout << std::endl;

    return 0;
}

这段代码对 std::vector<double> 中的浮点数进行升序排序并输出。

2.3 字符类型

字符类型 char 也遵循同样的规则。sort 会根据字符的 ASCII 码值进行排序。

#include <iostream>
#include <algorithm>
#include <vector>

int main() {
    std::vector<char> chars = {'d', 'a', 'c', 'b'};
    std::sort(chars.begin(), chars.end());

    for (char c : chars) {
        std::cout << c << " ";
    }
    std::cout << std::endl;

    return 0;
}

输出结果为 a b c d，因为按照 ASCII 码值，a 小于 b，以此类推。

3. 自定义数据类型的适配

3.1 结构体类型

当处理自定义的结构体类型时，默认情况下 sort 算法不知道如何比较结构体的实例。因此，我们需要为结构体定义比较运算符 < 或者提供一个自定义的比较函数。

假设我们有一个表示二维点的结构体 Point：

struct Point {
    int x;
    int y;
};

如果我们想按照 x 坐标升序排序，当 x 坐标相同时按照 y 坐标升序排序，可以定义 < 运算符如下：

struct Point {
    int x;
    int y;

    bool operator<(const Point& other) const {
        if (x != other.x) {
            return x < other.x;
        }
        return y < other.y;
    }
};

然后就可以使用 sort 对 std::vector<Point> 进行排序：

#include <iostream>
#include <algorithm>
#include <vector>

struct Point {
    int x;
    int y;

    bool operator<(const Point& other) const {
        if (x != other.x) {
            return x < other.x;
        }
        return y < other.y;
    }
};

int main() {
    std::vector<Point> points = { {2, 3}, {1, 4}, {2, 1} };
    std::sort(points.begin(), points.end());

    for (const Point& p : points) {
        std::cout << "(" << p.x << ", " << p.y << ") ";
    }
    std::cout << std::endl;

    return 0;
}

输出结果为 (1, 4) (2, 1) (2, 3)，符合我们定义的排序规则。

3.2 类类型

对于类类型，情况与结构体类似。假设我们有一个 Person 类：

class Person {
private:
    std::string name;
    int age;
public:
    Person(const std::string& n, int a) : name(n), age(a) {}

    // 定义比较运算符，按照年龄升序排序，年龄相同时按照名字字典序排序
    bool operator<(const Person& other) const {
        if (age != other.age) {
            return age < other.age;
        }
        return name < other.name;
    }

    friend std::ostream& operator<<(std::ostream& os, const Person& p) {
        os << p.name << "(" << p.age << ")";
        return os;
    }
};

然后可以对 std::vector<Person> 进行排序：

#include <iostream>
#include <algorithm>
#include <vector>
#include <string>

class Person {
private:
    std::string name;
    int age;
public:
    Person(const std::string& n, int a) : name(n), age(a) {}

    bool operator<(const Person& other) const {
        if (age != other.age) {
            return age < other.age;
        }
        return name < other.name;
    }

    friend std::ostream& operator<<(std::ostream& os, const Person& p) {
        os << p.name << "(" << p.age << ")";
        return os;
    }
};

int main() {
    std::vector<Person> people = { {"Alice", 25}, {"Bob", 20}, {"Charlie", 25} };
    std::sort(people.begin(), people.end());

    for (const Person& p : people) {
        std::cout << p << " ";
    }
    std::cout << std::endl;

    return 0;
}

输出结果为 Bob(20) Alice(25) Charlie(25)，满足我们定义的排序逻辑。

4. 使用自定义比较函数进行适配

4.1 函数指针作为比较函数

除了为自定义类型定义 < 运算符，我们还可以通过提供一个自定义的比较函数来适配 sort 算法。以 Point 结构体为例，假设我们想按照 y 坐标降序排序，当 y 坐标相同时按照 x 坐标降序排序，可以定义一个比较函数：

bool comparePointsByYDescending(const Point& a, const Point& b) {
    if (a.y != b.y) {
        return a.y > b.y;
    }
    return a.x > b.x;
}

然后在调用 sort 时使用这个函数：

#include <iostream>
#include <algorithm>
#include <vector>

struct Point {
    int x;
    int y;
};

bool comparePointsByYDescending(const Point& a, const Point& b) {
    if (a.y != b.y) {
        return a.y > b.y;
    }
    return a.x > b.x;
}

int main() {
    std::vector<Point> points = { {2, 3}, {1, 4}, {2, 1} };
    std::sort(points.begin(), points.end(), comparePointsByYDescending);

    for (const Point& p : points) {
        std::cout << "(" << p.x << ", " << p.y << ") ";
    }
    std::cout << std::endl;

    return 0;
}

输出结果为 (1, 4) (2, 3) (2, 1)，符合我们定义的降序排序规则。

4.2 函数对象作为比较函数

函数对象（functor）是一个重载了 () 运算符的类的实例。它比函数指针更灵活，因为它可以有内部状态。

以 Person 类为例，假设我们想按照名字长度降序排序，当名字长度相同时按照年龄升序排序，可以定义一个函数对象：

class ComparePersonByLengthDescending {
public:
    bool operator()(const Person& a, const Person& b) const {
        if (a.name.length() != b.name.length()) {
            return a.name.length() > b.name.length();
        }
        return a.age < b.age;
    }
};

然后在 sort 中使用这个函数对象：

#include <iostream>
#include <algorithm>
#include <vector>
#include <string>

class Person {
private:
    std::string name;
    int age;
public:
    Person(const std::string& n, int a) : name(n), age(a) {}

    friend std::ostream& operator<<(std::ostream& os, const Person& p) {
        os << p.name << "(" << p.age << ")";
        return os;
    }
};

class ComparePersonByLengthDescending {
public:
    bool operator()(const Person& a, const Person& b) const {
        if (a.name.length() != b.name.length()) {
            return a.name.length() > b.name.length();
        }
        return a.age < b.age;
    }
};

int main() {
    std::vector<Person> people = { {"Alice", 25}, {"Bob", 20}, {"Charlie", 25} };
    std::sort(people.begin(), people.end(), ComparePersonByLengthDescending());

    for (const Person& p : people) {
        std::cout << p << " ";
    }
    std::cout << std::endl;

    return 0;
}

输出结果将按照名字长度降序，长度相同时按年龄升序排列。

4.3 Lambda 表达式作为比较函数

C++11 引入的 Lambda 表达式为定义简单的比较函数提供了一种简洁的方式。对于 Point 结构体，假设我们想按照到原点的距离升序排序，可以使用 Lambda 表达式：

#include <iostream>
#include <algorithm>
#include <vector>
#include <cmath>

struct Point {
    int x;
    int y;
};

int main() {
    std::vector<Point> points = { {2, 3}, {1, 4}, {2, 1} };
    std::sort(points.begin(), points.end(), [](const Point& a, const Point& b) {
        double distA = std::sqrt(a.x * a.x + a.y * a.y);
        double distB = std::sqrt(b.x * b.x + b.y * b.y);
        return distA < distB;
    });

    for (const Point& p : points) {
        std::cout << "(" << p.x << ", " << p.y << ") ";
    }
    std::cout << std::endl;

    return 0;
}

在这个例子中，Lambda 表达式计算每个点到原点的距离，并按照距离升序对 points 向量进行排序。

5. 容器类型的适配

5.1 std::vector

std::vector 是最常用的支持随机访问的容器，与 sort 算法天然适配。因为 sort 要求迭代器是随机访问迭代器，而 std::vector 的迭代器恰好满足这一条件。前面的代码示例中已经多次展示了对 std::vector 使用 sort 算法进行排序。

5.2 std::deque

std::deque（双端队列）同样支持随机访问迭代器，所以也能直接使用 sort 算法。

#include <iostream>
#include <algorithm>
#include <deque>

int main() {
    std::deque<int> numbers = {5, 2, 8, 1, 9};
    std::sort(numbers.begin(), numbers.end());

    for (int num : numbers) {
        std::cout << num << " ";
    }
    std::cout << std::endl;

    return 0;
}

这段代码对 std::deque<int> 中的元素进行升序排序并输出。

5.3 std::list

std::list 是一个双向链表，它的迭代器是双向迭代器，不满足 sort 算法对随机访问迭代器的要求。但是 std::list 自身提供了一个 sort 成员函数，其实现通常是基于归并排序。

#include <iostream>
#include <list>

int main() {
    std::list<int> numbers = {5, 2, 8, 1, 9};
    numbers.sort();

    for (int num : numbers) {
        std::cout << num << " ";
    }
    std::cout << std::endl;

    return 0;
}

如果非要对 std::list 使用 <algorithm> 中的 sort，可以先将 std::list 中的元素复制到一个支持随机访问迭代器的容器（如 std::vector）中，排序后再复制回 std::list。

5.4 关联容器（std::map 和 std::set）

关联容器 std::map 和 std::set 本身就是有序的，它们内部使用红黑树等数据结构来维护元素的有序性。因此，对它们使用 sort 算法没有意义。std::map 根据键（key）自动排序，std::set 中的元素也是自动排序的。

例如，std::map 按照键的顺序存储元素：

#include <iostream>
#include <map>

int main() {
    std::map<int, std::string> myMap = { {3, "three"}, {1, "one"}, {2, "two"} };

    for (const auto& pair : myMap) {
        std::cout << pair.first << ": " << pair.second << std::endl;
    }

    return 0;
}

输出结果将按照键 1, 2, 3 的顺序打印，因为 std::map 自动按键排序。

6. 注意事项

6.1 比较函数的严格弱序

无论是定义 < 运算符还是提供自定义比较函数，都必须满足严格弱序（strict weak ordering）的要求。严格弱序定义了一种类似于 “小于” 的关系，它必须满足以下性质：

非自反性：对于任意 x，x < x 为 false。
传递性：如果 x < y 且 y < z，那么 x < z。
反对称性：如果 x < y，那么 y < x 为 false。
可传递的等价性：如果 !(x < y) 且 !(y < x)，那么对于任意 z，(x < z) 等价于 (y < z) 且 (z < x) 等价于 (z < y)。

如果不满足严格弱序，sort 算法的行为是未定义的。

6.2 性能考虑

虽然 sort 算法通常具有较好的平均性能，但在某些特定情况下，选择合适的排序策略可能会获得更好的性能。例如，如果数据量较小，插入排序可能比 sort 算法使用的混合排序策略更快。另外，如果数据已经部分有序，一些自适应排序算法可能会有更好的表现。

在对自定义类型进行排序时，比较函数的复杂度也会影响排序的整体性能。如果比较函数的时间复杂度较高，可能会导致排序性能下降。

6.3 稳定性

sort 算法并不保证稳定性，即相等元素的相对顺序在排序后可能会改变。如果需要稳定性，可以使用 stable_sort 算法，它具有与 sort 类似的接口，但保证相等元素的相对顺序不变。不过，stable_sort 的平均时间复杂度通常比 sort 略高。

#include <iostream>
#include <algorithm>
#include <vector>

struct Point {
    int x;
    int y;
    int id;

    bool operator<(const Point& other) const {
        return x < other.x;
    }
};

int main() {
    std::vector<Point> points = { {2, 3, 1}, {2, 1, 2}, {1, 4, 3} };
    std::stable_sort(points.begin(), points.end());

    for (const Point& p : points) {
        std::cout << "(" << p.x << ", " << p.y << ") id: " << p.id << std::endl;
    }

    return 0;
}

在这个例子中，stable_sort 保证了 x 坐标相等时，元素的相对顺序（根据 id）不变。

通过以上对 C++ STL 算法 sort 数据类型适配的详细阐述，我们可以看到它在处理各种数据类型和容器时的强大能力和灵活性。在实际应用中，我们需要根据具体需求，合理地选择和定义比较函数，以确保 sort 算法能够正确、高效地工作。