Redis 渐进式 rehash 的资源分配策略

Redis 中的 Rehash 概念

在深入探讨渐进式 rehash 的资源分配策略之前，我们先来理解一下什么是 rehash。Redis 是一个基于字典结构实现的 key - value 存储系统，而字典在底层通常使用哈希表来实现。哈希表的大小会随着数据的不断插入和删除而发生变化。

当哈希表中的元素数量过多或者过少时，为了保持较好的性能，需要调整哈希表的大小。这个调整哈希表大小并重新分布元素的过程就叫做 rehash。

在简单的 rehash 过程中，通常会创建一个新的、更大或更小的哈希表，然后将旧哈希表中的所有元素重新计算哈希值并插入到新哈希表中。然而，对于 Redis 这样的高性能数据库，简单的一次性 rehash 操作可能会带来性能问题，因为在数据量较大时，这个操作会消耗大量的 CPU 时间，导致 Redis 在这段时间内无法正常处理其他客户端请求。

渐进式 Rehash 的引入

为了解决一次性 rehash 带来的性能问题，Redis 采用了渐进式 rehash 策略。渐进式 rehash 不会一次性将所有数据从旧哈希表迁移到新哈希表，而是分多次、逐步完成这个过程。

在渐进式 rehash 过程中，Redis 会同时保留旧哈希表和新哈希表。在处理客户端请求时，会逐步将旧哈希表中的元素迁移到新哈希表。这样可以避免在短时间内消耗过多的 CPU 资源，保证 Redis 在 rehash 过程中仍能正常响应客户端请求。

Redis 渐进式 Rehash 的资源分配策略详解

哈希表结构及相关数据结构

Redis 中哈希表的实现主要涉及 dict 结构和 dictht 结构。dict 结构定义如下：

typedef struct dict {
    dictType *type;
    void *privdata;
    dictht ht[2];
    long rehashidx; /* rehashing not in progress if rehashidx == -1 */
    int iterators; /* number of iterators currently running */
} dict;

其中，ht 数组包含两个 dictht 结构，分别用于保存旧哈希表和新哈希表。rehashidx 字段用于记录 rehash 进行到的位置，如果 rehashidx 为 -1，表示当前没有进行 rehash。

dictht 结构定义如下：

typedef struct dictht {
    dictEntry **table;
    unsigned long size;
    unsigned long sizemask;
    unsigned long used;
} dictht;

table 是一个指向 dictEntry 指针数组的指针，size 表示哈希表的大小，sizemask 用于计算哈希值在数组中的索引位置，used 表示哈希表中已使用的节点数量。

触发渐进式 Rehash 的条件

Redis 会在以下两种情况下触发渐进式 rehash：

哈希表元素过多：当哈希表中已使用的节点数量（used）达到哈希表大小（size）的一定比例时，会触发 rehash 并扩大哈希表。这个比例在 Redis 中默认是 1:1，即当 used >= size 时，会触发 rehash 并将新哈希表的大小设置为当前大小的两倍。
哈希表元素过少：当哈希表中已使用的节点数量（used）与哈希表大小（size）的比例小于一定阈值时，会触发 rehash 并缩小哈希表。这个阈值在 Redis 中默认是 1:10，即当 used <= size / 10 时，会触发 rehash 并将新哈希表的大小设置为当前大小的一半。

渐进式 Rehash 的具体过程

初始化新哈希表：当触发 rehash 条件时，Redis 会根据需要创建一个新的哈希表。如果是扩大哈希表，新哈希表的大小是当前哈希表大小的两倍；如果是缩小哈希表，新哈希表的大小是当前哈希表大小的一半。
逐步迁移：在 rehashidx 字段不为 -1 时，表示正在进行渐进式 rehash。每次执行哈希表相关的操作（如插入、查找、删除等）时，Redis 除了执行正常的操作外，还会将旧哈希表中 rehashidx 位置的一个桶（bucket）中的所有元素迁移到新哈希表中。迁移完成后，rehashidx 会递增，指向下一个要迁移的桶。
完成迁移：当 rehashidx 递增到旧哈希表的大小，表示所有元素都已迁移完成。此时，Redis 会释放旧哈希表的内存，将新哈希表设置为当前使用的哈希表，并将 rehashidx 设为 -1，标志着 rehash 过程结束。

资源分配策略分析

内存资源分配：在渐进式 rehash 开始时，会分配新哈希表所需的内存。如果是扩大哈希表，新哈希表的大小翻倍，因此需要分配更多的内存；如果是缩小哈希表，新哈希表的大小减半，会释放部分内存。在迁移过程中，由于同时存在旧哈希表和新哈希表，内存使用会暂时增加。但随着迁移的进行，旧哈希表中的元素逐渐减少，最终当迁移完成后，旧哈希表的内存被释放，内存使用恢复到正常水平。
CPU 资源分配：渐进式 rehash 的核心是将一次性的大量计算任务拆分成多个小任务，分散在每次哈希表操作中执行。这样可以避免在短时间内消耗过多的 CPU 资源，保证 Redis 在 rehash 过程中仍能正常处理客户端请求。每次操作迁移一个桶的元素，这个过程中 CPU 资源的消耗相对较小，不会对 Redis 的整体性能造成太大影响。

代码示例分析

下面我们通过分析 Redis 源码中的部分关键代码来深入理解渐进式 rehash 的实现。

触发 Rehash 的代码

在 dict.c 文件中，dictExpand 函数用于触发 rehash 并扩大哈希表：

int dictExpand(dict *d, unsigned long size) {
    dictht n;
    unsigned long realsize = _dictNextPower(size);

    /* the size is invalid if it is smaller than the number of
     * elements already inside the hash table */
    if (dictIsRehashing(d) || d->ht[0].used > size)
        return DICT_ERR;

    n.size = realsize;
    n.sizemask = realsize - 1;
    n.table = zcalloc(realsize * sizeof(dictEntry *));
    n.used = 0;

    /* Is this the first initialization? If so it's not really a rehashing
     * we just set the first hash table so that it can accept keys. */
    if (d->ht[0].table == NULL) {
        d->ht[0] = n;
        return DICT_OK;
    }

    d->ht[1] = n;
    d->rehashidx = 0;
    return DICT_OK;
}

该函数首先计算新哈希表的大小 realsize，然后分配新哈希表的内存 zcalloc(realsize * sizeof(dictEntry *))。如果当前不是首次初始化且正在进行 rehash 或者当前哈希表中的元素数量大于要设置的大小，则返回错误。否则，将新哈希表设置为 ht[1]，并将 rehashidx 设置为 0，开始渐进式 rehash。

渐进式 Rehash 迁移代码

在 dict.c 文件的 dictRehash 函数中，实现了渐进式 rehash 的具体迁移过程：

int dictRehash(dict *d, int n) {
    int empty_visits = n * 10; /* Max number of empty buckets to visit. */
    if (!dictIsRehashing(d)) return 0;

    while (n-- && d->ht[0].used != 0) {
        dictEntry *de, *nextde;

        /* Note that rehashidx can't overflow as we are sure there are more
         * elements because ht[0].used != 0 */
        assert(d->ht[0].size > (unsigned long)d->rehashidx);
        while (d->ht[0].table[d->rehashidx] == NULL) {
            d->rehashidx++;
            if (--empty_visits == 0) return 1;
        }
        de = d->ht[0].table[d->rehashidx];
        while (de) {
            unsigned int h;
            nextde = de->next;
            /* Get the index in the new hash table */
            h = dictHashKey(d, de->key) & d->ht[1].sizemask;
            de->next = d->ht[1].table[h];
            d->ht[1].table[h] = de;
            d->ht[0].used--;
            d->ht[1].used++;
            de = nextde;
        }
        d->ht[0].table[d->rehashidx] = NULL;
        d->rehashidx++;
    }

    /* Check if we already rehashed the whole table... */
    if (d->ht[0].used == 0) {
        zfree(d->ht[0].table);
        d->ht[0] = d->ht[1];
        _dictReset(&d->ht[1]);
        d->rehashidx = -1;
        return 0;
    }

    /* More to rehash... */
    return 1;
}

该函数每次迁移 n 个桶中的元素。在迁移过程中，首先找到旧哈希表中 rehashidx 位置的非空桶，然后将桶中的所有元素重新计算哈希值并插入到新哈希表中。在插入过程中，更新新旧哈希表的 used 计数。当所有元素迁移完成后，释放旧哈希表的内存，将新哈希表设置为当前使用的哈希表，并将 rehashidx 设为 -1。

哈希表操作中的渐进式 Rehash 调用

在 dict.c 文件的 dictAddRaw 函数（用于向哈希表中添加新元素）中，会调用 dictRehashIfNeeded 函数来在合适的时候进行渐进式 rehash：

dictEntry *dictAddRaw(dict *d, void *key, dictEntry **existing) {
    long index;
    dictEntry *entry;
    dictht *ht;

    if (dictIsRehashing(d)) _dictRehashStep(d);

    /* Get the index of the new element, or -1 if
     * the element already exists. */
    if ((index = _dictKeyIndex(d, key, existing)) == -1)
        return NULL;

    /* Allocate the memory and store the new entry.
     * Insert the element in top, with the assumption that in a database
     * system it is more likely that recently added entries are accessed
     * more frequently. */
    ht = dictIsRehashing(d)? &d->ht[1] : &d->ht[0];
    entry = zmalloc(sizeof(*entry));
    entry->next = ht->table[index];
    ht->table[index] = entry;
    ht->used++;

    /* Set the hash entry fields. */
    dictSetKey(d, entry, key);
    return entry;
}

在 dictAddRaw 函数中，首先调用 _dictRehashStep 函数进行一次渐进式 rehash 操作（如果正在进行 rehash）。然后根据是否正在进行 rehash 来决定将新元素插入到旧哈希表还是新哈希表中。

渐进式 Rehash 对 Redis 性能的影响及优势

性能影响：渐进式 rehash 对 Redis 性能的影响主要体现在 CPU 资源的使用上。由于每次操作只迁移少量元素，因此在 rehash 过程中，CPU 使用率不会出现剧烈波动，而是相对平稳地上升。这种平稳的 CPU 使用率变化使得 Redis 在 rehash 期间仍能正常处理客户端请求，保证了系统的可用性。
优势：与一次性 rehash 相比，渐进式 rehash 最大的优势在于它能够在不显著影响系统性能的前提下，完成哈希表大小的调整。这使得 Redis 可以在运行过程中动态适应数据量的变化，始终保持较好的性能。无论是数据量快速增长还是减少，渐进式 rehash 都能有效地进行处理，避免了因一次性 rehash 导致的长时间性能下降问题。

总结 Redis 渐进式 Rehash 的资源分配策略

内存资源：在 rehash 开始时，根据哈希表的扩大或缩小需求分配或释放相应的内存。在迁移过程中，虽然会暂时增加内存使用，但随着迁移完成，内存使用会恢复正常。
CPU 资源：通过将 rehash 任务拆分成多个小任务，分散在每次哈希表操作中执行，避免了一次性消耗过多 CPU 资源，保证了 Redis 在 rehash 过程中的正常运行。

渐进式 rehash 的资源分配策略是 Redis 能够高效处理大量数据并保持高性能的关键因素之一。深入理解这一策略对于优化 Redis 应用、解决性能问题以及进一步开发基于 Redis 的系统都具有重要意义。