Redis Sentinel获取从服务器信息的技巧方法

Redis Sentinel 基础概述

Redis Sentinel 是 Redis 高可用性解决方案的关键组件，用于监控 Redis 主从集群，在主服务器出现故障时自动进行故障转移，确保系统的高可用性。它通过分布式的方式运行多个 Sentinel 实例，这些实例相互协作来检测 Redis 节点的状态。

监控原理

Sentinel 采用定期发送 PING 命令的方式来监控 Redis 节点。对于主服务器，它会监控其是否正常响应 PING 命令，若在一定时间内未收到响应，则认为主服务器处于主观下线（SDOWN）状态。当多个 Sentinel 实例都认为主服务器主观下线时，会进行一次投票，若达到一定数量的 Sentinel 认同，主服务器就会被判定为客观下线（ODOWN），进而触发故障转移流程。

配置文件示例

# 示例 Sentinel 配置文件
port 26379
sentinel monitor mymaster 127.0.0.1 6379 2
sentinel down-after-milliseconds mymaster 5000
sentinel failover-timeout mymaster 180000

上述配置中，sentinel monitor 指令定义了要监控的主服务器名称（mymaster）、IP 地址（127.0.0.1）、端口（6379）以及判定主服务器客观下线所需的 Sentinel 数量（2）。

Redis 主从复制架构

在深入探讨获取从服务器信息技巧之前，先了解 Redis 的主从复制架构。

主从复制原理

主服务器负责处理写操作，并将写操作的命令流通过复制机制发送给从服务器。从服务器接收这些命令流并在本地重放，从而保持与主服务器的数据一致性。主从复制的过程分为全量复制和部分复制：

全量复制：通常在从服务器初次连接主服务器时发生。主服务器会生成一个 RDB 文件，并将其发送给从服务器，同时将这段时间内的写命令缓存起来。从服务器接收 RDB 文件并加载，然后接收并执行缓存的写命令。
部分复制：当主从连接因网络等原因短暂中断后重新连接时，若主服务器的复制偏移量在一定范围内，主服务器会只将中断期间的写命令发送给从服务器，从而避免全量复制的开销。

从服务器角色

从服务器在 Redis 集群中有几个重要作用：

分担读压力：应用程序可以将读请求发送到从服务器，减轻主服务器的负载。
数据备份：从服务器保存了主服务器的数据副本，在主服务器出现故障时，可作为新主服务器的候选。

获取从服务器信息的常规方法

使用 Redis 命令行工具

在 Redis 命令行中，可以使用 INFO replication 命令获取当前服务器的复制信息。对于从服务器，该命令会返回如下类似信息：

# Replication
role:slave
master_host:127.0.0.1
master_port:6379
master_link_status:up
master_last_io_seconds_ago:1
master_sync_in_progress:0
slave_repl_offset:123456
slave_priority:100
slave_read_only:1
connected_slaves:0
master_replid:abcdef1234567890
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:123456
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:123456

通过解析这些信息，可以了解从服务器与主服务器的连接状态、复制偏移量等关键信息。

Sentinel 命令获取

Sentinel 提供了 SENTINEL slaves <master-name> 命令来获取指定主服务器的从服务器列表。例如：

redis-cli -p 26379 SENTINEL slaves mymaster
1) 1) "name"
   2) "127.0.0.1:6380"
   3) "ip"
   4) "127.0.0.1"
   5) "port"
   6) "6380"
   7) "runid"
   8) "abcdef1234567890"
   9) "flags"
  10) "slave"
  11) "master-link-status"
  12) "up"
  13) "master-link-down-time"
  14) "0"
  15) "last-ping-sent"
  16) "0"
  17) "last-ok-ping-reply"
  18) "1"
  19) "last-ping-reply"
  20) "1"
  21) "down-after-milliseconds"
  22) "10000"
  23) "info-refresh"
  24) "123456"
  25) "role-reported"
  26) "slave"
  27) "role-reported-time"
  28) "123456"
  29) "config-epoch"
  30) "1"
  31) "num-other-sentinels"
  32) "2"
  33) "sync-sent"
  34) "123456"
  35) "sync-received"
  36) "123456"
  37) "master-repl-offset"
  38) "123456"
  39) "repl-backlog-active"
  40) "1"
  41) "repl-backlog-size"
  42) "1048576"
  43) "repl-backlog-first-byte-offset"
  44) "1"
  45) "repl-backlog-histlen"
  46) "123456"

此命令返回的信息包含从服务器的 IP、端口、运行 ID、与主服务器的连接状态等详细信息。

基于编程方式获取从服务器信息

使用 Python 和 redis - py 库

安装 redis - py：

pip install redis

获取从服务器信息代码示例：

import redis

sentinel = redis.sentinel.Sentinel([('127.0.0.1', 26379)], socket_timeout=0.1)
master = sentinel.master_for('mymaster', socket_timeout=0.1)
slaves = sentinel.slaves_for('mymaster')

for slave in slaves:
    print(f"Slave IP: {slave.connection_pool.connection_kwargs['host']}, Port: {slave.connection_pool.connection_kwargs['port']}")
    info = slave.info('replication')
    print(f"Role: {info['role']}, Master Link Status: {info['master_link_status']}")

上述代码使用 redis - py 库连接 Sentinel，并获取指定主服务器的从服务器列表。然后遍历从服务器，获取其 IP、端口以及复制信息。

使用 Java 和 Jedis 库

添加 Jedis 依赖：在 pom.xml 中添加以下依赖：

<dependency>
    <groupId>redis.clients</groupId>
    <artifactId>jedis</artifactId>
    <version>3.6.0</version>
</dependency>

获取从服务器信息代码示例：

import redis.clients.jedis.*;
import java.util.*;

public class RedisSentinelSlaveInfo {
    public static void main(String[] args) {
        Set<String> sentinels = new HashSet<>(Arrays.asList("127.0.0.1:26379"));
        JedisSentinelPool jedisSentinelPool = new JedisSentinelPool("mymaster", sentinels);

        try (Jedis masterJedis = jedisSentinelPool.getResource()) {
            List<JedisSentinelPool.SentinelInfo> slaveInfos = jedisSentinelPool.getSlaves();
            for (JedisSentinelPool.SentinelInfo slaveInfo : slaveInfos) {
                System.out.println("Slave IP: " + slaveInfo.getHost() + ", Port: " + slaveInfo.getPort());
                try (Jedis slaveJedis = new Jedis(slaveInfo.getHost(), slaveInfo.getPort())) {
                    Map<String, String> info = slaveJedis.info("replication").split("\r\n").stream()
                           .map(line -> line.split(":"))
                           .collect(Collectors.toMap(a -> a[0], a -> a[1]));
                    System.out.println("Role: " + info.get("role") + ", Master Link Status: " + info.get("master_link_status"));
                }
            }
        }
    }
}

此 Java 代码通过 Jedis 库连接 Sentinel，获取从服务器信息，并解析复制相关的信息。

深入分析从服务器信息

复制偏移量与一致性

从服务器的 master_repl_offset 表示从服务器复制主服务器数据的偏移量。通过对比多个从服务器的复制偏移量，可以判断它们的数据一致性。如果某个从服务器的偏移量明显落后于其他从服务器，可能存在网络延迟、硬件性能问题等。例如，在 Python 中可以这样对比多个从服务器的偏移量：

import redis

sentinel = redis.sentinel.Sentinel([('127.0.0.1', 26379)], socket_timeout=0.1)
slaves = sentinel.slaves_for('mymaster')

offsets = []
for slave in slaves:
    info = slave.info('replication')
    offsets.append(int(info['master_repl_offset']))

if len(set(offsets)) > 1:
    print("从服务器复制偏移量不一致")
else:
    print("从服务器复制偏移量一致")

主从连接状态监控

从服务器的 master_link_status 字段表示与主服务器的连接状态。值为 up 表示连接正常，若为 down，则说明连接出现问题。这可能是网络故障、主服务器故障等原因导致。通过持续监控这个状态，可以及时发现并处理主从连接异常。在 Java 中监控连接状态示例如下：

import redis.clients.jedis.*;
import java.util.*;

public class MasterLinkStatusMonitor {
    public static void main(String[] args) {
        Set<String> sentinels = new HashSet<>(Arrays.asList("127.0.0.1:26379"));
        JedisSentinelPool jedisSentinelPool = new JedisSentinelPool("mymaster", sentinels);

        try (Jedis masterJedis = jedisSentinelPool.getResource()) {
            List<JedisSentinelPool.SentinelInfo> slaveInfos = jedisSentinelPool.getSlaves();
            for (JedisSentinelPool.SentinelInfo slaveInfo : slaveInfos) {
                try (Jedis slaveJedis = new Jedis(slaveInfo.getHost(), slaveInfo.getPort())) {
                    Map<String, String> info = slaveJedis.info("replication").split("\r\n").stream()
                           .map(line -> line.split(":"))
                           .collect(Collectors.toMap(a -> a[0], a -> a[1]));
                    if (!"up".equals(info.get("master_link_status"))) {
                        System.out.println("从服务器 " + slaveInfo.getHost() + ":" + slaveInfo.getPort() + " 与主服务器连接异常");
                    }
                }
            }
        }
    }
}

高级技巧：从服务器筛选与优化

根据性能指标筛选从服务器

可以根据从服务器的一些性能指标，如响应时间、CPU 使用率、内存使用率等，来筛选出最适合处理读请求的从服务器。例如，通过 INFO stats 命令获取从服务器的响应时间指标 instantaneous_ops_per_sec（每秒处理的操作数），可以大致评估其性能。在 Python 中筛选性能较好的从服务器示例：

import redis

sentinel = redis.sentinel.Sentinel([('127.0.0.1', 26379)], socket_timeout=0.1)
slaves = sentinel.slaves_for('mymaster')

best_slave = None
max_ops = 0
for slave in slaves:
    stats = slave.info('stats')
    ops_per_sec = int(stats['instantaneous_ops_per_sec'])
    if ops_per_sec > max_ops:
        max_ops = ops_per_sec
        best_slave = slave

if best_slave:
    print(f"性能最佳的从服务器: {best_slave.connection_pool.connection_kwargs['host']}:{best_slave.connection_pool.connection_kwargs['port']}")

动态调整从服务器数量

在系统负载变化时，可以动态调整从服务器的数量。例如，当读请求量增加时，可以添加更多从服务器来分担压力；当读请求量减少时，可以适当减少从服务器，以节省资源。在 Redis 中，可以通过修改主服务器的配置文件，添加或删除从服务器的配置，然后重启主服务器和相关从服务器来实现。以添加从服务器为例，在主服务器配置文件中添加如下配置：

slaveof <new_slave_ip> <new_slave_port>

然后重启主从服务器，Sentinel 会自动检测到新的从服务器。

故障场景下的从服务器信息处理

主服务器故障时

当主服务器发生故障，Sentinel 会进行故障转移，选择一个从服务器晋升为主服务器。在这个过程中，需要关注从服务器的状态变化。例如，新晋升的主服务器的角色转换，以及其他从服务器重新连接新主服务器的过程。可以通过监听 Sentinel 的通知事件来获取这些信息。在 Python 中使用 redis - py 监听 Sentinel 通知示例：

import redis

sentinel = redis.sentinel.Sentinel([('127.0.0.1', 26379)], socket_timeout=0.1)

def sentinel_notification_handler(message):
    print(f"收到 Sentinel 通知: {message}")
    if message['event'] == 'failover-end':
        print("故障转移完成，新主服务器已确定")

sentinel.subscribe(sentinel_notification_handler, '__sentinel__:hello')

从服务器自身故障

当从服务器自身出现故障时，Sentinel 会标记其为下线状态。可以通过定期检查从服务器列表，发现处于下线状态的从服务器，并进行相应处理，如记录日志、尝试重启等。在 Java 中定期检查从服务器状态示例：

import redis.clients.jedis.*;
import java.util.*;
import java.util.concurrent.Executors;
import java.util.concurrent.ScheduledExecutorService;
import java.util.concurrent.TimeUnit;

public class SlaveHealthCheck {
    private static final Set<String> sentinels = new HashSet<>(Arrays.asList("127.0.0.1:26379"));
    private static final JedisSentinelPool jedisSentinelPool = new JedisSentinelPool("mymaster", sentinels);

    public static void main(String[] args) {
        ScheduledExecutorService executorService = Executors.newScheduledThreadPool(1);
        executorService.scheduleAtFixedRate(() -> {
            try (Jedis masterJedis = jedisSentinelPool.getResource()) {
                List<JedisSentinelPool.SentinelInfo> slaveInfos = jedisSentinelPool.getSlaves();
                for (JedisSentinelPool.SentinelInfo slaveInfo : slaveInfos) {
                    try (Jedis slaveJedis = new Jedis(slaveInfo.getHost(), slaveInfo.getPort())) {
                        String pingResponse = slaveJedis.ping();
                        if (!"PONG".equals(pingResponse)) {
                            System.out.println("从服务器 " + slaveInfo.getHost() + ":" + slaveInfo.getPort() + " 无响应，可能故障");
                        }
                    } catch (Exception e) {
                        System.out.println("从服务器 " + slaveInfo.getHost() + ":" + slaveInfo.getPort() + " 连接异常，可能故障");
                    }
                }
            }
        }, 0, 10, TimeUnit.SECONDS);
    }
}

通过以上详细的介绍，涵盖了从基础概念到高级技巧以及故障场景处理等多方面内容，希望能帮助开发者更好地掌握 Redis Sentinel 获取从服务器信息的方法与应用。在实际应用中，需根据具体业务场景和需求，灵活运用这些技巧，以确保 Redis 集群的高效稳定运行。