Linux C语言多线程的异常处理

多线程编程简介

在Linux环境下，C语言的多线程编程是提高程序执行效率和响应能力的重要手段。多线程允许一个程序同时执行多个任务，这些任务共享相同的地址空间和其他资源。例如，在一个网络服务器程序中，一个线程可以负责监听新的连接，另一个线程处理已建立连接上的数据传输，这样可以显著提高服务器的并发处理能力。

在Linux中，我们主要使用POSIX线程库（pthread）来进行多线程编程。使用该库，我们可以创建、管理和同步线程。以下是一个简单的创建线程的示例代码：

#include <pthread.h>
#include <stdio.h>

void* thread_function(void* arg) {
    printf("This is a new thread.\n");
    return NULL;
}

int main() {
    pthread_t my_thread;
    if (pthread_create(&my_thread, NULL, thread_function, NULL) != 0) {
        printf("\n ERROR creating thread");
        return 1;
    }
    if (pthread_join(my_thread, NULL) != 0) {
        printf("\n ERROR joining thread");
        return 2;
    }
    printf("Thread joined successfully.\n");
    return 0;
}

在上述代码中，pthread_create函数用于创建一个新线程，第一个参数是指向pthread_t类型变量的指针，用于标识新线程；第二个参数通常为NULL，用于设置线程属性；第三个参数是线程函数的指针，新线程从该函数开始执行；第四个参数是传递给线程函数的参数。pthread_join函数用于等待指定线程结束。

异常处理的重要性

在多线程编程中，异常处理尤为重要。由于多个线程共享资源，一个线程中的异常可能会影响到其他线程甚至整个程序的稳定性。例如，如果一个线程在访问共享内存时发生越界错误，可能会导致其他线程读取到错误的数据，进而引发不可预测的行为。而且，多线程程序通常运行在复杂的环境中，如网络服务器、大型数据处理系统等，这些环境中的异常情况更加复杂多样，因此有效的异常处理机制是保证程序健壮性的关键。

常见异常类型及处理方式

资源竞争异常

本质资源竞争异常是多线程编程中最常见的问题之一。当多个线程同时访问和修改共享资源时，如果没有适当的同步机制，就会导致数据不一致或程序崩溃。例如，多个线程同时对一个全局变量进行加1操作，如果没有同步，可能会丢失某些加1操作的结果。
处理方式
- 互斥锁（Mutex）：互斥锁是一种基本的同步工具，它可以保证在同一时间只有一个线程能够访问共享资源。下面是一个使用互斥锁处理资源竞争的示例：

#include <pthread.h>
#include <stdio.h>

// 全局变量
int shared_variable = 0;
// 互斥锁
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;

void* increment(void* arg) {
    // 加锁
    pthread_mutex_lock(&mutex);
    shared_variable++;
    // 解锁
    pthread_mutex_unlock(&mutex);
    return NULL;
}

int main() {
    pthread_t threads[10];
    for (int i = 0; i < 10; i++) {
        if (pthread_create(&threads[i], NULL, increment, NULL) != 0) {
            printf("\n ERROR creating thread");
            return 1;
        }
    }
    for (int i = 0; i < 10; i++) {
        if (pthread_join(threads[i], NULL) != 0) {
            printf("\n ERROR joining thread");
            return 2;
        }
    }
    printf("Final value of shared variable: %d\n", shared_variable);
    // 销毁互斥锁
    pthread_mutex_destroy(&mutex);
    return 0;
}

在这个示例中，每个线程在访问shared_variable之前，先通过pthread_mutex_lock函数获取互斥锁，访问结束后通过pthread_mutex_unlock函数释放互斥锁。这样就保证了在同一时间只有一个线程能够修改shared_variable，避免了资源竞争异常。 - 读写锁（Read - Write Lock）：读写锁允许在同一时间有多个线程进行读操作，但只允许一个线程进行写操作。当有线程正在写时，其他读写线程都需要等待。读写锁适用于读多写少的场景，例如数据库查询操作，大量线程可能同时读取数据，但只有少数线程会进行写操作。下面是一个使用读写锁的示例：

#include <pthread.h>
#include <stdio.h>

// 共享数据
int shared_data = 0;
// 读写锁
pthread_rwlock_t rwlock = PTHREAD_RWLOCK_INITIALIZER;

void* read_data(void* arg) {
    // 加读锁
    pthread_rwlock_rdlock(&rwlock);
    printf("Read data: %d\n", shared_data);
    // 解锁
    pthread_rwlock_unlock(&rwlock);
    return NULL;
}

void* write_data(void* arg) {
    // 加写锁
    pthread_rwlock_wrlock(&rwlock);
    shared_data++;
    // 解锁
    pthread_rwlock_unlock(&rwlock);
    return NULL;
}

int main() {
    pthread_t read_threads[5], write_thread;
    for (int i = 0; i < 5; i++) {
        if (pthread_create(&read_threads[i], NULL, read_data, NULL) != 0) {
            printf("\n ERROR creating read thread");
            return 1;
        }
    }
    if (pthread_create(&write_thread, NULL, write_data, NULL) != 0) {
        printf("\n ERROR creating write thread");
        return 1;
    }
    for (int i = 0; i < 5; i++) {
        if (pthread_join(read_threads[i], NULL) != 0) {
            printf("\n ERROR joining read thread");
            return 2;
        }
    }
    if (pthread_join(write_thread, NULL) != 0) {
        printf("\n ERROR joining write thread");
        return 2;
    }
    // 销毁读写锁
    pthread_rwlock_destroy(&rwlock);
    return 0;
}

在这个示例中，读线程通过pthread_rwlock_rdlock获取读锁，写线程通过pthread_rwlock_wrlock获取写锁，从而保证了数据的一致性。

死锁异常

本质死锁是一种严重的异常情况，当两个或多个线程相互等待对方释放资源时，就会发生死锁。例如，线程A持有资源1并等待资源2，而线程B持有资源2并等待资源1，这样两个线程就会永远等待下去，导致程序无法继续执行。
处理方式
- 避免死锁的算法：
  - 破坏死锁的四个必要条件：死锁的发生需要满足四个条件，即互斥、占有并等待、不可剥夺和循环等待。我们可以通过破坏这些条件来避免死锁。例如，使用资源分配图算法（如银行家算法）可以检测和避免死锁。银行家算法通过预先了解每个线程对资源的最大需求，以及当前系统资源的分配情况，来判断是否可以安全地分配资源给某个线程，从而避免死锁的发生。
  - 按顺序获取锁：一种简单有效的方法是为所有资源分配一个唯一的编号，线程在获取锁时按照编号从小到大的顺序进行。例如，如果有两个锁mutex1和mutex2，编号分别为1和2，那么所有线程都应该先获取mutex1，再获取mutex2。这样可以避免循环等待，从而防止死锁。下面是一个示例代码：

#include <pthread.h>
#include <stdio.h>

// 定义两个互斥锁
pthread_mutex_t mutex1 = PTHREAD_MUTEX_INITIALIZER;
pthread_mutex_t mutex2 = PTHREAD_MUTEX_INITIALIZER;

void* thread1_function(void* arg) {
    // 按顺序获取锁
    pthread_mutex_lock(&mutex1);
    printf("Thread 1 acquired mutex1\n");
    pthread_mutex_lock(&mutex2);
    printf("Thread 1 acquired mutex2\n");
    // 释放锁
    pthread_mutex_unlock(&mutex2);
    pthread_mutex_unlock(&mutex1);
    return NULL;
}

void* thread2_function(void* arg) {
    // 按顺序获取锁
    pthread_mutex_lock(&mutex1);
    printf("Thread 2 acquired mutex1\n");
    pthread_mutex_lock(&mutex2);
    printf("Thread 2 acquired mutex2\n");
    // 释放锁
    pthread_mutex_unlock(&mutex2);
    pthread_mutex_unlock(&mutex1);
    return NULL;
}

int main() {
    pthread_t thread1, thread2;
    if (pthread_create(&thread1, NULL, thread1_function, NULL) != 0) {
        printf("\n ERROR creating thread 1");
        return 1;
    }
    if (pthread_create(&thread2, NULL, thread2_function, NULL) != 0) {
        printf("\n ERROR creating thread 2");
        return 1;
    }
    if (pthread_join(thread1, NULL) != 0) {
        printf("\n ERROR joining thread 1");
        return 2;
    }
    if (pthread_join(thread2, NULL) != 0) {
        printf("\n ERROR joining thread 2");
        return 2;
    }
    // 销毁互斥锁
    pthread_mutex_destroy(&mutex1);
    pthread_mutex_destroy(&mutex2);
    return 0;
}

在这个示例中，两个线程都按照mutex1、mutex2的顺序获取锁，从而避免了死锁的发生。

线程退出异常

本质线程退出异常可能由多种原因引起，例如线程函数执行完毕正常退出、线程调用pthread_exit函数主动退出、线程发生未处理的异常导致异常退出等。线程的异常退出可能会导致资源没有正确释放，影响程序的稳定性。
处理方式
- 使用pthread_cleanup_push和pthread_cleanup_pop：这两个函数可以用于注册清理函数，当线程退出时（无论是正常退出还是异常退出），会自动调用这些清理函数。清理函数可以用于释放线程占用的资源，如文件描述符、内存等。下面是一个示例代码：

#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>

void cleanup_handler(void* arg) {
    printf("Cleaning up: %s\n", (char*)arg);
    free(arg);
}

void* thread_function(void* arg) {
    char* data = (char*)malloc(100);
    if (data == NULL) {
        printf("Memory allocation failed\n");
        pthread_exit(NULL);
    }
    // 注册清理函数
    pthread_cleanup_push(cleanup_handler, data);
    // 线程执行的任务
    printf("Thread is running\n");
    // 模拟异常退出
    pthread_exit(NULL);
    // 弹出清理函数，实际执行清理
    pthread_cleanup_pop(1);
    return NULL;
}

int main() {
    pthread_t my_thread;
    if (pthread_create(&my_thread, NULL, thread_function, NULL) != 0) {
        printf("\n ERROR creating thread");
        return 1;
    }
    if (pthread_join(my_thread, NULL) != 0) {
        printf("\n ERROR joining thread");
        return 2;
    }
    return 0;
}

在这个示例中，pthread_cleanup_push注册了cleanup_handler函数，当线程退出时，无论是否发生异常，cleanup_handler都会被调用，从而确保了内存的正确释放。

信号处理异常

本质在多线程程序中，信号处理也会带来一些异常情况。由于多个线程共享相同的信号处理机制，一个线程接收到信号可能会影响到其他线程的执行。例如，一个线程接收到SIGTERM信号并进行处理，可能会导致其他线程的状态被意外改变。
处理方式
- 线程特定信号掩码：可以使用pthread_sigmask函数为每个线程设置特定的信号掩码，从而控制哪些信号可以被该线程接收。例如，如果一个线程正在执行关键操作，不希望被某些信号中断，可以将这些信号添加到线程的信号掩码中，阻止信号的传递。下面是一个示例代码：

#include <pthread.h>
#include <stdio.h>
#include <signal.h>
#include <unistd.h>

void* thread_function(void* arg) {
    sigset_t new_mask, old_mask;
    // 初始化信号集
    sigemptyset(&new_mask);
    // 添加SIGINT信号到信号集
    sigaddset(&new_mask, SIGINT);
    // 设置线程的信号掩码
    pthread_sigmask(SIG_BLOCK, &new_mask, &old_mask);
    printf("Thread is running, SIGINT is blocked.\n");
    // 模拟线程执行任务
    for (int i = 0; i < 10; i++) {
        sleep(1);
        printf("Thread is working...\n");
    }
    // 恢复原来的信号掩码
    pthread_sigmask(SIG_SETMASK, &old_mask, NULL);
    printf("Thread finished, SIGINT is unblocked.\n");
    return NULL;
}

int main() {
    pthread_t my_thread;
    if (pthread_create(&my_thread, NULL, thread_function, NULL) != 0) {
        printf("\n ERROR creating thread");
        return 1;
    }
    if (pthread_join(my_thread, NULL) != 0) {
        printf("\n ERROR joining thread");
        return 2;
    }
    return 0;
}

在这个示例中，线程通过pthread_sigmask函数阻止了SIGINT信号的传递，从而保证了线程执行任务期间不会被SIGINT信号中断。

综合异常处理策略

在实际的多线程编程中，通常需要综合运用上述各种异常处理方式。例如，在一个网络服务器程序中，可能会同时面临资源竞争、死锁、线程退出和信号处理等多种异常情况。

首先，对于共享资源，如客户端连接池、数据缓存等，需要使用互斥锁或读写锁来保证数据的一致性，避免资源竞争异常。其次，在获取多个锁时，要遵循一定的顺序，防止死锁的发生。对于线程退出，要合理使用清理函数来释放资源。同时，根据线程的功能，设置合适的信号掩码，处理信号处理异常。

此外，还可以使用日志记录来帮助调试和定位异常。在发生异常时，记录详细的信息，如线程ID、异常类型、发生异常的代码位置等，以便快速排查问题。例如：

#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <syslog.h>

void* thread_function(void* arg) {
    openlog("my_thread_program", LOG_PID, LOG_USER);
    // 模拟资源竞争异常
    int shared_variable = 0;
    // 没有同步机制，这里会出现资源竞争
    shared_variable++;
    if (shared_variable != 1) {
        syslog(LOG_ERR, "Resource competition detected in thread %lu", (unsigned long)pthread_self());
    }
    // 模拟死锁
    pthread_mutex_t mutex1 = PTHREAD_MUTEX_INITIALIZER;
    pthread_mutex_t mutex2 = PTHREAD_MUTEX_INITIALIZER;
    pthread_mutex_lock(&mutex1);
    printf("Thread %lu acquired mutex1\n", (unsigned long)pthread_self());
    // 没有按顺序获取锁，可能导致死锁
    pthread_mutex_lock(&mutex2);
    printf("Thread %lu acquired mutex2\n", (unsigned long)pthread_self());
    pthread_mutex_unlock(&mutex2);
    pthread_mutex_unlock(&mutex1);
    pthread_mutex_destroy(&mutex1);
    pthread_mutex_destroy(&mutex2);
    // 模拟线程退出异常
    char* data = (char*)malloc(100);
    if (data == NULL) {
        syslog(LOG_ERR, "Memory allocation failed in thread %lu", (unsigned long)pthread_self());
        pthread_exit(NULL);
    }
    free(data);
    closelog();
    return NULL;
}

int main() {
    pthread_t my_thread;
    if (pthread_create(&my_thread, NULL, thread_function, NULL) != 0) {
        printf("\n ERROR creating thread");
        return 1;
    }
    if (pthread_join(my_thread, NULL) != 0) {
        printf("\n ERROR joining thread");
        return 2;
    }
    return 0;
}

在这个示例中，通过syslog函数记录了可能出现的异常信息，有助于在实际运行中发现和解决问题。

通过综合运用这些异常处理策略，可以提高Linux C语言多线程程序的稳定性和可靠性，使其能够在复杂多变的环境中稳健运行。同时，在编写多线程程序时，要时刻保持对潜在异常的警惕，遵循良好的编程规范和设计模式，以减少异常发生的可能性。