Rust调用C代码的步骤 - 摩柯技术社区

Rust与C语言交互的背景

在现代软件开发中，不同编程语言往往各有所长。C语言以其高效、底层控制能力强以及广泛的库支持，在系统级编程、嵌入式开发等领域有着深厚的根基。而Rust作为新兴的系统编程语言，凭借其内存安全、并发友好等特性，逐渐在高性能编程领域崭露头角。在实际项目中，有时会遇到需要结合两者优势的场景，例如复用现有的C语言代码库，或者利用Rust对C代码进行安全封装。因此，掌握Rust调用C代码的方法具有重要的实际意义。

Rust调用C代码的整体流程概述

Rust调用C代码主要涉及几个关键步骤：编写C代码、编译C代码为库、在Rust项目中配置链接该库以及编写Rust代码进行调用。下面我们将逐步深入讲解每个步骤。

编写C代码

简单C函数示例 首先，我们编写一个简单的C函数，用于计算两个整数的和。
```
// add.c
int add(int a, int b) {
    return a + b;
}
```
这个函数非常基础，接收两个整数参数 a 和 b，并返回它们的和。在实际项目中，C代码可能会复杂得多，涉及结构体、指针运算等更高级的特性。
复杂C函数示例（含结构体） 为了进一步展示复杂场景，我们编写一个处理矩形结构体的C函数，用于计算矩形的面积。
```
// rectangle.c
#include <stdio.h>

// 定义矩形结构体
typedef struct {
    int width;
    int height;
} Rectangle;

// 计算矩形面积的函数
int calculate_area(Rectangle rect) {
    return rect.width * rect.height;
}
```
这里我们定义了一个 Rectangle 结构体，包含 width 和 height 两个成员变量。calculate_area 函数接收一个 Rectangle 结构体实例，并返回其面积。

编译C代码为库

静态库编译 对于上述 add.c 代码，我们可以将其编译为静态库。在类Unix系统（如Linux、macOS）下，使用 gcc 编译器，执行以下命令：
```
gcc -c add.c -o add.o
ar rcs libadd.a add.o
```
第一条命令 -c 选项用于编译源文件生成目标文件 add.o。第二条命令 ar 工具用于创建静态库 libadd.a，其中 r 表示将文件插入库中，c 表示创建库，s 表示为库创建索引。对于 rectangle.c 代码，同样可以编译为静态库：
```
gcc -c rectangle.c -o rectangle.o
ar rcs librectangle.a rectangle.o
```
动态库编译 在类Unix系统下，编译动态库可使用以下命令。对于 add.c：
```
gcc -shared -fPIC add.c -o libadd.so
```
其中 -shared 选项指示编译器生成共享库，-fPIC 选项用于生成位置无关代码（Position - Independent Code），这是共享库所必需的。对于 rectangle.c：
```
gcc -shared -fPIC rectangle.c -o librectangle.so
```

在Rust项目中配置链接C库

Cargo.toml配置 假设我们有一个Rust项目，在项目根目录下的 Cargo.toml 文件中进行配置。如果链接静态库，假设库文件 libadd.a 位于项目根目录下的 lib 文件夹，配置如下：

[package]
name = "rust_call_c"
version = "0.1.0"
edition = "2021"

[dependencies]
add = { path = "lib", kind = "static" }

如果链接动态库，假设库文件 libadd.so 位于项目根目录下的 lib 文件夹，配置如下：

[package]
name = "rust_call_c"
version = "0.1.0"
edition = "2021"

[dependencies]
add = { path = "lib", kind = "dylib" }

对于 rectangle 库，类似地，如果是静态库：

[package]
name = "rust_call_c"
version = "0.1.0"
edition = "2021"

[dependencies]
rectangle = { path = "lib", kind = "static" }

如果是动态库：

[package]
name = "rust_call_c"
version = "0.1.0"
edition = "2021"

[dependencies]
rectangle = { path = "lib", kind = "dylib" }

指定库搜索路径 在一些情况下，库文件可能不在项目根目录下的 lib 文件夹。我们可以通过环境变量指定库搜索路径。在类Unix系统下，对于动态库，可以设置 LD_LIBRARY_PATH 环境变量。例如，如果 libadd.so 位于 /usr/local/lib/custom 目录，执行以下命令：
```
export LD_LIBRARY_PATH=/usr/local/lib/custom:$LD_LIBRARY_PATH
```
对于静态库，在 Cargo.toml 中可以使用 rustflags 来指定库搜索路径。例如：
```
[package]
```

... [build] rustflags = ["-L", "/usr/local/lib/custom"]


### 编写Rust代码进行调用
1. **调用简单C函数（add函数）**
在Rust代码中，我们使用 `extern "C"` 块来声明外部C函数。
```rust
extern "C" {
    fn add(a: i32, b: i32) -> i32;
}

fn main() {
    let result: i32;
    unsafe {
        result = add(3, 5);
    }
    println!("The result of 3 + 5 is: {}", result);
}

在 extern "C" 块中，我们声明了C函数 add，它接收两个 i32 类型参数并返回一个 i32 类型值。在 main 函数中，我们通过 unsafe 块来调用这个函数，因为Rust无法对外部C代码进行安全检查。 2. 调用复杂C函数（calculate_area函数） 对于处理结构体的 calculate_area 函数，我们需要在Rust中定义对应的结构体，并确保其内存布局与C语言中的结构体一致。

use std::ffi::CStruct;

#[repr(C)]
#[derive(Copy, Clone)]
struct Rectangle {
    width: i32,
    height: i32,
}

extern "C" {
    fn calculate_area(rect: Rectangle) -> i32;
}

fn main() {
    let rect = Rectangle { width: 4, height: 6 };
    let result: i32;
    unsafe {
        result = calculate_area(rect);
    }
    println!("The area of the rectangle is: {}", result);
}

这里我们使用 #[repr(C)] 属性来指定Rust结构体的内存布局与C语言结构体一致，#[derive(Copy, Clone)] 用于实现结构体的复制和克隆。在 main 函数中，我们创建一个 Rectangle 结构体实例，并通过 unsafe 块调用 calculate_area 函数。

处理C语言中的指针

C函数接收指针参数 假设我们有一个C函数，用于将一个整数数组中的每个元素翻倍。

// double_array.c
void double_array(int *arr, int len) {
    for (int i = 0; i < len; i++) {
        arr[i] *= 2;
    }
}

在Rust中调用这个函数时，需要处理指针。

use std::ffi::CInt;
use std::ptr;

extern "C" {
    fn double_array(arr: *mut CInt, len: CInt);
}

fn main() {
    let mut arr = [1, 2, 3, 4, 5];
    let len = arr.len() as i32;
    let arr_ptr = arr.as_mut_ptr();

    unsafe {
        double_array(arr_ptr, len);
    }

    for num in arr.iter() {
        println!("{}", num);
    }
}

首先，我们通过 as_mut_ptr 方法获取数组的可变指针。然后在 unsafe 块中调用C函数。注意，这里Rust无法保证C函数对指针的操作是否安全，所以需要谨慎使用。

C函数返回指针 假设我们有一个C函数，用于创建一个包含两个整数之和与差的数组，并返回该数组的指针。

// sum_diff_array.c
#include <stdlib.h>

int* sum_diff_array(int a, int b) {
    int *result = (int *)malloc(2 * sizeof(int));
    result[0] = a + b;
    result[1] = a - b;
    return result;
}

在Rust中调用这个函数时，需要处理返回的指针并进行内存管理。

use std::ffi::CInt;
use std::ptr;

extern "C" {
    fn sum_diff_array(a: CInt, b: CInt) -> *mut CInt;
    fn free(ptr: *mut libc::c_void);
}

fn main() {
    let a = 5;
    let b = 3;
    let result_ptr: *mut CInt;
    unsafe {
        result_ptr = sum_diff_array(a as CInt, b as CInt);
        if!result_ptr.is_null() {
            let sum = *result_ptr;
            let diff = *(result_ptr.offset(1));
            println!("Sum: {}, Diff: {}", sum, diff);
            free(result_ptr as *mut libc::c_void);
        }
    }
}

这里我们通过 sum_diff_array 函数获取返回的指针，检查指针是否为空，然后读取数组中的值。最后，使用 free 函数释放C函数中分配的内存，以避免内存泄漏。注意，我们需要引入 libc 库来使用 free 函数，并且在 extern "C" 块中声明 free 函数。

处理C语言中的结构体指针

C函数接收结构体指针参数 假设我们有一个C函数，用于修改矩形结构体的宽度和高度。

// modify_rectangle.c
#include <stdio.h>

typedef struct {
    int width;
    int height;
} Rectangle;

void modify_rectangle(Rectangle *rect, int new_width, int new_height) {
    rect->width = new_width;
    rect->height = new_height;
}

在Rust中调用这个函数时，需要处理结构体指针。

use std::ffi::CInt;

#[repr(C)]
#[derive(Copy, Clone)]
struct Rectangle {
    width: CInt,
    height: CInt,
}

extern "C" {
    fn modify_rectangle(rect: *mut Rectangle, new_width: CInt, new_height: CInt);
}

fn main() {
    let mut rect = Rectangle { width: 2, height: 3 };
    let new_width = 4;
    let new_height = 5;
    let rect_ptr = &mut rect as *mut Rectangle;

    unsafe {
        modify_rectangle(rect_ptr, new_width as CInt, new_height as CInt);
    }

    println!("New width: {}, New height: {}", rect.width, rect.height);
}

这里我们创建一个 Rectangle 结构体实例，并获取其可变指针。在 unsafe 块中调用C函数来修改结构体的成员变量。

C函数返回结构体指针 假设我们有一个C函数，用于创建一个新的矩形结构体并返回其指针。

// create_rectangle.c
#include <stdlib.h>

typedef struct {
    int width;
    int height;
} Rectangle;

Rectangle* create_rectangle(int width, int height) {
    Rectangle *rect = (Rectangle *)malloc(sizeof(Rectangle));
    rect->width = width;
    rect->height = height;
    return rect;
}

在Rust中调用这个函数时，需要处理返回的结构体指针并进行内存管理。

use std::ffi::CInt;
use std::ptr;

#[repr(C)]
#[derive(Copy, Clone)]
struct Rectangle {
    width: CInt,
    height: CInt,
}

extern "C" {
    fn create_rectangle(width: CInt, height: CInt) -> *mut Rectangle;
    fn free(ptr: *mut libc::c_void);
}

fn main() {
    let width = 4;
    let height = 6;
    let rect_ptr: *mut Rectangle;
    unsafe {
        rect_ptr = create_rectangle(width as CInt, height as CInt);
        if!rect_ptr.is_null() {
            let rect = *rect_ptr;
            println!("Width: {}, Height: {}", rect.width, rect.height);
            free(rect_ptr as *mut libc::c_void);
        }
    }
}

这里我们通过 create_rectangle 函数获取返回的结构体指针，检查指针是否为空，然后读取结构体中的值。最后，使用 free 函数释放C函数中分配的内存。

处理C语言中的回调函数

C语言中使用回调函数示例 假设我们有一个C函数，它接收一个回调函数指针，并对数组中的每个元素应用该回调函数。

// apply_callback.c
#include <stdio.h>

typedef int (*Callback)(int);

void apply_callback(int *arr, int len, Callback callback) {
    for (int i = 0; i < len; i++) {
        arr[i] = callback(arr[i]);
    }
}

int square(int num) {
    return num * num;
}

在Rust中调用这个函数时，需要定义对应的回调函数并传递给C函数。

use std::ffi::CInt;
use std::mem;
use std::os::raw::c_int;

extern "C" {
    fn apply_callback(arr: *mut CInt, len: CInt, callback: Option<unsafe extern "C" fn(CInt) -> CInt>);
}

unsafe extern "C" fn square(num: CInt) -> CInt {
    num * num
}

fn main() {
    let mut arr = [1, 2, 3, 4, 5];
    let len = arr.len() as i32;
    let arr_ptr = arr.as_mut_ptr();

    unsafe {
        apply_callback(arr_ptr, len, Some(square));
    }

    for num in arr.iter() {
        println!("{}", num);
    }
}

首先，我们在Rust中定义了与C语言中 square 函数对应的 square 函数，并使用 unsafe extern "C" 修饰。在 main 函数中，我们将这个回调函数指针传递给 apply_callback 函数。注意，Rust中的函数指针需要使用 Option 来处理可能的空指针情况。

复杂回调函数示例（含结构体） 假设我们有一个C函数，它接收一个结构体和一个回调函数指针，回调函数用于处理结构体中的数据。

// process_rectangle.c
#include <stdio.h>

typedef struct {
    int width;
    int height;
} Rectangle;

typedef int (*RectangleCallback)(Rectangle *);

void process_rectangle(Rectangle *rect, RectangleCallback callback) {
    int result = callback(rect);
    printf("Process result: %d\n", result);
}

int calculate_area(Rectangle *rect) {
    return rect->width * rect->height;
}

在Rust中调用这个函数时，需要定义对应的结构体和回调函数。

use std::ffi::CInt;
use std::mem;
use std::os::raw::c_int;

#[repr(C)]
#[derive(Copy, Clone)]
struct Rectangle {
    width: CInt,
    height: CInt,
}

extern "C" {
    fn process_rectangle(rect: *mut Rectangle, callback: Option<unsafe extern "C" fn(*mut Rectangle) -> CInt>);
}

unsafe extern "C" fn calculate_area(rect: *mut Rectangle) -> CInt {
    (*rect).width * (*rect).height
}

fn main() {
    let mut rect = Rectangle { width: 4, height: 6 };
    let rect_ptr = &mut rect as *mut Rectangle;

    unsafe {
        process_rectangle(rect_ptr, Some(calculate_area));
    }
}

这里我们在Rust中定义了 Rectangle 结构体和 calculate_area 回调函数。在 main 函数中，我们将结构体指针和回调函数指针传递给 process_rectangle 函数。

处理C语言中的全局变量

访问C语言全局变量 假设我们有一个C文件，定义了一个全局变量。
```
// global_variable.c
int global_value = 10;
```
在Rust中访问这个全局变量，需要在 extern "C" 块中声明。
```
extern "C" {
    static global_value: i32;
}

fn main() {
    unsafe {
        println!("Global value from C: {}", global_value);
    }
}
```
这里通过 static 关键字在 extern "C" 块中声明了C语言中的全局变量 global_value，并在 unsafe 块中访问它。
修改C语言全局变量 如果要修改C语言中的全局变量，同样在 extern "C" 块中声明，并通过可变引用进行修改。
```
extern "C" {
    static mut global_value: i32;
}

fn main() {
    unsafe {
        global_value = 20;
        println!("Modified global value from C: {}", global_value);
    }
}
```
注意，这里使用 mut 关键字声明全局变量为可变的，并且在 unsafe 块中进行修改。由于全局变量可能会被多个线程访问，这种修改需要特别小心，以避免数据竞争等问题。

跨平台考虑

不同操作系统下的库文件命名 在Windows系统下，静态库文件的扩展名是 .lib，动态库文件的扩展名是 .dll。而在类Unix系统（如Linux、macOS）下，静态库文件的命名通常是 libxxx.a，动态库文件的命名通常是 libxxx.so（在macOS上也可能是 libxxx.dylib）。在Rust项目中配置链接库时，需要根据不同的操作系统进行相应的调整。例如，在Windows下链接静态库 add.lib，在 Cargo.toml 中的配置可能如下：
```
[package]
```

... [dependencies] add = { path = "lib", kind = "static" }

而在Linux下链接 `libadd.so`，配置为：
```toml
[package]
...
[dependencies]
add = { path = "lib", kind = "dylib" }

不同编译器的兼容性 C语言代码在不同的编译器下可能存在一些细微的差异。例如，gcc 和 clang 编译器在处理某些C标准特性时可能有不同的默认设置。在编译C代码为库时，需要确保使用的编译器选项在不同平台上都能正确工作。另外，Rust与C语言的交互也可能受到编译器的影响。例如，在Windows下使用 MinGW 编译C代码，与使用Visual Studio的C编译器，在链接到Rust项目时可能需要不同的配置。在实际项目中，需要进行充分的测试，以确保跨平台的兼容性。

错误处理

C函数返回错误码 假设我们有一个C函数，用于打开一个文件并返回文件描述符，如果打开失败返回 -1。

// open_file.c
#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>

int open_file(const char *filename) {
    int fd = open(filename, O_RDONLY);
    if (fd == -1) {
        return -1;
    }
    return fd;
}

在Rust中调用这个函数时，需要处理错误码。

use std::ffi::CStr;
use std::os::raw::c_int;

extern "C" {
    fn open_file(filename: *const libc::c_char) -> c_int;
}

fn main() {
    let filename = CStr::from_bytes_with_nul(b"test.txt\0").unwrap();
    let fd: c_int;
    unsafe {
        fd = open_file(filename.as_ptr());
        if fd == -1 {
            eprintln!("Failed to open file");
        } else {
            println!("File opened with fd: {}", fd);
        }
    }
}

这里我们通过检查C函数返回的错误码 -1 来判断文件是否打开成功，并进行相应的错误处理。

使用errno进行错误处理 在C语言中，errno 是一个全局变量，用于存储最近一次系统调用或库函数调用的错误代码。假设我们有一个C函数，用于读取文件内容，可能会设置 errno。

// read_file.c
#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>
#include <errno.h>

int read_file(int fd, char *buffer, size_t len) {
    ssize_t bytes_read = read(fd, buffer, len);
    if (bytes_read == -1) {
        return errno;
    }
    return 0;
}

在Rust中调用这个函数时，需要处理 errno。

use std::ffi::CStr;
use std::os::raw::{c_char, c_int, size_t};
use std::io::Error;

extern "C" {
    fn read_file(fd: c_int, buffer: *mut c_char, len: size_t) -> c_int;
    fn errno() -> c_int;
}

fn main() {
    let mut buffer = [0 as c_char; 1024];
    let fd = 3; // 假设已经打开的文件描述符
    let len = buffer.len() as size_t;
    let result: c_int;
    unsafe {
        result = read_file(fd, buffer.as_mut_ptr(), len);
        if result!= 0 {
            let err = Error::from_raw_os_error(errno());
            eprintln!("Read file error: {}", err);
        } else {
            let c_str = CStr::from_ptr(buffer.as_ptr());
            println!("Read content: {}", c_str.to_str().unwrap());
        }
    }
}

这里我们通过 errno() 函数获取 errno 的值，并将其转换为Rust的 Error 类型进行错误处理。

性能优化

减少内存拷贝 在Rust与C语言交互过程中，尽量减少不必要的内存拷贝。例如，当传递数组或结构体时，如果可能，尽量传递指针而不是整个数据的副本。对于结构体，使用 #[repr(C)] 确保内存布局一致，避免在传递过程中进行额外的内存转换。在处理字符串时，使用 CStr 和 CString 类型，以高效地与C语言中的字符串进行交互，减少字符串拷贝操作。
内联函数优化 如果C语言中的函数非常短小且频繁调用，可以考虑将其声明为内联函数。在C语言中，可以使用 inline 关键字（在一些编译器中可能需要特定的编译器选项支持）。在Rust调用这些内联函数时，由于函数体直接嵌入调用处，可能会减少函数调用的开销，从而提高性能。例如：
```
// inline_function.c
inline int add(int a, int b) {
    return a + b;
}
```
在Rust中调用这个内联函数与调用普通C函数的方式相同，但在性能上可能会有一定提升。
编译优化选项 在编译C代码为库时，使用适当的编译优化选项可以提高性能。例如，在 gcc 编译器中，可以使用 -O2 或 -O3 选项进行优化。在Rust项目编译时，同样可以使用 --release 模式，启用优化。例如，编译C代码为静态库时：
```
gcc -O2 -c add.c -o add.o
ar rcs libadd.a add.o
```
编译Rust项目时：
```
cargo build --release
```
这些优化选项可以对代码进行一系列的优化，如减少指令数量、提高缓存命中率等，从而提升整体性能。

通过以上详细的步骤和示例，我们全面地介绍了Rust调用C代码的方法、注意事项以及一些优化技巧。在实际项目中，根据具体需求灵活运用这些知识，可以充分发挥Rust和C语言的优势，构建高效、可靠的软件系统。