最近一直在研究RocksDB的源码,非常多的原子操作涉及到内存序,之前由于用原子操作常常只是用作计数,在这种简单的场景下其实各个内存序是没有差异的。但是为何要有各种内存序呢?

我们首先来看下定义的内存序:

typedef enum memory_order {
    memory_order_relaxed,   // relaxed
    memory_order_consume,   // consume
    memory_order_acquire,   // acquire
    memory_order_release,   // release
    memory_order_acq_rel,   // acquire/release
    memory_order_seq_cst    // sequentially consistent
} memory_order;

这些内存序作为原子变量的各种函数的参数被使用。首先我们要明白一点,原子操作被称之为原子操作,对于某个原子变量而言,无论使用了哪个内存序,只要在一个线程中执行完成过后,就立马能在另外一个线程中可见,不会因为存在各种缓存没有刷新的原因导致读到旧的变量,这是原子操作的核心定义。

那么为何要定义这么多的内存序呢?借助原子变量,我们的确可以保证该变量在线程之间的同步,那么原子变量的上下文呢?由于各种原因,比如CPU cache,由比如CPU乱序执行指令,在原子操作这一行代码的前后的执行顺序不一定就是代码中的顺序。通过内存序,我们可以在执行原子变量操作的时候通过内存序的参数控制一个同步点的行为,指定上下文的可见性,这就是内存序的含义。

这里贴一下cplusplus上对应的解释:

memory_order_relaxed
The operation is ordered to happen atomically at some point.
This is the loosest memory order, providing no guarantees on how memory accesses in different threads are ordered with respect to the atomic operation.

memory_order_consume
[Applies to loading operations]
The operation is ordered to happen once all accesses to memory in the releasing thread that carry a dependency on the releasing operation (and that have visible side effects on the loading thread) have happened.

memory_order_acquire
[Applies to loading operations]
The operation is ordered to happen once all accesses to memory in the releasing thread (that have visible side effects on the loading thread) have happened.

memory_order_release
[Applies to storing operations]
The operation is ordered to happen before a consume or acquire operation, serving as a synchronization point for other accesses to memory that may have visible side effects on the loading thread.

memory_order_acq_rel
[Applies to loading/storing operations]
The operation loads acquiring and stores releasing (as defined above for memory_order_acquire and memory_order_release).

memory_order_seq_cst
The operation is ordered in a sequentially consistent manner: All operations using this memory order are ordered to happen once all accesses to memory that may have visible side effects on the other threads involved have already happened.
This is the strictest memory order, guaranteeing the least unexpected side effects between thread interactions though the non-atomic memory accesses.
For consume and acquire loads, sequentially consistent store operations are considered releasing operations.

举个简单的例子,我们定义一个exit变量,它是原子的,当一个线程设置了exit状态之后,会触发另一个线程的某些操作。

伪代码

// 线程一
xxx = 1;
exit = true;

// 线程二
while (!exit) {
	sleep(xx);
}

// Doing things depend on variable xxx
if (xxx == 1) xxxx;

随便写了个例子,这里就能理解上述说的问题所在了。虽然在代码中xxx的赋值在exit赋值之前,但是由于各种原因,exit的赋值的可见性可能在线程二中先于xxx的赋值,这个时候就有问题了,当线程二判断exit被赋值后,xxx == 1的逻辑没有执行,就发生大乱子了。

这个时候就需要内存序了,在没有memory order之前(c++11),可能就需要手动写各种内存屏障了。

所以在c++11中,我们可以方便地通过memory order处理这种场景了,memory order是以对应的原子变量操作为基准点,对基准点前面或者后面的语句做语法上的约束。

我们来看下各个选项的意义:

  • memory_order_relaxed。松散的内存序,不管原子变量的上下文是如何执行的
  • memory_order_consume。针对于读的原子操作,所有依赖的读不会重排到基准点前面。这个比较难理解,这里举一个简单的例子:

    引用知乎用户program.jerry的例子
    std::atomic<int> net_con{0};
    std::atomic<int> has_alloc{0};
    char buffer[1024];
    char file_content[1024];
    
    void release_thread(void) {
            sprintf(buffer, "%s", "something_to_read_tobuffer");
    
        // 这两个是与buffer完全无关的代码
        // net_con表示接收到的链接
        net_con.store(1, std::memory_order_release);
        // 标记alloc memory for connection
        has_alloc.store(1,      std::memory_order_release);
    }
    
        
    // consume example
    std::atomic<int*> global_addr{nullptr};
    
    void func(int *data) {
        int *addr = global_addr.load(std::memory_order_consume);
        int d = *data;
        int f = *(data+1);
        if (addr) {
            int x = *addr;
        }
    

    在这里我们使用了consume的内存序,所以依赖于addr的所有语句都会在consume之后,而其余的语句,顺序不定。

  • memory_order_acquire。主要用于读操作,含义是所有的读操作,都在该操作之后。在别的线程使用了release语义前的操作对于acquire之后的操作可见

  • memory_order_release。主要用于写操作,含义是所有的写操作都在该操作之前。所有的操作对于使用了require或者consume语义的可见

  • memory_order_acq_rel。用于读写操作,同时设置acquire和release语义,也就是说任何的写操作,都不会重拍到基准语句之后;所有的读操作,都不会重排到基准语句之前

  • memory_order_seq_cst。最强的约束,相对于基准语句,所有上下文均不会进行重排

上述是语言层面的内存序,牵扯到具体实现,还会有各种内存屏障以及cache的刷新等,不然也会设计到不同cpu看见到数据不一致的场景产生。

共 1 条回复
  
musa   (游客) 2021-06-17 11:32
nice
发表新回复

作者

sryan
today is a good day