Implementing atomic read-modify-write operations with LL/SC instructions
Like many RISC1 architectures, Arm does not have dedicated RMW instructions. Given that the processor may switch contexts to another thread at any moment, constructing RMW operations from standard loads and stores is not feasible. Special instructions are required instead: load-link and store-conditional (LL/SC). These instructions are complementary: load-link performs a read operation from an address, similar to any load, but it also signals the processor to watch that address. Store-conditional executes a write operation only if no other writes have occurred at that address since its paired load-link. This mechanism is illustrated through an atomic fetch and add example.
On Arm
void incFoo() { ++foo; }
compiles to
incFoo:
ldr r3, <&foo>
dmb
loop:
ldrex r2, [r3] // LL foo
adds r2, r2, #1 // Increment
strex r1, r2, [r3] // SC
cmp r1, #0 // Check the SC result.
bne loop // Loop if the SC failed.
dmb
bx lr
We LL the current value, add one, and immediately try to store it back with a SC.
If that fails, another thread may have written to foo
since our LL, so we try again.
In this way, at least one thread is always making forward progress in atomically modifying foo
,
even if several are attempting to do so at once.
Reduced instruction set computer, in contrast to a complex instruction set computer (CISC) architecture like x64.