486 31.Producer‐ConsumerQueues
Microsoft Visual C++ 2008 has a compiler-specific feature to treat access to
volatile variables with acquire and release semantics [Dawson 2008]. In order to
enforce acquire and release semantics completely on modern x86/64 processors,
the compiler would need to enforce compiler memory barriers up the call stack in
addition to using atomic or lock instructions. Unfortunately, when disassembling
Dekker’s algorithm on a Core2 Duo platform, locking instructions were not used,
thus leaving the possibility of a race condition. Therefore, to be safe, aligned in-
tegers that are shared across threads should be accessed using interlocked instruc-
tions, which provide a full barrier when compiled under Microsoft Visual C++. A
cross-platform implementation may require the use of preprocessor defines or
different code paths depending on the target compiler and target architecture.
31.6Lock‐FreeAlgorithmDesign
On x86 and x64 architectures, lock-free algorithms center around the compare-
and-swap (CAS) operation, which is implemented atomically when supported as
a CPU instruction. Pseudocode for a 32-bit CAS operation is shown in Listing
31.4. If implemented in C++, this operation would not be atomic. However, on
x86 and x64 processors, the 32-bit
CMPXCHG instruction is atomic and introduces
a processor memory barrier (the instruction is prefixed with a
LOCK). There are
also 64-bit (CAS2) and 128-bit (CAS4) versions on supported platforms
(
CMPXCHG8B and CMPXCHG16B). In Win32, these are available as the Inter-
lockedCompareExchange()
and InterlockedCompareExchange64() functions.
At the time of this writing, there is no
InterlockedExchange128() function, but
there is a
_InterlockedCompareExchange128() intrinsic in Visual C++ 2010.
Support for the 64-bit
CMPXCHG4B instruction has been available since the
Pentium, so availability is widespread. The 128-bit
CMPXCHG8B instruction is
available on newer AMD64 architectures and Intel 64 architectures. These fea-
tures can be checked using the
__cpuid() intrinsic function, as shown in
Listing 31.5.
UINT32 CAS(UINT32 *dest, UINT32 old, UINT32 new)
{
UINT32 cur = *dest;
if (cur == old) *dest = new;
return (old);
}
Listing 31.4. CAS pseudocode.