Operating System: Three Easy Pieces --- Evaluating Spin Locks (Note)

Given our basic spin lock, we can now evaluate how effective it is along our previously describedide

axes. The most important aspect of a lock is correctness: does it provide mutual exclusion ? Theui

answer here is yes: the spin lock only allows a single thread to enter critical section at a time. this

Thus, we have a correct lock. atom

The next axis is fairness. How fair is a spin lock to a waiting thread? Can you guarantee that a lua

waiting thread will ever enter a critical section? The answer here, unfortunately, is bad news:idea

spin locks don't provide any fairness guarantees. Indeed, a thread spinning may spin forever,spa

under contention. Spin locks are not fair and may lead to starvation.rest

The final axis is performance. What are the costs of using a spin lock ? To analyze this more code

carefully, we suggest thinking about a few different cases. In the first, imagine threads competingorm

for the lock on a single processor; in the second, consider the threads as spread out across many

processors.

For spin locks, in a single CPU case, performance overheads can be quite painful; imagine the case

where the thread holding the lock is preempted within a critical section. The scheduler might then

run every other thread (imagine there are N - 1 threads), each of which tries to acquire the lock.

In this case, each of those threads will spin for the duration of a time slice before giving the CPU,

a waste of CPU cycles.

However, on multiple CPUs, spin locks work reasonably well (if the number of threads roughly

equals the number of CPUs). The thinking goes as follows: imagine Thread A on CPU1 and Thread

B on CPU2, both contending for a lock. If Thread A (CPU 1) grabs the lock, and then Thread B

tries to, B will spin (on CPU 2). However, presumably the critical section is short, and thus soon

the lock becomes available, and is acquired by Thread B. Spinning to wait for a lock held on

another processor doesn't waste many cycles in this case, and thus can be effective.

                  Compare-And-Swap

Another hardware primitive that some systems provide is known as the compare-and-swap

instruction as it is called on SPARC, for example, or compare-and-exchange as it is called on x86.

The C pseudocode for this single instruction is found in Figure 28.4.

int CompareAndSwap(int* ptr, int expected, int new) {
          int actual = *ptr;
          if (actual == expected) {
                  *ptr = new;
          }
          return actual;
}

The basic idea is for compare-and-swap to test whether the value at the address specified by ptr

is equal to expected; if so, update the memory location pointed to by ptr with the new value. If 

not, do nothing. In either case, return the actual value at that memory location, thus allowing the

code calling compare-and-swap to know whether it succeed or not.

With the compare-and-swap instruction, we can build a lock in a manner quite similar to that

with test-and-set. For example, we would just replace the lock() routine above with the following:

void lock(lock_t* lock) {
       while (CompareAndSwap(&lock->flag, 0, 1) == 1)
           ;
}

The rest of code is the same as the test-and-set example above. This code works quite similarly;

it simply checks if the flag is 0 and if so, atomically swaps in a 1 thus acquiring the lock. Threads

that try to acquire the lock while it is held will get stuck spinning until the lock is finally released.

If you want to see how to really make a C-callable x86-version of compare-and-swap, this code

sequence might be useful:

char CompareAndSwap(int* ptr, int old, int new) {
      unsigned char ret;
      
      __asm__ __volatile__ (
                 "  lock\n"
                 "  cmpxchg1 %2, %1\n"
                 "  sete %0\n"
                 : "=q" (ret), "=m" (*ptr)
                 : "r" (new), "m" (*ptr), "a" (old)
                 : "memory");

     return ret;
}

Finally, as you may have sensed, compare-and-swap is a more powerful instruction than test-and-

set. We will make some use of this power in the future when we briefly delve into wait-free

synchronization. However, if we just build a simple spin lock with it, its behavior is identical to the

spin lock we analyzed above.

相關文章
相關標籤/搜索