Required reading: Fast mutual exclusion for uniprocessors
We revisit the topic of mutual-exclusion coordination: the techniques to protect variables that shared among multiple threads. These techniques allows programmers to implement atomic sections so that one thread can safely update the shared variables without having to worry that another thread intervening.
We focus today on uniprocessors. Even on uniprocessor one has to provide mutual-exclusion coordination, because the scheduler might schedule another thread in response to a hardware interrrupt (e.g., end of a time slice).
The technique for mutual-exclusion coordination used in v6 is disabling/enabling interrupts in the kernel. In v6, the only program with multiple threads is the kernel, thus that is the only program that needs need mutual-exclusion mechanism. User-level processes don't share data directly.
As we discussed last lecture, microkernels force server programs to deal with many similar issues as monolithic kernels, and they therefore require concurrent handling of events. This organizational requirement raises the question "How to handle multiple events concurrently at a user-level?"
To implement any three of these options we need to avoid blocking system calls. The kernel must support asynchronous system calls or scheduler activations.
Can we use a thread package that use the v6 kernel approach? That is, can we allow a user-level thread to disable and reenable interrupts? (Answer: not, if we care about fault isolation.)
List and insert example:
struct List { int data; struct List *next; }; List *list = 0; insert(int data) { List *l = new List; l->data = data; l->next = list; list = l; } fn() { insert(100); } main() { thread_create(..., fn); thread_create(..., fn); thread_schedule(); }
The paper addresses how to provide mutual-exclusion primitives that are completed implemented in software. The motivation is that some uniprocessors don't provide a hardware-level atomic instructions. It turns out, however, that the approach advocated in the paper is in general a good one, because software implementations are often more efficient than hardware implementations.
The paper describes several ways to implement test-and-set in software:
bool flag[N]; void TSL (int L, int *R) { while (true) { flag[me] = true; if (!is_flagged (me)) { *R = L; L = 0; flag[me] = false; return; } else { flag[me] = false; } } } boolean is_flagged(me) { for (i = 0; i < N; i++) { if ((i != me) && flag[i]) return true; } return false; }
int insert_lock = 0; insert(int data) { /* acquire the lock: */ while(TSL(&insert_lock) != 0) ; /* critical section: */ List *l = new List; l->data = data; l->next = list; list = l; /* release the lock: */ insert_lock = 0; }
insert(int data) { List *l = new List; l->data = data; BEGIN_RAS l->next = list; list = l; END_RAS }This implementation of insert is strictly better than the one using TSL. The TSL insert is blocking: if T1 is executing the insert is pre-empted and control is passed to T2, T2 cannot execute insert without waiting for T1. This RAS version doesn't block T2. Such versions are called wait free.
void xchg_RAS (int *p1, int *p2) { BEGIN_RAS int tmp = *p1; *p1 = *p2; *p2 = tmp; END_RASFind a sequence of events such that xchg_RAS doesn't work. Assume:
a = 1; b = 2; xchg_RAS (&a, &b);
int cmpxchg(addr, v1, v2) { int ret = 0; // stop all memory activity and ignore interrupts if (*addr == v1) { *addr = v2; ret = 1; } // resume other memory activity and take interrupts return ret; } insert (int data) { element *n = new Element; n->x = x; do { n->next = list; } while (cmpxchg (&list, n->next, n) == 0); }