Lecture #5: Mutual Exclusion

These topics are from Chapter 2 (Synchronization) and Chapter 6 (Distributed Mutual Exclusion) in Advanced Concepts in OS.

Topics for Today

Review of Mutual Exclusion

(Students should all be able to do answer these, from the prerequisite course in operating systems.)

The need for mutual exclusion comes with concurrency.

There are several kinds of concurrent execution:

  1. Interrupt handlers
  2. Interleaved preemptively scheduled processes/threads
  3. Multiprocessor clusters, with shared memory
  4. Distributed systems

Concurrent Execution - Multiprocessor

Concurrent Execution - Single Processor

The above executions are all possible, depending on the scheduling policy of the operating system. (What are the three scheduling policies shown?)

Preemption

Need for Mutual Exclusion - Example of Race

procedure A is
begin ...
   M := M + 1; ...
end A;
procedure B is
begin ...
   M := M - 1; ...
end B;

No Interleaving: Task A First


Everything works fine.

No Interleaving: Task B First


Which task executes first does not matter.

Parallel Execution

Races

With parallel or interleaved execution, what happens?

We have a race between tasks A and B.

The effect of the execution depends on who "wins" the race.

Even supposing we have interleaved execution and the primitive memory fetch and store operations are atomic, the outcome depends on the particular interleaving of the operations.

If B Preempts A

If A Preempts B

Resource Example: Semaphores/Mutexes

This is the general idea. There are LOTS of variations on the details, which I hope you have seen in a prior course.

Critical Section Protected by Lock/Unlock

procedure A is
begin ...
   Lock;
   M := M + 1;
   Unlock; ...
end A;
procedure B is
begin ...
   Lock;
   M := M - 1;
   Unlock; ...
end B;

With Locking/Unlocking

Other Examples of Non-Preemptable Resources

Counting Semaphores

Non-Distributed Mutual Exclusion Mechanisms

Why does each of these work?

Why doesn't it work in more general environments?

One of the topics that is often skipped in the first course on operating systems is the details of how mutual exclusion is implemented. It is useful to understand these details. One benefit is better perspective on the comparative costs of local versus distributed mutual exclusion.

The code that implements spinlocks in version 2.2 of the Linux kernel is mostly contained in the file spinlock.h, from /usr/src/linux-2.2.14/include/asm-i386.

This code is not easy to read without some more explanation. A detailed commentary on the more subtle parts of the Linux spinlock implementation code is given in a separate file (click here).

Look through the explanation in detail. Observe the following:

Observe the comparative cost of disabling/enabling interrupts (2 instructions, no idle CPU time) versus spinlocks (a few instructions, but maybe idling CPU's) versus distributed mutual exclusion. For example, consider the following distribute algorithm:

We have three messages being sent, times the number of systems, and we have to wait for replies from all systems. Given a spinlock takes a few microseconds to execute and local area network delays are a few milliseconds, the difference in scale of the overhead of mutual exclusion is at least 1000. This is a good reason to try to avoid designing systems that require distributed mutual exclusion.


Applications of Distributed Mutual Exclusion

In general, one should "think outside the box". In particular, do not assume that a good approach to solving a small/local problem will scale up, or vice versa.

Classification of Mutual Exclusion Algorithms

Requirements for Mutual Exclusion

What does each of these mean?

Requirements for Mutual Exclusion

Why might fairness not be appropriate in some systems?

*Livelock will come up later, under transaction systems that support cancellation and rollback. It is an infinite cycle of cancellations and rollbacks.

Performance Metrics

Performance may depend on whether load is low or high.

Best case, worst case, and average cases are all of interest.

Centralized Control

What is the throughput?

Throughput and Synchronization Delay

Why is the execution time not sd + E + sd?

Next class: continue with Advanced Concepts in OS, Chapter 6, Distributed Mutual Exclusion.