These topics are from Chapter 13 (Fault Tolerance) in Advanced Concepts in OS.
appear to other processes as if they were
Process A | Process B --------- | --------- ... | ... Lock (X); Lock (Y); | Lock (X); Lock (Y); Tmp := X; | Tmp := X; X := Y; | X := Y; Y := Tmp; | Y := Tmp; Unlock (Y); Unlock (X); | Unlock (Y); Unlock (X); ... | ...
This is an abstraction of a core problem in the design of commit protocols.
There is no adequate protocol which sends the messengers a fixed number of times.
Why?
An adequate protocol may require an unbounded number of messages,
in the presence of an unbounded number of lost messages.
The "state" of the protocol is an abstraction of the location in the code where it is executing and the values of local variables. The following coding of the algorithm assigns a value to a variable State to make the notion of state explicit. Transitions are triggered by receipt of (sets of) messages, and result in the transmission of messages.
State := q_1; send COMMIT_REQUEST message to every cohort; State := w_1; wait for replies from all cohorts; if some cohort replied ABORT then send ABORT to all cohorts; State := a_1; else write COMMIT record to log; send COMMIT message to all cohorts; State := c_1; end if; loop wait for replies, with timeout; exit when all cohorts have replied; if State = a_1 then resend ABORT message to cohorts that have not yet replied; else resend COMMIT message to cohorts that have not yet replied; end if; end loop; write COMPLETE record to log; State := f_1;
State := q_i; await COMMIT_REQUEST message; if transaction successful then write UNDO and REDO records to log; send AGREED message to coordinator; State := w_i; else send ABORT message to coordinator; State := a_i; end if; await COMMIT/ABORT message; if message = ABORT then undo transaction, using UNDO log record; release all resources and locks for this transaction; send ACK to coordinator; State := b_i; else release all resources and locks for this transaction; send ACK to coordinator; State := c_i; end if;
Why is there provision for timeout and resending of messages just in one place? What about the other places where a site is awaiting a message? For example, what happens if a cohort failure causes delay in replying to a COMMIT_REQUEST message? Should the coordinator time out? How should it recover?
Site failures:What is the different between what we are accomplishing with a commit protocol in the current context and what we were accomplishing with the Byzantine Agreement protocols?
The above code can be further abstracted to the following state transition diagrams.
The diagrams use the following notation:
The state transitions correspond to message send and receive events.
What is a finite automaton?
One site never gets more than one state transition ahead of the rest of the participating sites.
The 2-Phase Commit Protocol has this synchronous property. In each state a site waits for replies from all the sites to which it sent messages in the previous state transition, before it makes the transition to its next state.
This property makes it easier to analyze the effects of the protocol, because it reduces the number of global states (combinations of local states) we need to consider.
We can augment the state diagram to include transitions for timeouts (T) and recovery after a local site failure (F), as follows.