Concurrency Control in Distributed Systems (2/23/2000) ============================================================================ Basic transaction primitives: BEGIN_TRANSACTION: mark the start of a transaction END_TRANSACTION : terminate transaction and attempt to commit ABORT_TRANSACTION: terminate transaction and restore old values READ : read data from a file/object WRITE: write data to a file/object NESTED TRANSACTIONS: While inside a transaction, you can initiate a subtransaction that is visible only within the original transaction. It has the same properties as a transaction, except that if the outer transaction aborts, it does as well. ============================================================================ Basic transaction properties: ACID Atomic Consistent Isolated (serializable) Durable ============================================================================= Example: Serializability (from book) BEGIN_TRANSACTION BEGIN_TRANSACTION BEGIN_TRANSACTION x = 0; x = 0; x = 0; x = x+1; x = x+2; x = x+3; END_TRANSACTION SERIALIZABILITY requires that the results of a series of transactions be identical to SOME serialized version of those transactions, even though in practice they might be running concurrently! ============================================================================= Implementing Distributed Transactions PROBLEM ONE: How to make changes isolated and atomic? Issue: any changes made by a transaction should only be visible to that transaction or any nested subtransaction. Intentions logs (aka writeahead logs): - modify files (records) in place - record a `change record' in a log on stable storage whenever you modify data - change record includes old and new values Example (from book): Execution: Log after each statement ---------- ------------------------ x = y = 0; BEGIN_TRANSACTION x = x + 1; x: {0,1} y = y + 2; x: {0,1}, y: {0,2} x = y * y; x: {0,1}, y: {0,2}, x: {1,4} END_TRANSACTION If transaction commits: - write `COMMIT' record to log - if not already done, propagate changes to `real' data If transaction aborts : - need to roll back changes - start from the end of the log, work your way backwards - apply inverse of logged changes Log can also be used for crash recovery - undo uncommitted transactions Alternative implementation: shadow blocks (shadow resources) PROBLEM TWO: How do we ensure atomicity across machines? Issue: No obvious single operation that demarcates `yes/no' decision, ala the log write in a single node database. Conventional solution: two phase commit (Gray 1978) Select one COORDINATOR and N COHORTS (subordinates) PHASE ONE: Coordinator: Cohort(s): ------------ ---------- P 1. Write PREPARE record in log H O 2. Multicast PREPARE message to A N all cohorts -------------------> S E 3. Write READY record in log E 5. Collect replies <--------------- 4. Reply OK to coordinator ------------------------------------------------ P 6. Write COMMITTED record in log (****) 7. Multicast COMMIT record to all cohorts -------------------> 8. Write COMMIT record in log 9. Commit changes 11. Collect replies <------------- 10. Reply OK to coordinator The point marked (****) is the ATOMIC COMMIT POINT. If any system crashes before this point, the transaction aborts. If any system crashes after this point, it will complete (eventually). ============================================================================= Optimistic versus Pessimistic Concurrency Control Pessimistic: - idea: ensure no conflicts occur - lock-based concurrency control - deadlocks are a real problem Optimistic: - idea: assume no conflicts, and act accordingly - concurrency control based on time stamps - detect conflicts at commit time -- abort conflicted transactions ============================================================================= Some details Pessimistic (locking) -- transactions acquire locks before using a resource. If a transaction completes, the new versions of the protected data overwrite the older versions, and the locks are all released. Need to guarantee serializability. Two-phase locking: * Divide execution into GROWING PHASE and SHRINKING PHASE. * During growing phase, process acquires all of the locks it will require (cannot modify protected data). If it cannot acquire a lock, it releases all locks, delays, and starts over. * During shrinking phase, process can modify protected data and release locks Variant: strict two-phase locking * system acquires locks as side-effect of accessing data * processes modify local copies of protected data * at end of transaction, local copies overwrite saved ones (via intentions log and two-phase commit) and all locks are released * always serializable * eliminates "cascading aborts" Issue: Granularity of locking (size, r/w, ...) Optimistic concurrency control: Idea: Individual processes don't worry about potential concurrency (serializability problems) -- just barrel ahead and let things sort themselves out later (politician's ideal solution). Implementation: * Keep track of data a process reads. If it is changed by a different process before this one commits, abort transaction when it tries to commit. Assumes private copies of data. + Maximum concurrency -- conflicts are rare + Deadlock free - Potential for lots of wasted work, especially as workload increases - Cascaded rollbacks (another way to state above problem) Timestamps: variant of optimistic concurrency control * Every transaction gets a logical timestamp when it starts * Maintain read and write timestamps with each data item (file), denoting logical timestamp of last *committed* transaction to read/write it * If a read or write is attempted, compare transaction's timestamp with timestamp of file: * If file is older, everything is ok. * If file is younger, serializabilty error - abort transaction