Chapter 7, Deadlocks 1 7.1 System Model • A system consists of a set of resources • The resources can be grouped into types • Processes compete to have (unique) access to instances of the types 2 • Examples of resource types: – – – – – Memory locations CPU cycles Files Object locks I/O devices (printers, drives, etc.) • When classifying by type, each item has to be interchangeable • Otherwise, it’s necessary to classify into (distinct) subtypes 3 • Each process can request (and hold) as many resources as needed to complete its work • Each process may request one or more instances of one or more types of resource • It makes no sense for a process to request more instances of resources than exist • Such a request can only be denied and such a process cannot do whatever it supposedly was intended to do 4 • Request/Use sequence – Request the resource – This request will either be granted or not – If granted, use the resource – After use, release the resource 5 • Request and release are forms of system calls • As usual, you may make explicit calls, or the compiler may generate them from your source code • Examples: – You can open and close files – You can request and release devices – You can allocate and de-allocate (free) memory locations 6 • The O/S manages access to shared system resources • User level resource sharing may also be implemented in synchronized application code • The general plan of action for synchronization described in the previous chapter applies to system level resource management 7 • Aspects of system resource management: • The O/S keeps a list of resources and records which have been allocated to which process • For processes that couldn’t acquire a resource, there is a separate waiting list for each resource • Processes’ statuses implicitly (due to the presence of their PCB in a waiting list) or explicitly (due to a value recorded in their PCB) reflect whether the processes have acquired or are waiting for certain resources 8 • Deadlock is a condition involving a set of processes (and a set of resources) • Verbal definition of deadlock: • Each process in the set of deadlocked processes is waiting for an event (namely, the release of a lock = the release of a resource) which can only be caused by another process in that set 9 • Keep in mind the idea that you can have resources and you can have locks on the resources • Access to the resource is managed through access to or possession of the lock • This was the basis of the car/car title analogy • Hyothetically, access to resources could granted directly, without some lock concept between the process and the resource • In that case, deadlock could literally occur on the resources 10 • In practice, access to a resource is managed through a semaphore, a lock, or some other control mechanism • As a consequence, deadlocking typically occurs on the stand-in, the lock or other mechanism, not on the resource itself 11 • Under correct synchronization, deadlock should not be possible on one resource • Deadlock can occur with as few as two contending processes and two shared resources • Deadlocks can occur on instances of one kind of resource type or instances of various kinds of resource type • The number of processes and resources involved in a deadlock can be arbitrarily large 12 • As a student, the place you are most likely to concretely encounter deadlock is in Java thread programming • Although it doesn’t necessarily use the term “lock”, mutual exclusion meets the requirements for a process action that has the effect of a lock. Mutual exclusion means that one thread locks another out of a critical section • As soon as you have threads, you have the potential for shared resources—and a critical section itself can qualify as the shared resource • As soon as you have shared resources, you have to do synchronization so that your code is thread safe • As soon as you write code with synchronization, assuming >1 process and >1 resource, you have the potential for deadlock 13 7.2 Deadlock Characterization • Necessary conditions for deadlock: – Mutual exclusion (locking): There are common resources that can’t be shared without concurrency control – Hold and wait: Once processes acquired resources (locks) they’re allowed to hold them while waiting for others 14 • No pre-emption: Resources can’t be pre-empted (swiped from other processes). A process can only release its resources voluntarily • Circular wait: This condition is actually redundant. The previous three conditions imply this one, which is essentially a complete statement of the underlying problem: – If processes can’t acquire resources, they wait. If process x is waiting for a resource held by process y and y is waiting for a resource held by x, this is circular wait 15 • Resource allocation graphs • Let processes, Pi, be represented as circles (labeled) • Let resources, Ri, be represented as boxes with a dot for each instance of the type • Let a request by a process for a resource be represented by an arrow from the process to the resource • Let the granting of requests, if successful, be immediate. Such an arrow will only be shown in cases where the request could not be granted • Let the granting of requests also be atomic • Let the granting, or assignment, of a resource to a process be represented by an arrow from the resource to the process • If there is more than one instance of the resource, the arrow should go from the specific dot representing that instance • Illustrations follow 16 17 • A cycle in the resource allocation graph implies a deadlock • A diagram of the simple, classical case follows 18 19 • The book’s next example includes multiple resource types and multiple processes • It also includes multiple dots per box, representing multiple instances of a resource • This principle holds: If there is no cycle in the graph, there is no deadlock • The example shows waiting, but no deadlock • A diagram follows 20 21 • With more than one instance of a resource type, no cycleno deadlock • However, if there is a cycle in the graph, there may or may not be deadlock • The next example illustrates a case with deadlock 22 23 • The next example illustrates a case where there is a cycle in the resource allocation graph, but there is no deadlock • There is no deadlock because, of the multiple instances of resource R1 which are held, one is held by P2, which is not in the cycle. Similarly, one instance of R2 is held by P4, which is not in the cycle • If P2 gives up R1, R1 will immediately be assigned to P1, and the cycle in the graph will disappear. Likewise, if P4 gives up R2, R2 will immediately be assigned to P3, and the cycle in the graph will disappear 24 25 7.3 Methods for Handling Deadlocks • There are three major categories which lead to subsections in the book • 1. Use techniques so that a system never enters the deadlocked state – A. Deadlock prevention – B. Deadlock avoidance • 2. Allow systems to deadlock, but support: – A. Deadlock detection – B. Deadlock recovery 26 • 3. Ignore the problem—in effect, implement processing without regard to deadlocks • If problems occur (system performance slows, processing comes to a halt) deal with them on a special case basis • The justification for this is that formal deadlock may occur rarely—and there are other reasons that systems go down • Administrative tools have to exist to re-initiate processing in any case. Deadlock is just one of several different cases where they are necessary 27 • In extreme cases, rebooting the system may be the solution • Consider these observations: – How many times a year, on average, do you press CTRL+ALT+DEL on a Windows based system? – Hypothesize that in a given environment, one formal deadlock a year would occur – Under these circumstances, would it be worthwhile to implement a separate deadlock handling mechanism? • In the interests of fairness, note this: In fact, in simple implementations of Unix, what has been described is the level of implemented support for deadlock handling • It may not be elegant, but it’s easy, and suitable for simple systems 28 • Deadlock handling in Java • The book’s explanation may leave something to be desired • The book’s example program is so confusing that I will not pursue it • In short, although a fair amount of Java synchronization was covered in the last chapter, I will be limiting myself to the general discussion of deadlock handling in this chapter 29 • The situation can be summarized in this way: Java doesn’t contain any specific deadlock handling mechanisms • If threaded code may be prone to deadlocking, then it’s up to the application programmer to devise the deadlock handling • It is worth keeping in mind that the Java API contains these methods, which apply to threads, and have been deprecated: • suspend(), resume(), and stop() 30 • Part of the reason for deprecating them is that they have characteristics which impinge on deadlock • The suspend() method causes the currently running thread to be suspended, but it will continue to hold all locks it has acquired • The resume() method causes a thread to start again, but it can only be called by another thread • If the suspended thread holds locks required by the other thread which can resume it, deadlock will result 31 • The stop() method isn’t directly deadlock prone • As pointed out some time ago, it is prone to lead to inconsistent state • Consider this typical sequence of events: – Acquire a lock – Access a shared data structure – Release the lock 32 • When stop() is called, all locks held by the thread are immediately released • If, in confused code, stop() is called when step 2 is in progress, locks will be released before the point where they should be • This can lead to inconsistent state 33 • It may seem odd that a programmer would call stop() somewhere during step 2, but • Hamlet: ... There are more things in heaven and earth, Horatio, Than are dreamt of in your philosophy. Hamlet Act 1, scene 5, 159–167 34 • In short • In Java, it’s the programmer’s problem to write code that isn’t deadlock prone • There are definitely things in the Java API to avoid if you are worried about deadlock • The book’s example program is not the ideal vehicle for seeing how to solve this as a programmer • I don’t have the time to dream up a better example • … 35 7.4 Deadlock Prevention • • • • • • Recall the preconditions for deadlock: Mutual exclusion (locking) Hold and wait No pre-emption Circular wait (redundant) The basic idea behind deadlock prevention is to implement a protocol/code where at least one of the preconditions can’t hold or is disallowed 36 • Disallowing mutual exclusion – This is not an option – It’s true that without mutual exclusion, deadlock is impossible, but this consists solely of wishing the problem away – The whole point of the last chapter was the fact that in some cases mutual exclusion is necessary, and it is therefore necessary to be able to manage the deadlock that comes along with it – The book finally gives a rock-bottom simple example of a shared resource managed by the O/S where mutual exclusion is necessary: Having given the printer to one process, it is out of the question to interrupt it and hand the resource over to another process in the middle of a print job 37 • Disallowing hold and wait – This is doable – There are two basic approaches: • A. Request (and acquire) all needed resources before proceeding to execute • B. Only request needed resource(s) at a point when no others are held – Both of these are a kind of block acquisition. B is a finer grained version of it than A—a single process may request and release >1 block of resources over time 38 • Problems with disallowing hold and wait: – Low resource utilization—because processes grab and hold everything they need, even when they’re not using it – If B is impractical, A is forced, but A is the more drastic choice where more unused resources are held for longer times – Starvation is possible if a process needs a large set of resources and has to be able to acquire them all at the same time 39 – Note that this is not practical in Java. Java doesn’t have a block request mechanism – In general, the underlying goal of a multi-tasking system is concurrency. Disallowing hold and wait reduces concurrency – Processes that aren’t deadlock may not be able to run because they require a resource another process is holding but not using 40 • Disallowing no pre-emption – In other words, implementing a protocol which allows one process to take locks/resources away from other processes – There are two approaches, given below 41 • 1. Self-pre-emption: The requesting process gives up resources – If a process holds resources and requests something that’s unavailable, it releases the resources it has acquired – It will have to start over from scratch 42 • 2. External pre-emption: The requesting process takes resources • If a process requires a resource held by another process, and the other process is waiting for resources, then the holding process is required to release its resources and the requesting process takes what it needs • In this case, the process that was pre-empted will have to start over from scratch • If the required resource is held by a process that is active, that is, it is not waiting for other resources in order to proceed, then the requesting process will have to wait 43 • Note that pre-emption based protocols are related in nature to context switching • Registers and other system components are resources • One process can interrupt, or pre-empt another • These protocols work because the state of the preempted process can be saved and restored • Note that pre-emption isn’t very practical with resources like printers • Chaos results of the printed output consists of interleaved print jobs 44 • Disallowing circular wait – Although “circular wait” is in a sense redundant, it encapsulates the idea behind deadlock and suggests a solution • 1. Number the resource types • 2. Only let processes acquire items of types in increasing resource type order • 3. A process has to request all items of a type at the same time 45 • Why does this work? • Examine the classic case of deadlock, and observe how it doesn’t meet the requirements listed above: 46 47 • The order of actions for P1: – Acquire R1 – Request R2 • The order of actions for P2: – Acquire R2 – Request R1 • P2 did not acquire/try to acquire resources in ascending order by type 48 • If all processes acquire in ascending order, waiting may result, but deadlock can’t result • This is the verbal explanation – Suppose process 1 holds resources a, b, c, and requests resource x – Suppose process 2 holds x – Process 1 will have to wait for process 2 – However, if process 2 already holds x, because it acquires in ascending order, it cannot be waiting for resources a, b, or c – It may already have acquired copies of those resources that it needs, or it may not need them at all, but it will not be going back to try and get them – Therefore, it is not possible for process 1 and process 2 to be deadlocked 49 • A variation on the previous idea (which involves a serious cost) • Suppose a process reaches an execution point where it becomes clear that a resource with a smaller number is now needed • The process will have to release all resources numbered higher than that one and acquire them again—in ascending order 50 • Deadlock avoidance • The previous discussion was of techniques for deadlock prevention • Deadlock prevention reduces concurrency • Deadlock avoidance is based on having more information about resource requests • With sufficient knowledge requests may be ordered/granted in a way that will not lead to deadlock • A simple starting point for the discussion: required all processes to declare their resource needs up front— then work from there 51 • Definitions needed for deadlock avoidance • Safe state: • Informally: The system can allocate resources to processes in some order without deadlock • Formally: there exists a safe sequence • Definition: A safe sequence, <P1, P2, …, Pn>, has this property: • For all Pj, Pj’s resource requests can be satisfied by free resources or resources held by Pi, where i < j. 52 • Notice how this is related to the idea of circular wait • Under the definition given, Pj can be waiting, but only on Pi where i < j • Pj can’t be waiting on some Pk where k > j • Waiting only goes in sequential order, ascending by subscript • As a consequence, there can be no circular waiting 53 • • • • According to these definitions 1. A safe state is not deadlocked 2. A deadlocked state is not safe 3. There are unsafe states that are not yet deadlocked, but could or will lead to deadlock • The nub of deadlock avoidance is point 3 • Under deadlock prevention, you avoid going into a deadlocked state • Under deadlock avoidance, you avoid going into an unsafe state 54 • Resource allocations can cause a system to move from a safe to an unsafe state • These may be allocations of currently free resources • Because processes pre-declare their needs, an unsafe state (leading to possible deadlock) can be foreseen based on other requests and allocations that will be made 55 • There are several algorithms for deadlock avoidance • They all prevent a system from entering an unsafe state • The algorithms are based on pre-declaration of needs, but not immediate acquisition of all resources up front • Like deadlock prevention algorithms, they tend to have the effect of reducing concurrency, but the level of concurrency reduction is low since not all acquisitions happen up front 56 • Resource allocation graph algorithm for deadlock avoidance • Let the previous explanation of the graph stand: – Let a request by a process for a resource be represented by an arrow from the process to the resource – When a request is made, it may be granted or not, but if it is, the granting is atomic – Let the granting, or assignment, of a resource to a process be represented by an arrow from the resource to the process • Let a new kind of edge, a claim edge, be added to the notation – A claim edge goes in the same direction as a request, but it’s dashed – A claim edge represents a future request that will be made by a process 57 • The resource allocation graph algorithm says: – Requests for resources can only be granted if they don’t lead to cycles in the graph – When identifying cycles, the dashed claim edges are included 58 • What follows is a sequence of diagrams showing two processes progressing towards an unsafe state • Initially, the diagram only shows claim edges. These are the pre-declared requests of the processes • The diagrams show a sequence of actual requests and granting of them • The last request can’t be granted because it would lead to an unsafe state 59 P1 acquires R1 60 P2 requests R1, P2 requests R2, and P2 acquires R2. This leads to the third state, which is unsafe. From that state, if R1 then requested R2 before anything was released, deadlock would result. The middle state is the last safe state. Therefore, P2’s request for R2, the transition to the third state, can’t be granted. 61 • Reiteration of aspects of the resource allocation graph algorithm for deadlock avoidance • This algorithm is based on pre-declaring all claims • The pre-declaration requirement can be relaxed by accepting new claims if all of a process’s edges are still only claims • The algorithm would require the implementation of graph cycle detection. The authors say this can be done O(n2) 62 • The banker’s algorithm – The resource allocation graph algorithm doesn’t handle multiple instances of each resource type – The banker’s algorithm does – This algorithm is as exciting as its name would imply – It basically consists of a bunch of bookkeeping – It’s messy to do, but there is nothing cosmic about the idea – I’m not covering it 63 8.6 Deadlock Detection • If you don’t do deadlock prevention or deadlock avoidance, then you allow deadlocks • At this point, if you choose to handle deadlocks, two capabilities are necessary: – 1. Deadlock detection – 2. Deadlock recovery • (Remember that some systems may simply ignore deadlocks.) 64 • Detecting deadlocks with single instances of resource types • This can be done with a wait-for-graph • This is like a resource allocation graph, but it’s not necessary to record the resources • The key information is whether one process is waiting on another • A cycle in the graph still indicates deadlock • A simple illustration of a resource allocation graph and a comparable wait-for-graph follow 65 66 • Wait-for-graph algorithm implementation • 1. The system maintains a WFG, adding and removing edges with requests and releases • 2. The system periodically searches the graph for cycles. This is O(n2), where n is the number of vertices 67 • Several instances of a resource type • This is analogous to the banker’s algorithm • Instead of a WFG, it’s necessary to maintain some NxM data structures and algorithms • I am uninterested in the details 68 • Deadlock detection algorithm usage • When should deadlock detection be invoked? • This depends on two questions: – 1. How often is deadlock likely to occur? – 2. How many processes are likely to be involved? • The general answer is, the more likely you are to have deadlock, and the worse it’s likely to be, the more often you should check 69 • Checking for deadlock is a trade-off • Checking isn’t computationally cheap • Checking every time a request can’t be satisfied would be extreme • If deadlock is a real problem, on a given system, checking every hour might be extreme in the other direction • Deadlock will tend to affect system performance, like CPU utilization • A system (or administrator) might adopt a rule of thumb like this: Trigger deadlock detection when CPU utilization falls below a certain threshold, like 40%. 70 8.7 Recovery from Deadlock • Recovery from deadlock, when detected, can be manual, done by an administrator • It can also be an automatic feature built into a deadlock handling system • Overall, deadlock recovery falls into two categories: – 1. Abort processes to break cycles – 2. Pre-empt resources from processes to break cycles without aborting 71 • Process termination (abortion) • There are basically two approaches: – 1. Abort all deadlocked processes. This potentially wastes a lot of work – 2. Abort one process at a time among those that are deadlocked • Approach 2 also has disadvantages – The underlying problem is that there may be >1 cycle, and a given process may be in more than one cycle – If one process at a time is aborted, deadlock detection will have to be done after each abortion to see whether the system is deadlock free 72 • The overall problem with abortion is the potential for leaving resources in an inconsistent state • If a process is only partially finished, and it has made changes to resources, an advanced system will log changes and roll them back when the abortion is done • That is, abortion isn’t complete until there is no trace of the process’s existence left 73 • Aside from the question of rollback, if selective abortion is done, then there need to be criteria for picking a victim. For example: • Process priority • Time already spent computing, time remaining (% completion) • What resources are held • How many more resources are needed? (% completion as measured by resources) • How many (other?) processes will need to be terminated? • Whether the process is interactive or batch… 74 • Resource pre-emption • This is an even finer scalpel. Three questions remain • Victim selection: How do you choose a victim for pre-emption (what cost function)? • Rollback: In what way can you bring a partially finished process back to a safe state where it could be restarted and would run to completion correctly, short of aborting it altogether? • Starvation: How do you make sure a single process isn’t repeatedly pre-empted? 75 • • • • • • • • Consider the general principle illustrated by deadlock The problem, deadlock, arises due to concurrency The mindless “solution” to the problem is to eliminate concurrency Barring that, a solution at one extreme is to ignore deadlock, hoping that it is infrequent, and overcoming it by gross means, such as rebooting From there, possible solutions get more and more fine-grained, each leading to their problems to solve You can do prevention, avoidance, or detection and recovery With detection and recovery you have to decide how fine-grained the recovery mechanism is, whether rollback can be implemented, whether it is possible to pre-empt at the level of individual resources rather than whole processes… This continues ad infinitum, until you’ve had enough and you decide that a certain level of solution is cost-effective for the system under consideration 76 The End 77
© Copyright 2026 Paperzz