New Worker-Centric Scheduling Strategies for Data-Intensive Grid Applications Steve Ko, Ramses Morales, and Indranil Gupta Department of Computer Science University of Illinois at Urbana-Champaign Distributed Protocols Research Group http://dprg.cs.uiuc.edu Our Thesis Statement Worker-centric scheduling (or just-in-time scheduling) is more efficient than task-centric scheduling at exploiting locality of interest in data-intensive Grid applications. 2 Our Thesis Statement Worker-centric scheduling (or just-in-time scheduling) is more efficient than task-centric scheduling at exploiting locality of interest in data-intensive Grid applications. Worker-centric vs. task-centric Our contribution! Why worker-centric is more desirable Proposed worker-centric scheduling heuristics Performance evaluation 3 Background on Grid Model Global Fileserver Grid model Global Scheduler Queue multiple tasks File Cache Worker A Queue …. Queue …. Worker B Site 1 File Cache Worker H Site 5 4 Background on Grid Model Multiple tasks Global scheduler One Grid application = multiple parallel tasks The list of tasks: static and finite Independent scheduling of multiple Grid apps Our focus: scheduling tasks of the same Grid app File cache per site Cache files from the global server Limited in size (LRU in our experiments) 5 Background on Scheduling Scheduling for data-intensive Grid applications Goal: to reuse the files in the local cache of a site Characteristics Accessing a large set of files Locality of interest Many data-intensive Grid applications in Physics, earth science, and Astronomy 6 Background on Data-intensive Grid Data-intensive Grid applications access a large set of files E.g., coadd (Sloan Digital Sky Survey southernhemisphere coaddition): each task accesses up to 181 files (~900MB) File transfer time is a major bottleneck 7 Background on Data-intensive Grid Data-intensive Grid applications exhibit locality of interest A set of files that are accessed by one task are also likely to be accessed together by other tasks. Different tasks have high-degree of file-sharing E.g., coadd: 90% of files are accessed by 6 or more tasks. 8 Background on Locality of Interest 1000 random pairs of files from coadd Ratio between the actual number of tasks (C) accessing both files and the expected number (a/T * b/T * T) C/(a/T * b/T * T) 9 a: # of tasks accessing file A b: # of tasks accessing file B T: total # of tasks High correlation Background on Locality of Interest 1000 random pairs of files from coadd Ratio between the actual number of tasks (C) accessing both files and the expected number (a/T * b/T * T) C/(a/T * b/T * T) 10 a: # of tasks accessing file A b: # of tasks accessing file B T: total # of tasks High correlation Background on Scheduling Scheduling for data-intensive Grid applications Goal: to reuse the files in the local cache of a site Many data-intensive Grid applications in Physics, earth science, and Astronomy 11 Our Thesis Statement Worker-centric scheduling (or just-in-time scheduling) is more efficient than task-centric scheduling at exploiting locality of interest in data-intensive Grid applications. Worker-centric vs. task-centric Why worker-centric is more desirable Proposed worker-centric scheduling heuristics Performance evaluation 12 Task-Centric vs. Worker-Centric Key difference : worker’s availability for execution Task-centric scheduling Task-centric refers to those that don’t consider it. Worker-centric refers to those that consider it. The global scheduler assigns a task to a worker, whether or not the worker can execute it immediately. Worker-centric scheduling Task-assignment time is determined by each worker based on its availability for execution. 13 Task-Centric vs. Worker-Centric Task-centric scheduling Global Scheduler tasks Queue Queue …. Worker A (running) Worker B (available) …. 14 Worker H (running) Task-Centric vs. Worker-Centric Worker-centric scheduling Global Scheduler tasks …. Worker A (running) Worker B (available) …. 15 Worker H (running) Why Worker-Centric for Data-Intensive Grid Applications? Reminder: Data-intensive Grid applications Inherent problems with task-centric scheduling Locality of interest Major bottleneck: file transfer time Unbalanced task assignment Long latency between scheduling and execution Worker-centric scheduling does not suffer from these problems 16 Why Not Task-Centric Scheduling? Unbalanced task assignment Many tasks are assigned to a site with popular files. 17 Why Not Task-Centric Scheduling? Unbalanced task assignment Many tasks are assigned to a site with popular files. Global Scheduler Queue tasks (all tasks require File0) Queue File1 File0 Worker A (running) Worker B (running) 18 Why Not Task-Centric Scheduling? Unbalanced task assignment Many tasks are assigned to a site with popular files. Global Scheduler Queue tasks (all tasks require File0) Queue File1 File0 Worker A (running) Worker B (running) 19 Why Not Task-Centric Scheduling? Unbalanced task assignment Many tasks are assigned to a site with popular files. Global Scheduler Queue File1 File0 Worker A (running) Worker B (available) 20 Why Not Task-Centric Scheduling? Unbalanced task assignment Can be fixed Storage-affinity based scheduling (by Santos-Neto et al.): replicating tasks to the idle workers Ranganathan et al.: replicating popular files Worker-centric scheduling does not suffer from this problem. No need for additional mechanisms 21 Why Not Task-Centric Scheduling? Long latency between scheduling and execution Tasks are assigned to a worker and stored in the queue of each worker for later execution. Storage is limited, and thus files are replaced. Result: files might no longer reside in the storage at the execution time. Information at the scheduling time becomes stale at the execution time. 22 Why Worker-Centric for Data-Intensive Grid Applications? Two inherent problems with task-centric Unbalanced task assignment Long latency between scheduling and execution Worker-centric scheduling does not suffer from these problems 23 Our Thesis Statement Worker-centric scheduling (or just-in-time scheduling) is more efficient than task-centric scheduling at exploiting locality of interest in data-intensive Grid applications. Worker-centric vs. task-centric Why worker-centric is more desirable Proposed worker-centric scheduling heuristics Performance evaluation 24 Worker-Centric Scheduling Heuristics Goal: reducing the total execution time by exploiting the locality of interest tasks (2) Find the best match (using a metric) (1) Signal Availability Worker A (running) (3) Send a task Global Fileserver (4) Retrieve files Worker B (available) 25 Worker-Centric Scheduling Heuristics Goal: select the best task for the worker 1st approach – overlap Counting the number of files that are needed by a given task and also present in the local storage (i.e. intersection) – used by storage-affinity Goal: reuse the existing files Files in the local cache Files Needed 26 Worker-Centric Scheduling Heuristics Goal: select the best task for the worker 2nd approach - rest The inverse of the number of files that need to be transferred (i.e. difference) Goal: reduce the file transfers Files in the local cache Files Needed 27 Worker-Centric Scheduling Heuristics Goal: select the best task for the worker 3rd approach – probabilistic rest Mostly the same as rest Except randomly choosing one out of top N tasks Intuition: avoid being too greedy (a better worker might come along right after the assignment) Experimental results show top 2 is good. Several other metrics in the paper 28 Performance Evaluation Simulation using SimGrid Workload: Coadd trace with 6000 tasks accessing 53390 files Grid environment 1 global scheduler and 1 global file server 90 sites with up to 10 workers each One file server per site 29 Performance Evaluation Main metrics Makespan (total execution time) # of files transfers Comparison to task-centric storage-affinity Storage-affinity: overlap metric with task replication to idle workers 30 Capacity Variation Makespan vs. Capacity Strong correlation between the two Worker-centric is much better with smaller capacities 31 File transfers vs. Capacity Capacity Variation Makespan vs. Capacity Strong correlation between the two Worker-centric is much better with smaller capacities 32 File transfers vs. Capacity Worker Variation Makespan vs. Workers Positive: more workers, more processing Negative: more workers, more contention at the file cache 33 # of file transfers vs. Workers Worker Variation Contention at the file cache Makespan vs. Workers Positive: more workers, more processing Negative: more workers, more contention at the file cache 34 # of file transfers vs. Workers Site Variation Makespan vs. Sites rest metric is the most useful 35 # of file transfers vs. Sites Summary Worker-centric scheduling is more efficient than task-centric scheduling for dataintensive Grid applications. Exploiting locality of interest is the key. Worker-centric scheduling can avoid Unbalanced task assignment Long latency between scheduling and execution Especially, worker-centric scheduling performs better with limited resources 36
© Copyright 2026 Paperzz