Grid Scheduling Cécile Germain-Renaud 1 Scheduling • Job – A computation to run on a machine – Possibly with network access e.g. input/output file (coarse grain) or communication with other jobs (the DAG model) • Schedule – s(J) = date to begin execution of task J – Alloc(J) = machine assigned to J • One of the oldest Computer Science problems • Principles of classification: [Graham et al. Optimization and approximation in deterministic sequencing and scheduling: A survey. Ann. Discrete Math. 5, (1979), 287-326] • Computer-aided classification of complexity results (4536 at the time of the paper) [Lageweg et al. Computer-Aided complexity classification of combinational problems. CACM 11:2, 1892] 2 Classical scheduling in HPC • • Context: parallel computing/computers Application = Direct Acyclic Graph (T, E, w, c) – T = set of sequential tasks – E = dependence constraints – w(t) = computational cost of task t – c(t,t’) = communication cost (data sent from t to t’) • T T’ Infrastructure – P identical processors – With or without preemption, dedicated (no sharing) • • An optimization problem with objective function Makespan = Total execution time S(T) = max (s(t) + w(t)) Complexity – NP-complete for independant tasks and no communication E = vide, p =2 and c= 0 – NP-complete for UET-UCT graphs (w = c = 1) – Very old: without communication, list scheduling provides a (2-1/p) approximation 3 Scheduling in Institutional Grids • Institutional: federation of ressources – accounted-for: fair-share on the medium to long time scale is a premium constraint – Partially autonomous local policies must be allowed • Grid – Permanent regime: on-line decisions – Large scale: strongly distributed • Information system • Scheduling services • Relevant contexts – Autonomous, multi-agents systems – Auction algorithms – Service Level Agreement (SLA) technology 4 EGEE gLite Scheduling Site (node) UI Broker Proc CE UI Broker UI UI Local scheduler Broker UI 5 EGEE gLite Scheduling Site (node) BDII UI Broker Publish Proc CE UI Broker UI UI Local scheduler Broker UI 6 EGEE gLite Scheduling Site (node) BDII UI UI UI Publish Proc CE Query Rank Broker Local scheduler UI UI The information published is Static: eg which type of VO is accepted Dynamic: expected traversal time 7 EGEE gLite Scheduling Site (node) BDII UI UI UI Publish Proc CE Query Rank Broker Local scheduler UI UI Rank: may be any user-defined function, e.g. avoid « bad » machines Default is first locality, second expected traversal time 8 EGEE gLite Scheduling Site (node) BDII UI Publish Proc CE UI UI Broker Update Local scheduler UI Query UI BDII broker cache 9 Not only academic Overhead Ratio Execution time (s) • Long waiting times • When EGEE was not so heavily loaded 10 Batch scheduling • Very complex policies • Maximise throughput under constraints – – – – Weighted fair-share – VOs, type of jobs Priorities Hardware requirements Advance reservations • An indication of job duration is given by the type of queue: infinite, long, medium, short, and exotic ones [B. Bode et al.The Portable Batch Scheduler and the Maui Scheduler on Linux Clusters] 11 Classical vs Grid • (Relatively) easy: – Throughput instead of makespan + Master-slave graph instead of DAG allow for instance to define cyclic schedules in polynomial time which are asymptotically optimal, but not local [Y. Robert] [A. Rosenberg] • Moderately difficult: information about – Applications – Infrastructures • The same program on different data may run at very different speed • The network performance is dynamic • Really difficult – Queues managed by local policies – On-line decision 12 Information and Scheduling (I) • Considerable work has been done in predicting CPU load in shared environments – desktops, clusters, desktop grids [P.A. Dinda, R. Wolski, J. Schopf] – The basic technique is linear time-series analysis q(B) a + zt = f(B)(1 – B)d t – – – – Self-similarity and epochal behavior Usual goal is the prediction of the next value Applied to soft real-time scheduling on shared clusters Practical application in NWS 13 Information and scheduling (III) • Less work on predicting the behavior of dedicated systems • Papers are on parallel systems, mostly based on timeseries techniques, but at least one based on a genetic algorithm [Downey, Foster, Wolski] • The traces are much more difficult to access • No time slice - Irregular time series: the records are event-driven • Which analysis – Average waiting time: clear but not very useful for prediction – Fitting a distribution: not convincing for // systems – Predicting an upper bound with a confidence interval: metric of success? 14 Information and grid • We cannot directly log the entire state of the system – Access rights – Size • Currently available data – – – – The lifecycle of jobs going through certain brokers The job ranking at the same brokers The detailed behavior of the queues on certain sites Certain = LAL + possibly other mainstream • Easy to get – Summary data about the lifecycle of all jobs – From which it could be possible to reconstruct the detailed state and dynamic of the CE 15 What should we learn ? • Learning besides time series make sense in a grid: massive use of community programs instead of (?) sparse runs of a very long and complex digital experiment • Information as sketched before – Beware: not be a steady-state system • New users, new machines, new software is the expected regime for some years from now • A community-based resource will tend display correlated activity – Is there an invariant social graph? Is it a feature? • System algorithms e.g. a site scheduler or the broker – Validation ? • Scheduling algorithms – Validation ? 16
© Copyright 2026 Paperzz