Int. J. Production Economics 133 (2011) 586–595 Contents lists available at ScienceDirect Int. J. Production Economics journal homepage: www.elsevier.com/locate/ijpe Scheduling with processing set restrictions: PTAS results for several variants Leah Epstein a,, Asaf Levin b a b Department of Mathematics, University of Haifa, 31905 Haifa, Israel Faculty of Industrial Engineering and Management, The Technion, Haifa, Israel a r t i c l e i n f o a b s t r a c t Article history: Received 29 April 2009 Accepted 25 April 2011 Available online 13 May 2011 We consider multiprocessor scheduling to minimize makespan. Each job has a given processing time and in addition, a subset of machines associated with it, also called its processing set. Each job has to be assigned to one machine in its set, thus contributing to the load of this machine. We study two variants of this problem on identical machines, the case of nested processing sets, and the case of tree-hierarchical processing sets. In addition, we consider uniformly related machines with a special case of inclusive processing sets, which has a clear motivation. We design polynomial time approximation schemes for these three variants. The first case resolves one of the open problems stated in the survey by Leung and Li (2008). & 2011 Elsevier B.V. All rights reserved. Keywords: Scheduling Approximation scheme 1. Introduction We study multiprocessor scheduling problems with processing set restrictions (Leung and Li, 2008). Such problems are motivated by situations, where machines have different expertise levels with respect to tasks which need to be processed. These restrictions can result from lack of memory of some processors, limited speed, or specific components which only subsets of the machines are equipped with. In makespan minimization problems, which is the type of problems considered in this paper, a set of input jobs is partitioned among a number of machines. Each machine receives a subset of the job set, and the load of a machine is the total time required to process all the jobs which were assigned to it. The makespan is the maximum load of any machine, and the goal is to minimize the makespan. The most general variant of multiprocessor scheduling with processing set restrictions is the model of restricted assignment (Azar et al., 1995). We denote the set of jobs by J . Each job j A J , of processing time pj, is provided with a subset Mj of the machine set M ¼ f1,2, . . . ,mg. The number of machines jMj ¼ m is arbitrary, and it is seen to be a part of the input. Job j needs to be assigned to one of the machines of Mj to be processed there. The sets Mj of the different jobs do not necessarily need to satisfy any particular relation. We use OPT to denote an optimal solution as well as its cost (i.e., its makespan). For a polynomial time algorithm A, we denote its cost by A as well. The approximation ratio of A is the infimum Corresponding author. E-mail addresses: [email protected] (L. Epstein), [email protected] (A. Levin). 0925-5273/$ - see front matter & 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.ijpe.2011.04.024 R such that for any input, A r R opt. If the approximation ratio of A is at most r, then we say that it is an r-approximation. The case of identical machines (Graham, 1966) is a special case of restricted assignment, where for each job j, Mj ¼ M, that is, every job can run on any machine. This problem is known to be NP-hard in the strong sense, and therefore it does not admit a fully polynomial time approximation scheme (FPTAS) unless P¼NP. Such a scheme is a class of approximation algorithms, which contains an ð1 þ eÞ-approximation for any e 4 0, where an algorithm of this class must have a running time which is polynomial in the input size and in 1=e (see also Kang and Ng, 2007). The identical machines variant admits, however, a polynomial time approximation scheme (PTAS), a similar concept, where the running time does not need to be polynomial in 1=e, i.e., e is seen as a constant in the running time calculation. The first PTAS for this problem was designed by Hochbaum and Shmoys (1987). On the other hand, restricted assignment is an important special case of scheduling on unrelated machines (Lenstra et al., 1990; Liao and Lin, 2003; Weng et al., 2001), in which the time to process job j on machine i, denoted by pi,j , may be influenced by the properties of job and of the machine. For this case, approximation algorithms with an approximation ratio of 2 and 2ð1=mÞ are known (Lenstra et al., 1990; Shchepin and Vakhania, 2005). Moreover, an inapproximability result was shown in Lenstra et al. (1990), showing that unless P¼ NP, the problem cannot be approximated within a factor smaller than 32. The proof in fact holds for the more specialized restricted assignment model, and therefore, it is probably possible to design polynomial time approximation schemes only for special cases. In this paper, we consider special cases which were studied in the past. While the current best approximation ratio for the restricted assignment scheduling problem is still 2ð1=mÞ, recently L. Epstein, A. Levin / Int. J. Production Economics 133 (2011) 586–595 Ebenlendr et al. (2008) improved this result for the special case, where jMj j ¼ 2 for each job j, and obtained a 74-approximation for this case. It is reasonable to assume that some hierarchy can be defined on the machine set. The inclusive or hierarchical model (Kafura and Shen, 1977; Bar-Noy et al., 2001) refers to a case where for any two jobs j and j0 , such that jMj j r jMj0 j, Mj D Mj0 holds. In other words, the machines can be renamed, so that each Mj is a suffix of the machine set. A PTAS for this case was shown by Ou et al. (2008). In the same paper (Ou et al., 2008), a fast heuristic with an approximation ratio of at most 43 was developed for this case, improving previous results on fast heuristics by Kafura and Shen (1977), Spyropoulos and Evans (1985), Hwang et al. (2004), Glass and Kellerer (2007). An interesting generalization is the nested version, where for any two jobs j and j0 , such that jMj j rjMj0 j, either Mj D Mj0 or Mj \ Mj0 ¼ | holds. In other words, the set of distinct processing sets is a laminar family of sets which can be represented by a tree, where a vertex v is the parent of a vertex u, if the corresponding ! V, and there is no processing processing sets V and U satisfy UD ! XD ! V. It is moreover possible to assume set X which satisfies UD that each processing set corresponds to a consecutive subset of machines, and that for every machine, the set which consists of this machine as a single element, is a leaf (for which the set of jobs is possibly empty). Therefore, the size of the tree is polynomial in m (see Section 3). Glass and Kellerer (2007) analyzed a fast greedy heuristic for this variant and showed that its approximation ratio is 2ð1=mÞ. An improved 74-approximation algorithm for this version was recently obtained by Huo and Leung (2010). This last algorithm has an approximation ratio of 54 for m¼2 and 32 for m ¼3. In the tree-hierarchical model (Bar-Noy et al., 2001), a rooted tree T , whose set of vertices is M, is given. For each job j, the set Mj consists of all vertices in the path from the root RT to a vertex yj A M. The work of Bar-Noy et al. (2001) mostly considered online algorithms, but it is not difficult to see that an off-line 2-approximation is implied by their results. This model corresponds to a situation where there is a hierarchy of employees, so that there is a director, and each additional employee has a unique direct boss. This hierarchy is presented as a tree, where the root of the tree is the director, and the direct boss of each employee is the parent node of his node in the tree. The ancestors of a node are all its bosses, including the director. Any work that a certain employee can do, can be done by any of his bosses. Uniformly related machines are a generalization of identical machines (and a special case of unrelated machines), where machine i has a speed si, and the time to process job j on machine i is pj =si . Hochbaum and Shmoys (1988) designed a PTAS for this model. In this paper we consider a special case of the inclusive model of uniformly related machines, which we call the the speed hierarchical model, where the subset of machines which can process a given job j is determined by a parameter sj which is a lower bound on the speed which this job requires, that is, Mj ¼ fijsi Z sj g. The motivation for this model is the scenario in which a job must be completed within a certain amount of time once it starts running, and therefore a machine can process a job if it is fast enough given the speed requirement of the job. A slightly different motivation for the same model is the case, where the speeds of the machines are monotonically increasing in the expertise level of the machines and each job has a minimum expertise level in which it can be processed. This case models the familiar situation in which the more senior worker, who can handle a larger set of tasks, is also faster in accomplishing these tasks. Note that the case where m is seen as a constant, which is not a part of the input, is NP-hard, but as it was shown by Horowitz and 587 Sahni (1976), it admits a FPTAS for the most general case of unrelated machines (see also Woeginger, 2009), and therefore, for each one of the variants studied here. Therefore, we focus on the case where m is a part of the input. As mentioned above, since the problem is strongly NP-hard already for identical machines, which is a special case of all three models studied here, the best result which we can expect to achieve is a PTAS for each one of the three models. Note that there is no direct relation between the three models, each of them is an interesting and applicative generalization of the case of identical machines and a special case of the model of restricted assignment. Prior to our work, for each one of the variants which we study, a polynomial time algorithm of constant approximation ratio was known, while a PTAS was not known. Our results: We design a polynomial time approximation scheme for each one of the three variants: the nested model on identical machines, the tree-hierarchical model on identical machines, and the speed hierarchical model on uniformly related machines. The first variant resolves an open problem stated in Leung and Li (2008). Specifically, open problem 2 in the conclusion section of Leung and Li (2008) is concerned with the existence of a PTAS for the nested variant, and the general case of restricted assignment. While the general case does not admit a PTAS by the results of Lenstra et al. (1990), in this paper we show that the nested variant in fact admits a PTAS. 2. Preliminaries In this section we present common parts of the three schemes (for the three variants). We note that although these parts are common, the details of the correctness proofs differ and these differences are explained later, in the specific sections. Throughout the paper, we let e 4 0 be such that 1=e is a large enough integer (a sufficient assumption for all sections is e r 15Þ. The outline of our schemes is as follows. We start by ‘‘guessing’’ the value of OPT within a multiplicative factor of 1þ e. By scaling the processing times of all jobs we can assume that opt 1. We further define the so-called rounded-down instance which possesses the significant property that the optimal makespan to the rounded-down instance is within a factor of 1 þOð1Þe times the optimal makespan to the original instance. The second phase is an application of a dynamic programming procedure along the tree representing the processing sets in a bottom-up fashion (in the nested case) or in a top-down fashion (in the treehierarchical model case), or along the set of machines in an increasing order of speeds (in the speed hierarchical model for uniformly related machines) to solve the scheduling problem of the rounded instance optimally or approximately. The last phase of the scheme involves the transformation of the solution of the rounded instance into a feasible solution to the original instance while making sure that the resulting solution has a makespan which remains 1þ OðeÞ. We next describe some of the common details. The first step of our schemes computes a value, later called t, which is a ð1 þ eÞ-approximation of the value of OPT in a sense which is defined in what follows. In order to find the final value t, the schemes use the variable t as a parameter. The three variants can be seen as special cases of scheduling on unrelated machines. Hence one can use a 2-approximation algorithm for this problem due to Lenstra et al. (1990) to approximate this problem. Alternatively, for the nested variant we can use the 2-approximation algorithm of Glass and Kellerer (2007) (or the algorithm of Huo and Leung, 2010 which approximates within an even smaller factor), and for the treehierarchical model we can use the 2-approximation algorithm of 588 L. Epstein, A. Levin / Int. J. Production Economics 133 (2011) 586–595 Bar-Noy et al. (2001). We run such a 2-approximation algorithm and denote the makespan of the resulting solution by UB. Let pmax ¼ maxnj¼ 1 pj , and we define the following lower bound LB on the optimal makespan. In all three variants we know that the optimal makespan is at least UB=2 and in the nested model and in the tree-hierarchical model we know that the optimal makespan is at least pmax . Hence, both in the nested variant and the treehierarchical model we let LB ¼ maxfUB=2, pmax g, whereas in the speed hierarchical model on uniformly related machines we let LB ¼ UB=2. In all cases, we use a parameter m 40 and we let m ¼ 5 in the tree-hierarchical model, m ¼ 6 in the nested version and m ¼ 7 in the speed hierarchical model on uniformly related machines. We then apply a binary search (or a geometric-mean binary search) on the interval [LB,UB] to find an interval [a,b] such that the following conditions hold: b=a r 1þ e (or b ¼ LBÞ, the solution of our scheme when we use the value t ¼ b returns a feasible solution whose makespan is at most ð1 þ meÞb, and the solution of our scheme when using the value t ¼ a returns a feasible solution whose makespan is larger than ð1 þ meÞa. The value of t used in a given step is called the guess of OPT. Note that the number of iterations in this binary search is constant (for a constant value of eÞ. The final value of t is b. Assume that the current iteration of the scheme uses a value of t such that optr t. It suffices to show that for this case our scheme returns a solution whose makespan is at most ð1 þ meÞt. For each value of t, before the execution of the scheme, we apply scaling on the processing times of all jobs, and hence we can assume without loss of generality that t ¼ 1. We conclude the following claim. Claim 1. To derive a polynomial time approximation scheme for any of the three variants, it suffices to devise a polynomial time algorithm A such that for a fixed constant value of m, for any set of jobs, A acts in one of the following ways. It either returns a feasible schedule whose makespan is at most 1þ me or it returns an answer that there is no feasible solution to the problem whose makespan is at most 1. 2.1. Creating the rounded-down instance for the nested variant and for the tree-hierarchical model The first step is to define small jobs and large jobs. We say that a job j is a small job if its processing time is at most e, and otherwise we say that j is a large job. We let L be the set of large jobs, and S the set of small jobs. Consider a large job j whose processing time is pj 4 e, we round its processing time and denote the rounded processing time of j by p0j ¼ maxk ¼ 0,1,2,... fe þke2 : e þ ke2 r pj g. That is, p0j is a roundeddown value of the form e þ ke2 for integer values of k. We next partition the set of small jobs S into subsets of common machine processing sets. The partition is defined by formulating an equivalence relation on the small jobs, such that j j0 (where j,j0 A SÞ if Mj ¼ Mj0 . We denote by S 1 , . . . ,S p the set of equivalence classes of this relation. We also let Mi be the common machine processing set of the equivalence class S i , i.e., Mi ¼ Mj for any job j A S i . For i ¼ 1,2, . . . ,p, we denote by si the total processing times P of all jobs in S i , i.e., si ¼ j A Si pj for all i. For all i ¼ 1,2, . . . ,p, we i replace the jobs in S by a set of bsi =ec new jobs, where each of them has a size of e, and its processing set is Mi . The new jobs are called slices. Such a process was applied to the entire set of small jobs in Alon et al. (1998), but due to the distinct processing sets, we apply it to multiple sets of small jobs. The set of large jobs with the rounded processing time together with the set of slices allow us to define a new instance of the problem,which we call the rounded-down instance. In the rounded-down instance we have jobs of processing times in the set S ¼ fe, e þ e2 , . . . , e þ 1=e2 1=e e2 g. We note that the jobs of size e in this instance are either slices or large jobs whose rounded-down size is e. We note that jSj is a constant (i.e., less than 1=eÞ. For every vertex v in T whose associated processing set is V, and for every s ¼ e þ is e2 A S, we denote by nv ðis Þ the number of jobs whose machine processing set is V and whose size in the rounded-down instance is s. Note that these values can be computed easily from the input. We recall that the rounded-down instance is solved optimally by a dynamic programming procedure. 2.2. Transforming the solution to the rounded instance of the nested and the tree-hierarchical variants into a solution to the original instance We next describe common features of the last step of the schemes for the nested and the tree-hierarchical model. In this step we need to transform the solution SOLr to the rounded-down instance, whose makespan is at most 1þ e, into a solution SOL to the original instance. Large jobs are scheduled in SOL on the same machine as they are scheduled in SOLr. Note that reverting the processing time of the large jobs from their rounded-down values to their original values may increase the makespan of the schedule by a multiplicative factor of at most 1 þ e (since the ratio between the original processing time and the rounded-down processing time is at most 1 þ eÞ. This creates a solution in which the makespan is at most ð1 þ eÞ2 . It remains to show how to schedule the small jobs. We first define an allocation of space to the small jobs on each machine. Let g ¼ 3 in the nested variant, and g ¼ 2 in the tree-hierarchical variant. Assume that machine i has exactly di slices allocated to it in SOLr, then the total space reserved on machine i for possible usage by small jobs is ðdi þ gÞe. Keeping the large jobs which are scheduled together with the space to the small jobs, give us a schedule whose makespan is at most ð1þ eÞ2 þ ge o 1 þ me, where the last inequality holds since e o1. We need to show that this space for small jobs suffices to schedule all the small jobs. We pack the small jobs into the spaces in a greedy fashion. The definition of this greedy fashion differs between the two models. In Sections 3 and 4, we present the different details of the schemes for the nested variant and for the tree-hierarchical variant together with the correctness proofs of the different steps. 3. PTAS for the nested version In this section we present a PTAS for the nested version. As noted above the laminar family of sets fMj : j A J g can be represented by a rooted tree T ¼ ðV,EÞ such that each distinct set in the family has a corresponding vertex in T, where a vertex v is the parent of a vertex u, if the corresponding processing sets V and ! V, and there is no processing set X which satisfies U satisfy UD ! XD ! V. UD We first show that the tree contains at most 2m 1 vertices, and thus the number of vertices is polynomial. We actually prove a stronger property that the subtree of a vertex v with a processing set V has a size of at most 2jVj1, using induction. This is clearly true for a leaf. For an inner vertex v, with a processing set V, let V1 ,V2 , . . . ,Vk be the processing sets of its P children. We have jVj Z ki ¼ 1 jVi j, as well as jVjZ jV1 jþ 1. Thus, P the size of the subtree is at most 1 þ ki ¼ 1 ð2jVi j1Þ r2jVjkþ 1r 2jVj1, for k Z 2, and for k ¼ 1 the size of the subtree is at most 1 þ ð2jV1 j1Þ ¼ 2jV1 jr 2jVj1. We conclude that the number of equivalence classes in the relation used for the definition of slices in the rounded-down instance is at most 2m 1. L. Epstein, A. Levin / Int. J. Production Economics 133 (2011) 586–595 Lemma 2. There is a feasible solution SOLr to the rounded-down instance whose makespan is at most 1 þ e. Proof. Consider the solution OPT to the original instance, and recall that opt r 1. We next define SOLr by modifying OPT. For every large job j, SOLr processes j on the same machine as OPT. Let ci denote the total processing time of small jobs on machine i in OPT. We let xi ¼ bci =ec þ 1 be the quota on the number of slices which are allowed to be used on machine i, and thus exi Z ci . We next allocate the slices to machines such that every machine i receives at most xi slices, and all slices are allocated to machines which can process them. We allocate the slices to machines in a bottom-up fashion. Consider a leaf of the tree T. Then, all slices, which have exactly the processing set corresponding to this leaf, are assigned to be processed by the machines of this set (of the leaf), after which the leaf is removed from the tree (note that the set of the parent is a superset of the set of the leaf, so all the machines of the set of the removed leaf appear in the set of its parent). Specifically, we allocate the slices which can be processed by the machines of the leaf to these machines such that the quota of each machine is not exceeded. We then update the quota of all machines (according to the number of allocated slices of each machine) and we continue to apply this procedure recursively on the remaining tree. It remains to show that this process can indeed be applied in every vertex of the tree. Assume otherwise, that is, assume that v is a vertex in T corresponding to a machine processing set V ¼ S i , and assume that v is the first vertex for which we are not able to allocate all slices of S i to machines in Mi without exceeding P the quota of these machines. We let ri ¼ ‘:M‘ D Mi x‘ and P Ci ¼ ‘:M‘ D Mi c‘ . Note that in OPT the total processing time of the machines in Mi , which is used for small jobs whose processing sets are (not necessarily proper) subsets of Mi, is at most Ci r eri . The last term is the number of slices of all machine sets which correspond to vertices in the subtree of T rooted at v. Hence, the total quota of machines in Mi is at least as large as the number of slices in the subtree rooted at v. Therefore, there is a sufficient quota of slices to allocate to all the slices with processing set V, which leads to a contradiction. Since the rounded-down size of a large job is no larger than its size, and the conversion of small jobs increases the load of a machine i by at most exi ci r e, the makespan of the defined solution is at most optþ e r 1þ e. & We next present a dynamic programming procedure which finds a feasible solution whose makespan is at most 1 þ e. Such a schedule is possible by Lemma 2. Lemma 3. There exists a polynomial time dynamic programming procedure to find a feasible schedule to the rounded-down instance whose makespan is at most 1 þ e if such a solution exists. Proof. Our dynamic programming procedure assumes that every vertex v of the tree initially contains a vector of length jSj, Nv ¼ ðnv ð0Þ,nv ð1Þ, . . . ,nv ðjSj1ÞÞ. The procedure is based on solving the following type of feasibility problems. We define a class of decision problems Pv ða0 ,a1 , . . . ,ajSj1 Þ, where v is a vertex of the tree (for which the corresponding processing set is V), and fa0 ,a1 , . . . ,ajSj1 g is a set of values, where 0 r ai rn is an integer for every 0 r i rjSj1. The goal is to consider the vectors Nu for all vertices u of the subtree of v (including v), together with additional ai items of size e þ ie2 for each i, where these last jobs may be assigned to any machine of V, and to check whether it is possible to assign all these jobs to the 589 machines in V without exceeding a makespan of 1þ e, so that the processing sets conditions are kept. The vector ða0 ,a1 , . . . ,ajSj1 Þ represents the number of jobs, that could have been assigned to a wider class of machines, but is nevertheless supposed to be assigned to a machine of V. Thus, ai is the number of jobs whose rounded-down processing time is e þ ie2 and their processing set is a (proper) superset of V, to be considered by the problem Pv ða0 ,a1 , . . . ,ajSj1 Þ, which tests whether all of them can be assigned validly on the machines of V, together with jobs that cannot be assigned to any machine in M–V. We note that for every vertex of the tree, the number of such subproblems (for all possible vectors ða0 ,a1 , . . . ,ajSj1 ÞÞ is polynomial in the input size, and since the tree has a polynomial size (in m), we conclude that the number of such problems is polynomial. The goal of the algorithm is to check whether all jobs can be assigned. Since there is no processing set which is a proper superset of the root, this goal is achieved by computing Pr ð0,0, . . . ,0Þ, where r is the root of the tree. We first make a simplifying assumption that each leaf of T has a processing set which consists of a single machine, and each machine has a leaf which corresponds to the processing set which contains only this machine. In order to make this assumption, we first assume that the processing set of the root is M. Otherwise, we add a new root as a parent of the current root. Moreover, if for some machine, there is no leaf which corresponds to it as a processing set, then such a leaf vertex is created, and the lowest vertex of the tree, which contains it in its processing set, will become its parent. The new vertices are added each with an empty set of jobs that have the corresponding processing set as their processing set. If v is a leaf (and thus we assume jVj ¼ 1Þ then the problem Pv ða0 ,a1 , . . . ,ajSj1 Þ can be easily solved by computing the total processing time of the jobs in this subproblem. If it is larger than 1þ e then the solution to this feasibility problem is NO and otherwise it is YES. We next present an algorithm to solve Pv ða0 ,a1 , . . . ,ajSj1 Þ assuming that all the subproblems associated with children of v in T have been already solved. We denote by v1 ,v2 , . . . ,vq the set of children of v in T. For all i, we denote by bi the number of jobs for which both the (rounded-down) processing time is equal to e þ ie2 and the processing set is equal to V in the subproblem Pv ða0 ,a1 , . . . ,ajSj1 Þ, i.e., bi ¼ ai þnv ðiÞ. We solve this feasibility problem by an application of another dynamic programming procedure. We denote by Pi ðc0 ,c1 , . . . ,cjSj1 Þ the corresponding problem while assuming that v has only v1 , . . . ,vi as its children (and we remove the subtree rooted at vj for all iþ 1 rj rqÞ. We would like to solve Pq ðb0 ,b1 , . . . bjSj1 Þ. We define the answer to P0 ð0,0, . . . ,0Þ to be YES, and P0 ðc0 ,c1 , . . . ,cjSj1 Þ is defined as NO, if for at least one component cj 4 0. We next show how to compute Pi ðc0 ,c1 , . . . ,cjSj1 Þ. We check if there is a vector ðd0 ,d1 , . . . ,djSj1 Þ (where 0 r di r ci Þ, such that both the solution to Pvi ðd0 ,d1 , . . . ,djSj1 Þ is YES and the solution to Pi1 ðc0 d0 , c1 d1 , . . . ,cjSj1 djSj1 Þ is YES. That is, there is a partition of the jobs defined by the vector ðc0 ,c1 , . . . ,cjSj1 Þ, where the jobs defined by the vector ðd0 ,d1 , . . . ,djSj1 Þ are assigned to machines of the processing set of vi, and the other jobs are assigned to machines of the processing sets of the first i–1 children. If such a vector does not exist, then the solution to this problem is NO. Clearly this test can be carried out in constant time for each such vector ðd0 ,d1 , . . . ,djSj1 Þ. As a result, the entire computation of Pi ðc0 ,c1 , . . . ,cjSj1 Þ takes polynomial time (as the number of such 590 L. Epstein, A. Levin / Int. J. Production Economics 133 (2011) 586–595 vectors is polynomial). Hence, we can solve the problem Pv ða0 ,a1 , . . . ,ajSj1 Þ in polynomial time and so the entire feasibility problem is solvable in polynomial time as well. We next note that we need to solve the scheduling problem and not the feasibility problem (that is, finding the actual schedule, and not only testing its existence). However, by keeping backtracks throughout the computation we can easily reveal a feasible solution to the scheduling problem whose makespan is at most 1 þ e if such a solution exists. & are given a rooted tree T ¼ ðM,ET Þ, whose vertex set is the set of machines M, and its root is denoted by RT . We define the rounded-down instance as specified above. Note that the number of equivalence classes in the relation used for the definition of slices in the rounded-down instance is at most m. Thus, if opt r1, the process described above is successful. It remains to transform the solution SOLr to the rounded-down instance into a solution SOL to the original instance. To do so, we need to specify the allocation of small jobs into the spaces. We pack the small jobs into the spaces in a greedy fashion as follows. We pack first the jobs whose processing set is most restrictive. That is, we sort the small jobs in a way that if small jobs j and j0 have processing sets U and V which correspond to vertices u and v, respectively, of the tree, and u is an ancestor of v, then j0 appears before j in the ordering of the small jobs. Then, we use a first fit type heuristic to pack the jobs into the space allocated to the small jobs (so each job is scheduled to the first machine in its processing set which still can process it without violating the condition that the total processing time of jobs in this machine is at most its space). We next prove that this allocation procedure of small jobs to machines can always schedule all jobs. Proof. Consider the solution OPT to the original instance, and recall that optr 1. For every large job j, SOLr processes j on the same machine as OPT. Let ci denote the total processing time of small jobs on machine i in OPT. We let xi ¼ bci =ec þ 1 be the quota on the number of slices allocated to machine i. We next allocate the slices to machines such that every machine i receives at most xi slices, and all slices are allocated to machines which can process them. We allocate the slices to machines in a bottomup fashion. Consider a leaf of the tree T which is associated with machine i. Then, if there are sufficient slices (that is, at least xi Þ which include machine i in their processing sets then, we assign exactly xi such slices to be processed by machine i, and otherwise we assign all such slices to machine i. In the first case, the remainder must be processed by an ancestor, while in the second case, the subtree of the vertex which corresponds to machine i can be removed from the tree since the slices associated with it do not affect any other vertex. In both cases we continue by removing the leaf from the tree for the purposes of the assignment of slices (since all the remaining jobs which have machine i in their processing set have the machines associated with the ancestors of the removed leaf in their processing set). We continue to apply this procedure on the remaining tree. It is left to show that this process can indeed be applied in every vertex of the tree. To show this claim we will show that for every vertex of the tree, the total processing time of slices allocated to machines in its rooted subtree, is at least the total processing time of small jobs assigned to the machines in this subtree in OPT. Assume otherwise, that is, assume that v is a vertex in T corresponding to a machine i, and assume that v is the first vertex for which the total processing time of slices allocated to machines in its rooted subtree, is smaller than the total processing time of small jobs assigned to the machines in this subtree in OPT. First note that if there is a descendant of v whose corresponding machine is j, for which this procedure did not assign xj slices, then we can remove the subtree rooted at j from the tree, and prove the claim on the remaining tree. This is because the allocation in the subtree was successful (in the sense that no slices associated with any vertices of the subtree remain unassigned) and the number of slices of this subtree does not affect the allocation of the slices in the rest of the tree (as any other slice cannot be allocated to machines in the subtree rooted at j). Therefore, we can assume that for each machine in the subtree rooted at v we allocated amount of slices which equals its quota. Hence in this subtree the processing time allocated to slices of each machine is larger than the total processing time of small jobs in these machines according to OPT. Therefore, there is a sufficient quota to be allocated to all the slices which are associated with v or a descendant of v, which leads to a contradiction. & Lemma 4. The allocation procedure of small jobs is able to allocate each job to a machine in its processing set, such that for every machine i the total processing time of small jobs allocated to machine i is at most ðdi þ 3Þe. Proof. Assume otherwise. That is, let j be a job with processing set V which corresponds to the vertex v of the tree T, such that j is the first job which the allocation procedure is not able to allocate to a machine in V. Note that since j is a small job, we conclude that for every machine i A V the total processing time of jobs allocated to machine i which are allocated prior to j is strictly larger than ðdi þ 2Þe. Summing up over all jobs which were allocated to machines in V and appear in the list of small jobs before j does, we conclude that the total processing time of small jobs with processing set which is either V or a proper subset of V P P is strictly larger than i A V ðdi þ 2Þe ¼ i A V di e þ2jVje. In the rounded-down instance, by the feasibility of our solution to this instance, the total size of the slices whose processing set is P either V or a proper subset of V is at most i A V di e. Since our rounding of small jobs into slices may decrease the total processing time by at most e per processing set, we conclude that the total processing time of all small jobs whose processing set is P either V or a subset of V is at most i A V di e þ ð2jVj1Þe, where the last term ð2jVj1Þe follows from the fact (proved above) that the tree T has at most 2jVj1 vertices in the subtree rooted at v. This leads to a contradiction. & Hence, since the running time of our algorithm (for a fixed value of eÞ is polynomial, we have established the following theorem. Theorem 5. The nested version has a polynomial time approximation scheme. 4. PTAS for the tree-hierarchical variant In this section we present a PTAS for the tree-hierarchical version. In the tree-hierarchical model (Bar-Noy et al., 2001), we Lemma 6. There is a feasible solution SOLr to the rounded-down instance whose makespan is at most 1þ e. We next describe how to use the tree-structure for a dynamic programming procedure which finds a feasible solution whose makespan is at most 1 þ e. Such a schedule is possible by Lemma 6. L. Epstein, A. Levin / Int. J. Production Economics 133 (2011) 586–595 Lemma 7. There is a polynomial time dynamic programming procedure to find a feasible schedule to the rounded-down instance whose makespan is at most 1 þ e if such solution exists. Proof. Our dynamic programming procedure is based on solving two types of feasibility problems for each vertex. These two dynamic programs are solved simultaneously. The first class of decision problems for a vertex v is denoted by Pv . Specifically, we consider problems of the form Pv ða0 ,a1 , . . . ,ajSj1 Þ, where v is a vertex of the tree, and fa0 ,a1 , . . . ,ajSj1 g is a set of integral values which satisfy 0 r ai rn for every 0 r ir jSj1. The value of ai represents the number of jobs to be subtracted from the subtree rooted at v whose rounded-down processing time is e þ ie2 , in the sense that these jobs would be assigned to ancestors of v. More formally, this problem is to find out if there is a feasible solution to the machine scheduling problem with the machine set corresponding only to the subtree rooted at v, and the set of jobs of the rounded-down instance which are associated with v or a descendant of v, so that the number of jobs of size e þ ie2 from this subtree which are not processed by machines in this subtree, i.e., are processed on machines which correspond to ancestors of v in T (not including v), is at most ai. We note that for every vertex of the tree the number of such subproblems is polynomial in the input size, and since the tree has polynomial size, we conclude that the number of such problems is polynomial. The goal of the algorithm is to compute PRT ð0,0, . . . ,0Þ. The second class of decision problems for a vertex v is denoted ~ v . We consider problems of the form P ~ v ðb0 ,b1 , . . . ,bjSj1 Þ, by P where v is a vertex of the tree, and fb0 ,b1 , . . . ,bjSj1 g is a set of integral values which satisfy 0 r bi r n for every 0 r i rjSj1. The value of bi represents the number of jobs which can run on v, whose rounded-down processing time is e þ ie2 , which will not run on a machine which is a proper descendant of v, but will be assigned to v or to an ancestor of v. The difference between this problem and Pv is that here the values bi include the jobs to be executed on v and not only the jobs which are received by its ~ v ðb0 ,b1 , . . . ,bjSj1 Þ will therefore proper ancestors. The problem P find out if there is a feasible solution to the machine scheduling problem of the machines in the subtree of v, excluding v, and jobs with processing sets corresponding to the vertices of the subtree rooted at v (including v) and the set of jobs of the rounded-down instance which are associated with a vertex in this subtree (including v), where the number of jobs of size e þie2 from this subtree which are processed by machines which do not correspond to proper descendants of v (i.e., machines which correspond to either v or ancestors of v in T Þ is at most bi. If v is a leaf then problem Pv ða0 ,a1 , . . . ,ajSj1 Þ can be easily solved by computing the total processing time of the jobs which P þ 2 should be assigned to v, which is jSj1 i ¼ 0 ðnv ðiÞai Þ ðe þie Þ, where þ z ¼ maxfz,0g (that is, the total processing time of the jobs which are associated with v, neglecting ai such jobs of size e þ ie2 Þ. If this total processing time (of all jobs which v can execute, except for those which are neglected) is larger than 1 þ e then the solution to this feasibility problem is NO and otherwise it is YES. As for the ~ v ðb0 ,b1 , . . . ,bjSj1 Þ for a leaf, the solution to this feasiproblem P bility problem is YES if and only if all i ¼ 0,1, . . . ,jSj1 satisfy bi Z nv ðiÞ. This requirement is due to fact that the jobs which are associated with v must be processed by v or an ancestor of v, and since v is a leaf, there are no additional such jobs. 591 We next present an algorithm to solve Pv ða0 ,a1 , . . . ,ajSj1 Þ ~ v ðb0 ,b1 , . . . ,bjSj1 Þ was computed for all values assuming that P of ðb0 ,b1 , . . . ,bjSj1 Þ. We first enumerate all the possible assignments of jobs which can be executed on v, excluding jobs which are assigned to its proper descendants. An assignment is a partition of these jobs into jobs which v receives, and jobs that its ancestors would receive. Since we are solving Pv ða0 ,a1 , . . . ,ajSj1 Þ, we know how many jobs of each size the ancestors of v receive. Such an assignment is therefore a vector ðb0 ,b1 , . . . ,bjSj1 Þ of integers for P 2 which we have jSj1 i ¼ 0 ðbi ai Þ ðe þ ie Þ r 1þ e, such that for all i, we have both ai rbi and bi is no larger than the total number of jobs of size e þie2 which are associated with a descendant of v or with v. Note that the number of such vectors is a constant (for a constant value of eÞ since the ‘1 distance of such vector from ða0 ,a1 , . . . ,ajSj1 Þ is at most 1=e. If there is at least one such vector ~ v is YES, then for which the solution to the feasibility problem P also the solution to the feasibility problem Pv ða0 ,a1 , . . . ,ajSj1 Þ is YES. ~ v ðb0 ,b1 , . . . ,bjSj1 Þ by an application of another We solve P dynamic programming procedure. We assume that all the sub~ , associated with children of v in T, have problems for P and P been already solved. We denote by v1 ,v2 , . . . ,vq the set of children of v in T. ~ v ðc0 ,c1 , . . . , ~ v ðc0 ,c1 , . . . ,cjSj1 Þ the problem P We denote by P i cjSj1 Þ in a tree, where v has only v1 , . . . ,vi as its children (and we remove the subtree rooted at vj for all j Z iþ 1Þ. In order to ~ v ðb0 ,b1 , . . . bjSj1 Þ, which is ~ v ðb0 ,b1 , . . . bjSj1 Þ, we solve P solve P q v ~ ðc0 ,c1 , . . . ,cjSj1 Þ to be YES if and only if equivalent. We define P 0 for every 0 rt r jSj1, ct Z nv ðtÞ holds, which is consistent with ~ v ðc0 ,c1 , . . . ,cjSj1 Þ in this case where v is a leaf. the definition of P ~ v ðc0 ,c1 , . . . ,cjSj1 Þ. In order to We next show how to compute P i do that, we examine the assignment of the subtree of the child vi of v. Specifically, we consider which jobs are not received by vi and its subtree. We denote a vector of such jobs by ~ v ðc0 ,c1 , . . . ,cjSj1 Þ to be feasible, ðd0 ,d1 , . . . ,djSj1 Þ. In order for P i there must be a combination of two subproblems which are feasible. We check if there is a vector ðd0 ,d1 , . . . ,djSj1 Þ such that both the solution to Pvi ðd0 ,d1 , . . . ,djSj1 Þ is YES and in addition, the ~ v ðc0 d0 ,c1 d1 , . . . ,cjSj1 djSj1 Þ is YES. That is, the solution to P i1 set of jobs not processed by the first i 1 children of v (and their subtrees) complements the set of jobs not processed by the next child of v and its subtree. If such a vector does not exist, then the solution to this problem is NO. Clearly this test can be carried out in constant time for each such vector ðd0 ,d1 , . . . ,djSj1 Þ and so the ~ v ðc0 ,c1 , . . . ,cjSj1 Þ takes polynomial time entire computation of P i (as the number of such vectors is polynomial). Hence, we can solve the problem Pv ða0 ,a1 , . . . ,ajSj1 Þ in polynomial time and so the entire feasibility problem is solvable in polynomial time. We next note that we need to solve the scheduling problem and not the feasibility problem (that is we need to find a schedule). However, by keeping backtracks throughout the computation we can easily reveal a feasible solution to the scheduling problem whose makespan is at most 1 þ e if such solution exists. & Thus, if opt r1, the process described above is successful. It remains to transform the solution SOLr to the rounded-down instance into a solution SOL to the original instance. To do so, we need to specify the allocation of small jobs into the spaces. We pack the small jobs into the spaces according to a greedy fashion, we pack first the jobs whose processing set is less 592 L. Epstein, A. Levin / Int. J. Production Economics 133 (2011) 586–595 restrictive. That is, we sort the small jobs in a way that if small jobs j and j0 are associated with vertices u and v, respectively, and u is an ancestor of v, then j0 appears first in the order of the small jobs. Then, we use the following heuristic to pack the jobs into the space allocated to the small jobs. Each job is scheduled to the machine in its processing set, which corresponds to the lowest vertex which still can process it without violating the condition that the total processing time of jobs in this machine is at most its space. We next prove that this allocation procedure of small jobs to machines can always schedule all jobs. Lemma 8. The allocation procedure of small jobs is able to allocate each job to a machine in its processing set, such that for every machine i the total processing time of small jobs allocated to machine i is at most ðdi þ 2Þe. Proof. Assume otherwise. That is, let j be a job which is associated with a vertex v of the tree T , such that j is the first job for which the allocation procedure is not able to allocate to a machine in its processing set. Note that this failure is declared if j could not be assigned to any machine in its processing set, including the machine of the root. Let X and X 0 be subsets of vertices, which are defined as follows. X 0 contains all vertices u for which there exists a job j~ such that there was an unsuccessful attempt to assign j~ to the machine of u. Consider the connected components of the forest induced by X 0 . We let X be the vertices of the tree in this forest which contains the root of T . Let U denote the set of machines associated with the vertices of X. This means that for every machine of U, there exists some job for which there was an unsuccessful attempt to assign it. The subtrees whose vertices are not in X are such that their roots do not have such a job, and hence all jobs assigned to machines of U are those for which their processing sets are contained in U, i.e., each one of them is associated with a vertex of X. Note that since all allocated jobs are small, we conclude that for every machine i A U the total processing time of jobs allocated to machine i (excluding j, which is not allocated at all) is strictly larger than ðdi þ 1Þe. Summing up over all these jobs, we conclude that the total processing time of small jobs which are associated P with a vertex in X is strictly larger than i A U ðdi þ1Þe ¼ P i A U di e þ jUje. By the feasibility of our solution to the rounded-down instance, the total size of the slices which are associated with a vertex in X P is at most i A U di e. Since our rounding of small jobs into slices decrease the total processing time by less than e per vertex, we conclude that the total processing time of all small jobs which are P associated with a vertex of X is smaller than i A U di e þ jUje, and this is a contradiction. & Hence, since the running time of our algorithm (for a fixed value of eÞ is polynomial, we have established the following theorem. Theorem 9. The tree-hierarchical version has a polynomial time approximation scheme. 5. A PTAS for the speed hierarchical model In this section, we present our last PTAS. We recall that si is the speed of machine i A M, and a job j has a minimum speed level sj , that is, Mj ¼ fiA M : si Z sj g. Our first step is to partition the machines into classes of machines with approximately the same speed. That is, first we assume without loss of generality that the machines are sorted in a non-decreasing order of their speeds s1 rs2 r rsm . Note that according to this order, the processing set of each job is a suffix of machines. To simplify notations, we let s0 ¼0. Next, we define an equivalence relation on the machines, so that two machines i,i0 are equivalent if dlog1=e si e ¼ dlog1=e si0 e. We use the term class for an equivalence class of this relation. We let the set of classes be C1 , . . . ,Cp . The set of available jobs for class Ci , denoted by Ai , is defined to be A0 ¼ | for i¼0, and as follows for i4 0. Assume that s0i is the maximum speed of any machine in Ci , and denote by 0 bi ¼ ð1=eÞdlog1=e si e an upper bound on the speed of each machine in this class. Ai consists of all jobs whose processing time is at most s0i and their expertise level is at most s0i . That is, Ai ¼ fj A J : pj rs0i , sj r s0i g. These jobs are available for at least one machine in the class. We next partition the jobs in Ai into small and large jobs with respect to the class Ci . We say that a job j A Ai is a small job with respect to Ci if its processing time is at most bi e2 , and other jobs in Ai (i.e., the jobs of Ai of size in ðe2 bi ,s0i Þ are called large jobs with respect to Ci . Note that a job may be small with respect to some classes and large with respect to other classes (by definition, there are at most two classes in which a given job is large). Moreover, note that if a job j is small with respect to Ci , then for every machine k A Ci the load incurred by scheduling this job on machine k is at most e (because the speed of each machine in Ci is larger than ebi Þ. For every class Ci we denote by S i the set of small jobs with respect to Ci , and by Li the set of large jobs with respect to Ci . We have Ai ¼ S i [ Li . We next describe the rounding of the large jobs with respect to Ci . The rounded-down processing time of a job j A Li , such that e2 bi opj r ebi , is maxfbi ðe2 þ te3 Þ : bi ðe2 þ te3 Þ r pj ,t A Zg. The rounded-down processing time of a job j A Li , such that ebi o pj rbi , is maxfbi ðe þ te2 Þ : bi ðe þt e2 Þ rpj ,t A Zg. Note that the maximum ratio between the processing time of a job j A Li and its rounded-down processing time is at most 1 þ e. Moreover, note that if j A Li \ Li þ 1 then pj 4 ebi so its roundeddown processing time of job j in these two classes is the same (since in this case bi þ 1 ¼ bi =e, and pj A ðe2 bi þ 1 , ebi þ 1 Þ. We next describe the rounding of the small jobs. Let Si ¼ S i \ Ai1 be the subset of S i whose elements were available to at least one class prior to Ci , i.e., they were available to the fastest machine whose speed is at most ebi . We first partition S i \Si according to the first machine which can schedule the jobs with respect to the required expertise level. That is, for each machine t A Ci , we let Sit ¼ fj A S i \Si : st1 o sj r st g. For every t, we replace the jobs in Sit with new jobs which are called slices whose size is bi e2 and the number of such slices is P b j A Si pj =ðbi e2 Þc. These slices are available for machines t t,t þ 1, . . . ,m. The rounded-down instance consists of the jobs which are the set of slices of all classes together with the set of other jobs with their rounded-down processing times. Specifically, for each other job on which the process of conversion into slices was not applied, there exists a class in which it is available for at least one machine as a large job. The processing times of such a job in the rounded-down instance is its rounded-down processing time, which is defined in a class in which it is available as large job for at least one machine. The processing time of the slices is defined above. Note that unlike the previous sections, here there may be slices of very different sizes. All these sizes, however, are powers of 1=e. Lemma 10. There is a feasible solution to the rounded-down instance whose makespan is at most 1þ 2e. Proof. Consider the solution OPT to the original instance, and recall that opt r 1. For every job j in the rounded-down instance, which is not a slice, SOLr is defined to process j on the same machine as it is assigned to in OPT. Let ct denote the total processing time of the other jobs (which were replaced by slices L. Epstein, A. Levin / Int. J. Production Economics 133 (2011) 586–595 in the rounded-down instance) on machine t A Ci in OPT. We let xt ¼ ð2þ bct =bi e2 cÞ bi e2 be the quota on the total processing time of slices allocated to machine t. We next allocate the slices to machines such that for every machine t, we allocate slices of a total processing time of at most xt to machine t, and all slices are allocated to machines which can process them. We allocate the slices to machines according to the sorted order of the machines (that is, starting from the slowest machine of index 1, and finishing with the fastest machine of index m). Consider the slowest machine which we have not considered, yet. Assume that this is machine t A Ci . Then, we can schedule all slices which can be processed on machine t unless we exceed the quota. If the quota is exceeded, then we pick the slices in an arbitrary order as long as we do not exceed the quota xt , and schedule them on machine t. After scheduling the slices on machine t, we move to scheduling slices on machine t þ1. It remains to show that this process allocates all slices. Slices originate in jobs that were never available as large jobs. For a machine t, the slices which can be processed on this machine were converted into slices in the class of machine t or earlier. Thus all the sizes of such slices are at most est . Assume that the process was not successful. That is, assume that there exists a non-allocated slice at the end of the procedure. Denote by t the fastest machine for which the above procedure was able to assign all available slices to machine t (if there is no such machine then we let t¼0). Hence, we can assume that the instance consists of machine t þ1,t þ 2 . . . and we need to schedule the slices which become available starting from machine t þ1. Note that in each of the machines t0 Z t þ1 where t 0 A Ci0 , the total processing time of the allocated slices to machine t 0 is at least ð1 þbct0 =bi0 e2 cÞ bi0 e2 4 ct0 . This is because the size of each available slice for this machine is at most bi0 e2 . Hence, the total processing time of the slices allocated to machines t þ 1,t þ2, . . . is strictly larger than the processing time of the jobs which were replaced by these slices. Hence, since the processing time of the slices which become available at each machine is not larger than the total processing time of the jobs which were replaced by these slices, we derive a contradiction to the assumption that there is a non-allocated slice at the end of the procedure. The claim follows because scheduling jobs with an additional 2bi e2 processing time on a machine from class Ci may increase the makespan by at most 2e as machines from this class have speed of at least bi e. & We next describe how to approximate the optimal solution to the rounded-down instance. That is, we will present a dynamic programming procedure which finds a feasible solution whose makespan is at most 1 þ3e. Recall that there is such a schedule with makespan at most 1 þ 2e by Lemma 10. Lemma 11. There is a polynomial time dynamic programming procedure to find a feasible schedule to the rounded-down instance whose makespan is at most 1 þ 3e if the makespan of OPT is at most 1. Proof. Let k ¼ 1=e2 1=e and r ¼ 2k ¼ 2=e2 2=e. For a machine i class Ci and 0 r a rr, we define fa as follows. If a r k1, then fia ¼ ðe2 þ ae3 Þbi , and otherwise fia ¼ ðe þðakÞe2 Þbi . Our dynamic programming procedure will be described by three levels, where each of them is solved by its own dynamic programming procedure. The first level of our dynamic programming procedure is to solve the following feasibility problems for each class Ci . A feasibility problem Pi ða0 ,a1 , . . . ,ak1 ,a0 ,a1 , . . . ,ar Þ is defined for all non-negative integer values 0 raa r n, 0 r a r k1 and 0 r aa r n, 0 r a r r. aa is the number of jobs (or slices) of size 593 fia which are available already in the previous class (i.e., belong to Ai1 Þ and are still needed to scheduled. aa is the number of jobs of i size fa which belong to Ai , but would still be needed to be scheduled after we have completed the scheduling of the jobs to the machines of class Ci . To solve this feasibility problem we consider the problem Pit,ða0 ,a1 ,...,ar Þ ða0 ,a1 , . . . ,ar Þ which is the same problem if we delete machines up to t 1 from the set of machines, and we remove the set of jobs which become available prior to machine t; however, i we keep exactly aa jobs of size fa from this set which are available from machine t. Hence the solution to Pi ða0 ,a1 , . . . , ak1 ,a0 ,a1 , . . . ,ar Þ is exactly the solution Pit~ ,ða0 ,a1 ,...,ar Þ to ða0 ,a1 , . . . ,ak1 ,0,0, . . . ,0Þ where t~ is the first machine in Ci (according to the sorted order of machines). The starting conditions are Pit þ 1,ða0 ,a1 ,...,ar Þ ða0 ,a1 , . . . ,ar Þ for t being the last machine in class Ci , is YES if aa ¼ aa for all a, and otherwise the answer is NO. We next show how to solve Pit,ða0 ,a1 ,...,ar Þ ða0 ,a1 , . . . ,ar Þ assuming we have solved all Pit þ 1,ða0 ,a1 ,...,ar Þ problems. We denote by ya i the number of jobs (or slices) of size fa which start to being available on machine t (and were not available on machine t 1). Hence, we try all possible vectors ðd0 ,d1 , . . . ,dr Þ such that if we i schedule on machine t exactly da jobs of size fa , the load on this machine is at most 1 þ 2e, and we try to find such a vector such that the answer to Pit þ 1,ða0 ,a1 ,...,ar Þ ða0 þ y0 d0 ,a1 þ y1 d1 , . . . , ar þ yr dr Þ is YES. If this is possible then the answer to Pit,ða0 ,a1 ,...,ar Þ ða0 ,a1 , . . . ,ar Þ is YES, and otherwise it is NO. Throughout the computation we assume that if at least one of the arguments of Pit þ 1,ða0 ,a1 ,...,ar Þ ða0 þ y0 d0 ,a1 þ y1 d1 , . . . ,ar þ yr dr Þ is negative, then the answer to this feasibility problem is NO. We note that the number of such vectors ðd0 , . . . ,dr Þ is polynomial in the input size (for fixed values of eÞ. Moreover, for every class, the number of such subproblems is polynomial in the input size, and since there are polynomial number of classes, we conclude that the number of such problems is polynomial, and all of them can be solved in polynomial time. We next show how to use these building blocks in another dynamic programming procedure which solves the feasibility ~ i ða0 ,a1 , . . . ,ak1 Þ be the feasibility problem of problem. Let P scheduling jobs with makespan at most 1 þ2e consisting of the jobs which become available starting from the first machine in i class Ci together with additional aa jobs of size fa for every a where these additional jobs were available prior to this class and we did not schedule them on machines from the prior classes. ~ p þ 1 ða0 ,a1 , . . . ,ak1 Þ is YES if and only if aa ¼ 0, 8a. We Clearly, P ~ 1 ð0,0, . . . ,0Þ. need to compute P ~ i problems based on the We next show how to solve the P iþ1 ~ i ða0 ,a1 , . . . ,a Þ ~ problems. The solution to P solution to the P k1 is YES, if and only if there exists a non-negative vector ðZ0 , Z1 . . . , Zr Þ for which all the following conditions hold. 1. Pi ða0 ,a1 , . . . ,ak1 , Z0 , Z1 , . . . , Zr Þ is YES. 2. We define a vector ðZ00 , . . . Z0k1 Þ using the following rules. For a Z 1 we let Z0a be equal to the value of Za þ k if bi ¼ ebi þ 1 and otherwise we let Z0a be 0. Regarding Z00 , if bi ¼ ebi þ 1 , we let P C ¼ bi kb ¼ 0 Zb fib be the total size of the jobs which 594 L. Epstein, A. Levin / Int. J. Production Economics 133 (2011) 586–595 remained from class Ci and are small jobs of class Ci þ 1 , and P i otherwise we let C ¼ bi rb ¼ 0 Zb fb . We let Z00 be bC=bi þ 1 e2 c, that is, the rounded-down number of slices (with processing time according to the one of class Ci þ 1 Þ which we need to schedule to get a feasible solution that packs the jobs remained at the end of machines of class Ci . The second condition which the vector ðZ0 , Z1 . . . , Zr Þ needs to satisfies is ~ i þ 1 ðZ0 , . . . Z0 Þ is YES. that P 0 k1 We note that to get a feasible solution to the rounded-down instance, we need to follow the tracks of a path resulting a YES answer. When we follow the rounding down of the total size into an integer number of slices, we schedule the remaining slices on the first machine of the class. This incurs an additional e to the resulting load of the first machine in a class. & It remains to transform the solution SOLr to the rounded-down instance into a solution SOL to the original instance. A job which is scheduled in SOLr on a machine where it is a large job, is scheduled in SOL on the same machine as in SOLr. It remains to show how to schedule the remaining jobs. We first define an allocation of space to these remaining jobs on each machine. Assume that the machine t of class Ci has exactly dt slices allocated to it in SOLr, then the total space to remaining jobs which is allocated to machine t, is ðdt þ 2Þe2 bi . We next show that this space suffices to schedule all the remaining jobs. We schedule the remaining jobs using the spaces according to a greedy fashion along the following rule. For every job j, we let m(j) be the first machine in which we can schedule job j as a small job (note that this means also that m(j) belongs to the processing set of job j). Then, if there are two remaining jobs j and j0 such that mðjÞ omðj0 Þ, then we decide upon the schedule of j0 before we consider j. The schedule of such remaining job is defined via the last fit heuristic, i.e., each job is scheduled on the last machine where its allocation would keep the threshold on the load. We next prove that this allocation procedure of remaining jobs to machines can always schedule all jobs. Lemma 12. The allocation procedure of remaining jobs is able to allocate each job to a machine in its processing set, such that, for every machine, the total processing time of remaining jobs allocated to this machine is at most its allocated space. Proof. For a machine t of class Ci , we denote bt ¼ bi . Hence, the space in each machine t is ðdt þ2Þe2 bt . We let the available load of machine t be the total space which has not been already used to pack remaining jobs in this machine. We note that if there is a remaining job j for which mðjÞ rt where the scheduling procedure was not able to allocate to machine t, then the available load of machine t at this time is smaller than e2 bt (this is because such a job j would have incurred an increase of at most e2 bt on the total load of machine t, if it were scheduled there). Assume by contradiction that the claim does not hold. That is, let job j be the first job which cannot be scheduled on its processing set machines. This means that for all machines mðjÞ,mðjÞ þ 1, . . . the available load is smaller than e2 bt . Hence, the total processing time of the remaining jobs preceding job j (in Pm 2 t our allocation order) is strictly larger than t ¼ mðjÞ ðdt þ1Þe b . Note that in SOLr we schedule all the slices which replaced the jobs preceding j including job j in the machines t ¼ mðjÞ,mðjÞ þ 1, . . . (because we did not schedule these jobs as large jobs). The total processing time of these slices is therefore at Pm 2 t most t ¼ mðjÞ dt e b . Our rounding down of job sizes (when we convert them into slices) decrease the total processing time by at most e2 bt for each machine t. Therefore, the total processing time of these jobs is at most diction. & Pm e2 bt , that is a contra- t ¼ mðjÞ ðdt þ 1Þ The process of converting the slices into jobs increases the makespan by at most 2e. Hence, the makespan of the resulting solution is at most 1 þ 3e þ 2e ¼ 1 þ5e. Note that reverting the processing time of all jobs whose sizes were rounded from their rounded-down values to their original values may increase the makespan of the schedule by a multiplicative factor of at most 1 þ e (since the ratio between the original processing time and the rounded-down processing time of such a job is at most 1 þ eÞ. Such jobs are jobs which are assigned as large jobs, and also a part of the jobs which participate in slices, specifically, jobs which are available as large jobs on at least one machine. Therefore the final solution has a makespan of at most ð1 þ 5eÞð1þ eÞ oð1 þ 7eÞ, where the last inequality holds since e r 15. Hence, since the running time of our algorithm (for a fixed value of eÞ is polynomial, we have established the following theorem. Theorem 13. The speed hierarchical model has a polynomial time approximation scheme. 6. Concluding remarks In this paper, we presented three polynomial time approximation schemes for the makespan minimization problem on parallel machines with processing set restrictions. The three variants which we consider were studied before, but no polynomial time approximation schemes for these variants were known prior to the current research. Specifically, the survey paper (Leung and Li, 2008), which motivated our study, states the existence of a polynomial time approximation scheme for the nested variant as a major open problem. We solve this open problem in Section 3, by presenting a PTAS for the nested variant. We also present a PTAS for the tree-hierarchical model which is an additional variant of restricted assignment with identical speed machines, studied in the literature. Even though the type of allowed processing sets are different in the two models, these two schemes have many common features and properties. We conclude this paper by presenting a PTAS for a special case of uniformly related machines with processing set restrictions, which is called the speed hierarchical model. This study is also motivated by Leung and Li (2008), where it is mentioned that finding a PTAS for special cases of uniformly related machines with processing set restrictions is an important open problem. Our result is a first step toward finding such a scheme. It would be interesting to generalize our result for the speed hierarchical model to uniformly related machines with any inclusive processing sets, and perhaps even for nested processing sets. Two additional relevant articles were recently published independently of this work: (Li and Wang, 2010; Muratore et al., 2010). References Alon, N., Azar, Y., Woeginger, G.J., Yadid, T., 1998. Approximation schemes for scheduling on parallel machines. Journal of Scheduling 1 (1), 55–66. Azar, Y., Naor, J., Rom, R., 1995. The competitiveness of on-line assignments. Journal of Algorithms 18, 221–237. Bar-Noy, A., Freund, A., Naor, J., 2001. On-line load balancing in a hierarchical server topology. SIAM Journal on Computing 31, 527–549. Ebenlendr, T., Krčál, M., Sgall, J., 2008. Graph balancing: a special case of scheduling unrelated parallel machines. In: Proceedings of the 19th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA2008), pp. 483–490. Glass, C.A., Kellerer, H., 2007. Parallel machine scheduling with job assignment restrictions. Naval Research Logistics 54 (3), 250–257. L. Epstein, A. Levin / Int. J. Production Economics 133 (2011) 586–595 Graham, R.L., 1966. Bounds for certain multiprocessing anomalies. Bell System Technical Journal 45, 1563–1581. Hochbaum, D.S., Shmoys, D.B., 1987. Using dual approximation algorithms for scheduling problems: theoretical and practical results. Journal of the ACM 34 (1), 144–162. Hochbaum, D.S., Shmoys, D.B., 1988. A polynomial approximation scheme for scheduling on uniform processors: using the dual approximation approach. SIAM Journal on Computing 17 (3), 539–551. Horowitz, E., Sahni, S., 1976. Exact and approximate algorithms for scheduling nonidentical processors. Journal of the ACM 23 (2), 317–327. Huo, Y., Leung, J.Y.-T., 2010. Parallel machine scheduling with nested processing set restrictions. European Journal of Operational Research 204 (2), 229–236. Hwang, H.-C., Chang, S.Y., Lee, K., 2004. Parallel machine scheduling under a grade of service provision. Computers and Operations Research 31 (12), 2055–2061. Kafura, D.G., Shen, V.Y., 1977. Task scheduling on a multiprocessor system with independent memories. SIAM Journal on Computing 6 (1), 167–187. Kang, L., Ng, C.T., 2007. A note on a fully polynomial-time approximation scheme for parallel-machine scheduling with deteriorating jobs. International Journal of Production Economics 109 (1–2), 180–184. Lenstra, J.K., Shmoys, D.B., Tardos, É., 1990. Approximation algorithms for scheduling unrelated parallel machines. Mathematical Programming 46 (1–3), 259–271. Leung, J.Y.-T., Li, C.-L., 2008. Scheduling with processing set restrictions: a survey. International Journal of Production Economics 116 (2), 251–262. 595 Liao, C.-J., Lin, C.-H., 2003. Makespan minimization for two uniform parallel machines. International Journal of Production Economics 84 (2), 205–213. Li, C.-L., Wang, X., 2010. Scheduling parallel machines with inclusive processing set restrictions and job release times. European Journal of Operational Research 200 (3), 702–710. Muratore, G., Schwarz, U.M., Woeginger, G.J., 2010. Parallel machine scheduling with nested job assignment restrictions. Operations Research Letters 38 (1), 47–50. Ou, J., Leung, J.Y.-T., Li, C.-L., 2008. Scheduling parallel machines with inclusive processing set restrictions. Naval Research Logistics 55 (4), 328–338. Shchepin, E.V., Vakhania, N., 2005. An optimal rounding gives a better approximation for scheduling unrelated machines. Operations Research Letters 33 (2), 127–133. Spyropoulos, C.D., Evans, D.J., 1985. Generalized worst-case bounds for an homogeneous multiprocessor model with independent memories—completion time performance criterion. Performance Evaluation 5 (4), 225–234. Weng, M.X., Lu, J., Ren, H., 2001. Unrelated parallel machine scheduling with setup consideration and a total weighted completion time objective. International Journal of Production Economics 70 (3), 215–226. Woeginger, G.J., 2009. A comment on parallel-machine scheduling under a grade of service provision to minimize makespan. Information Processing Letters 109 (7), 341–342.
© Copyright 2026 Paperzz