Anticipation of Stochastic Customer Requests in Vehicle Routing: Value Function Approximation based on a Dynamic Lookup-Table Marlin W. Ulmer, Dirk C. Mattfeld, Felix Köster Technische Universität Braunschweig, Germany, [email protected] December 4, 2014 Abstract We present a vehicle routing problem with stochastic customer requests. A single vehicle has to serve customers in a service area. In the beginning, a set of customers is known and has to be served. During the day, stochastic requests arrive. Respecting a time limit, the dispatcher has to decide, whether to reject or confirm each request. Effective decision making requires the anticipation of future requests. We apply approximate dynamic programming (ADP) evaluating appearances and actions regarding the expected number of future costumer confirmations. As indicators for the expected value, we use the parameters time and slack. The values are stored in a lookup table (LT). We show that the (static) interval length of the LT-axes significantly impacts the approximation process and the solution quality. Small intervals result in high solution qualities, large intervals in a fast and reliable approximation. To combine both advantages, we introduce a dynamic LT. The dynamic LT adapts interval lengths to the problem specifics during the approximation process. The dynamic LT proves as a generic approach providing an efficient approximation process and effective decision making. keywords: vehicle routing, stochastic customer requests, value function approximation, approximate dynamic programming, dynamic lookup table 1 1 Introduction. capacitated vehicle collects parcels during the day. The vehicle starts and ends its tour in a depot. Early request customers have to be served and are known in advance (e.g., are postponed customers from the day before). During the service period, new requests arrive stochastically within the service area. Arriving at a customer provides a set of new requests. The dispatcher has to decide whether to permanently accept or reject new requests. The confirmed subset of requests has to be added to the planned tour. Our objective is to maximize the number of collected parcels considering the given time limit reflecting the driver’s working hours. As a result, not every request can be served. So, in some cases, rejecting feasible customers in one area might be promising to serve more future requests in another. In last mile delivery, challenges for logistic service providers increase. Especially, for courier express and parcel services, customers expect reasonably priced, fast and reliable service (Ehmke 2012). Due to an significant increase in e-commerce sales, the number of shipped parcels has grown significantly in recent years. Beside delivery, service provider offer pickup of parcels (Hvattum et al. 2006). For delivery, regular tours are defined by service network design (Crainic 2000). For parcel pickup, additional vehicles are scheduled dynamically, because some pickups are requested in the course of the day (Lin et al. 2010). For these vehicles, a subset of the pickup-requests is stochastic and not known in the beginning of the service period. Further, the requesting customers may be arbitrarily distributed in the whole service region and the locations may not be known beforehand. To include new pickup-requests in the tour, the provider has to dynamically adapt the current plan (Gendreau et al. 2006). Due to working hour restrictions, service provider may not be able to serve all dynamic requests. Some requests have to be rejected (Menge and Hebes 2011). Decisions about new requests significantly impact future outcomes. So, myopic plans may lead to substantial detours and many customer rejections in the future. To address this risk, anticipation of future requests into current decision making is mandatory (Meisel 2011). For the anticipation of future requests, service provider can derive probabilities of customer requests for certain regions of the service area by making prognoses about customer behavior (Dennis 2011) and analyzing historical data. The focus of this paper is on providing anticipatory confirmation policies. Given a set of new requests, such a policy allows for decisions about confirming or rejecting a request. In the decision process, each decision results in a known system appearance. Such an appearance consists of the point of time and the locations of the vehicle and the remaining customers. These appearances have an significant impact of the number of future confirmations. To exploit this impact, we apply a value function approximation using approximate dynamic programming (ADP, Powell 2007). ADP considers both the immediate and expected future confirmations. The immediate number depends directly on the applied decision. The expected number of future confirmations is approximated for every appearance. Due to the multiplicity of possible appearances, a distinct approximation of every appearance value is not feasible. In this paper, we study a single-period dy- Therefore, ADP assigns a group of appearnamic vehicle routing problem with stochastic ances to a simplified (post decision) state of customer requests. We assume that a non- lower dimensionality. Usually, an appearance 2 served customers significantly. For the static LT, we experience a high variance in solution quality regarding different interval lengths. An a priori determination of a suitable interval length is not possible. The DLTs provide excellent solution quality and significantly The state space can be described as a reduce the required memory consumption. lookup table (LT, Sutton and Barto 1998). Every axis of the LT represents a parameter. This paper is outlined as follows. In §2, we The axes are divided in equidistant intervals. present and discuss the related literature focusThe number of different entries in the table ing on vehicle routing problems with stochastic and therefore, the size of the state space is customer requests. In §3, we recall the geninversely proportional to the interval length. eral terminology and modus operandi of ADP The selection of the interval length is essential using a (static) lookup table. To analyze the for the success of ADP. A large interval length, impact of the LT selection to the approximai. e., a small LT allows a frequent observation tion process, we present an example highlightof the entries and therefore, a fast and reliable ing the trade-off between approximation effiapproximation. Nevertheless, it may group ciency and solution quality using aggregation heterogeneous appearances to a single entry within the LT. We use this motivation to de(George et al. 2008). This results in a high velop a new approach using a DLT in §4. In value deviation within the observations of §5, we define the vehicle routing problem and an entry. The decision quality decreases. A present a variety of real-world sized instances small interval length, i. e., a large LT allows differing in customer distribution, region size a detailed differentiation of the appearances, and ratio of dynamic customers. For these inbut entries are observed rarely. So, the stances, we apply ADP using different LTs and approximation requires high computational analyze the approximation process and the soefforts and a large memory consumption lution qualities in §6. We especially evaluate (Sutton and Barto 1998). In this paper, we the approximation efficiency to highlight the combine the advantages of both small and advantages of a dynamic lookup tables. The large interval lengths defining a dynamic LT paper concludes with a summary of the results (DLT). DLT is a generic approach and adapts and directions for future research in §7. to the problem specifics. DLT dynamically change the interval length of the parameters according to the approximation process. 2 Literature review Assuming different approximation behavior within the LT, ”interesting” areas in the DLT We consider a stochastic and dynamic vehicle are considered in more detail, while the rest routing problem (Kall and Wallace 1994). of the DLT stays in the original design. This The problem is stochastic, because not all allows both a fast and reliable approximation information is provided in the beginning, but and, if necessary, a detailed differentiation is revealed over time. It is dynamic, because of the appearances. We compare different the problem setting allows decision making DLTs with LTs of static interval length for during the service period. The literature on instances of real-world size. All approaches stochastic and dynamic vehicle routing is allow anticipation and increase the number of vast. Stochastic impacts are uncertain travel is reduced to a vector of key parameters (for the given problem, time and slack). To reduce dimensionality, the parameter realizations are assigned to intervals. These intervals generate the state space. 3 the future regards the positions (or distributions) of possible customers and their request behavior over time (Hvattum et al. 2006). Nevertheless, customers can request service at arbitrary locations to every point in time. So, we experience a nearly infinite number of possible problem outcomes and appearances. Due to the high dimensionality of the information, it is necessary to aggregate information to apply optimization algorithms (Rogers et al. 1991, Provost and Fawcett 2013). To include forecasts into optimization, for all anticipatory approaches, a simplification of the problem or within the algorithm is applied. These, mainly static, simplifications have a significant impact of algorithm efficiency and solution quality. On the one hand, only simplified information allows the efficient application of solution algorithms. On the other hand, a simplification has to maintain the problem specific characteristics to effectively achieve good solution qualities (Meisel and Mattfeld 2010). Simplification is applied in the problem setting, by using decomposition and within the stochastic optimization model. To anticipate future customers, predefined policy functions using aggregated information (e.g., waiting strategies), sampling approaches reducing the number of possible outcomes, and approaches approximating the value of certain groups of appearances are applied. times (Thomas and White III 2007, Lecluyse et al. 2009, Ehmke et al. 2015), service times (Daganzo 1984, Larsen et al. 2002), stochastic customer demands (Erera and Daganzo 2003, Sungur et al. 2008) and requests (Psaraftis 1980, Ichoua et al. 2007). An extensive overview over the different problem settings is given by Pillac et al. (2013). In the sequel, we focus on work considering stochastic customer requests. For these problems, decisions consider both routing and request confirmations. We first present myopic approaches. Then, we review the anticipatory approaches. 2.1 Myopic vs. Anticipatory Solution approaches can be divided into myopic and anticipatory. Myopic approaches only use already revealed information, while anticipatory approaches exploit further knowledge (Butz et al. 2003). To deal with stochastic customer requests, myopic approaches are mainly focused on routing, following a greedy confirmation policy. First come, first serve-policies are applied by Psaraftis (1980, 1995), Bertsimas and Van Ryzin (1991), Swihart and Papastavrou (1999), Larsen et al. (2002). Tassiulas (1996) partitions the service region and subsequently serves the subareas. Gendreau et al. (1999, 2006) combine tabu search and an adaptive memory with a rolling horizon algorithm to dispatch customer re2.2 Problem Simplification and Dequests to a fleet of vehicles. Other frequent composition myopic approaches are basic waiting strategies (e.g., wait at start, wait at end ), for instance Many anticipatory approaches use a geographapplied by Mitrović-Minić and Laporte (2004). ical aggregation (Campbell 2006) to simplify the problem structure. Here, possible cusFor customer anticipation, knowledge about tomers are represented by a set of nodes in a future requests has to be incorporated in the graph model. So, the vast number of possidecision process. This knowledge allows fore- ble customer locations is simplified to a known casts and can be derived from historic obser- set of reasonable size. For some problems, this vations or depends on predictions of future bears the risk of a discrepancy between (aggreevents. In most approaches, information about gated) decision and practical implementation 4 potential customers. Policy function approximations allow the application of efficient optimization algorithms. Nevertheless, the anticipation of future requests is limited due to the static rules and the generally high information aggregation. (Ulmer and Mattfeld 2013) and therefore, of inefficient solutions. To achieve suitable practical actions, an appropriate disaggregation policy is required. Another approach to reduce dimensionality is by problem decomposition. In some cases, a greedy confirmation policy is combined with anticipatory routing (Thomas and White III 2007). Alternatively, a predefined routing algorithm is applied, while anticipation in confined to in the confirmation policy (Schmid 2012). In both cases, the number of possible decisions and so the decision space dimensionality can be reduced significantly. 2.3 2.4 Sampling Monte Carlo-sampling is used to reduce the dimensionality of the optimization problem. Sampling approaches simulate a set of future events to evaluate current decisions. The set of all possible outcomes is represented by limited number of sampled outcomes. Each sampled outcome generates a lookahead model (Powell 2014). Optimization is applied to these models. Hence, the outcome space is simplified, while the detailed level of information within the outcomes can be maintained. Sampling allows a more detailed consideration of future events, but often requires significant computational effort, depending on the number of generated samples. To anticipate stochastic customer requests in vehicle routing, future customer requests are simulated. These requests are used to evaluate different routes and decisions. Bent and Van Hentenryck (2003, 2004) introduce a multiple plan approach and multiple scenario approach, where future customer requests are sampled and integrated in plans containing a set of routes. This approach is also used by Sungur et al. (2010) and Flatberg et al. (2007). Ghiani et al. (2009) use short term sampling for a pick-up and delivery problem. Hvattum et al. (2006) approach a real world case study with sampling. They use historical data of customer requests to minimize the expected travel costs. Ghiani et al. (2012) compare the center-of-gravity waiting approach with a sampling method and achieved nearly similar results. Even though the sampling approach requires high computational effort, it is not able to surpass the effi- Policy Function Approximation In a policy function approximation, decision policies follow certain rules depending on aggregated information. For vehicle routing problems, not every possible future customer is considered. Instead, the information is translated into some key figures to evaluate customers (Powell et al. 1988, van Hemert and La Poutré 2004), certain routes (Branke et al. 2005), waiting locations and waiting times (Thomas 2007). Mainly, these approaches determine certain routes and waiting locations to insert new requests efficiently in the scheduled tour. Larsen et al. (2004) propose to wait on idle points with high probabilities of future requests. Branke et al. (2005) maximize the probability to include new requests in the tour by using evolutionary algorithms and waiting strategies to find the best tour and waiting position. Ichoua et al. (2006) partition the service area into zones and calculate the different request probabilities. Waiting is only applied near zones with request probability higher than a certain threshold. Thomas (2007) introduces a center-of-gravity waiting strategy. He dynamically calculates the center of gravity for the remaining (feasible) potential customers. The vehicle waits at the customer right before this center. So, it not passes by a majority of 5 between the key parameters and a value, each parameter vector is assigned to a value stored in a lookup table (LT). This enables a more detailed evaluation of the appearances without the use of any a priori hypothesis. Given a known set of possible customers combined with decomposition of the algorithm, this allows a nearly exact appearance representation. Meisel et al. (2009) use a vector of customer statuses to describe an appearance. Each state contains information about the time, the vehicle position and the set of possible future customers. Considering a priori unknown customers distributed in the whole service area, this approach is no longer applicable. Schmid (2012) partitions the service area into zones. A state contains of the number of vehicles and customers in the different zones. In combination with cheapest insertion for routing, Ulmer et al. (2014) use a LT containing the numerical parameters time and slack as appearance representation achieving an acceptable confirmation policy at the expense of a large state space and therefore, high memory consumption and slow approximation. cient policy function approximations. 2.5 Value Function Approximation A decision mainly depends on the expected future contributions, i.e., value of an appearance (Bellman 1957a). For problem settings of realworld size, calculating these expected values is computationally intractable and value function approximation (VFA) is applied. Among others, approximation is achieved using ADP (Powell 2009). ADP approximates appearance values by simulation and, therefore, finds reasonable decisions combining immediate and expected future contributions. Again, aggregation techniques are necessary to reduce the manifold appearances to a suitable set of states, which can be efficiently evaluated. Mainly, an appearance is assigned to a vector of key parameters. The parameter selection is essential to achieve anticipation. They must enable to group similar appearances to identical states. The similarity of appearances depends on the problem specifics and the applied routing and confirmation heuristics. Different routing heuristics may significantly impact the value of an appearance. The dimensionality of a LT is highly limited due to computational and memory reasons (Barto 1998). To reduce dimensionality, the numerical parameters of a LT can be aggregated to intervals. George et al. (2008) analyze the influence of the aggregation level of the approximation process. Results show that the aggregation level has a significant impact on the approximation process and the solution quality. On the one hand, a highly aggregated LT allows for a frequent visit of the entries and supports a fast approximation. On the other hand, for a detailed consideration of the problem specifics, a fine-grained level of aggregation is necessary. In essence, we experience a trade-off between method efficiency and solution quality. To reduce the number of approximation iterations and to improve In some cases, the value of a state is calculated directly by a weighted set of basis functions depending on the parameters. The basis functions express hypothesis of the correlations between the key parameters and the value. During the simulation runs, these weights are approximated using multiple linear regression. Meisel et al. (2011) use a basis function considering the slack and the average additional time to include the remaining possible customers. They stated that the number of expected future confirmations is directly depending on the ratio between slack and average time to include a customer. To allow a direct and unbiased mapping 6 ure 1: On the operational level, ADP simulates a problem. For decision making, it extracts an appearance and a set of possible actions, which are assigned to the according set of post decision states. The immediate contributions of the actions and the set of PDSs are used in a Markov Decision Process (MDP, Bellman 1957b) to achieve a decision. For evaluation of the PDS, the MDP draws on the value function approximation. Finally, the provided decision is assigned back to the specific action applied to the simulation. The outcome of an iteration is used to update the VFA. During the iterations, VFA approximates the expected PDS values. This subsequently leads to a more accurate evaluation and the solution quality increases. To achieve an efficient and effective approximation process, the interfaces between system and MDP, i.e., the assignments of an appearance-action pair to a PDS for evaluation are essential. For optimization, ADP uses the functionality of a Markov Decision Process (Bellman 1957b). MDPs provide optimal solutions for stochastic and dynamic multi-stage optimization problems of small size. Given a small set of possible states and the probabilities for transitions between the states, MDP allows the calculation of the optimal decision considering immediate and expected future contributions. In each state s of a finite set S = {s0 , . . . , sn }, a subset of possible decisions depending on state s is given: Xs ⊆ X = {x1 , . . . , xm }. The outcome of each combination (s, x) is known beforehand and is defined as the PDS (s, x) = p ∈ P = S × X. A realization ω of the (stochastic) transition function Ω : P → S leads to the subsequent state s′ = ω(s, x). Each decision x ∈ X in state s ∈ S generates an immediate contribution C : S × X → R. Additionally, it leads to a more a less valuable PDS p. The expected value V (p) is calculated by the (discounted) sum of the expected following contributions. So, the value function can be defined Optimization MDP s e le c t Values Post Decision States re a s s ig n n igs as Appearance + Actions cta rt x e System a p p ly Update Approximate Value Function tr se in Simulation Figure 1: ADP-Algorithm: Interaction between Problem-System and MDP solution quality, a dynamical adaption of the LT to the problem and algorithm specifics is promising. George et al. (2008) provide the first approach to combine different levels of aggregation. The values are calculated using a set of LTs differing in level of aggregation. George et al. depict, that for some problems this approach results in a fast learning process with good solutions. Nevertheless, the calculation effort and memory consumption using multiple LTs increases drastically, caused especially by the LT with the lowest level of aggregation. Thus, we propose to use a single LT with areas of different aggregation level (interval length) dynamically regarding the approximation process. 3 Approximate Dynamic Programming To define the dynamic LT in §4, we first recall the terminology and modus operandi of ADP. We focus on ADP using post decision states (PDSs) and a LT for evaluation. ADP contains of three main modules, depicted in Fig7 by V : P → R. To select a decision, the immediate contribution C(s, x) and the PDS-value V (p) = V (s, x) are combined. An optimal decision x∗s for state s fulfills: x∗s = arg max(C(s, x) + V (s, x)) The parameter α represents the stepsize of the approximation process. The more frequent a state is visited, the more accurate is the approximation. If states are only visited sparsely, the approximation and the solution quality might be impaired (Barto 1998, p. 193). In multistage problems, V̂run contains the subsequent contributions of one iteration. Hence, further ineffective solutions impact V̂run . By choosing an appropriate stepsize and initial values, V̂ approximates the values of the PDSs within the iterations. (1) x∈Xs The expected value of a PDS is calculated recursively using an adaption of the Bellman equation (Bellman 1957a): ∑ V (s, x) = π(s′ , (s, x))V (s′ ) (2) s′ ∈S Here, π : P × S → [0, 1] is the probability of the transition between (s, x) and s′ . For many problem settings, the number of possible system appearances and actions is vast, so it is impossible to apply the plain MDP. Further, π is not accessible, respectively, computationally tractable. Hence, to apply a MDP to a more complex problem, appearances are grouped into states. The state values are approximated using simulation. 3.1 3.2 The simulation depicted in Figure 3 models the problem on an operation level, i.e., appearances and actions consist of every detailed information provided by the simulation. To apply MDP, these appearance-action pairs have to be assigned to post decision states by aggregation of information. Therefore, a vector of key parameters (e.g., point of time) ki ∈ Ki ∀ i ∈ {1, . . . , n} is derived from appearance and action and is used for evaluation. Mainly, parameters can be binary, ordinal or numerical. There are two different approaches for evaluation: value approximation using a lookup table and the application of weighted basis functions (Powell 2007). In this paper, we consider PDS spaces represented by LTs, because they allow a more accurate and unrestricted mapping between appearances and values. Nevertheless, they often lead to computational intractability and consume large amounts of memory. In the following, we define the classical static LT and show the impact of the LT size to approximation process and solution quality. As a result, we derive the motivation for the dynamic LT. Value Function Approximation In every decision point in the operational system (appearance), a set of possible actions is given. The appearance and the actions are assigned to a set of post decision states to apply MDP as seen in figure 1. MDP chooses the decision leading to the most valuable PDS considering the immediate contribution of the actions seen in (1). The expected values are provided by the value function approximation. The chosen decision is reassigned to the according action, which is applied to the system appearance in the simulation. By updating the state values, ADP approximates the value function during the iterations. Here, every post decision state is initialized with value V̂0 . During the simulation, the values are updated with the realized values V̂run as given V̂i+1 = (1 − α) V̂i + α V̂run . State Space Representation (3) 8 3.2.1 the according SLT contains four entries P = {p1 , . . . , p4 }. Additionally, we consider an aggregated SLT P̄ = p̄. The value in every entry in P follows a normal distribution with expected value V (pi ) and standard deviation σ 2 (pi ). Additionally, let νi the probability of observing pi . The according values are shown in Table 1. Here, the entry values behave heterogeneous, the expected values and deviations rise from p1 to p4 . The according values of p̄ are a result of the aggregation. Static Lookup Table A PDS p is represented by a set of key parameters p = (k1 , k2 , . . . , kn ), ki ∈ Ki , ∀ i ∈ {1, . . . , n}. With no loss of generality, we only consider numerical parameters. The resulting LTs might be added extra non-numerical parameters. The PDS space is evaluated by a ndimensional LT. Each axis of the LT represents a parameter. The axes are divided into static intervals. The PDSs are assigned to the entries of the static LT (SLT) by mapping parameter values to the LT-intervals. The size of a SLT is defined by the interval lengths and is essential for the approximation process. Small interval lengths allow a highly accurate representation of an appearance. Nevertheless, they impede the approximation process and are computationally challenging. Additionally, a large SLT requires a significant amount of memory (Sutton and Barto 1998). For a certain parameter, let K 1 be the LT with the smallest interval length. Now, within K 1 , we can aggregate the parameters to intervals of size I. For a particular parameter, this leads to: To show the impact of the SLT size on the approximation process, we calculate the expected necessary number of iterations n∗i for every entry and the total number n∗ for termination, i.e., for sufficient approximation in every entry. As a termination criterion, we allow a difference of the average values V̂ to the expected values up to 0.05. Further, we calculate the number of iterations n̄ for the aggregated entry p̄ and compare the results. For each entry, we calculate the distribution of the expected realizations average (i.e., α = n1 ). Then, we derive the probability Pk that the average lies in the allowed deviation range after k entry iterations. We determine n∗i as the minimal number of iterations with probability higher than 95.0%. Let V̂l (pi ) be the value of the lth entry realization of pi . Then, the minimum number of iterations for entry pi can be calculated as seen in Formula (6). |K 1 | . (4) I Let I be the interval length of all KiI , i ∈ {1, . . . , n} and P I the resulting LT of the post decision state space. Then, combining several parameters, we can calculate the impact of uniform aggregation to the size of P I : |K I | = n ∏ n 1 ∏ 1 = n |Ki | I Additionally, the total number of iterations (5) n∗ is the maximum number of individual iteri=1 i=1 ations considering the probability of observing In essence, the number of possible LT- the entry: entries decreases significantly given larger { ∗} interval lengths within the parameter sets. ni ∗ . (7) n = max νi i∈{1,...,4} Let us show by a simplified, single-stage example how the interval length impacts Hence, rarely visited entries increase the necthe approximation process. Given a single essary number of iterations of the algorithm. parameter, in the lowest level of aggregation In the example, p4 requires the most visits for |P | ≤ I |KiI | 9 Table 1: Expected Entry Values and Deviation Entries p1 p2 p3 p4 p̄ V 1.0 2.0 4.0 6.0 3.0 σ2 0.0 0.2 0 1.0 0.3 1, 537 3.0 0.4 4, 610 4.0 0.1 6, 146 4.3 1.0 6, 590 ν n∗i ( ) } k ∑ 1 n∗i = arg min Pk V (pi ) − V̂l (pi ) ≤ 0.05 ≥ 0.95 k + k∈N { (6) l=1 termination with n∗4 = 6, 146. Due to the probability ν4 = 0.1 of observing p4 , the expected number of runs for termination of the whole n∗ process is n∗ = 0.14 = 61, 460. We now show that aggregation can reduce the number of iterations significantly. Therefore, the entries are aggregated to one single entry P̄ = {p̄}. The new expected value of V (p̄) is the weighted sum of the single expected values: m ∑ V (p̄) = νi V (pi ) = 3.0 (8) i=1 The variance σ 2 (p̄) can be calculated as σ 2 (p̄)= ∑4 i=1 νi V (pi )2 −( ∑4 i=1 2 νi V (pi )) =4.3. (9) The probability distribution of V (p̄) is the weighted sum of the single distributions. The number of necessary iterations to achieve a maximal deviation of 0.05 with a probability of at least 95% is n̄ = 6, 590. The number of necessary iterations is reduced by 89.3% compared to P . For our example, aggregation allows a significantly faster approximation. Nevertheless, aggregation has a large impact on decision making. As we can see from (9), the variance of V (p̄) exceeds the variance of all original entries p1 , . . . , p4 . Aggregation results in a rise of the deviation of the entry value. We additionally experience a bias |V (p̄) − V (pi )| up to 3.0 for all former states. Using p̄ results in a less accurate representation and may lead to ineffective decisions. We now show the influence of the aggregation to the solution quality. We consider a decision point s and two possible decisions xa , xb . Decision xa leads to post decision state (or entry) p1 , xb to p4 . The immediate contributions are C(xa ) = 2.0, C(xb ) = 1.0. Considering (1), in P , the overall values are C(xa ) + V (p1 ) = 3.0 and C(xb ) + V (p4 ) = 7.0. Hence, we choose xb and achieve an expected overall outcome of 7.0. With SLT P̄ , the two decisions result in the same entry p̄. Hence, decision xa is chosen with overall outcome of 3.0. Due to aggregation, we experience a loss of 4.0. In essence of the example, aggregation allows a faster approximation, but simultaneously leads to a loss in solution quality. Obviously, the selection of the interval lengths strongly impacts the approximation process. A small tuning results in a significantly different and hard to predict behavior of the approximation process. Given an SLT, the interval lengths have to be defi
© Copyright 2026 Paperzz