Serial Inventory Systems with Markov-Modulated Demand: Derivative Bounds, Asymptotic Analysis, and Insights Li Chen Samuel Curtis Johnson Graduate School of Management, Cornell University, Ithaca, NY 14853 [email protected] Jing-Sheng Song Fuqua School of Business, Duke University, Durham, NC 27708 [email protected] Yue Zhang Fuqua School of Business, Duke University, Durham, NC 27708 [email protected] In this paper we consider the inventory control problem for serial supply chains with continuous, Markovmodulated demand (MMD). Our goal is to simplify the computational complexity by resorting to certain approximation techniques, and, in doing so, to gain a deeper understanding of the problem. To this end, we analyze the problem in several new ways. We first perform a derivative analysis of the problem’s optimality equations, and develop general, analytical solution bounds for the optimal policy. Based on the bound results, we derive a simple procedure for computing near-optimal heuristic solutions for the problem. These simple solutions reveal a closer relationship with the primitive model parameters. Second, we perform asymptotic analysis with long replenishment lead time and establish an MMD central limit theorem. We further show that the relative errors between our heuristics and the optimal solutions converge to zero as the lead time becomes sufficiently long, with the rate of convergence being the square root of the lead time. Our numerical results reveal that our heuristic solutions can achieve near-optimal performance even under relatively short lead times. Third, we show that, by leveraging the Laplace transformation, the optimal policy becomes computationally tractable under the gamma distribution family. This enables us to numerically compare various heuristic solutions with the optimal solution, and to demonstrate that our heuristic outperforms existing heuristics in most cases. Finally, we observe that the internal fill rate and demand variability propagation in an optimally controlled supply chain under MMD exhibit behaviors different from those under stationary demand. Key words : multi-echelon inventory systems, Markov-modulated demand, derivative, asymptotic analysis History : file version October 8, 2015 1. Introduction The increasingly open global economy has made it possible for companies to seek the best available resources and technologies worldwide and to serve new markets. As a result, many supply chains have been stretched long and thin, often consisting of multiple production and distribution stages across different countries and continents. This new supply chain configuration brings new challenges 1 2 to managers. For example, demand shocks, which could be political, economic, technological, or climatic, in one region can quickly propagate to other regions, causing shortages in some places but oversupplies in others. Thus, companies that traditionally operate their supply chains in a relatively stable environment, such as within one country, must now learn how to effectively rationalize inventory along the global supply chain to hedge against greater environmental uncertainties. To help managers to meet this new challenge, we consider in this paper a serial supply chain model that involves demand uncertainties driven by dynamic environmental factors. Specifically, we assume that customer demand in each period is influenced by a “world” state that evolves according to a discrete-time Markov chain. This demand process is known as the Markov-modulated demand (MMD) process in the literature (Iglehart and Karlin 1962 and Song and Zipkin 1993). The MMD process is useful and practical for its structured data requirement and modeling flexibility. To construct the Markov chain, one can first identify demand scenarios of different phases in a product development process or in a product life cycle, and then assess the likelihood of each scenario as well as the transition probabilities between the scenarios, much as in the construction of a decision tree (Song and Zipkin 1996b). Chen and Song (2001) showed that a world-state-dependent echelon base-stock policy is optimal for serial inventory systems with MMD. Computing the optimal policy, however, is a nontrivial task. Chen and Song (2001) developed an iterative optimization algorithm, which requires solving a set of integral renewal equations in each iteration. Their algorithm is effective for the discrete demand case (as the integral renewal equations reduce to linear equations), but remains computationally challenging for the continuous demand case. Muharremoglu and Tsitsiklis (2008) showed that the state-dependent echelon base-stock policy continues to be optimal for more general systems with (exogenous and sequential) stochastic lead times and discrete demand processes (including MMD). They decomposed the problem into a series of single-unit single-demand problems, and developed an efficient dynamic programming algorithm for computing the optimal policy. Due to the nature of their single-unit decomposition approach, their algorithm cannot be applied to the continuous demand case. In addition, while representing a significant improvement over the standard dynamic programming algorithms, these exact algorithms work somewhat like a “black box”—numbers in and numbers out, lacking the transparency between the model parameters (inputs) and the optimal inventory decisions (outputs). In this paper we focus on the continuous demand case. Our goal is two-fold. First, we seek to develop new approximation techniques to simplify the computational complexity. Such solution techniques enable managers to make swift decisions in choosing appropriate inventory levels along the supply chain to hedge against environmental fluctuations. Second, we strive to make the black box more transparent by establishing a direct relationship between the primitive model parameters 3 and the simple solutions. Such relationship sheds light on both the qualitative and quantitative effects of the environmental uncertainties. It also helps guide managers to invest wisely to obtain the right information (or problem parameters) and to react quickly in the right direction. To achieve the above goal, we analyze the problem in the following new ways. Derivative Bounds and Heuristics Our first approach is to perform a derivative analysis of the problem’s optimality equations and develop derivative bounds. The solution bounds obtained from our derivative bounds can be computed without solving the integral renewal equations required in the algorithm of Chen and Song (2001). The derivative bound expressions also reveal a closer relationship with the primitive model parameters, which was not apparent in the algorithms of Chen and Song (2001) and Muharremoglu and Tsitsiklis (2008). From the derivative analysis, we obtain general, analytical solution bounds for serial inventory systems with MMD. The existing bounding approaches for problems with nonstationary demand all build on a simplifying assumption that requires the optimal state-dependent echelon base-stock level to be achievable in each period (e.g., Dong and Lee 2003 and Shang 2012). This assumption, however, does not hold in general, and the “lower bound” obtained from these approaches may fail to bound the optimal solution under MMD. On the contrary, our derivative-based approach does not require such an assumption, and thus our solution bounds work in general. In addition, our solution bounds can be viewed as a generalization of those obtained by Shang and Song (2003) for the independent and identically-distributed (i.i.d.) demand process. When the state space of the Markov chain degenerates to a singleton, the demand process becomes i.i.d. and our bounds reduce to theirs, whereas their bounding approach does not extend to the MMD case (see Appendix D for a detailed discussion). Computing our derivative-based solution lower bound involves optimization over demand state permutations, which can be challenging if there is a large number of states. To simplify computation, we develop easy-to-compute heuristic solutions that only require evaluating a linear combination of derivatives of the newsvendor cost functions of each demand state. To our knowledge, this derivative-based heuristic is the first of its kind in approximating the optimal policy for both singlestage and serial inventory systems with MMD. Moreover, with this heuristic, we can establish a mapping between the underlying Markov chain transition probabilities and the linear weight of each demand state, making the solution process more transparent. Asymptotic Analysis with Long Lead Time Our second approach is to perform an asymptotic analysis of the problem with long lead time. In doing so, we first establish an MMD central limit theorem for the lead time demand (i.e., demand 4 aggregated over the lead time periods). We find that, as the lead time becomes sufficiently long, the lead time demands with different initial states all converge to the same normal distribution. To prove this result, we use the concept of α-mixing to show that demands far apart in time under MMD are almost independent (Billingsley 1995). This finding reveals an inherent averaging effect of aggregation under MMD: While the demand distributions of different states in a single period may be drastically different, the aggregated (lead time) demand would become more resembling to that of an i.i.d. process as the aggregation period increases. Leveraging our derivative analysis and an MMD central limit theorem, we obtain asymptotic results for the solution bounds under long replenishment lead time. Specifically, we show that the relative errors between our solution upper and lower bounds converge to zero as the lead time increases, with the convergence rate being the square root of the lead time. Accordingly, heuristic solutions that fall between our solution bounds, such as our derivative-based heuristic and its simplified variants, are guaranteed to converge to the optimal solution at a minimum speed of square root of lead time. This observation suggests that, when the lead time is sufficiently long, the seemingly complex supply chain problem with MMD has surprisingly simple solutions. In proving the above result, we make a new methodological contribution to the literature—one can apply our analysis approach to establish similar results for other inventory systems. Exact Evaluation and Observations While solving integral renewal equations is generally challenging for continuous demand, we discover an exception. When demand belongs to the gamma distribution family, we find that the exact algorithm of Chen and Song (2001) becomes tractable with the aid of Laplace transformation and its inverse (see Appendix C for details). This finding enables us to numerically compare our derivative-based heuristic solutions with the optimal policy under relatively short lead times, complementing the asymptotic results established under long lead times. From our numerical studies, we observe that our heuristic achieves near-optimal performance in most cases. We further compare our derivative-based heuristic with the decoupling heuristic proposed by Abhyankar and Graves (2001). Their heuristic, designed for a two-stage system with MMD, is based on the following simplifying assumption. The upstream stage provides 100% internal fill rate, i.e., it can always fulfill orders from the downstream stage. As a result, the two-stage system is decoupled into two single-stage systems, with the downstream stage operating under a state-dependent installation base-stock policy, and the upstream stage operating under a static installation base-stock policy. Our numerical results show that our derivative-based heuristic outperforms their decoupling heuristic in most cases, with an average performance improvement of 4.6%. Besides the performance difference, we note that our derivative-based heuristic can be applied to systems with any number of stages, whereas their decoupling heuristic only works for two-stage systems. 5 It has been demonstrated in the literature that, in a serial inventory system with i.i.d. demand, an optimal internal fill rate is usually much lower than the fill rate at the customer-facing stage (see Choi et al. 2004, Shang and Song 2006, and references therein). For systems with MMD, a natural question is whether the internal fill rate would continue to be low, or become much higher as assumed in the decoupling heuristic of Abhyankar and Graves (2001). To obtain insights into this question, we conduct numerical experiments to measure the (optimal) internal fill rate under MMD. We find that, contrary to the observations documented for the i.i.d. demand case, when the lead time at the upstream stage is long, the internal fill rate can be high under MMD (around 94-96%; the target fill rate at the customer-facing stage is around 94%), suggesting that the decoupling heuristic may yield a good approximation to the optimal policy in this case. On the other hand, when the lead time at the upstream stage is relatively short, the internal fill rate tends to be low (around 84-93%, lower than the 94% target fill rate at the customer-facing stage). As a result, the decoupling heuristic yields a poor approximation; and our derivative-based heuristic has a clear advantage in this case, as it does not require the high internal fill rate assumption. In addition, we investigate numerically the demand variability propagation effect, or the bullwhip effect (see Chen and Lee 2009, 2012 and the references therein). Interestingly, we observe that the bullwhip effect can be significantly dampened under the optimal policy, indicating that the statedependent inventory policy under MMD may smooth demand variability propagation in the supply chain (see Appendix E for details). This contrasts the existing observations under autoregressive AR(1) demand in the literature, suggesting interesting future research directions. Finally, we note that our derivative bounds are derived based on the optimality equations of the exact algorithm. Chen and Song (2001) showed that the exact algorithm can be extended to systems with a fixed setup cost at the upmost stage as well as the assembly systems. Thus, all our results can be extended accordingly to those more general systems. The rest of this paper is organized as follows. We provide a brief literature review in §2. We then analyze a single-stage inventory problem in §3. This lays the foundation for the analysis of serial inventory systems in §4. §5 presents the numerical studies, and §6 concludes the paper. All proofs are presented in Appendix A; and all other supplementary results are included in Appendices B-E. 2. Literature Review The serial system structure we consider in this paper is the classic multi-echelon inventory model, first developed and analyzed by Clark and Scarf (1960), and further studied by many researchers in various dimensions (see Zipkin 2000 for a review). It serves as an important baseline model and a key building block for more complex supply chain structures. Much of the literature on this classic model assumes an i.i.d. demand process. Under this assumption, it has been shown 6 that a stationary echelon base-stock policy is optimal. Earlier work focused on the computational efficiency of the optimal policy; e.g., see Federgruen and Zipkin (1984) and Chen and Zheng (1994). More recently, researchers have developed simple bounds and heuristics that aim to increase the transparency of the factors that determine the optimal policy. See, for example, Gallego and Zipkin (1999), Dong and Lee (2003), Shang and Song (2003), Gallego and Ozer (2005), and Chao and Zhou (2007). Our work extends their efforts to systems with MMD. The MMD process, which extends the i.i.d. demand process to incorporate dynamic state evolution, was first introduced by Iglehart and Karlin (1962) to study a single-stage inventory model, but only gained popularity in the inventory literature in the last few decades; see, for example, Song and Zipkin (1993), Ozeciki and Parlar (1993), Beyer and Sethi (1997), and Sethi and Cheng (1997). An important special case of MMD is the cyclic demand model, which various authors explored; see, for example, Karlin (1960), Zipkin (1989), Aviv and Federgruen (1997), and Kapuscinsky and Tayur (1998). Several authors also adopted MMD in serial inventory systems, but focused on different aspects of the problem than we do. For example, Song and Zipkin (1992, 1996a, 2009) and Abhyankar and Graves (2001) analyzed specific types of policies, Parker and Kapukinski (2004) characterized optimal policy for a two-echelon capacitated system, Huh and Janakiraman (2008) used a sample-path approach to establish the optimality of the echelon base-stock policies, Angelus (2011) considered the complexity of incorporating secondary market sales, and Abouee-Mehrizi et al. (2014) conducted an exact analysis of capacitated two-echelon inventory systems with priorities. The most closely related works to ours are Chen and Song (2001) and Muharremoglu and Tsitsiklis (2008). These authors showed that, for a serial inventory system, a state-dependent echelon basestock policy is optimal. They also presented exact algorithms to compute the optimal policy. Using a decomposition approach similar to that in Muharremoglu and Tsitsiklis (2008), Janakiraman and Muckstadt (2009) developed efficient algorithms for computing the optimal policies for capacitated and lost-sales serial systems. We do not consider these system features in our paper. There has also been research in the literature on deriving solution bounds for serial inventory systems with nonstationary demand processes. For example, Dong and Lee (2003) developed lower bounds for optimal policies for serial systems with time-correlated demand. The time-series demand model requires past demand information, whereas the MMD model we consider here focuses on external factors. Shang (2012) extended the bounding approach of Shang and Song (2003) to derive solution bounds for serial inventory systems with independent but nonidentical demands (which is a special case of MMD). As discussed earlier, the bounding approaches employed in these papers all build on a simplifying assumption that the optimal state-dependent echelon base-stock level is achievable in each period. This assumption does not hold in general; as a result, the “lower bound” obtained from these approaches may fail to bound the optimal solution under MMD. Our paper complements this literature by deriving solution bounds that work in general under MMD. 7 3. Single-Stage Inventory Systems For expositional purposes, we consider a single-stage inventory problem in this section. This allows us to clearly introduce the key ideas to be used later in the analysis for serial inventory systems. Specifically, the demand of a product is met with on-hand inventory in each period; when there is stockout, we assume the unmet demand is fully backlogged. Inventory is replenished from an external source with ample supply. The replenishment lead time is a constant of L periods. At the end of each period, unit inventory holding cost h and unit backlog penalty cost b are charged on the on-hand inventory and backorders, respectively. The planning horizon is infinite, and the objective is to minimize the long-run average cost of the inventory system. Because the linear ordering cost under the long-run average cost is a constant, we can assume the ordering cost is zero without loss of generality (see Federgruen and Zipkin 1984). The demand process is nonstationary and has an embedded Markov chain structure (i.e., the MMD process). Specifically, demand in each period is a nonnegative random variable denoted by Dk , where k is the demand state that determines the continuous demand distribution with density function fk (·) and the cumulative density function Fk (·). The demand state in period t follows a Markov chain W = {W (t), t = 1, 2, ...}, with a total of K states, denoted by {1, 2, ..., K }. The Markov chain is time-homogeneous. Let P = (pij ) denote the transition probability matrix, where pij is the one-step transition probability from state i to j. Without loss of generality, we assume the Markov chain is irreducible, which implies that all states communicate with each other. (When the Markov chain is reducible, under the long-run average cost criterion, the problem is equivalent to one involving only the irreducible class of the chain.) Also let Dk [t, t0 ] denote the total demand in periods t, ..., t0 with the demand state being k in period t, and let Dk [t, t0 ) denote the total demand in periods t, ..., t0 − 1. In each period, the sequence of events is as follows: 1) at the beginning of a period, the demand state k is observed; 2) an inventory replenishment order is placed with the supplier; 3) a shipment ordered L periods ago is received from the supplier; 4) demand arrives during the period; and 5) at the end of the period, inventory holding and backorder costs are assessed. 3.1 Preliminaries It has been shown that a state-dependent base-stock policy {s∗ (k), k = 1, ..., K } is optimal for the problem (see Iglehart and Karlin 1962, Beyer and Sethi 1997, Chen and Song 2001). The policy works as follows: When the demand state is k, if the inventory position is below the base-stock level s∗ (k), order up to this level; otherwise, do not order. Given the demand state k and inventory position y, the single-period expected cost is G(k, y) = h · E (y − Dk [t, t + L])+ + b · E (Dk [t, t + L] − y)+ , (1) 8 where (x)+ = max{x, 0}. The above function is the well-known newsvendor cost function, which is convex in y. Its minimizer, denoted by s̄(k) = arg miny G(k, y), is termed the myopic base-stock level. Specifically, s̄(k) solves the following first-order condition: ∂y G(k, y) = −b + (b + h)Fk,L (y) = 0, or Fk,L (y) = b , b+h (2) where we use “∂y ” to denote the first-order derivative with respect to y, and Fk,L (·) is the cumulative distribution of the lead time demand Dk [t, t + L]. We shall also refer to the function ∂y G(k, y) as the “newsvendor derivative function” throughout the paper. For later usage, we define the smallest myopic base-stock level among all states and its corresponding demand state as smin = min {s̄(k)}, k min = arg min {s̄(k)}. 1≤k≤K 1≤k≤K Although the myopic base-stock level s̄(k) is easy to compute, it is not necessarily optimal. The demand state in the next period may result in a (stochastically) much smaller demand distribution; therefore, stocking up to the myopic base-stock level in the current period may cause overstocking in the next period. More sophisticated methods are needed to determine the optimal policy. Exact Algorithm. Chen and Song (2001) developed the following K-iteration algorithm to compute the optimal policy. The algorithm starts with a partition of the state space of W. In each iteration i, let V i denote the set that contains the states for which we have not found the optimal solution yet, and U i the complementary set of V i . Specifically, in the first iteration i = 1, let V 1 = {1, ..., K } and U 1 = φ. Define G1 (k, y) = G(k, y) as given in (1), solve the single-period problem, and let s1 (k) = s̄(k) denote the solution for state k. Find the demand state that gives the smallest solution among V 1 . Denote such a state as 1∗ = arg mink∈V 1 s1 (k) = arg min1≤k≤K {s̄(k)}. Then, the optimal solution for state 1∗ is just s∗ (1∗ ) = s1 (1∗ ) = smin . Next, update the state sets by V 2 = V 1 \{1∗ } and U 2 = U 1 ∪ {1∗ }, and proceed to iteration i = 2. In iteration i + 1 (1 ≤ i < K), for each k ∈ V i+1 , solve the cost minimization problem miny {Gi+1 (k, y)}, where X Gi+1 (k, y) = Gi (k, y) + pku E {Ri (u, y − Dk )}, (3) u∈U i+1 and for any u ∈ U i+1 , Ri (u, y) = X puu0 E {Ri (u0 , y − Du )} u0 ∈U i+1 if u ∈ U i , X i ∗ (i , y) + pi∗ u0 E {Ri (u0 , y − Di∗ )} if u = i∗ , G u0 ∈U i+1 (4) 9 with Gi (i∗ , y) = Gi (i∗ , max{y, si (i∗ )}). It can be shown that Gi+1 (k, y) is convex in y. Let si+1 (k) = arg miny Gi+1 (k, y). Also let (i + 1)∗ = arg mink∈V i+1 si+1 (k), i.e., the demand state that gives the smallest si+1 (k) among V i+1 . Then the optimal solution for state (i + 1)∗ is given by s∗ ((i + 1)∗ ) = si+1 ((i + 1)∗ ). Update the state sets by V i+2 = V i+1 \{(i + 1)∗ } and U i+2 = U i+1 ∪ {(i + 1)∗ }. Repeat the above procedure until reaching the final iteration K. In summary, the above algorithm finds the optimal base-stock level for each demand state by sorting the demand states based on the solutions to the cost minimization problems miny {Gi+1 (k, y)} in each iteration. Although the algorithm is simpler and more efficient than the standard dynamic programming algorithms, it remains quite complicated because each iteration builds on the results obtained in the previous iterations. The main difficulty lies in determining the function Ri at each iteration, as given in (4). For the discrete demand case, computing Ri is relatively easy, as it involves solving a set of linear equations (see Chen and Song 2001). For the continuous demand case, however, determining Ri requires solving a set of integral renewal equations. To illustrate the last point more clearly and to facilitate our subsequent analysis, we introduce the following matrix notation. Let m = (m1 , ..., mK ) denote a permutation of the sequence of demand states {1, ..., K }. Define the following sub-matrices of transition probabilities under m: for i = 1, ..., K − 1, pm1 m1 . . . pm1 mi .. . .. P(i) . .. . m = . pmi m1 . . . pmi mi (5) Let m∗ = (1∗ , ..., K ∗ ) denote the optimal demand state sequence determined by the exact algo(i) rithm. Let Dm∗ (x) be the following diagonal matrix involving demand densities for states 1∗ , ..., i∗ : f1∗ (x) . . . 0 (i) .. . .. Dm∗ (x) = ... (6) . . 0 . . . fi∗ (x) Also let Ri (y) = [Ri (1∗ , y), Ri (2∗ , y), ..., Ri (i∗ , y)]T . Then, we can rewrite (4) in a matrix form as follows: i Z R (y) = ∞ (i) (i) Dm∗ (x) · Pm∗ · Ri (y − x)dx + ei · Gi (i∗ , y), (7) 0 where ei is the unit vector with the i-th element being one. Our Approach. Based on the exact algorithm, it is useful to let [k] denote the iteration in which the optimal solution s∗ (k) for state k is obtained. In other words, s∗ (k) = arg min G[k] (k, y) . y≥0 Because G[k] (k, y) is convex in y, its derivative is increasing in y. If we can find bounding functions for its derivative, then the roots of the bounding functions become bounds for the optimal solution. 10 Thus, our approach is to develop bounding functions that involve only the primitive model parameters, so that the resulting solution bounds will depend only on the primitive model parameters. To this end, by repeatedly applying (3) and (4) and then taking derivatives, we observe that [k]−1 ∂y G[k] (k, y) = ∂y G(k, y) + X X pku ∂y E {Ri (u, y − Dk )}. (8) i=1 u∈U i+1 The first term is simple, given by (2), so what remains is to find simple functions to bound ∂y E {Ri (u, y − Dk )}. Note that all functions in the exact algorithm are continuously differentiable and all the expectations are finite. Therefore, we can interchange the derivative and expectation (or integral) operators in our subsequent derivations, e.g., ∂y E {Ri (u, y − Dk )} = E {∂y Ri (u, y − Dk )}. 3.2 Derivative and Solution Bounds We now derive bounds for ∂y G[k] (k, y). Observe that ∂y Gi (i∗ , y) = (∂y Gi (i∗ , y))+ . Also, from equation (4), it is straightforward to show that ∂y Ri (·, y) = 0 for y ≤ si (i∗ ). Using a standard result in renewal equations (Sgibnev 2001), we obtain Lemma 1 For any u ∈ U i+1 , Ri (u, y) is increasing convex in y. Moreover, for any k = 1, ..., K, Gi (k, y) is convex in y and Gi (i∗ , y) is increasing convex in y. According to the above Lemma, the second term of (8) is nonnegative, so we immediately have ∂y G[k] (k, y) ≥ ∂y G(k, y) = −b + (b + h)Fk,L (y). Thus, ∂y G(k, y) serves as a simple lower bound function for ∂y G[k] (k, y). We now proceed to derive an upper bound for ∂y G[k] (k, y). Because W is irreducible, it follows (i) that I(i) − P(i) is the i × i identity matrix and P(i) m is invertible for any i = 1, ..., K − 1, where I m is defined in (5). Thus, we can define the following parameters: for i = 1, ..., K − 1, (i−1) (i−1) I − Pm , βi (m) = (i) I(i) − Pm where | · | is the matrix determinant and we adopt the convention that |I(0) − P(0) m | = 1. Taking the derivative with respect to y on both sides of the equation (7), we have Z i ∞ (i) Z0 y = (i) Dm∗ (x) · Pm∗ · ∂y Ri (y − x)dx + ei · ∂y Gi (i∗ , y) ∂y R (y) = (i) (i) Dm∗ (x) · Pm∗ · ∂y Ri (y − x)dx + ei · ∂y Gi (i∗ , y) 0 (i) ≤ Pm∗ · ∂y Ri (y) + ei · ∂y Gi (i∗ , y), 11 where the inequality follows from the fact that ∂y Ri (·, y) is increasing in y and the probability distribution is less than one. Therefore, (i) I(i) − Pm∗ ∂y Ri (y) ≤ ei · ∂y Gi (i∗ , y). (9) Let e be the i-dimensional vector with all elements being one. With some additional argument, we obtain the following upper bound on ∂y Ri (y): Lemma 2 For i = 1, ..., K − 1, −1 (i) ∂y Ri (y) ≤ I(i) − Pm∗ · ei · ∂y Gi (i∗ , y) ≤ e · βi (m∗ ) · ∂y Gi (i∗ , y). Applying both Lemmas 1 and 2 to the second term of (8) yields [k]−1 [k]−1 X X i pku ∂y E {R (u, y − Dk )} ≤ i=1 u∈U i+1 X X pku βi (m∗ )∂y Gi (i∗ , y) i=1 u∈U i+1 ≤ (1 − pkk ) K−1 X βi (m∗ )∂y Gi (i∗ , y), i=1 where the second inequality relaxes the range of the second summation to make it independent of the set U i . We can further show that (see the detailed derivation in Appendix A) K−1 X βi (m∗ )∂y Gi (i∗ , y) ≤ i=1 K−1 X αi (m∗ )(∂y G(i∗ , y))+ ≤ ∆(y), i=1 where αi (m) = K−1 X αj (m) · φj,i (m) + βi (m), i = 1, ..., K − 1 (10) j=i+1 with φj,i (m) = [pmj m1 , pmj m2 , ..., pmj mi ]T I(i) − P(i) m ∆(y) = max m∈S K−1 X −1 ei , and + αi (m) · (∂y G(mi , y)) , (11) i=1 with S being the set of all permutations of {1, ..., K }. Thus, we have obtained an upper bound function for ∂y G[k] (k, y) based on a linear combination of simple newsvendor derivative functions. Note that both βi (m) and αi (m) depend only on the primitive model parameters—the transition matrix P. The following proposition summarizes the above bound results: Proposition 1 For any state k = 1, ..., K, the following holds: ∂y G(k, y) ≤ ∂y G[k] (k, y) ≤ ∂y G(k, y) + (1 − pkk )∆(y). 12 When K = 1, ∆(y) ≡ 0, so the inequalities in the above proposition becomes binding and the result reduces to the derivative of the single-period cost function. This is intuitive because when K = 1, the demand process reduces to an i.i.d. process. With the derivative lower bound shown above, it follows that the myopic base-stock level s̄(k) is an upper bound for the optimal solution s∗ (k). This result is implied by the exact algorithm of Chen and Song (2001). Symmetrically, from the derivative upper bound shown above, we can obtain a lower bound for the optimal solution s∗ (k). Specifically, let s(k) be the solution to the following equation: ∂y G(k, y) + (1 − pkk )∆(y) = −b + (b + h)Fk,L (y) + (1 − pkk )∆(y) = 0. (12) We obtain the following result: Corollary 1 For any demand state k = 1, ..., K, the following holds: smin ≤ s(k) ≤ s∗ (k) ≤ s̄(k). where s̄(k) and s(k) are determined by (2) and (12), respectively. The inequalities become equalities when k = k min . While the bounds smin and s̄(k) have appeared in the literature (see Song and Zipkin 1993), the lower bound s(k) is new to the literature. This new lower bound is tighter than the existing one, smin . Note that computing the new lower bound involves optimization over demand state permutations, which can be challenging if there is a large number of states. To further simplify computation, we develop easy-to-compute heuristic solutions in the next subsection. 3.3 Heuristics The insights gained from the derivation of the derivative bounds lead us next to construct heuristics that fall between the solution lower and upper bounds. To do so, we augment the derivative lower bound by a positive component that is less than (1 − pkk )∆(y). Specifically, we first sort the myopic base-stock level s̄(k) in an increasing order and let m̃ = (m̃1 , ..., m̃K ) denote the resulting order sequence of the states. Suppose that k = m̃j . Based on Lemma 2, we can make the following approximation for the second term of ∂y G[k] (k, y) in (8): [k]−1 X X pku ∂y E {Ri (u, y − Dk )} ≈ i=1 u∈U i+1 j−1 X pkm̃i · ∂y E {Ri (m̃i , y − Dk )} i=1 j−1 ≈ X n o + pkm̃i · βi (m̃) · E (∂y G(m̃i , y − Dk )) i=1 j−1 ≈ 1X + pkm̃i · βi (m̃) · Fk (y − s̄(m̃i )) (∂y G(m̃i , y)) . 2 i=1 13 Here, in the first step we approximate m∗ by m̃ and replace U i+1 by {m̃i }. The second step follows from Lemma 2 with m̃ replacing m∗ . The last step is based on the following integral approximation: n o Z y−s̄m̃i 1 + (∂y G(m̃i , y − x))+ fk (x)dx ≈ Fk (y − s̄(m̃i )) · (∂y G(m̃i , y))+ . E (∂y G(m̃i , y − Dk )) = 2 0 Now let sa (k) denote a heuristic solution determined by the following equation: j−1 1X + pkm̃i · βi (m̃) · Fk (y − s̄(m̃i )) (∂y G(m̃i , y)) = 0. ∂y G(k, y) + 2 i=1 (13) Note that equation (13) involves only a linear combination of the newsvendor derivative functions of different demand states. Thus, solving sa (k) only requires a direct evaluation of the lead time demand distributions of different demand states. With this heuristic, we also establish a direct relationship between the underlying Markov chain transition probabilities and the linear weights in equation (13), making the solution process more transparent. Moreover, we have Corollary 2 For any demand state k = 1, ..., K, smin ≤ s(k) ≤ sa (k) ≤ s̄(k). The inequalities become equalities when k = k min . The above Corollary shows that the heuristic solution obtained from (13) indeed falls between the solution lower and upper bounds. A detailed illustration of our bound and heuristic results is provided in Appendix B for a two-state example (i.e., K = 2). In the example, we show that, when it is highly likely to transit from a high demand state to a low demand state, one should reduce the optimal inventory level for the high demand state, and our heuristic solution moves in the same direction as the optimal solution. In addition, our heuristic solution provides a closer approximation to the optimal solution when there is a high chance of transiting from a low demand state to a high demand state (see Appendix B for details). Combining Corollaries 1 and 2, we observe that s∗ (k) and sa (k) fall in the same interval [s(k), s̄(k)]. Therefore, if the interval is tight, the heuristic solution sa (k) becomes a close approximation to the optimal solution s∗ (k). In the next subsection, we identify sufficient conditions to ensure that such an interval is tight. 3.4 Asymptotic Analysis with Long Lead Time Intuitively, when the replenishment lead time L increases, the lead time demands starting from different states may converge to the same distribution. As a result, the gap between smin and s̄(k) will narrow. Below we formally prove this result by first establishing an MMD central limit theorem for the lead time demand. First, consider the case in which W has a stationary distribution π = [π(1), ..., π(K)]. Let Dπ (t) be the demand in period t under the stationary distribution, that is, Dπ (t) follows the distribution 14 of Dk with probability π(k). Let µk = E {Dk } and σk2 = var{Dk }. The mean and variance of Dπ (t) are given by µ = E {Dπ (t)} = K X µk π(k), var{Dπ (t)} = k=1 K X σk2 + µ̃2k π(k), (14) k=1 where µ̃k = µk − µ is the relative difference between µk and µ. Let Dπ [t, t + L] denote the lead time demand under the stationary distribution π, we have E {Dπ [t, t + L]} = (L + 1)µ, cov{Dπ (t), Dπ (t + l)} = K X K X (s) µ̃k µ̃l pkl π(k), l = 1, 2, ..., (15) k=1 l=1 (s) where pkl is the s-step transition probability from state k to state l. We can further show (see the proof of Lemma 3 in Appendix A) ∞ X var {Dπ [t, t + L]} = var{Dπ (t)} + 2 cov{Dπ (t), Dπ (t + l)}. σ = lim L→∞ L+1 l=1 2 (16) Thus, the variance limit σ 2 contains two parts: the single-period demand variance and the covariance across different periods. It is interesting to note that the covariance structure under MMD depends only on the demand mean of each state. Thus, besides the demand variance at each state, another major source of variability under MMD comes from the variation of the demand means. Let N (0, 1) denote the standard normal random variable with distribution function Φ(·). We can establish the following result for the lead time demand distribution under MMD: Lemma 3 (MMD Central Limit Theorem) Suppose that the Markov chain W has a stationary distribution and Dk has finite moments for any demand state k. If σ 6= 0, then as L → ∞, Dk [t, t + L] − (L + 1)µ dist. √ −−→ N (0, 1) , σ L+1 where µ and σ are defined in (14) and (16), respectively. The above Lemma confirms the intuition that the lead time demands with different initial states converge to the same distribution as the lead time increases. To prove this result, we use the concept of α-mixing to show that demands far apart in time under MMD are almost independent (Billingsley 1995). This finding reveals an inherent averaging effect of aggregation under MMD: While the demand distributions of different states in a single period may be drastically different, the aggregated (lead time) demand would become more resembling to that of an i.i.d. process as the aggregation period increases. Based on the result, when L is sufficiently long, the lead time demand Dk [t, t + L] can be approximated by a normal distribution with mean (L + 1)µ and variance (L + 1)σ 2 , independent of the 15 initial demand state k. Consequently, the myopic base-stock level for state k can be approximated by the following formula: √ s̄(k) ≈ (L + 1)µ + z ∗ σ L + 1, k = 1, ..., K, (17) where z ∗ = Φ−1 (b/(b + h)). In other words, all myopic base-stock levels s̄(k) converge to the same value, so does smin . Next, consider a cyclic Markov chain W. In this case W does not have a stationary distribution. However, due to its cyclic nature, as the lead time increases, the lead time demands starting from different states share a growing common component. As a result, we can show that, for any two states k and k 0 , the lead time demand distributions Fk,L (y) and Fk0 ,L (y) converge to the same limit distribution for every y as L goes to infinity. Thus, it continues to hold that all myopic basestock levels s̄(k) converge to the same value in this case. The following proposition summarizes our discussion above and further determines the convergence rate: Proposition 2 Suppose that the Markov chain W either has a stationary distribution or is cyclic, and that Dk has finite moments for any demand state k. For any demand state k = 1, ..., K, lim √ L→∞ L+1· s̄(k) − smin = 0. smin The above proposition shows that the relative percentage error between s̄(k) and smin converges √ to zero as L goes to infinity. The speed of convergence is at a rate of o 1/ L + 1 . Thus, when the lead time is sufficiently long, we can use the closed-form expression (17) or the more sophisticated sa (k) (Corollary 2) to approximate s∗ (k). In proving the above convergence result, we leverage the fact that smin = mink∈{1,...,K} s̄(k), which is the minimum of the myopic base-stock levels of different states. Since the myopic base-stock level of a given state depends only on the lead time demand distribution associated with that state, we can then apply the MMD central limit theorem to obtain the convergence rate result for our solution bounds. 4. Serial Inventory Systems In this section we present our main results for the general serial inventory system with N > 1 stages. Specifically, random customer demand arises in every period at Stage 1, Stage 1 orders from Stage 2, ..., and Stage N orders from an external supplier with ample supply. We also call the external supplier Stage N + 1. If there is stockout at Stage 1, we assume the unmet demand is fully backlogged. The production-transportation lead time from Stage n + 1 to Stage n is a Pn constant of Ln periods. Let L0n = j=1 Lj denote the total lead time from Stage n + 1 to Stage 1. At the end of each period, a unit (installation) inventory holding cost Hn is charged for on-hand 16 inventories at Stage n (1 ≤ n ≤ N ) and a unit backlog penalty cost b is charged for backlogs at Stage 1. Following the convention in the literature, we define the echelon inventory holding cost at Stage n as hn = Hn − Hn+1 > 0, with HN +1 = 0. The demand process at Stage 1 follows the MMD process as described in §3. We assume that all the replenishment activities in a period happen at the beginning of the period after observing the demand state. At Stage n (n > 1), the sequence of events is as follows: an order from Stage n − 1 is received, an order is placed with Stage n + 1, a shipment is received from Stage n + 1, and then a shipment is sent to downstream Stage n − 1. For Stage 1, an order is placed at the beginning of the period, a shipment is received from Stage 2, and then the demand arrives during the period. The planning horizon is infinite and the objective is to minimize the long-run average cost of the serial inventory system. As in the single-stage problem, we assume the ordering cost is zero without loss of generality. In what follows, we shall assign a subscript “n” to the corresponding functions and variables for Stage n. 4.1 Preliminaries Let ILn (t) denote the echelon inventory level at Stage n by the end of period t, which includes the total on-hand inventory at Stages 1, ..., n, plus the total inventory in transit to Stages 1, ..., n − 1, minus the backorders at Stage 1. Also let B(t) denote the backorder level at Stage 1 by the end of period t. Then, the system inventory holding and penalty cost at the end of period t can be written as N X Hn [ILn (t) − ILn−1 (t)] + H1 [IL1 (t) + B(t)] + bB(t) = n=2 N X hn ILn (t) + (b + H1 )B(t), n=1 where the right-hand side is a sum of a sequence of echelon-related costs from Echelon 1 to N . Consider first the Echelon 1 cost. Under the long-run average cost criterion, we can charge the cost of period t + L1 to period t without affecting the cost assessment. Specifically, given the demand state k and inventory position y at the beginning of period t, the expected cost at the end of period t + L1 is given by E {h1 IL1 (t + L1 ) + (b + H1 )B(t + L1 )} = h1 · E (y − Dk [t, t + L1 ])+ + (b + H2 ) · E (Dk [t, t + L1 ] − y)+ . Thus, we can define the single-period expected cost function at Echelon 1 as G11 (k, y) = h1 · E (y − Dk [t, t + L1 ])+ + (b + H2 ) · E (Dk [t, t + L1 ] − y)+ . (18) Recall that when N = 1, h1 = H1 and H2 = 0. Therefore, in the single-stage system, G11 (k, y) reduces to G(k, y) as defined in (1). Chen and Song (2001) showed that the echelon decomposition technique 17 of Clark and Scarf (1960) for the i.i.d. demand case can be extended to the MMD process. That is, the optimal state-dependent echelon base-stock policy can be determined sequentially from Echelons 1 to N , with the aid of an induced penalty function between the successive echelons. Exact Algorithm. Chen and Song (2001) developed the following algorithm for computing the optimal policy. Start with Stage 1. Apply the K-iteration algorithm described in §3.1 to G11 (k, y) defined in (18), and obtain the optimal echelon base-stock levels s∗1 (k) for Stage 1. For Stage n ≥ 2, first define the following induced penalty functions from stage n − 1 to stage n: for k = 1, ..., K, [k] [k] Gn−1,n (k, y) = Gn−1 (k, min{y, s∗n−1 (k)}) − Gn−1 (k, s∗n−1 (k)), (19) [k] where Gn−1 (k, ·) is the function obtained from the K-iteration algorithm for Stage n − 1 (see §3.1). With the induced penalty functions, define the following cost functions for Stage n: for k = 1, ..., K, G1n (k, y) = E {hn (y − Dk [t, t + Ln ]) + Gn−1,n (Wk (t + Ln ), y − Dk [t, t + Ln ))}, (20) where Wk (t + Ln ) is the demand state at the beginning of period t + Ln given the state k in period t. Now apply the K-iteration algorithm again to G1n (k, y), and obtain the optimal echelon base-stock levels s∗n (k) for Stage n. Repeat the above procedure until reaching the final Stage N . Thus, the computation procedure for the N -stage system requires N × K iterations. Besides the difficulty in determining the function Rni at each iteration as discussed in §3.1, determining the induced penalty function Gn−1,n in the objective function (20) presents an additional challenge. Our Approach. As in §3.2, we seek to derive simple bounds for the optimal policy by performing a derivative analysis of the two key equations (19) and (20) in the exact algorithm. Analogous to the expression (8) in the single-stage system, we can show [k]−1 ∂y G[k] n (k, y) = ∂y G1n (k, y) + X X pku ∂y E {Rni (u, y − Dk )}. (21) i=1 u∈U i+1 n For each Stage n and demand state k, we define the following newsvendor cost function paramePn terized by ζ (hn ≤ ζ ≤ i=1 hi ): e n (k, y |ζ) = ζ · E (y − Dk [t, t + L0n ])+ + (b + Hn+1 ) · E (Dk [t, t + L0n ] − y)+ . G (22) e n (k, y |ζ) = −(b + Hn+1 ) + (b + Hn+1 + ζ)Fk,L0 (y) will play an important Its derivative function ∂y G n role in our bound and heuristic development. Note that when N = 1, ζ ≡ h1 = h, H2 = 0, and the above newsvendor cost function reduces to (1) . 18 4.2 Derivative and Solution Bounds For illustrative purposes, consider the cost function for Stage 2. From (19) and Proposition 1, we have [k] ∂y G1,2 (k, y) = ∂y G1 (k, min{y, s∗1 (k)}) ≥ ∂y G11 (k, min{y, s∗1 (k)}) = min 0, ∂y G11 (k, y) . Because H1 ≥ H2 , ∂y G11 (k, y) = −(b + H2 ) + (b + H1 )Fk,L1 (y) ≥ −(b + H2 ) + (b + H2 )Fk,L1 (y). Observe that the right-hand side of the above inequality is always negative. It follows that ∂y G1,2 (k, y) ≥ −(b + H2 ) + (b + H2 )Fk,L1 (y). With this inequality, we obtain ∂y G12 (k, y) = h2 + E {∂y G1,2 (Wk (t + L2 ), y − Dk [t, t + L2 ))} ≥ h2 + E −(b + H2 ) + (b + H2 )FWk (t+L2 ),L1 (y − Dk [t, t + L2 )) e 2 (k, y |h2 ). = −(b + H3 ) + (b + H2 )Fk,L02 (y) = ∂y G where the second equality follows from E {FWk (t+L2 ),L1 (y − Dk [t, t + L2 ))} = Fk,L02 (y). By an argument analogous to that used in Proposition 1, we have [k] e 2 (k, y |h2 ). ∂y G2 (k, y) ≥ ∂y G (23) Repeating the above argument from Stage 2 to Stage n, we can obtain a lower bound for ∂y G[k] n (k, y). Developing the derivative upper bound for the serial system, however, is more complex, because we need to bound the induced penalty function Gn−1,n in (20). Nevertheless, we can show the Pn following result, which generalizes Proposition 1 to the serial system. Denote h[1,n] = i=1 hi . Proposition 3 For any stage n = 1, ..., N and any state k = 1, ..., K, the following holds: e n (k, y |hn ), (a) ∂y Gn[k] (k, y) ≥ ∂y G e n (k, y |h[1,n] ) + (1 − pkk )∆n (y) + (b) ∂y Gn[k] (k, y) ≤ ∂y G Pn−1 ¯ j=1 ∆j,n (k, y), ¯ j,n (k, y) are recursively defined as follows: for n = 1, ..., N , where ∆n (y) and ∆ ∆n (y) = max m∈S K−1 X h i+ e n (k, y |h[1,n] ) + Pn−1 ∆ ¯ j,n (mi , y) , αi (m) ∂y G j=1 i=1 ¯ j,n (k, y) = E ∆j y − Dk [t, t + L0n − L0j ) for 1 ≤ j ≤ n − 1 and αi (m) given by (10). with ∆ 19 It is straightforward to verify that ∆1 (y) = ∆(y) as defined in (11). Thus, the above result Pn−1 ¯ reduces to that of Proposition 1 when N = 1. When N > 1, there is an extra term j=1 ∆ j,n (k, y) on the right-hand side of the inequality. This term captures the dependence of Echelon n on all the downstream echelon base-stock levels from Echelons 1 to n − 1. Based on Proposition 3, let s̄n (k) be the solution to the equation e n (k, y |hn ) = −(b + Hn+1 ) + (b + Hn )Fk,L0 (y) = 0, ∂y G n (24) and let sn (k) be the solution to the equation: e n (k, y |h[1,n] ) + (1 − pkk )∆n (y) + ∂y G Pn−1 ¯ j=1 ∆j,n (k, y) = 0. (25) It follows that Corollary 3 For any stage n = 1, ..., N and any state k = 1, ..., K, sn (k) ≤ s∗n (k) ≤ s̄n (k). The above result generalizes Corollary 1 to serial inventory systems. To our knowledge, Corollary 3 presents the first general, analytical solution bounds for serial inventory systems with MMD. The existing bounding approaches for problems with nonstationary demand all build on a simplifying assumption that the optimal state-dependent echelon base-stock level is achievable in each period (e.g., Dong and Lee 2003, Theorem 6 and Shang 2012, Theorem 1). This assumption, however, does not hold in general, and the “lower bound” obtained from these approaches may fail to bound the optimal solution under MMD. On the contrary, our derivative-based bounding approach does not require such an assumption and accounts for the dependence on the downstream echelon base-stock levels through the ∆ terms in Proposition 3. Thus, our solution bounds work in general. To see the above point more clearly, observe that by ignoring the ∆ terms in the derivative upper bound in Proposition 3, we obtain a newsvendor derivative function. Let svn (k) be the root of this function, i.e., the solution to e n (k, y |h[1,n] ) = −(b + Hn+1 ) + (b + H1 )Fk,L0 (y) = 0. ∂y G n (26) Thus, svn (k) is the “lower bound” for the optimal solution s∗n (k) under the assumption that the optimal state-dependent echelon base-stock level is achievable in each period (e.g., Dong and Lee 2003 and Shang 2012). However, for Stage 1, we have h[1,1] = h1 , and thus, from Proposition 3 and Corollary 3, sv1 (k) = s̄1 (k). Thus, sv1 (k) is not a solution lower bound, but instead a solution upper bound under MMD. For Stage n > 1, it is also not guaranteed that svn (k) ≤ s∗n (k) under MMD (see Table 1 in §5 for the counterexamples marked with a “†” sign). When the state space of the Markov chain degenerates to a singleton, i.e., K = 1, the demand process becomes i.i.d. In this case, the optimal state-dependent echelon base-stock level is always 20 achievable in each period, and the ∆ terms in Proposition 3 vanish. As a result, ∂y G[k] n (k, y) is bounded below and above by simple newsvendor derivative functions and svn (k) becomes a true lower bound in this case. It is interesting to note that under the i.i.d. demand process, the solution bounds obtained from our derivative bounds coincide with those obtained by Shang and Song (2003) through their single-stage approximation approach. However, their bounding approach does not extend to the more general MMD process, as it also requires the assumption that the optimal state-dependent echelon base-stock level is achievable in each period (see Appendix C for a detailed discussion). Our solution bounds can be computed without solving the integral renewal equations required in the exact algorithm. However, computing the solution lower bound involves optimization over demand state permutations, which can be challenging if there is a large number of states. To further simplify computation, we develop easy-to-compute heuristic solutions in the next subsection. 4.3 Heuristics Based on the definition of svn (k) given in (26), by Proposition 3, it follows that sn (k) ≤ svn (k) ≤ s̄n (k). Hence, although svn (k) is not guaranteed to be a solution lower bound, it can still serve as a simple heuristic solution for our problem. We can further improve this heuristic based on the idea used in the single-stage problem in §3.3. Specifically, we can sort svn (k) in an increasing order and let m̃n = (m̃n,1 , ..., m̃n,K ) denote the resulting order sequence of the states. Suppose that k = m̃n,j . Based on Lemma 2, we can make the following approximation (parameterized by ζ) for ∂y G[k] n (k, y) in (21): ∂y Gn[k] (k, y) e n (k, y |ζ) + ≈ ∂y G j−1 X pkm̃n,i · ∂y E {Rni (m̃n,i , y − Dk )} i=1 e n (k, y |ζ) + ≈ ∂y G j−1 X pkm̃n,i · βi (m̃n ) · E i=1 j−1 e n (k, y |ζ) + 1 ≈ ∂y G 2 X e n (m̃n,i , y − Dk |ζ) ∂y G + + e n (m̃n,i , y |ζ) , pkm̃n,i · βi (m̃n ) · Fk (y − svn (m̃n,i )) ∂y G i=1 where the parameter ζ takes value in hn ≤ ζ ≤ h[1,n] , and the last step follows from the same integral approximation as in the single-stage problem in §3.3. Next, let san (k |ζ) denote a heuristic solution determined by the following equation: e n (k, y |ζ) + ∂y G j−1 + 1X e n (m̃n,i , y |ζ) = 0, pkm̃n,i · βi (m̃n ) · Fk (y − svn (m̃n,i )) ∂y G 2 i=1 (27) where the above equation involves only a linear combination of the newsvendor derivative functions of different demand states. Thus, solving the heuristic solution san (k |ζ) only requires a direct 21 evaluation of the lead time demand distribution of different demand states. With this heuristic, we also establish a mapping between the underlying Markov chain transition probabilities and the linear weights in equation (27), making the solution process more transparent. Moreover, we have the following result: Corollary 4 For any stage n = 1, ..., N and any state k = 1, ..., K, sn (k) ≤ san (k |ζ) ≤ s̄n (k). Combining Corollaries 3 and 4, we observe that san (k |ζ) and the optimal solution s∗n (k) fall in the same interval [sn (k), s̄n (k)], suggesting that san (k |ζ) can be a close approximation for the optimal solution s∗n (k) as long as the interval is tight. In the next section we identify sufficient conditions to ensure that such an interval is tight. 4.4 Asymptotic Analysis with Long Lead Time Recall from Corollary 1 that in the single-stage system the solution lower bound s(k) is further bounded below by smin = mink∈{1,...,K} s̄(k). However, in the serial inventory system, this property = mink∈{1,...,K} sn (k) for n ≥ 1, is no longer available. Instead, we need to work directly with smin n with sn (k) determined from the derivative upper bound expression (25). From Proposition 3, the derivative upper bound for the serial inventory system is more complex than that for the single-stage system. Thus, in order to apply the MMD central limit theorem to derive asymptotic results, we need to further relax the derivative upper bound to obtain a more workable expression (see the proof of Proposition 4 in Appendix A). Overcoming these technical difficulties, we can establish the following result, which generalizes Proposition 2 to serial inventory systems. Proposition 4 Suppose that the Markov chain W either has a stationary distribution or is cyclic, and that Dk has finite moments for any demand state k. For any Stage n = 1, ...N and any demand state k = 1, ..., K, lim sup L0n →∞ p L0n + 1 · s̄n (k) − smin n =c smin n where c is a nonnegative constant. The above proposition shows that the relative percentage error between s̄n (k) and smin converges n p to zero as the lead time L goes to infinity. The convergence rate is O 1/ L0n + 1 , where L0n is the total lead time from Stage n + 1 to Stage 1. Therefore, according to Corollary 4, the heuristic solution san (k |ζ) is guaranteed to be a close approximation to the optimal solution s∗n (k) when the total lead time L0n is sufficiently long. In other words, when the lead time is sufficiently long, the seemingly complex supply chain problem with MMD has surprisingly simple solutions. This is because the aggregated (lead time) demand under MMD becomes more resembling to that of 22 an i.i.d. process as the lead time increases. Moreover, in the serial inventory system, the total lead time at an upstream stage is always greater than the total lead time at a downstream stage. Thus, Proposition 4 suggests that the heuristic solution at the upstream stage tends to be a closer approximation to the optimal solution than does the heuristic solution at the downstream stage. Finally, in proving the above result, we make a new methodological contribution to the literature— one can apply our analysis approach to establish similar results for other inventory systems. 5. Numerical Studies In this section we conduct numerical studies to evaluate our bound and heuristic performance. With the aid of Laplace transform and its inverse, we find that the optimal policy for our problem becomes computationally tractable under gamma demand distribution (see Appendix C for details). This finding enables us to benchmark our bounds and heuristic solutions against the optimal solution. Specifically, we assume that the single-period demand Dk under demand state k follows a gamma probability density f (x|nk , θk ) = (θk )nk xnk −1 e−θk x , with x ≥ 0, Γ(nk ) (28) where nk is the shape parameter and θk the rate parameter (1/θk is the scale parameter). The demand mean is µk = nk /θk . It is worth emphasizing that, unlike the optimal policy, our solution bounds and heuristic solutions can be computed under any demand distributions. We focus on presenting numerical results of two-stage systems. The numerical insights from systems with more than two stages are similar to what we report below, and are thus omitted for brevity. Recall from our asymptotic analysis—as the number of stages increases, the heuristic solutions at upstream stages tend to be closer approximations to the optimal solution than those at the downstream stages. Also note that the heuristic solutions (as well as the optimal solution) are computed echelon by echelon from downstream to upstream stages, according to the echelon decomposition technique of Clark and Scarf (1960). For each echelon, the solutions are computed by assuming ample supply at the immediate upstream stage. Thus, the Echelons 1 and 2 solutions in a two-stage system would be the same as the Echelons 1 and 2 solutions in an N -stage system (with N ≥ 2). Combining this observation with our asymptotic results, we believe that focusing on two-stage systems in our numerical studies is sufficient, as the numerical performance observed in such systems is clearly informative to what to expect in systems with greater number of stages. 5.1 Base Case We first present results for two demand cases. Both involve three demand states, with each state following a gamma distribution. The parameters for these gamma distributions are (ni , θi ) = (1, 1), (2, 2), and (2, 0.2), for i = 1, 2, 3, representing demand means of 1, 1, and 10, respectively. The transition matrices for these two cases are given below: 23 Cyclic Noncyclic demand demand 010 0.3 0.1 0.6 0 0 1 , 0.5 0.2 0.3 . 100 0.3 0.5 0.2 We set the unit penalty cost to be b = 50, and the unit echelon holding costs to be h1 = 2 and h2 = 1 for Stages 1 and 2, respectively. This echelon holding cost structure corresponds to unit installation holding costs of H1 = 3 for Stage 1 and H2 = 1 for Stage 2, representing the valueadding process of moving product from Stage 2 to Stage 1. Based on these cost parameters, the implicit fill rate at the customer-facing Stage 1 is 50/(50 + 3) = 94.3%. The lead times are set as (L1 , L2 ) = (1, 1), (1, 3), and (3, 1). For each scenario, we compute five solutions: the optimal solution s∗n (k), the lower bound sn (k), the upper bound s̄n (k), the newsvendor heuristic svn (k) defined by (26), and the derivative-based heuristic san (k |ζ) defined by (27). Because we have two stages and three states, we compare the latter four policies with the optimal solution using the following weighted relative solution error percentage metric. Specifically, for a given policy sn (k), the weighted relative solution error percentage is defined as ∗ k,n |sn (k) − sn (k)| P ∗ k,n sn (k) P × 100%. We evaluate the long-run average system costs under the policies of s∗n (k), svn (k), and san (k |ζ). For each parameter scenario, we simulate the system for 50,000 periods and compute the average cost over these 50,000 periods. To reduce variance in the simulation results, we use the same random demand sample path for different policies. For the heuristic policies svn (k) and san (k |ζ), we compute the relative cost error percentage with respect to the cost under the optimal policy s∗n (k). For the heuristic san (k |ζ), we find that it performs well in most cases when ζ = h[1,n] , so we let san (k |ζ) = san (k |h[1,n] ) in our reporting below. The detailed numerical results for the three-state demand cases are given in Table 1. From the table, the foremost observation is that the solution “lower bound” svn (k) obtained by the bounding approaches of Dong and Lee (2003) and Shang (2012) may be greater than the optimal solution s∗n (k). We mark all such counterexamples with a “†” sign in the table. The numerical results confirm our earlier discussion that sv1 (k) is actually a solution upper bound and sv2 (k) may be greater than s∗2 (k) in many cases. Both san (k |ζ) and svn (k) are good approximations to the optimal solution, with the derivativebased heuristic san (k |ζ) outperforming svn (k) in most cases. As the Stage 1 lead time L1 increases from one to three periods, the echelon base-stock levels for different states become close to each other, and both heuristics’ approximation to the optimal solution improves, resulting in less than 24 Table 1 (L1 , L2 ) n (1, 1) 1 k 1 2 3 2 1 2 3 Solution error % Cost error % (1, 3) 1 1 2 3 2 1 2 3 Solution error % Cost error % (3, 1) 1 1 2 3 2 1 2 3 Solution error % Cost error % Comparison of policies in two-stage systems with three-state demand. s∗n (k) 4.63 24.79 22.75 23.99 22.39 29.24 59.43 4.63 24.79 22.75 33.35 43.76 42.31 67.46 28.63 28.58 39.37 27.21 38.87 41.29 82.35 Cyclic demand sn (k) s̄n (k) svn (k) san (k |ζ) s∗n (k) 4.63 4.63 4.63 4.63 23.13 19.70 26.45 26.45† 26.45 19.03 19.74 26.50 26.50† 24.25 29.74 19.79 31.42 25.09† 25.09 27.60 20.11 31.42 25.09† 25.09 25.22 20.11 31.42 25.09 25.09 36.48 18.6% 57.6% 10.5% 8.7% 2.7% 2.1% 73.83 4.63 4.63 4.63 4.63 23.13 † 19.70 26.45 26.45 26.45 19.03 19.74 26.50 26.50† 24.25 29.74 22.82 33.55 27.22 27.22 45.06 27.88 48.70 40.96 40.96 42.21 28.39 48.74 40.99 39.00 52.36 28.2% 38.8% 9.1% 9.0% 2.7% 2.5% 84.79 28.60 28.63 28.63 28.63 39.32 28.58 28.58 28.58 28.58 36.85 34.67 42.95 42.95† 40.94 45.91 27.22 33.55 27.22† 27.22 42.56 32.05 48.70 40.96† 40.96 39.72 32.05 48.74 40.99 39.00 50.13 10.2% 41.4% 2.9% 2.9% 0.6% 0.1% 101.79 Noncyclic demand sn (k) s̄n (k) svn (k) san (k |ζ) 21.12 23.40 23.40† 23.22 19.03 19.03 19.03 19.03 25.49 31.57 31.57† 30.07 24.44 36.52 28.60† 28.48 23.56 33.83 25.94† 25.94 27.08 45.10 36.43 35.08 12.7% 41.4% 2.4% 2.1% 0.7% 0.3% 21.12 23.40 23.40† 23.22 19.03 19.03 19.03 19.03 25.49 31.57 31.57† 30.07 35.21 52.76 43.00 42.87 34.30 50.09 40.40 40.40 37.86 60.70 50.43 49.10 18.2% 30.3% 3.7% 3.6% 0.7% 0.8% 37.95 39.48 39.48† 39.35 36.85 36.85 36.85 36.85 41.53 47.48 47.48† 46.08 40.18 52.76 43.00† 42.87 39.47 50.09 40.40† 40.40 42.38 60.70 50.43† 49.10 6.3% 41.9% 1.2% 0.9% 0.5% 0.1% 1% system cost errors. This corroborates the convergence result of Proposition 4 and suggests that the heuristics can achieve near-optimal performance even under relatively short lead time. It is also interesting to note that under the cyclic demand with L1 = L2 = 1, the echelon base-stock levels for sa2 (k |ζ) are identical across all states, so do s̄2 (k) and sv2 (k). This is because (L02 + 1) mod K = 0, which leads to identical lead time demand distributions with different initial states. As a robustness check, we conduct an additional numerical study for two five-state demand cases. The numerical insights are essentially the same as in the three-state demand cases (see Appendix E for details). 5.2 Effect of Transition Probability To illustrate the effect of transition probability on our bound and heuristic performance, we parameterize the Markov chain transition probability matrix as follows: 01 0 0 0 1 , q 0 1−q where we vary the transition probability q from 0.1 to 1. When q = 1, the transition matrix corresponds to the cyclic demand case; otherwise, it represents the noncyclic demand case. The 25 cost parameters are set to be the same as the base case. The lead times are set as (L1 , L2 ) = (2, 1). For the highest demand state 3, we plot the optimal solution s∗n (3), the lower bound sn (3), the upper bound s̄n (3), the newsvendor heuristic svn (3), and the derivative-based heuristic san (3|ζ) (with 75 75 70 70 65 65 60 60 Base stock level for state 3 Base stock level for state 3 ζ = h[1,n] ). The plots for Stages 1 and 2 are shown in Figure 1 (a) and (b), respectively. 55 50 45 40 s ∗1 ( 3) s1 ( 3) s̄ 1( 3) s v1 ( 3) s a1 ( 3) 35 30 25 0.1 0.2 55 50 45 40 35 30 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 25 0.1 q (a) Stage 1 Figure 1 s ∗2 ( 3) s2 ( 3) s̄ 2( 3) s v2 ( 3) s a2 ( 3) 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 q (b) Stage 2 Impact of transition probability on the solutions in a two-stage system. From the figure, the optimal echelon base-stock level for the highest demand state decreases as the transition probability q increases (i.e., the probability of jumping from the highest demand state to a lower demand state increases). Our bounds and heuristics follow the same decreasing trend as the optimal solution. Specifically, the derivative-based heuristic san (3|ζ) tracks the optimal solution closely as the transition probability q increases. 5.3 Comparison with an Alternative Heuristic In this subsection we compare our derivative-based heuristic with an alternative heuristic proposed by Abhyankar and Graves (2001). Motivated by a case study of Teradyne Inc. (a major electronic test equipment manufacturer), Abhyankar and Graves (2001) considered an inventory hedging problem for a two-stage system with a two-state MMD demand process. Their heuristic is based on the following decoupling assumption. The upstream stage is assumed to provide 100% internal fill rate, i.e., it can always fulfill orders from the downstream stage. As a result, the two-stage system is decoupled into two single-stage systems, with the downstream stage operating under a state-dependent installation base-stock policy, and the upstream stage operating under a static installation base-stock policy. The cost objective function considered in Abhyankar and Graves (2001) is different from ours: They treated backorders as negative inventory, and aimed to minimize expected inventory holding cost subject to a customer fill rate constraint. Nevertheless, we can still implement their decoupling heuristic idea in our problem setting, and compare its performance with that of our derivative-based heuristic. 26 We design the following numerical study to (approximately) match the model parameters of Abhyankar and Graves (2001). Their demand model is a continuous-time two-state Markov demand process. The low-state demand rate is 7.5 per time unit, with an expected cycle length of 45 time units. The high-state demand rate is 12.5 per time unit, with an expected cycle length of 45 time units. The total lead time of the system is 30 time units. We convert their continuous-time model to our discrete-time model as follows. We first assume that each discrete period consists of 5 time units. As a result, the mean demands for the low and high states are 7.5 × 5 = 37.5 and 12.5 × 5 = 62.5, respectively. The expected cycle length for each state is 45/5 = 9 periods. To approximate these parameters, we assume that the demand follow a gamma distribution with parameters (ni , θi ) = (3, 0.1) and (6, 0.1) for i = 1, 2, with i = 1 (mean 30) representing the low demand state and i = 2 (mean 60) representing the high demand state. We also assume that the transition probability matrix is given by 1−p p p 1−p . Under this transition matrix, the system stays in each state for 1/p periods on average. Therefore, when p = 0.1, the expected cycle length is 10 periods, approximately matching the original cycle length given in Abhyankar and Graves (2001). To illustrate the effect of the transition probability p, we let p vary from 0.1 to 0.9. Since each discrete period consists of 5 time units, the total lead time (30 time units) becomes L1 + L2 = 6 periods. We let L1 vary from 1 to 5 while keeping the total lead time at a constant of 6 periods. We set the stockout penalty cost b = 50 and installation holding costs H1 = 3, H2 = 1, which are the same as in the base case. To sum up, we have a total of 9 × 5 = 45 parameter cases for comparing our heuristic with the decoupling heuristic. To implement the decoupling heuristic idea in our problem setting, we first use our single-stage heuristic to compute the state-dependent installation base-stock levels for Stage 1. For Stage 2, we need to compute a static installation base-stock level. Note that the transition probability matrix under consideration is doubly stochastic. Hence, the stationary distribution is uniform (1/2, 1/2). We use a gamma distribution with parameters (n, θ) = (9, 0.2) to approximate the demand under the stationary distribution (with a matching demand mean 45). Based on this demand distribution, we use a target fill rate of 97.7%, as suggested in Abhyankar and Graves (2001), to determine the installation base-stock level for Stage 2. We then convert the installation base-stock levels into the echelon base-stock levels by a standard technique (Zipkin 2000). We let sdn (k) denote the resulting echelon base-stock levels, so that we can have a comparison with the optimal echelon base-stock levels and our heuristic solutions. Recall that our derivative-based heuristic is parametrized by ζ, with hn ≤ ζ ≤ h[1,n] . We find the heuristic performs well at either of the two extreme values of ζ, 27 so we present both cases san (k |hn ) and san (k |h[1,n] ) below. We evaluate the long-run average costs of the two-stage system under the heuristic policies and the optimal policy, and then compute the percentages of relative solution/cost errors. When the transition probability p is small (e.g., p = 0.1), the transition between the two states occurs less frequently. The demand process behaves more like an i.i.d. process, with an occasional demand state shifting. On the other hand, when p is large (e.g., p = 0.9), the demand process becomes more dynamic, as it transitions between the two states more frequently. The latter case is similar to the three-state and five-state demand cases considered earlier (where the probability of transitioning to a different state is no less than 0.7). Table 2 presents the detailed numerical results for the two extreme transition probability cases of p = 0.1 and 0.9. Table 2 (L1 , L2 ) n (1, 5) 1 k 1 2 2 1 2 Solution error % Cost error % (2, 4) 1 1 2 2 1 2 Solution error % Cost error % (3, 3) 1 1 2 2 1 2 Solution error % Cost error % (4, 2) 1 1 2 2 1 2 Solution error % Cost error % (5, 1) 1 1 2 2 1 2 Solution error % Cost error % Comparison of policies in a two-stage system with two-state demand. p=0.1 s∗n (k) san (k |h[1,n] ) san (k |hn ) 117.9 117.9 117.9 185.1 180.6 180.6 438.6 356.1 392.0 533.3 471.9 512.2 11.6% 5.7% 414.0 29.5% 4.8% 168.3 168.3 168.3 253.8 247.4 247.4 434.4 356.1 392.0 527.7 471.9 512.2 10.2% 4.6% 501.8 22.3% 3.6% 218.6 218.6 218.6 318.4 310.9 310.9 428.5 356.1 392.0 519.9 471.9 512.2 8.6% 3.5% 591.3 16.9% 2.6% 269.0 269.0 269.0 380.0 371.9 371.9 420.4 356.1 392.0 509.7 471.9 512.2 7.0% 2.5% 681.3 11.7% 1.6% 319.6 319.6 319.6 439.4 430.9 430.9 403.4 356.1 392.0 496.7 471.9 512.2 4.9% 2.1% 768.1 6.3% 0.7% sdn (k) 110.1 171.6 458.4 519.9 4.3% 2.3% 158.8 236.5 455.7 533.4 3.9% 2.9% 207.7 298.4 452.5 543.2 5.3% 4.5% 256.8 358.0 448.6 549.8 6.5% 6.1% 306.3 415.7 443.5 552.9 8.0% 7.9% p=0.9 s∗n (k) san (k |h[1,n] ) san (k |hn ) 148.0 148.0 148.0 156.9 156.9 156.9 423.8 410.7 447.5 447.5 433.1 467.2 2.3% 3.7% 357.3 1.3% 1.6% 197.1 197.1 197.1 227.2 226.9 226.9 421.9 410.7 447.5 444.7 433.1 467.2 1.8% 3.8% 424.6 0.6% 1.8% 264.2 264.2 264.2 276.5 276.5 276.5 417.7 410.7 447.5 442.3 433.1 467.2 1.2% 3.9% 490.7 0.2% 2.4% 314.0 314.0 314.0 339.5 339.3 339.3 416.0 410.7 447.5 438.8 433.1 467.2 0.7% 4.0% 552.0 0.0% 3.0% 375.0 375.0 375.0 389.5 389.5 389.5 410.2 410.7 447.5 436.3 433.1 467.2 0.2% 4.2% 609.1 0.0% 4.1% sdn (k) 139.9 148.2 488.2 496.6 11.1% 10.1% 187.3 216.7 484.3 513.6 11.7% 10.4% 253.0 264.9 497.8 509.7 12.2% 10.9% 301.5 326.4 493.2 518.2 12.1% 10.9% 361.4 375.5 498.6 512.7 11.9% 11.5% 28 From Table 2, we observe that, when p is small, our heuristic san (k |ζ) performs better with ζ = hn . When p is large, our heuristic san (k |ζ) performs better with ζ = h[1,n] . This latter observation accords with our earlier observations in the three-state and five-state demand cases. These results indicate that, when the demand process is more dynamic, we should set ζ = h[1,n] in our heuristic; otherwise, we should use ζ = hn . Comparing our heuristic with the decoupling heuristic, we note that our heuristic performs better in most cases, with an average performance improvement of 4.6%. The most dramatic improvement occurs in the more dynamic demand case of p = 0.9, where the improvement can be as high as 11.5%. However, in the less dynamic demand case of p = 0.1, when the Stage 2 lead time is long (i.e., L2 ≥ 4), the decoupling heuristic performs better than our heuristic, with an improvement of 0.7-2.5%. Performance difference aside, we note that our derivative-based heuristic can be applied to systems with any number of stages, whereas the decoupling heuristic only works for two-stage systems. Figure 2 illustrates the heuristic performance comparison under other p values, where the relative cost error percentages of the three heuristic policies are plotted for the two extreme lead time cases of (L1 , L2 ) = (1, 5), (5, 1). The plots confirm our observations from Table 2. 35% 35% s an ( k |h [ 1 , n)] s an ( k |h [ 1 , n)] s an ( k |h n) s dn ( k ) 30% s an ( k |h n) 30% 25% Heuristic Cost Errors Heuristic Cost Errors 25% 20% 15% 20% 15% 10% 10% 5% 5% 0% 0.1 s dn ( k ) 0.2 0.3 0.4 0.5 0.6 p 0.7 0.8 0.9 0% 0.1 0.2 0.4 0.5 0.6 0.7 0.8 0.9 p (a) (L1 , L2 ) = (1, 5) Figure 2 0.3 (b) (L1 , L2 ) = (5, 1) Relative cost error percentages under various heuristics It has been demonstrated in the literature that, in a serial inventory system with i.i.d. demand, an optimal internal fill rate is usually much lower than the fill rate at the customer-facing stage (see Choi et al. 2004, Shang and Song 2006, and references therein). For systems with MMD, a natural question is whether the internal fill rate would continue to be low, or becomes much higher as assumed in the decoupling heuristic. To obtain insights into this question, we compute the system internal fill rates (at Stage 2) under the optimal policy for the two extreme lead time cases of (L1 , L2 ) = (1, 5), (5, 1). The results are plotted in Figure 3. 29 100% Le ad t ime ( 1, 5) System Internal Fill Rate under Optimal Solution 98% Le ad t ime ( 5, 1) 96% 94% 92% 90% 88% 86% 84% 82% 80% 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 p Figure 3 Internal fill rates under optimal policy Figure 3 shows that, contrary to the the observations documented for the i.i.d. demand case, when the lead time at the upstream stage is long (i.e., L2 = 5), the internal fill rate can be high under MMD (ranging between 94-96%; the target fill rate at the customer-facing stage is 94.3%), suggesting that the decoupling heuristic may yield a good approximation to the optimal policy. However, from Table 2, the decoupling heuristic still performs poorly when p is large. This is because the static installation base-stock level computed under the the decoupling heuristic tends to be a poor approximation to the optimal policy as the demand process becomes more dynamic. Figure 3 also shows that, when the lead time at the upstream stage is relatively short (i.e., L2 = 1), the internal fill rate tends to be low (ranging between 84-93%, lower than the 94.3% target fill rate at the customer-facing stage). As a result, the decoupling heuristic yields a poor approximation; and our derivative-based heuristic has a clear advantage in this case, as it does not require the high internal fill rate assumption. We conduct an additional numerical study on the internal fill rates under the three-state demand cases. The findings are similar to the above observations (see Appendix E). 6. Conclusion We have addressed in this paper supply chain inventory management in a volatile demand environment driven by uncertain external factors. Our goal is to develop simple analytical tools and insights to cope effectively with dynamic demand uncertainties. To this end, we have focused on simplifying the inventory decision tools for serial inventory systems with MMD. We make several contributions to the literature. First, we have developed simple solution bounds and approximations for these systems. Our derivative analysis approach enables us to obtain a general solution lower bound that does not require the restrictive assumptions used in previous works. Second, we have performed the first asymptotic analysis for these systems with long lead 30 time. We prove the MMD central limit theorem and characterize the asymptotic mean and variance. We also show the convergence rate of our bounds and approximations to the optimal policy. These results help reveal the effect of lead time on the optimal policy and provide analytical assessment of the effectiveness of our bounds and heuristic policies (i.e., the relative errors). Third, to evaluate the performance of our solution bounds and heuristic policies, we use the gamma demand distribution family and show that, under this family, both the optimal solutions and the solution bounds can be computed easily by leveraging Laplace transformation and its inverse. To our knowledge, existing algorithms in the literature for serial inventory systems with MMD only address discrete demand (such as Poisson distribution). With the gamma demand distribution, we have presented the first set of numerical studies of serial systems under MMD. We have demonstrated through numerical studies that our approximations are effective. These results greatly simplify computation and, hence, facilitate implementation. For instance, the bounds provide a convenient benchmark for the maximum and minimum inventory needed for each possible demand state, which is useful for budgeting purposes. The heuristics can serve as quick decision tools. Perhaps, even more importantly, these simple solutions can be obtained by solving equations that depend on the primitive model parameters only, making them much more transparent than the exact optimal solution. In particular, they shed light on how the transition probability among the states (and thus the environmental uncertainty) affects the optimal inventory position. The MMD demand model is particularly attractive because the flexibility of Markov chains enables us to account for environmental uncertainties. Advanced information technology, such as hand-held devices, GPS, and social media, as well as abundant data associated with these technologies, makes it easier for the construction of MMD. Real-time data also makes it easier to observe the demand state and, hence, to facilitate the implementation of MMD. For example, one can detect the predictive seasonality pattern from real-time data and construct a Markov chain with cyclic demand state transition matrix. For demand processes with more complex dynamics, such as in the Teradyne case of Abhyankar and Graves (2001), one can construct a discrete-time Markov chain based on the observed data as we have demonstrated in the numerical study. Several extensions of our problem merit further investigation. We have observed numerically that the optimal internal fill rate under MMD may be significantly higher than that observed in previous studies under i.i.d. demand. It would be interesting to identify analytical conditions that guarantee a high internal fill rate, such that the serial inventory system can be decoupled into separate single-stage systems to facilitate the implementation of the decoupling heuristic. Another extension is to investigate the demand variability propagation effect, or the bullwhip effect (see Chen and Lee 2009, 2012 and the references therein), in serial inventory systems under MMD. We have observed numerically that the optimal policy can dampen the bullwhip effect significantly, 31 suggesting that the state-dependent inventory policy under MMD may smooth demand variability propagation in the supply chain (see Appendix E for details). Providing analytical explanations for this system behavior can be an interesting future research topic. References Abhyankar, H., S. Graves. 2001. Creating an inventory hedge for Markov-modulated Poisson demand: An application and model. Manufacturing & Service Oper. Management 3(4) 306–320. Abouee-Mehrizi, H., O. Baron, O. Berman. 2014. Exact analysis of capacitated two-echelon inventory systems with priorities. Manufacturing & Service Oper. Management 16(4) 561-577. Angelus, A. 2011. A multiechelon inventory problem with secondary market sales. Management Sci. 57(12) 2145–2162. Aviv, Y., A. Federgruen. 1997. Stochastic inventory models with limited production capacity and periodically varying parameters. Probability in the Engineering and Informational Sciences. 11 107–135. Beyer, D., S. Sethi. 1997. Average cost optimality in inventory models with Markovian demands. J. Optim. Theory Appl. 92(3) 497–526. Billingsley P. 1995. Probability and Measure. 3rd Edition, John Wiley & Sons. Brenner, J. L. 1959. Relations among the minors of a matrix with dominant principal diagonal. Duke Math. J. 26(4) 563-567. Chao, X., S. Zhou. 2007. Probabilistic solution and bounds for serial inventory system with discounted and average cost. Naval Res. Logist. 54(6) 623–631. Chen, F., J.-S. Song. 2001. Optimal policies for multiechelon inventory problems with Markovmodulated demand. Oper. Res. 49(2) 226–234. Chen, F., Y.-S. Zheng. 1994. Lower bounds for multi-echelon stochastic inventory systems. Management Sci. 40(11) 1426–1443. Chen, L., H.L. Lee. 2009. Information sharing and order variability control under a generalized demand model. Management Science 55(5) 781-797. Chen, L., H.L. Lee. 2012. Bullwhip Effect Measurement and Its Implications. Operations Research 60(4) 771-784. Choi, K.-S., J.G. Dai and J.-S. Song. 2004. On measuring supplier performance under vendormanaged-inventory programs in capacitated supply chains. Manufacturing & Service Operations Management 6 (1) 53-72. 32 Clark, A.J., H. Scarf. 1960. Optimal policies for a multi-echelon inventory problem. Management Sci. 6 475-490. Dong, L., H.L. Lee. 2003. Optimal policies and approximations for a serial multi-echelon inventory system with time-correlated demand. Oper. Res. 51(6) 969-980. Federgruen, A., P. Zipkin. 1984. Computational issues in an infinite horizon multi-echelon inventory problem with stochastic demand. Oper. Res. 32 818-836. Gallego, G., O. Ozer. 2005. A new algorithm and a new heuristic for serial supply systems. Oper. Res. Letters 33(4) 349-362. Gallego, G., P. Zipkin. 1999. Stock positioning and performance estimation in serial productiontransportation systems. Manufacturing & Service Oper. Management 1(1) 77-88. Huh, W. T., G. Janakiraman. 2008. A sample-path approach to the optimality of echelon orderup-to policies in serial inventory systems. Oper. Res.Letters 36(5) 547-550. Iglehardt, D., S. Karlin. 1962. Optimal policy for dynamic inventory process with nonstationary stochastic demands. Studies in Applied Probability and Management Science. K. Arrow, S. Karlin, H. Scarf (eds.). Chapter 8, Stanford University Press, Stanford, California. Janakiraman, G., J.A. Muckstadt. 2009. A decomposition approach for a class of capacitated serial systems. Oper. Res. 57(6) 1384-1393. Karlin S. 1960. Dynamic inventory policy with varying stochastic demands. Management Sci. 6(3) 231-258. Kapuscinski, R., S. Tayur. 1998. A capacitated production-inventory model with periodic demand. Oper. Res. 46(6) 899-911. Muharremoglu, A., J.N. Tsitsiklis. 2008. A Single-unit decomposition approach to multiechelon inventory systems. Oper. Res. 56(5) 1089-1103. Ozekici, S., M. Parlar. 1993. Periodic-review inventory models in random environments. Working paper, School of Business, McMaster University, Hamilton, Ontario, Canada. Parker, R.P., R. Kapuscinski. 2004. Optimal policies for a capacitated two-echelon inventory system. Oper. Res. 52(5) 739-755. Sgibnev M. S. 2001. Stone decomposition for a matrix renewal measure on a half-line. Sbornik: Mathematics 192(7) 1025-1033. Sethi, S., F. Cheng. 1997. Optimality of (s, S) policies in inventory models with Markovian demand. Oper. Res. 45(6) 931-939. 33 Shang, K.H. 2012. Single-stage approximations for optimal policies in serial inventory systems with non-stationary demand. Manufacturing & Service Oper. Management 14(3) 414-422. Shang, K.H., J.-S. Song. 2003. Newsvendor bounds and heuristic for optimal policies in serial supply chains. Management Sci. 49(5) 618-638. Shang, K. and J.-S. Song. 2006. A closed-form approximation for serial inventory systems and its application to system design. Manufacturing & Service Oper. Management 8(4) 394-406. Song, J.-S., P. Zipkin. 1992. Evaluation of base stock policies in multiechelon inventory systems with state dependent demands part I: State independent policies. Naval Res. Logist. 39(5) 715-728. Song, J.-S., P. Zipkin. 1993. Inventory control in a fluctuating demand environment. Oper. Res. 41(2) 351-370. Song, J.-S. and P. Zipkin. 1996a. Evaluation of base-stock policies in multiechelon inventory systems with state-dependent demands: Part II: state-dependent depot policies. Naval Research Logistics 43 381-396. Song, J.-S., P. Zipkin. 1996b. Managing inventory with the prospect of obsolescence. Oper. Res. 44(1) 215-222. Song, J.-S., P. Zipkin. 2009. Inventories with multiple supply sources and networks of queues with overflow bypasses. Management Sci. 55(3) 362–372. Zipkin, P. 1989. Critical number policies for inventory models with periodic data. Management Sci. 35(1) 71-80. Zipkin, P. 2000. Foundations of inventory management. McGraw-Hill, New York. 34 Appendices for Online Companion Appendix A: Proofs Proof (Lemma 1) Prove by induction. For case i = 1, because G1 (·, y) is convex in y, it follows that G1 (1∗ , y) is increasing convex in y. Thus, by definition (4), we have ∂y R1 (1∗ , y) = ∂y G1 (1∗ , y) + p1∗ 1∗ E ∂y R1 (1∗ , y − D1∗ ) . Because the first term on the right-hand side is nonnegative, by a standard result of renewal equations (see Sgibnev 2001 Theorem 3, p. 1032), it follows that ∂y R1 (1∗ , y) ≥ 0. Taking derivative on both sides of the equation, by the same argument, we obtain ∂y2 R1 (1∗ , y) ≥ 0. Therefore, R1 (1∗ , y) is increasing convex in y. Next, by the definition of G2 (·, y), we have G2 (·, y) is convex in y. Hence, it follows that G2 (2∗ , y) is increasing convex in y. By repeating the above induction proof step, we obtain the desired result. −1 (i) Proof (Lemma 2) We first show ∂y Ri (y) ≤ I(i) − Pm∗ · ei · ∂y Gi (i∗ , y). When i = 2, (9) can be written as (1 − p1∗ 1∗ )∂y R2 (1∗ , y) − p1∗ 2∗ ∂y R2 (2∗ , y) ≤ 0; −p2∗ 1∗ ∂y R2 (1∗ , y) + (1 − p2∗ 2∗ )∂y R2 (2∗ , y) ≤ ∂y G2 (2∗ , y). Recall that ∂y Ri (·, y) and ∂y Gi (·, y) are always nonnegative. It follows that ∂y R2 (1∗ , y) ≤ p1∗ 2∗ p1∗ 2∗ ∂y G2 (2∗ , y); ∂y G2 (2∗ , y) = (2) (1 − p1∗ ,1∗ )(1 − p2∗ 2∗ ) − p1∗ 2∗ p2∗ 1∗ (2) I − Pm ∗ ∂y R2 (2∗ , y) ≤ 1 − p1∗ 1∗ 1 − p1∗ 1∗ ∂y G2 (2∗ , y). ∂y G2 (2∗ , y) = (2) (1 − p1∗ ,1∗ )(1 − p2∗ 2∗ ) − p1∗ 2∗ p2∗ 1∗ (2) I − Pm ∗ On the other hand, it is easy to verify −1 −1 p1∗ 2∗ (2) (2) (2) (2) 2 I − Pm∗ e = I − Pm∗ . 1 − p1∗ 1∗ −1 (2) Therefore, we have ∂y R2 (y) ≤ I(2) − Pm∗ · e2 · ∂y G2 (2∗ , y). When i > 2, following the same proof steps, we can show the inequality holds in general. Recall from lemma 1 that ∂y Ri (y) is always −1 (i) (i) nonnegative. This implies that I − Pm∗ · ei ≥ 0. (i) Next, let (a)c denote the minor of an element a in I(i) − Pm∗ . From the standard result of matrix inverse, we have (−1)i+1 (−pi∗ 1∗ )c −1 (−1)i+2 (−pi∗ 2∗ )c −1 (i) (i) . I(i) − Pm∗ · ei = I(i) − Pm∗ .. . (1 − pi∗ i∗ )c 35 (i) Also note that I(i) − Pm∗ is a diagonally dominant matrix (because 1 − pi∗ i∗ ≥ Pi−1 ∗ ∗ j=1 pi j ). By Brenner (1959, Corollary 1), it follows that (−1)i+1 (−pi∗ 1∗ )c (1 − pi∗ i∗ )c (−1)i+2 (−pi∗ 2∗ )c (1 − pi∗ i∗ )c ≤ , .. .. . . (1 − pi∗ i∗ )c (1 − pi∗ i∗ )c −1 (i−1) (i) where (1 − pi∗ i∗ )c = I(i−1) − Pm∗ . Therefore, we obtain I(i) − Pm∗ ei ≤ e · βi (m∗ ), and the desired result follows. Proof (Proposition 1) The lower bound result is straightforward from Lemma 1. For the upper bound result, from the discussion preceding Proposition 1, we know [k] ∂y G (k, y) ≤ ∂y G(k, y) + (1 − pkk ) K−1 X βi (m∗ )∂y Gi (i∗ , y), i=1 where m∗ is the optimal state sequence. We want to further relax the second term of the right-hand side to make it independent of the optimal state sequence. Note that, for i = 1, ..., K − 1, ∂y Gi (i∗ , y) = max{0, ∂y Gi (i∗ , y)} ( 1 ∗ = max 0, ∂y G (i , y) + ≤ ∂y G1 (i∗ , y) + ≤ ∂y G1 (i∗ , y) + + i−1 X l X ) l ∗ pi∗ k∗ E ∂y R (k , y − Di∗ ) l=1 k=1 i−1 l XX pi∗ k∗ ∂y Rl (k ∗ , y) l=1 k=1 + i−1 X φi,l (m∗ )∂y Gl (l∗ , y), (A1) l=1 where the first inequality follows from Lemma 1 and the second inequality follows from Lemma −1 (l) 2, with φi,l (m∗ ) = [pm∗i m∗1 , ..., pm∗i m∗l ]T I(l) − Pm∗ el . Define the following vector and matrix notations: T ∗ ∂y G = ∂y G1 (1∗ , y), ∂y G2 (2∗ , y), ..., ∂y GK−1 ((K − 1) , y) , T ∗ (∂y G1 )+ = (∂y G1 (1∗ , y))+ , (∂y G1 (2∗ , y))+ , ..., (∂y G1 ((K − 1) , y))+ , 0 0 ... 0 0 0 φ2,1 (m∗ ) ∗ ∗ φ (m ) φ (m ) 0 0 . 3,1 3,2 Ψ= . . . . . .. .. .. .. .. φK−1,1 (m∗ ) φK−1,2 (m∗ ) . . . φK−1,K−2 (m∗ ) 0 Then, the inequalities of (A1) can be written in a matrix form as I(K−1) − Ψ ∂y G ≤ (∂y G1 )+ . Note that I(K−1) − Ψ is a lower triangular matrix with ones on the diagonal, and hence its inverse 36 must also be a lower triangular matrix with ones on the diagonal. From the fact that the elements in Ψ are all nonnegative, it follows that I(K−1) − Ψ is invertible and its inverse has all nonnegative −1 elements. Therefore, we have ∂y G ≤ I(K−1) − Ψ (∂y G1 )+ . T Further define β(m∗ ) = [β1 (m∗ ), ..., βK−1 (m∗ )] . Then, we have K−1 X βi (m∗ )∂y Gi (i∗ , y) = β(m∗ )T ∂y G ≤ β(m∗ )T I(K−1) − Ψ −1 (∂y G1 )+ . i=1 Let α(m∗ ) = [α1 (m∗ ), ..., αK−1 (m∗ )] = β(m∗ )T I(K−1) − Ψ −1 , where αi (m∗ ) can be determined recursively by (10). Therefore, we conclude that [k] ∂y G (k, y) ≤ ∂y G(k, y) + (1 − pkk ) K−1 X αi (m∗ ) ∂y G1 (i∗ , y) + i=1 ≤ ∂y G(k, y) + (1 − pkk ) max m∈S K−1 X αi (m) (∂y G(mi , y)) + i=1 = ∂y G(k, y) + (1 − pkk )∆(y), where the last equality follows from the definition of ∆(y) given in (11). Proof (Corollary 1) s(k) ≤ s∗ (k) ≤ s̄(k) is a direct result from Proposition 1. By the definition of smin , we have smin ≤ s̄(k). Note that ∂y G(k, y) is increasing in y. Thus, it follows from the definition of s̄(k) that ∂y G(k, smin ) ≤ 0. Moreover, note that (∂y G(k, smin ))+ = 0 for any k. Hence, it is straightforward to verify that ∂y G(k, smin ) + (1 − pkk )∆(smin ) ≤ 0, which implies that smin ≤ s(k) by the definition of s(k) in (12). Proof (Corollary 2) By the definition of αi (m), we have αi (m) ≥ βi (m). Therefore, j−1 1X + ∂y G(k, y) ≤ ∂y G(k, y) + pkm̃i · βi (m̃) · Fk (y − s̄(m̃i )) (∂y G(m̃i , y)) 2 i=1 + ≤ ∂y G(k, y) + (1 − pkk ) · αi (m̃) (∂y G(m̃i , y)) ≤ ∂y G(k, y) + (1 − pkk )∆(y), which implies the desired result. Proof (Lemma 3) Let π tk = [πkt (1), ..., πkt (K)] denote the distribution of demand states at period t given an initial state k ∈ {1, ..., K }. Then, we have limt→∞ π tk = π, where π is the Markov chain stationary distribution. Let Xkt denote the demand in period t given the initial state k, then (t) D1 with prob. πkt (1), D(t) with prob. π t (2), 2 k Xkt = .. . (t) DK with prob. πkt (K), 37 (t) where Di (t ) (t ) has the same distribution as Di . Note that Di11 , ..., Dinn are all independent of each other for any given sets of {i1 , ..., in } and {t1 , ..., tn }, as long as tj 6= tj 0 for j 6= j 0 . Consider first an alternative Markov chain with the same transition matrix but starting with the stationary distribution. Let Y t denote the demand in each period under this Markov chain. PK (t) Then, we have Y t = Di with probability π(i) for any t ≥ 1. Hence, E {Y t } = i=1 µi π(i) = µ. Let Ỹ t = Y t − µ. Then we have E {Ỹ t } = 0. It is also easy to check that K X K K o X n o X n (s) (µk − µ)µl pkl π(k), E (Ỹ t )2 = σk2 + µ2k − µ2 π(k), and E Ỹ t Ỹ t+s = k=1 l=1 k=1 (s) where pkl is the s-step transition probability from state k to state l. Since the Markov chain W has a stationary distribution and the state space is finite, by Theorem 8.9 in Billingsley (1995, p. 131), we have, for any k, l and s, where A ≥ 0 and 0 ≤ ρ < 1, (s) pkl − π(l) ≤ Aρs . (A2) Since {Ỹ t } is stationary, from the above condition, it can be verified that {Ỹ t } is α-mixing with an exponential decay rate (see Billingsley 1995, p. 363). Let S̃ L = Ỹ 1 + · · · Ỹ L+1 . By Theorem 27.4 of Billingsley (1995, p. 364), it follows that, as L → ∞, the following series converge absolutely: ∞ o o n n n o X E Ỹ 1 Ỹ 1+s (L + 1)−1 var S̃ L → E (Ỹ 1 )2 + 2 s=1 = K X k=1 ∞ X K X K X (s) σk2 + µ2k − µ2 π(k) + 2 (µk − µ)µl pkl π(k) s=1 k=1 l=1 = σ2. If σ > 0, we further have, as L → ∞, S̃ L dist. √ −−→ N (0, 1) . σ L+1 (A3) Now consider the original Markov chain. Define X̃kt = Xkt − µ and SkL = X̃k1 + · · · X̃kL+1 . Given a fixed y value (y ≥ 0), for any N ≥ 0, we can write the following: ! X̃kN +1 ... + X̃kL+1 √ ≤y P σ L+1 ! K K X X X̃kN +1 ... + X̃kL+1 N +1 L+1 √ = DiN +1 , ..., Xk = DiL+1 = ··· P ≤ y Xk σ L+1 iN +1 =1 iL+1 =1 ×P XkN +1 = DiN +1 , ..., XkL+1 = DiL+1 ! (L+1) (N +1) K K X X DiN +1 ... + DiL+1 − (L − N + 1)µ (N +1) √ = ··· P ≤ y · pk,iN +1 piN +1 ,iN +2 · · · piL ,iL+1 , σ L + 1 i =1 i =1 N +1 L+1 38 and Ỹ N +1 ... + Ỹ L+1 √ ≤y σ L+1 P = K X ··· iN +1 =1 K X P iL+1 =1 ! ! (N +1) (L+1) DiN +1 ... + DiL+1 − (L − N + 1)µ √ ≤ y · π(iN +1 )piN +1 ,iN +2 · · · piL ,iL+1 . σ L+1 Therefore, comparing the two expressions, we obtain ! ! Ỹ N +1 ... + Ỹ L+1 X̃kN +1 ... + X̃kL+1 √ √ ≤ y − P ≤ y P σ L+1 σ L+1 ! (N +1) (L+1) DiN +1 ... + DiL+1 − (L − N + 1)µ (N +1) √ ≤ y pk,iN +1 − π(iN +1 ) piN +1 ,iN +2 · · · piL ,iL+1 ≤ ··· P σ L+1 iN +1 =1 iL+1 =1 ! (N +1) (L+1) K K X X D ... + D − (L − N + 1)µ i i N +1 L+1 √ ≤ y · piN +1 ,iN +2 · · · piL ,iL+1 ≤ AρN +1 ··· P σ L+1 i =1 i =1 K X K X N +1 N +1 ≤ AKρ L+1 , where the second inequality follows from the condition (A2) and the last inequality is a result of relaxing the inner probability term P (·) to be one. For any > 0, there exists N such that AKρN ≤ /3 for N > N . Fix N , it is easy to check that the following results hold: L lim P L→∞ S̃ √ ≤y σ L+1 ! PN i Ỹ + Ỹ N +1 ... + Ỹ L+1 i=1 √ = lim P ≤ y L→∞ σ L+1 ! Ỹ N +1 ... + Ỹ L+1 √ ≤y , = lim P L→∞ σ L+1 and, similarly, lim P L→∞ SkL √ ≤ y = lim P L→∞ σ L+1 ! X̃kN +1 ... + X̃kL+1 √ ≤y . σ L+1 Therefore, from the above limiting results, there exists L(, N ) such that for L > L(, N ), ! ! S̃ L Ỹ N +1 ... + Ỹ L+1 √ √ ≤y −P ≤ y < /3, and P σ L+1 σ L+1 ! SkL X̃kN +1 ... + X̃kL+1 √ √ ≤y −P ≤ y < /3. P σ L+1 σ L+1 Hence, for L > L(, N ), ! SkL S̃ L SkL √ √ √ ≤y −P ≤ y ≤ P ≤y −P P σ L+1 σ L+1 σ L+1 X̃kN +1 ... + X̃Lk √ ≤y σ L+1 ! 39 + P + P ! ! Ỹ N +1 ... + ỸL X̃kN +1 ... + X̃Lk √ √ ≤y −P ≤y σ L+1 σ L+1 ! ! Ỹ N +1 ... + ỸL S̃ L √ √ ≤y −P ≤y σ L+1 σ L+1 ≤ /3 + AKρN +1 + /3 ≤ . S̃ L SL √ k and √ converge to the same distribuσ L+1 σ L+1 tion as L goes to infinity. From (A3), the desired result follows. Since is chosen arbitrarily, it follows that Proof (Proposition 2) Consider first the case when the Markov chain W has a stationary distribution. Let η = b/(b + h). For any initial state k, the myopic base-stock level s̄(k) is the solution of P Xk1 + Xk2 + ... + XkL+1 ≤ s̄(k) = η. The above equation can be further written as 1 Xk + Xk2 + ... + XkL+1 − (L + 1)µ s̄(k) − (L + 1)µ √ √ ≤ = η. P L + 1σ L + 1σ Let L goes to infinity, since the left-hand side of the inequality inside the parenthesis converges s̄(k) − (L + 1)µ √ to N (0, 1) in distribution, it follows that lim = Φ−1 (η), where Φ(·) is the stanL→∞ L + 1σ s̄(k) − (L + 1)µ = 0, or, equivalently, dard normal distribution function. Hence, it follows that lim L→∞ (L + 1)σ s̄(k) µ lim = . Now for different initial states k and k 0 , we have L→∞ (L + 1)σ σ √ √ (s̄(k) − s̄(k 0 ))/ L + 1σ s̄(k) − s̄(k 0 ) L+1 = s̄(k) s̄(k)/(L + 1)σ √ √ (s̄(k) − (L + 1)µ)/ L + 1σ − (s̄(k 0 ) − (L + 1)µ))/ L + 1σ = , s̄(k)/(L + 1)σ where the numerator converges to zero and the denominator converges to a positive constant, as L goes to infinity. Therefore, we conclude that, for any k and k 0 , lim L→∞ √ √ L+1· s̄(k) − s̄(k 0 ) = 0, s̄(k) s̄(k) − smin = 0. L→∞ smin When the Markov chain W is cyclic, for any given lead time L, we can write L = mK + l, where which implies lim L+1· l = L mod K. The first mK periods of aggregated demand of Dk (L) and Dk0 (L) are exactly the same. As L increases to infinity, the residual part is negligible compared with the first mK periods 40 of aggregated demand. Therefore, we can show that, for any states k and k 0 , the lead time demand distributions Fk,L (y) and Fk0 ,L (y) converge to the same limit distribution for every y as L goes to infinity. Hence, the same convergence rate result can be obtained. We omit the details here. Proof (Proposition 3) By repeatedly applying the procedure we used to obtain (23), we can prove by induction that, 1 e ∂y G[k] n (k, y) ≥ ∂y Gn (k, y) ≥ −(b + Hn+1 ) + (b + Hn )Fk,L0n (y) = ∂y Gn (k, y |hn ). By applying (3) repeatedly, taking derivative with respect to y, and applying Lemma 2, we have [k]−1 ∂y G[k] n (k, y) = ∂y G1n (k, y) + X X pku ∂y E Rni (u, y − Dk ) i=1 u∈U i+1 n [k]−1 ≤ ∂y G1n (k, y) + X X pku βi (m∗n )∂y Gin (i∗ , y) i=1 u∈U i+1 n ≤ ∂y G1 (k, y) + K−1 X (1 − pkk )βi (m∗n )∂y Gin (i∗ , y) , (A4) i=1 where the last inequality utilizes the facts that [k] ≤ K, transition probabilities pku sum up to 1, and ∂y Rni (u, y) is increasing and nonnegative. For ease of exposition, denote vj = Wk (t + L0n − L0j−1 ). By using the derivatives of (19) and (20), we have ∂y G1n (k, y) = hn + E {∂y Gn−1,n (vn , y − Dk [t, t + Ln ))} n n o o [vn ] [vn ] [vn ] [vn ] = hn + E ∂y Gn−1 vn , min y − Dk [t, t + Ln ), sn−1 (vn ) − ∂y Gn−1 vn , sn−1 (vn ) n n oo [vn ] [vn ] = hn + E ∂y Gn−1 vn , min y − Dk [t, t + Ln ), sn−1 (vn ) n o [vn ] ≤ hn + E ∂y Gn−1 (A5) (vn , y − Dk [t, t + Ln )) , [v ] n where the inequality utilizes the fact that the derivative of ∂y Gn−1 (vn , y) is increasing in y (Chen and Song 2001). By applying (3) to (A5), we have ∂y G1n (k, y) ≤ hn + E ∂y G1n−1 (vn , y − Dk [t, t + Ln )) [v ]−1 n X X i +E pvn u EDvn ∂y Rn−1 (u, y − Dk [t, t + Ln ) − Dvn ) i=1 u∈U i+1 n−1 ≤ hn + E ∂y G1n−1 (vn , y − Dk [t, t + Ln )) n ]−1 [vX X i +E pvn u ∂y Rn−1 (u, y − Dk [t, t + Ln )) , i=1 u∈U i+1 n−1 (A6) 41 where the last inequality follows from Lemma 1. e n (k, y |h[1,n] ) = −(b + Hn+1 ) + (b + H1 )Fk,L0 (y). By repeatedly applying the above Recall that ∂y G n techniques and using Lemma 2, we have ∂y G1n (k, y) ≤ hn + hn−1 + ... + h2 + E ∂y G11 (v2 , y − Dk [t, t + L0n − L1 )) n−1 [vj+1 X X]−1 X i 0 0 +E pvj+1 u ∂y Rj u, y − Dk [t, t + Ln − Lj ) j=1 i=1 u∈U i+1 j ]−1 n−1 [vj+1 X X X i 0 0 e n (k, y |h[1,n] ) + E = ∂y G pvj+1 u ∂y Rj u, y − Dk [t, t + Ln − Lj ) j=1 i=1 u∈U i+1 j [v ]−1 j+1 n−1 X X X i ∗ 0 0 ∗ e ≤ ∂y Gn (k, y |h[1,n] ) + E pvj+1 u βi mj ∂y Gj i , y − Dk [t, t + Ln − Lj ) j=1 i=1 u∈U i+1 j ) ( n−1 K−1 XX i ∗ 0 0 ∗ e n (k, y |h[1,n] ) + E (A7) ≤ ∂y G βi m ∂y G i , y − Dk [t, t + L − L ) , j j n j j=1 i=1 where the equality follows from v2 = Wk (t + L0n − L1 ) and Fv2 ,L1 (y − Dk [t, t + L0n − L1 )) = Fk,L0n (y). Combining (A4) and (A7) leads to e ∂y G[k] n (k, y) ≤ ∂y Gn (k, y |h[1,n] ) + Mn (k, y), where Mn (k, y) = n−1 K−1 X X X K−1 βi (m∗j )E ∂y Gij i∗ , y − Dk [t, t + L0n − L0j ) + (1 − pkk )βi (m∗n )∂y Gin (i∗ , y), j=1 i=1 i=1 with m∗j being the optimal sequence in the jth stage. With the same analysis as in the proof of Proposition 1, we can further relax Mn (k, y) as follows. (K−1 ) n−1 X X + ∗ 1 ∗ 0 0 Mn (k, y) ≤ E αi mj ∂y Gj i , y − Dk [t, t + Ln − Lj ) j=1 i=1 +(1 − pkk ) K−1 X αi (m∗n ) ∂y G1n (i∗ , y) + , (A8) i=1 where αi (m∗j ) can be determined recursively by (10). From (A7) and Lemma 2, we also have ( (K−1 )) j−1 X X + l 1 ∗ ∗ ∗ ∗ 0 0 ∂y Gj (i , y) ≤ max 0, gj (i , y) + E βl (ms )∂y Gs l , y − Di∗ [t, t + Lj − Ls ) s=1 ≤ gj (i∗ , y) + j−1 X s=1 E (K−1 X l=1 )!+ αl (m∗s ) ∂y G1s l∗ , y − Di∗ [t, t + L0j − L0s ) + . l=1 (A9) 42 Combining (A8) and (A9), we can verify that Mn (k, y) ≤ (1 − pkk )∆n (y) + Pn−1 ¯ j=1 ∆j,n (k, y) ¯ j,n (k, y) are defined in Proposition 3. We arrive at the desired result. where ∆n (y) and ∆ Proof (Corollary 3) The proof follows the same argument as in the proof for Corollary 1 and is omitted here. Proof (Corollary 4) The proof follows the same argument as in the proof for Corollary 2 and is omitted here. Proof (Proposition 4) For ease of exposition, we present a proof for N = 2. The general case with N > 2 can be shown following the same argument. Consider first the case when the Markov chain W has a stationary distribution. When L01 goes to infinity, we can apply Proposition 2 to obtain the result for Stage 1. Thus, we only need to check Stage 2. When L02 going to infinity, either L1 or L2 goes to infinity. For brevity, we shall consider the case of L1 going to infinity and L2 staying finite below. The other cases can be shown similarly. By Proposition 3, s̄2 (k) is the solution to −(b + H3 ) + (b + H2 )Fk,L02 (y) = 0, or equivalently, Fk,L02 (s̄2 (k)) = b + H3 . b + H2 With a similar argument to that used in the proof of Proposition 2, we have s̄2 (k) − (L02 + 1)µ p 0 = Φ−1 (η), L2 →∞ L2 + 1σ lim 0 where η = (b + H3 )/(b + H2 ). We next establish a further relaxation of the derivative upper bound given in Proposition 3. Define the following functions recursively: ˆ n (y) = ᾱ ∆ K X (b + H1 )Fk,L0n (y) + Pn−1 ˜ ∆ (k, y) j,n j=1 k=1 n o ˜ j,n (k, y) = E ∆ ˆ j y − Dk [t, t + L0n − L0j ) , ∆ ˆ 2 (y) and ∆ ¯ 1,2 (k, y) ≤ where ᾱ = maxi∈{1,...,K−1},m∈S αi (m). It is easy to verify that ∆2 (y) ≤ ∆ ˜ 1,2 (k, y). Then, we have ∆ [k] e 2 (k, y |h[1,2] ) + (1 − pkk )∆ ˆ 2 (y) + ∆ ˜ 1,2 (k, y). ∂y G2 (k, y) ≤ ∂y G Therefore, we obtain a relaxed solution lower bound, denoted by s02 (k), from solving the equation: e 2 (k, y |h[1,2] ) + (1 − pkk )∆ ˆ 2 (y) + ∆ ˜ 1,2 (k, y) = 0. ∂y G 43 ˆ 2 (y) and ∆ ˜ 1,2 (k, y) based on their definitions yields Expanding the terms ∆ ˆ 1 (y) = ᾱ ∆ K X (b + H1 )Fk,L1 (y), k=1 o n ˆ 1 (y − Dk [t, t + L02 − L01 )) ˜ 1,2 (k, y) = E ∆ ∆ (K ) X = ᾱ(b + H1 )E Fl,L1 (y − Dk [t, t + L2 )) l=1 = ᾱ(b + H1 ) K X P (Dl [t, t + L1 ] + Dk [t, t + L2 ) ≤ y) , l=1 ˆ 2 (y) = ᾱ ∆ K X ˜ 1,2 (k, y) (b + H1 )Fk,L02 (y) + ∆ k=1 = ᾱ(b + H1 ) K X Fk,L02 (y) + ᾱ2 (b + H1 ) k=1 K X K X P (Dl [t, t + L1 ] + Dk [t, t + L2 ) ≤ y) . k=1 l=1 Therefore, s02 (k) is the solution to the following equation: (b + H1 )Fk,L02 (y) + (1 − pkk )ᾱ(b + H1 ) K X Fk,L02 (y) k=1 2 +(1 − pkk )ᾱ (b + H1 ) K X K X P (Dl [t, t + L1 ] + Dk [t, t + L2 ) ≤ y) k=1 l=1 +ᾱ(b + H1 ) K X P (Dl [t, t + L1 ] + Dk [t, t + L2 ) ≤ y) = b + H3 , l=1 where the left-hand side involves a linear combination of the lead time demand distributions with the same time length. From Lemma 3, we have Dk [t, t + L02 ] − (L02 + 1)µ d p 0 −−0−−→ N (0, 1). L2 →∞ L2 + 1σ Also recall that L1 goes to infinity and L2 stays finite, as L02 goes to infinity. It is straightforward to verify that Dl [t, t + L1 ] + Dk [t, t + L2 ) − (L02 + 1)µ d p 0 −−0−−→ N (0, 1). L2 →∞ L2 + 1σ With a similar argument to that used in the proof of Proposition 2, we have s02 (k) − (L02 + 1)µ p 0 = Φ−1 (η 0 ), L2 →∞ L2 + 1σ lim 0 where η 0 = (b + H3 )/[(b + H1 )(1 + ᾱK(2 − pkk ) + ᾱ2 K 2 (1 − pkk ))]. It is easy to verify that η 0 < η. Therefore, we have, for any k and k 0 , p L02 s̄2 (k) − s02 (k 0 ) +1· s02 (k 0 ) = p L02 p (s̄2 (k) − s02 (k 0 )) / L02 + 1σ p +1· s02 (k 0 )/ L02 + 1σ 44 p p (s̄2 (k) − (L02 + 1)µ) / L02 + 1σ − (s02 (k 0 ) − (L02 + 1)µ) / L02 + 1σ h i p (L02 + 1)−1/2 (s02 (k 0 ) − (L02 + 1)µ)/ L02 + 1σ + µ/σ = Φ−1 (η) − Φ−1 (η 0 ) . µ/σ −−0−−→ L2 →∞ This implies that lim 0 L2 →∞ p L02 + 1 s̄2 (k) − mink0 s02 (k 0 ) Φ−1 (η) − Φ−1 (η 0 ) = . mink0 s02 (k 0 ) µ/σ , we have Since s2 (k) ≥ s02 (k), by the definition of smin 2 0≤ s̄2 (k) − mink0 s02 (k 0 ) s̄2 (k) − smin 2 ≤ . smin mink0 s02 (k 0 ) 2 Hence, 0 ≤ lim sup L02 →∞ p L02 + 1 p s̄2 (k) − smin s̄2 (k) − mink0 s02 (k 0 ) Φ−1 (η) − Φ−1 (η 0 ) 2 0 ≤ lim L + 1 = . 2 smin mink0 s02 (k 0 ) µ/σ L02 →∞ 2 Therefore, we arrive at lim sup L02 →∞ p s̄2 (k) − smin 2 L02 + 1 = c, smin 2 where c is a nonnegative constant. When the Markov chain W is cyclic, we can apply the same argument as used in the proof of Proposition 2 to obtain the convergence result. We omit the details here. 45 Appendix B: A Two-State Example for the Single-Stage System To illustrate our bounds and heuristic solutions, let us consider a single-stage system with a twostate MMD demand process. For ease of exposition, we write the transition matrix of W as P= 1−p p q 1−q . Its stationary distribution is π = (q/(p + q), p/(p + q)). According to Corollary 1, the solution upper bounds are just the myopic base-stock levels s̄(k), which solves (2). Without loss of generality, we assume s̄(1) < s̄(2). Thus, the optimal solution for state 1 is given by s∗ (1) = s̄(1) = smin = s(1) = sa (1). It remains to determine the solution lower bound for state 2, s(2), and the heuristic solution sa (k). Given the two demand states, there are two possible state sequences: m1 = (1, 2) and m2 = (2, 1). (1) It is easy to verify that P(1) m1 = 1 − p and Pm2 = 1 − q, so that β1 (m1 ) = α1 (m1 ) = 1/p and β1 (m2 ) = α1 (m2 ) = 1/q. From Corollary 1, s(2) is the solution of (12), which in this case reduces to q · ∂y G(1, y) + p · ∂y G(2, y) = 0. This is equivalent to q p b · F1,L (y) + · F2,L (y) = p+q p+q b+h or Fπ,L (y) = b . b+h Comparing the above equation with (2), we can see that s(2) is a myopic base-stock level with an “effective” lead time demand distribution, which is a weighted average of the lead time demand distributions of states 1 and 2 (or the “stationary” lead time demand distribution). Next, we illustrate how our heuristic solution works. With the assumption s̄(1) < s̄(2), the heuristic sequence is m̃ = (1, 2). Therefore, by equation (13), sa (2) is given by ∂y G(2, y) + q + · F2 (y − s̄(1)) (∂y G(1, y)) = 0, 2p which is equivalent to + q b b F2,L (y) + · F2 (y − s̄(1)) F1,L (y) − = . 2p b+h b+h (A10) To gain further insight, we next assume the demand distributions for state k is exponential with mean 1/θk , with θ1 > θ2 . We also assume L = 0 to allow for explicit expressions. This yields Fk (y) = Fk,0 (y) = 1 − e−θk y , ∂y G(k, y) = h − (h + b)e−θk y , s̄(k) = ln(1 + b/h)/θk for k = 1, 2, and s̄(1) < s̄(2). Thus, s∗ (1) = s̄(1) = s(1) = sa (1) = ln(1 + b/h)/θ1 . 46 To determine the optimal solution for state 2, note that s∗ (2) > s∗ (1). From (3), using Laplace transform technique described in Appendix C, we can show that s∗ (2) is determined by the following first-order condition: [2] ∂y G (2, y) = h − (h + b)e −θ2 y ∗ ∗ qh pθ1 e−θ2 (y−s (1)) − θ2 e−pθ1 (y−s (1)) + · 1− = 0. p pθ1 − θ2 (A11) From (A10), we can show that the heuristic sa (2) solves the following equation: for y ≥ s∗ (1), ∗ ∗ qh 1 − e−θ1 (y−s (1)) 1 − e−θ2 (y−s (1)) −θ2 y · = 0. (A12) h − (h + b)e + p 2 The difference between the above two first-order conditions lies in the last terms, which can be approximated respectively as follows by using Taylor expansion to the second order: 1− and (pθ1 e−θ2 (y−s ∗ 1 − e−θ1 (y−s ∗ ∗ − θ2 e−pθ1 (y−s pθ1 − θ2 (1)) (1)) ∗ 1 − e−θ2 (y−s (1)) (1)) 2 ) 1 ≈ pθ1 θ2 (y − s∗ (1))2 + o((y − s∗ (1))2 ), 2 1 ≈ θ1 θ2 (y − s∗ (1))2 + o((y − s∗ (1))2 ). 2 Therefore, s (2) is a close approximation to s∗ (2) to the second order if p is close to one. This a condition implies that there is a high likelihood that the demand will stay in the high state—even if it transitions back to state 1 in one period, it will jump to state 2 with a high probability in the next period. The following proposition further characterizes how sa (2) and s∗ (2) are affected by the Markov chain transition probability q: Proposition 5 Under a two-state MMD with exponential density and zero lead time, both the optimal solution s∗ (2) and the heuristic sa (2) are decreasing in transition probability q. Proof Denote s∗ = s∗ (1) for simplification. Because s∗ (2) ≥ s∗ , we can introduce the following change of variable: y = s∗ + x with x ≥ 0. Substituting this into (A11), we obtain ∗ qθ1 qθ2 · epθ1 s −pθ1 (x+s∗ ) p+q θ2 s∗ −θ2 (x+s∗ ) 2 ∗ ∂x G (2, s + x) = h− h+b+h e e +h 2 e p pθ1 − θ2 p θ1 − pθ2 ∗ p+q qθ1 qθ2 = h − (h + b)e−θ2 (x+s ) − h e−θ2 x + h 2 e−pθ1 x p pθ1 − θ2 p θ1 − pθ2 By substituting s∗ = ln(1 + b/h)/θ1 into the above equation we have p+q ∂x G (2, s + x) = h − (h + b)e−θ2 x p 2 ∗ h h+b θ2 /θ1 −h qθ1 qθ2 e−θ2 x + h 2 e−pθ1 x . pθ1 − θ2 p θ1 − pθ2 Further take derivative with respect to q of the above expression. We obtain h θ1 θ2 ∂q ∂x G2 (2, s∗ + x) = − h e−θ2 x + h 2 e−pθ1 x . p pθ1 − θ2 p θ1 − pθ2 47 Note that ∂q (∂x G2 (2, s∗ + x)) |x=0 = 0, ∂q (∂x G2 (2, s∗ + x)) |x=+∞ = h/p > 0, and ∂x ∂q ∂x G2 (2, s∗ + x) =h θ1 θ2 e−θ2 x − e−pθ1 x ≥ 0 for x ≥ 0. pθ1 − θ2 Therefore we must have ∂q (∂x G2 (2, s∗ + x)) ≥ 0 for x ≥ 0, which implies that s2 (2∗ ) decreases as q increases. Next we examine our heuristic. Since sa (2) ≥ s∗ , we can also apply the above change of variable to (A12). Therefore, sa (2) is given by the solution to ∗ h − (h + b)e−θ2 (x+s ) + qh (1 − e−θ1 x ) (1 − e−θ2 x ) · = 0. p 2 Clearly the left-hand side of the above equation is increasing in q, which implies that our heuristic solution for state 2 also decreases as q increases. 48 Appendix C: Gamma Demand Distribution and Laplace Transform Recall from (3) and (4) that the exact algorithm requires convolutions involving the demand distributions Fk . In this section, we assume the demand distributions under different demand states all belong to the gamma distribution family, with the probability density function given by (28). We show that, in this case, the optimal policy can be computed relatively easily by leveraging Laplace transformation and its inverse. Denote Laplace transform of ∂y Ri (k, y) by Z ri (k, λ) = ∞ e−λy dRi (k, y), 0 and denote Laplace transform of ∂y Gi (i∗ , y) by Z ∞ e−λy dGi (i∗ , y). g i (i∗ , λ) = 0 Given the optimal demand state sequence m∗ = (1∗ , ..., K ∗ ), define ri (λ) = [ri (1∗ , λ), ..., ri (i∗ , λ)]T , gi (λ) = [0, ..., 0, g i (i∗ , λ)]T . Because Laplace transform of a gamma density f (x|nk , θk ) is given by n (i) θk k /(λ + θk )nk , Laplace transform of Dm∗ (x) defined in (6) is given by n ∗ θ1∗1 . . . 0 (λ+θ1∗ )n1∗ (i) .. .. .. dm∗ (λ) = . . . . n ∗ θi∗i 0 . . . (λ+θ ∗ )ni∗ i (i) (i) Under Laplace transform, and noting that I(i) − dm∗ (λ)Pm∗ is strictly diagonal dominant, (7) can be simplified to i (i) r (λ) = I (i) (i) − dm∗ (λ)Pm∗ −1 gi (λ). (i) Here, because dm∗ (λ) has a simple form, the inverse Laplace transform of ri (λ) can be fairly easily determined by Laplace transform table or by standard software packages such as Matlab. Thus, we can obtain ∂y Ri (k, y) from the inverse Laplace transform, and substitute it into (3) to compute the optimal policy. It is worth emphasizing here that the computation of the solution bounds and heuristic in Corollary 2 does not require the function of ∂y Ri (k, y); it only involves evaluation of simple newsvendor derivative functions. Thus, unlike the optimal policy, the solution bounds and heuristic proposed in this paper can be computed for other general demand distributions. 49 Appendix D: Single-Stage Approximation under Stationary Demand Shang and Song (2003) established solution bounds for a serial inventory system with i.i.d. demand, which is a special case of our problem. Our results relate to theirs as follows. Let ΩN = (N, (hi , Li )N i=1 , b, D) denote the original N -stage serial system. Shang and Song (2003) showed that under i.i.d. demand, the echelon n problem is equivalent to an n-stage serial system Ωn = PN n, (hi , Li )ni=1 , b + i=n+1 hi , D . That is, the echelon n problem is the same as the original N stage system truncated at stage n, except that the backorder cost rate is modified to be b + PN h . They also showed that (the cost of) Ωn is bounded by a lower bound system Ωln = i=n+1 i PN n, (hi , Li )ni=1 , b + i=n+1 hi , D , with hn = hn and hi = 0 for i < n, and an upper bound system PN Pn Ωun = n, (h̄i , Li )ni=1 , b + i=n+1 hi , D , with h̄n = i=1 hi and h̄i = 0 for i < n. They further showed that the optimal solution to the lower (upper) bound system is an upper (lower) bound for the optimal echelon base-stock solution of the original system. Because the two bounding systems are equivalent to single-stage systems, where the myopic base-stock level is optimal, their solution bounds are simple single-stage myopic base-stock levels, and one can regard these solutions as single-stage approximations. We can adapt the same two bounding systems to our problem with MMD. Specifically, the derivative of the single-period cost function for demand state k in the lower bound system Ωln is given by PN PN ∂y Gn (k, y |Ωln ) = − b + i=n+1 hi + b + i=n+1 hi + hn Fk,L0n (y) = −(b + Hn+1 ) + (b + Hn )Fk,L0n (y). (A13) And its counterpart in the upper bound system Ωun is given by PN PN ∂y Gn (k, y |Ωun ) = − b + i=n+1 hi + b + i=n+1 hi + h̄n Fk,L0n (y) = −(b + Hn+1 ) + (b + H1 )Fk,L0n (y). (A14) Observe that (A13) is exactly the same as the derivative lower bound in Proposition 3 (a). On the other hand, (A14) is only the first term in the derivative upper bound in Proposition 3 (b). Under the i.i.d. demand process, i.e., K = 1, the ∆ terms in Proposition 3(b) vanish, and we recover the solution bound results in Shang and Song (2003). However, their bounding approach cannot be fully extended to the general MMD process: While the myopic base-stock level to the lower bound system remains an upper bound for the optimal solution of the original system, the myopic base-stock level to the upper bound system is no longer guaranteed to be a lower bound (See Table 1 for the counterexamples marked with a “†” sign). 50 Appendix E: Additional Numerical Results Five-State Demand Cases As a robustness check, we conduct an additional numerical study for two five-state demand cases. Each of the five states follows a gamma distribution, with parameters given by (ni , θi ) = (1, 1), (2, 2), (3, 3), (2, 0.2), and (2, 0.2) for i = 1, 2, 3, 4, 5, representing demand means of 1, 1, 1, 10, and 10, respectively. We set the demand distributions for states 4 and 5 to be the same in these two cases, to illustrate the fact that the optimal echelon base-stock levels for these two states may be different (because the lead time demand distributions starting from these two states may be different under MMD). The transition matrices for these two cases are given as follows: Cyclic demand 01000 0 0 1 0 0 0 0 0 1 0 , 0 0 0 0 1 10000 Noncyclic demand 0.10 0.10 0.20 0.40 0.20 0.35 0.10 0.20 0.10 0.25 0.10 0.20 0.20 0.10 0.40 . 0.50 0.05 0.25 0.15 0.05 0.30 0.05 0.25 0.30 0.10 We set the cost parameters to be the same as in our base case. Table 3 below presents the detailed numerical results for the two five-state demand cases. As we can see, the numerical insights are essentially the same as those observed in the three-state demand cases (Table 1). Internal Fill Rates We design an additional numerical study to shed light on the internal fill rate under the optimal and heuristic policies (i.e., s∗n (k), svn (k), and san (k |h[1,n] )). For this purpose, we simulate the two-stage inventory system for 50,000 periods to measure the realized fill rates at both stages. In each period, we measure the fill rate (= fulfilled order quantity/placed order quantity) at each stage and then average the fill rates across 50,000 periods. We present the results for the base case of three-state MMD in Table 4. The numerical insights from the five-state demand cases are essentially the same, and we omit them for brevity. From Table 4, we observe that the fill rates under the enhanced heuristic san (k |h[1,n] ) are close to those under the optimal policy s∗n (k), except for the cyclic demand case with lead times (L1 , L2 ) = (1, 1). In this case, as can be seen from Table 1, the Stage 2 heuristic policy is the same across different states (because the lead time demand distributions are the same across different states), whereas the optimal policy varies depending on the state. This explains why the Stage 2 fill rate under the heuristic policies is significantly lower than that under the optimal policy. We observe that, under the optimal policy, the Stage 2 fill rate need not be high, especially when the Stage 1 lead time is relatively long. However, the difference between the internal and external fill rates is not as significant as what is observed under the i.i.d. demand. The behavior 51 Table 3 (L1 , L2 ) n k (1, 1) 1 1 2 3 4 5 2 1 2 3 4 5 Soln error % Cost error % (1, 3) 1 1 2 3 4 5 2 1 2 3 4 5 Soln error % Cost error % (3, 1) 1 1 2 3 4 5 2 1 2 3 4 5 Soln error % Cost error % Comparison of policies in a two-stage system under the five-state demand cases. s∗n (k) 4.47 3.88 26.44 34.21 20.65 6.14 25.02 34.15 37.45 27.34 66.46 4.47 3.88 26.44 34.21 20.65 36.35 42.87 48.41 47.81 43.07 72.90 28.57 42.53 41.77 40.17 28.57 40.83 40.48 39.41 44.62 43.91 87.90 Cyclic demand sn (k) s̄n (k) svn (k) 4.03 4.63 4.63† 3.88 3.88 3.88 13.17 26.44 26.44 23.74 40.85 40.85† 13.21 26.50 26.50† 5.09 6.50 5.40 6.50 31.35 25.03† 7.37 46.58 38.85† 7.41 46.63 38.89† 6.72 31.42 25.09 58.5% 20.5% 9.9% 4.3% 4.03 4.63 4.63† 3.88 3.88 3.88 13.17 26.44 26.44 23.74 40.85 40.85† 13.21 26.50 26.50† 27.14 48.69 40.94† 27.44 48.69 40.94 27.44 48.69 40.94 27.44 48.69 40.94 27.14 48.69 40.94 36.8% 12.2% 11.6% 7.7% 28.57 28.57 28.57 31.96 42.90 42.90† 31.97 42.94 42.94† 31.97 42.95 42.95† 28.57 28.57 28.57 33.20 48.69 40.94† 33.20 48.69 40.94† 33.45 48.69 40.94† 33.48 48.69 40.94 33.44 48.69 40.94 18.2% 9.9% 3.3% 0.6% san (k) s∗n (k) 4.49 23.12 3.88 20.01 26.44 22.03 38.84 30.15 24.25 32.91 5.40 28.97 25.03 28.18 38.85 29.63 36.90 38.27 22.88 39.98 8.5% 2.2% 79.23 4.49 23.12 3.88 20.01 26.44 22.03 38.84 30.15 24.25 32.91 40.94 47.85 40.94 46.07 40.94 47.73 40.94 55.72 40.94 57.58 10.1% 6.2% 89.63 28.57 40.79 42.90 38.81 42.94 40.22 40.94 47.86 28.57 49.79 40.94 45.13 40.94 43.33 40.94 45.20 40.94 52.96 40.94 54.65 2.8% 0.6% 108.59 Noncyclic demand sn (k) s̄n (k) svn (k) 21.39 23.40 23.40† 20.01 20.01 20.01 20.92 22.22 22.22† 24.06 31.58 31.58† 24.98 35.04 35.04† 24.83 36.92 28.98† 24.47 35.25 27.31 24.75 36.57 28.60 25.96 46.71 38.02 26.09 49.05 40.13† 19.0% 14.8% 2.2% 0.6% 21.39 23.40 23.20† 20.01 20.01 20.01 20.92 22.22 22.22† 24.06 31.58 31.58† 24.98 35.04 35.04† 34.15 54.72 44.87† 33.61 52.79 42.98 33.99 54.18 44.32 35.49 63.27 52.90 35.62 65.52 54.97 25.8% 10.3% 4.9% 1.0% 39.58 40.93 40.93† 38.81 38.81 38.81 39.39 40.32 40.32† 41.83 49.22 49.22† 42.30 51.57 51.57† 42.09 54.72 44.87 41.73 52.79 42.98 42.01 54.18 44.32 43.06 63.27 52.90 43.18 65.52 54.97† 9.8% 11.5% 1.1% 0.4% san (k) 23.20 20.01 22.05 30.48 33.61 28.91 27.31 28.51 36.76 38.75 2.0% 0.6% 23.40 20.01 22.05 30.48 33.61 44.77 42.98 44.23 51.79 53.67 4.9% 1.2% 40.82 38.81 40.20 48.08 50.24 44.77 42.98 44.23 51.79 53.67 1.0% 0.1% of the system also differs depending on whether W is cyclic. In the noncyclic demand case with lead times (L1 , L2 ) = (3, 1), the realized fill rate at Stage 2 is only 78%. However, when Stage 2 has longer lead time than Stage 1, i.e., (L1 , L2 ) = (1, 3), the realized fill rate at Stage 2 is 98% under cyclic demand and 93% under noncyclic demand. This observation suggests that the high internal fill rate assumption under the decoupling heuristic holds when the downstream lead time is shorter than the upstream lead time (e.g., Abhyankar and Graves 2001). However, this assumption may be problematic when the downstream lead time is longer than the upstream lead time. 52 Table 4 Fill rates in a two-stage system under the three-state demand cases. Cyclic demand (L1 , L2 ) n s∗n (k) svn (k) san (k |h[1,n] ) (1, 1) 1 88% 89% 89% 2 88% 60% 77% (1, 3) 1 90% 90% 90% 2 98% 87% 96% (3, 1) 1 91% 92% 91% 81% 2 80% 77% Noncyclic demand s∗n (k) svn (k) san (k |h[1,n] ) 89% 90% 89% 88% 85% 86% 89% 89% 88% 93% 90% 89% 92% 92% 92% 78% 74% 75% The Bullwhip Effect The bullwhip effect, or the amplification of demand variability propagating from the downstream to the upstream stages of a supply chain, has been extensively studied in the literature (see Chen and Lee 2009, 2012 and the references therein). Most studies of the bullwhip effect have focused on either a single-stage system or a two-stage system with decentralized decision making. In a serial supply chain with centralized decision making, it is known that a stationary echelon base-stock policy is optimal under i.i.d. demand (Clark and Scarf 1960). As a result, the replenishment order at each stage in each period is simply the replacement of the demand in the previous period, so there is no amplification of demand variability in such a system. However, when demand follows an MMD process, the optimal inventory policy is state-dependent, and it is not clear whether demand variability would be amplified or dampened in such a system. Specifically, we simulate the two-stage inventory system for 50,000 periods to measure the order variability amplification ratio (i.e., the bullwhip effect ratio) at both stages. We first calculate the sample variances of external demand, orders placed by Stage 1, and orders placed by Stage 2. Then, the bullwhip effect ratio for Stage 1 is determined by the ratio of Stage 1 order variance over the external demand variance, and the bullwhip effect ratio for Stage 2 (or more precisely echelon 2) is determined by the ratio of Stage 2 order variance over the external demand variance. We present the results of the base case of three-state MMD in Table 5. The numerical insights of the five-state demand cases are essentially the same, and we omit them for brevity. Table 5 Bullwhip effect ratio in a two-stage system under the three-state demand cases. Cyclic demand Noncyclic demand (L1 , L2 ) n s∗n (k) svn (k) san (k |h[1,n] ) s∗n (k) svn (k) san (k |h[1,n] ) (1, 1) 1 0.85 0.98 1.09 0.63 0.69 0.64 2 0.64 1.00 1.00 0.70 0.67 0.63 (1, 3) 1 1.15 1.02 1.19 0.63 0.68 0.63 2 0.75 0.78 0.91 0.64 0.65 0.62 (3, 1) 1 0.68 0.77 0.66 0.63 0.67 0.63 2 0.49 0.76 0.89 0.65 0.65 0.62 53 From Table 5, we observe that the bullwhip effect ratios under both heuristics closely track those under the optimal policy for the noncyclic demand. When the demand is cyclic, the bullwhip effect ratios under both heuristics tend to be higher than those under the optimal policy. This observation accords with our earlier observations that the heuristics tend to perform better when demand is noncyclic. An interesting observation from the above table is that the bullwhip effect ratio under the optimal policy is less than one in most cases, indicating a sweeping demand variability dampening effect. Specifically, for noncyclic demand, the bullwhip effect ratios for Stages 1 and 2 are consistently below 0.70. To further investigate this phenomenon, we compute the autocorrelation of sequential demand under the noncyclic demand. We find that ρ(1) = −0.122, ρ(2) = −0.002, ρ(3) = 0.010, ρ(4) = −0.003, and ρ(5) = 0.000, where ρ(i) = Corr{D(t), D(t + i)}. Thus, there exists only a weak negative correlation among sequential demands, which cannot fully account for the large magnitude of the variability dampening effect. Chen and Lee (2012) showed that the existence of system capacity constraints can help dampen the bullwhip effect. In our system with MMD, there is no capacity constraint, yet we still observe that the bullwhip effect can be significantly dampened under the optimal policy. This suggests that the state-dependent inventory policy under MMD may have an inherent smoothing effect on the order variability. Finding the root cause for this intriguing observation is beyond the scope of this paper, and we shall leave it for future research.
© Copyright 2025 Paperzz