Adding Unimodality or Independence Makes Interval Probability Problems NP-Hard Daniel J. Berleant Electrical and Computer Engineering Iowa State University Ames, IA 50011, USA [email protected] Olga Kosheleva Hung T. Nguyen NASA Pan-American Center for Dept. of Mathem. Sciences Earth and Environmental Studies New Mexico State Univ. (PACES) Las Cruces, NM 88003, USA University of Texas at El Paso [email protected] El Paso, TX 79968, USA [email protected] Abstract In many real-life situations, we only have partial information about probabilities. This information is usually described by bounds on moments, on probabilities of certain events, etc. – i.e., by characteristics c(p) which are linear in terms of the unknown probabilities pj . If we know interval bounds on some such characteristics ai ≤ ci (p) ≤ ai , and we are interested in a characteristic c(p), then we can find the bounds on c(p) by solving a linear programming problem. In some situations, we also have additional conditions on the probability distribution – e.g., we may know that the two variables x1 and x2 are independent, or that for each value of x2 , the corresponding conditional distribution for x1 is unimodal. We show that adding each of these conditions makes the corresponding interval probability problem NP-hard. Keywords: Interval Probability, Unimodality, Independence, NPHard. 1 Introduction Interval probability problems can be often reduced to linear programming (LP). In many real-life situations, in addition to the intervals [xi , xi ] of possible values of the unknowns x1 , . . . , xn , we also have partial information about the probabilities of different values within these intervals. This information is usually given in terms of bounds on the standard characteristics c(p) of the corresponding probability distribution p, def R such as the k-th moment Mk = xk · ρ(x) dx (where ρ(x) is the probability density), the values of the cumulative distribution function Rt def (cdf) F (t) = Prob(x ≤ t) = −∞ ρ(x) dx of some of the variables, etc. Most of these characteristics are linear in terms of ρ(x) – and many other characteristics like central moments are combinations of linear chractaristics: e.g., variance V can be expressed as V = M2 − M12 . A typical practical problem is when we know the ranges of some of these characteristics ai ≤ ci (p) ≤ ai , and we want to find the range of possible values of some other characteristic c(p). For example, we know the bounds on the marginal cdfs for the variables x1 and x2 , and we want to find the range of values of the cdf for x1 + x2 . In such problems, the range of possible values of c(p) is an interval [a, a]. To find a (correspondingly, a), we must minimize (correspondingly, maximize) the linear objective function c(p) under linear constraints — i.e., solve a linear programming (LP) problem; see, e.g., [15, 16, 17, 20]. Other simple examples of linear conditions include bounds on the values of the density function ρ(x); see, e.g., [14]. Comment. Several more complex problems can also be described in LP terms. For example, when we select a new strategy for a company (e.g., for an electric company), one of the reasonable criteria is that the expected monetary gain should be not smaller than the expected gain for a previously known strategy. In many case, for each strategy, we can estimate the probability of different production values – e.g., the probability F (t) = Prob(x ≤ t) that we will produce the amount ≤ t. However, the utility u(t) corresponding to producing t depends on the future prices and is not well known; therefore, we cannot predict the exact value of the expected utility R u(x) · ρ(x) dx. One way to handle this situation is require that for every monotonic utility function u(t), the expected utility under the new strategy – with probability density function (pdf) ρ(x) and cdf F (x) – is larger than or equal to the expected utility under the old strategy – with pdf ρ0 (x) and cdf F0 (x): R R u(x)·ρ(x) dx ≥ u(x)·ρ0 (x) dx. This condition is called first order stochastic dominance. It is known that this condition is equivalent to the condition that F (x) ≤ F0 (x) for all x. Indeed, the condition is equivalent to Z t 0 u(x) · (ρ(x) − ρ0 (x)) dx ≥ 0. Integrating by part, we conclude that − Z t 0 u0 (x) · (F (x) − F0 (x)) dx ≥ 0; since u(x) is non-decreasing, the derivative u0 (x) can be an arbitrary non-negative function; so, the above condition is indeed equivalent to F (x) ≤ F0 (x) for all x. Each of these inequalities is linear in terms of ρ(x) – so, optimizing a linear objective function under the constraints F (x) ≥ F0 (x) is also a LP problem. This requirement may be too restrictive; in practice, preferences have the property of risk aversion: it is better to gain a value x with probability 1 than to have either 0 or 2x with probability 1/2. In mathematical terms, this condition means that the corresponding utility function u(x) is concave. It is therefore reasonable to require that for all such risk aversion utility functions u(x), the expected utility under the new strategy is larger than or equal to the expected utility under the old strategy. This condition is called second order stochastic dominance (see, e.g., [7, 8, 18, 19]), and itR known to beRequivalent to the condition that 0t F (x) dx ≤ 0t F0 (x) dx. Indeed, the condition is equivalent to Z t 0 u(x) · (ρ(x) − ρ0 (x)) dx ≥ 0 for every concave function u(x). Integrating by part twice, we conclude that Z t 0 00 µZ x u (x)· 0 F (z) dz − Z x 0 ¶ F0 (z) dz dx ≥ 0. Since u(x) is concave, the second derivative u00 (x) can be an arbitrary non-positive function; so, Rthe above condition is indeed equivR alent to 0t F (x) dx ≤ 0t F0 (x) dx for all t. The cdf F (x) is a linear combination of the R values ρ(x); thus, its integral F (x) dx is also linear linear in ρ(x), and hence the above condition is still linear in terms of the values ρ(x). Thus, we again have a LP problem; for details, see [2]. Most of the corresponding LP problems can be efficiently solved. Theoretically, some of these LP problems have infinitely many variables ρ(x), but in practice, we can discretize each coordinate and thus, get a LP problem with finitely many variables. There are known efficient algorithms and software for solving LP problems with finitely many variables. These algorithms require polynomial time (≤ nk ) to solve problems with ≤ n unknowns and ≤ n constraints; these algorithms are actively used in imprecise probabilities; see, e.g., [1, 1, 3, 4, 5, 6]. For example, for the case of two variables x1 and x2 , we may know the probabilities pi = p(x1 ∈ [i, i + 1]) and qj = p(x2 ∈ [j, j + 1]) for finitely many intervals [i, i + 1]. Then, to find the range of possible values of, e.g., Prob(x1 + x2 ≤ k), we can consider the following linear programming problem: the unknowns are def pi,j = p(x1 ∈ [i, i + 1] & x2 ∈ [j, j + 1]), the constraints are pi,j ≥ 0, pi,1 + pi,2 + . . . = pi , p1,j + p2,j + . . . = qj , and the objective P function is pi,j . i,j:i+j≤k Comment. The only LP problems for which there may not be an efficient solution are problems involving a large amount of variables v. If we discretize each variable into n intervals, then overall, we need nv unknowns pi1 ,i2 ,...,iv (1 ≤ i1 ≤ n, 1 ≤ i2 ≤ n, . . . , 1 ≤ iv ≤ n) to describe all possible probability distributions. When v grows, the number of unknowns grows exponentially with v and thus, for large v, becomes unrealistically large. It is known (see, e.g., [12]) that this exponential increase in complexity is inherent to the problem: e.g., for v random variables x1 , . . . , xv with known marginal distributions, the problem of finding the exact bounds on the cdf for the sum x1 + . . . + xv is NP-hard. Beyond LP. There are important practical problems which lie outside LP. One example is problems involving independence, when constraints are linear in p(x, y) = p(x) · p(y) and thus, bilinear in p(x) and p(y). In this paper, we show that the corresponding range estimation problem is NP-hard. Another example of a condition which cannot be directly described in terms of LP is the condition of unimodality. For a one-variable distribution with probabilities p1 , . . . , pn , unimodality means that there exists a value m (“mode”) such that pi increase (non-strictly) until m and then decreases after m: p1 ≤ p2 ≤ . . . ≤ pm−1 ≤ pm ; For a 1-D case, if we do not know the location of the mode, we can try all n possible locations and solve n corresponding LP problems. Since each LP problem requires a polynomial time to run, running n such problems still requires a polynomial time. In the 2-D case, it is reasonable to consider the situation when, e.g., for every value of x2 , the corresponding conditional distribution for x1 is unimodal. In this case, to describe this as a LP problem, we must select a mode for every x2 . If there are n values of x2 , and at least 2 possible choices of mode location, then we get an exponential amount of 2n possible choices. In this paper, we show that this problem is also NP-hard – and therefore, that, unless P=NP, no algorithm can solve it in polynomial time. Comment. Other possible restrictions on probability may involve bounds on the entropy of the corresponding probability distributions; such problems are also, in general, NP-hard [11]. 2 Adding Unimodality Makes Interval Probability Problems NP-Hard Definition 1 Let n1 > 0 and n2 > 0 be given integers. • By a probability distribution, we mean a collection of real numbers pi1 ,i2 ≥ 0, 1 ≤ i ≤ n1 , and 1 ≤ j ≤ n2 , such that n1 P n2 P i1 =1 i2 =1 pi1 ,i2 = 1. • We say that the distribution pi1 ,i2 is unimodal in the 1st variable (or simply unimodal, for short) if for every i2 from 1 to n2 , there exists a value m such that pi1 ,i2 grows with i1 for i1 ≤ m and decreases with i1 for i1 ≥ m: pm ≥ pm+1 ≥ . . . ≥ pn−1 ≥ pn . p1,i2 ≤ p2,i2 ≤ . . . ≤ pm,i2 ; When the location of the mode is known, we get several linear inequalities, so we can still use efficient techniques such as LP; see, e.g., [9, 21]. pm,i2 ≥ pm+1,i2 ≥ . . . ≥ pn1 ,i2 . • By a linear constraint on the probability distribution, we mean the constraint of the type b ≤ n1 P n2 P i1 =1 i2 =1 bi1 ,i2 · pi1 ,i2 ≤ b for some given values b, b, and bi1 ,i2 . • By an interval probability problem under unimodality constraint, we mean the following problem: given a find list of linear constraints, check whether there exists a unimodal distribution which satisfies all these constraints. Theorem 1 Interval probability problem under unimodality constraint is NP-hard. • if εi2 = 1, then we take p1,i2 = p2,i2 = 0 and p3,i2 = 1/n. The resulting distribution is unimodal: indeed, for each i2 , its mode is the value 1 + εi2 . Let us check that it satisfies all the desired constraints. It is easy to check that for every i2 , we have p2,i2 = 0 and p1,i2 + p2,i2 + p3,i2 = 1/n. Finally, due to our choice of pi1 ,i2 , we 1 conclude that −si2 ·p1,i2 +si2 ·p3,i2 = ·εi2 ·si2 n and thus, n2 X Comment. So, under the unimodality constraint, even checking whether a system of linear constraints is consistent – i.e., whether the range of a given characteristic is empty – is computationally difficult (NP-hard). Proof. We will show that if we can check, for every system of linear constraints, whether this system is consistent or not under unimodality, then we would be able to solve a partition problem which is known to be NPhard [10, 13]. The partition problem consists of the following: given n positive integers s1 , . . . , sn , check whether exist n integers εi ∈ {−1, 1} for which ε1 · s1 + . . . + εn · sn = 0. Indeed, for every instance of the partition problem, we form the following system of constraints: n1 = 3, n2 = n, • p2,i2 = 0 for every i2 = 1, . . . , n2 , • p1,i2 + p2,i2 + p3,i2 = 1/n for every i2 = 1, . . . , n2 ; • n2 P (−si2 · p1,i2 + si2 · p3,i2 ) = 0. i2 =1 Let us prove that this system is consistent if and only if the original instance of the partition problem has a solution. “If ” part. If the original instance has a solution εi ∈ {−1, 1}, then, for every i2 from 1 to n2 , we can take p2+εi2 ,i2 = 1/n and pi1 ,i2 = 0 for i1 6= 2 + εi2 . In other words: • if εi2 = −1, then we take p1,i2 = 1/n and p2,i2 = p3,i2 = 0; n (−si2 ·p1,i2 +si2 ·p3,i2 ) = i2 =1 2 1 X · εi ·si = 0. n i =1 2 2 2 “Only if ” part. Vice versa, let us assume that we have a unimodal distribution pi1 ,i2 for which all the desired constraints are satisfied. Since the distribution is unimodal, for every i2 , there exists a mode mi2 ∈ {1, 2, 3} for which the values pi1 ,i2 increase for i1 ≤ mi2 and decrease for i1 ≥ mi2 . This mode cannot be equal to 2, because otherwise, the value p2,i2 = 0 will be the largest of the three values p1,i2 , p2,i2 , and p3,i2 hence all three values will be 0 – which contradicts to the constraint p1,i2 + p2,i2 + p3,i2 = 1/n. Thus, this mode is either 1 or 3: • if the mode is 1, then due to monotonicity, we have 0 = p2,i2 ≥ p3,i2 hence p3,i2 = p2,i2 = 0; • if the mode is 3, then due to monotonicity, we have p1,i2 ≤ p2,i2 = 0 hence p1,i2 = p2,i2 = 0. In both case, for each i2 , only one value of pi1 ,i2 is different from 0 – the value pmi2 ,i2 . Since the sum of these three values is 1/n, this non-zero value must be equal to 1/n. If def we denote εi = mi − 2, then we conclude that εi ∈ {−1, 1}. For each i2 , we have −si2 · p1,i2 + si2 · p3,i2 = εi2 · si2 · (1/n), hence from the constraint n2 X n (−si2 ·p1,i2 +si2 ·p3,i2 ) = i2 =1 2 1 X · εi ·si = 0, n i =1 2 2 2 P we conclude that εi · si = 0, i.e., that the original instance of the partition problem has a solution. b≤ n1 X i=1 ai · pi + n2 X j=1 bj · qj + n1 X n2 X ci,j · pi · qj ≤ b i=1 j=1 The theorem is proven. Comment. The above constraints are not just mathematical tricks, they have a natural interpretation if for x1 , we take the values −1, 0, and 1 as corresponding to i1 = 1, 2, 3, and for and for x2 , we take the values s1 , . . . , sn . Then: • the constraint p2,i2 = 0 means that Prob(x1 = 0) = 0; • the constraint p1,i2 + p2,i2 + p3,i2 = 1/n means that Prob(x2 = si ) = 1/n for all n values si , and • the constraint n2 P (−si2 ·p1,i2 +si2 ·p3,i2 ) = i2 =1 0 means that the expected value of the product is 0: E[x1 · x2 ] = 0. So, the difficult-to-solve problem is to check whether it is possible that E[x1 · x2 ] = 0 and Prob(x1 = 0) = 0 for some unimodal distribution for which the marginal distribution on x2 is “uniform”. 3 Adding Independence Makes Interval Probability Problems NP-Hard In general, in statistics, independence makes problems easier. We will show, however, that for interval probability problems, the situation is sometimes opposite: the addition of independence assumption turns easy-to-solve problems into NP-hard ones. Definition 2 Let n1 > 0 and n2 > 0 be given integers. • By an independent probability distribution, we mean a collection of real numbers pi ≥ 0, 1 ≤ i ≤ n1 , and qj , 1 ≤ j ≤ n2 , such that n1 P i=1 pi = n2 P j=1 qj = 1. • By a linear constraint on the independent probability distribution, we mean the constraint of the type for some given values b, b, ai , bj , and ci,j . • By an interval probability problem under independence constraint, we mean the following problem: given a find list of linear constraints, check whether there exists an independent distribution which satisfies all these constraints. Comment. Independence means that pi,j = pi · qj for every i and j. The above constraints are linear in terms of these probabilities pi,j = pi · qj . Theorem 2 Interval probability problem under independence constraint is NP-hard. Proof. To prove this theorem, we will reduce the problem in question to the same known NP-hard problem as in the proof of Theorem 1: to the partition problem. For every instance of the partition problem, we form the following system of constraints: n1 = n2 = n, • pi − qi = 0 for every i from 1 to n; • Si · pi − pi · qi = 0 for all i from 1 to n, where def 2 · si . Si = P n sk k=1 Let us prove that this system is consistent if and only if the original instance of the partition problem has a solution. Indeed, if the original instance has a solution εi ∈ {−1, 1}, then, for every i from 1 to n, we 1 + εi can take pi = qi = · Si , i.e.: 2 • if εi = −1, we take pi = qi = 0; • if εi = 1, we take pi = qi = Si . n P Let us show that for this choice, n P j=1 n X i=1 qj = 1. Indeed, pi = n X 1 + εi i=1 2 i=1 ·Si = n n 1 X 1 X · Si + · εi ·Si . 2 i=1 2 i=1 2 · si By definition of Si = P n k=1 i=1 n P Si = 2 · i=1 k=1 and , we have sk n P n X si = 2, sk n P n X εi · Si = 2 · i=1 Since i=1 n P i=1 εi · si = 0, the second sum is 0, hence In both cases εi = ±1, we have Si ·pi −pi ·qi = 0, so all the constraints are indeed satisfied. Vice versa, if the constraints are satisfied, this means that for every i, we have pi = qi and Si · pi − pi · qi = pi · (Si − qi ) = pi · (Si − pi ) = 0, so pi = 0 or pi = Si . Thus, the value pi /Si is equal to 0 or 1, hence the value def εi = 2 · (pi /Si ) − 1 takes values −1 or 1. In 1 + εi terms of εi , we have pi /Si = , hence 2 n P 1 + εi pi = · Si . Since pi = 1, we conclude 2 i=1 that i=1 pi = n n 1 X 1 X · Si + · εi · Si = 1. 2 i=1 2 i=1 n n X 1 X We know that · Si = 1, hence εi ·Si = 2 i=1 i=1 0. We know that this sum is proportional to n P i=1 εi · si , hence n P i=1 εi · si = 0 – i.e., the orig- inal instance of the partition problem has a solution. The theorem is proven. This work was supported in part by NASA under cooperative agreement NCC5209, NSF grant EAR-0225670, NIH grant 3T34GM008048-20S1, and Army Research Lab grant DATM-05-02-C-0046. The authors are very thankful to the participants of the 4th International Symposium on Imprecise Probabilities and Their Applications ISIPTA’05 (Carnegie Mellon University, July 20–24, 2005), especially to Mikelis Bickis (University of Saskatchewan, Canada), Arthur P. Dempster (Harvard University), and Damjan Ŝkulj (University of Ljubljana, Slovenia), for valuable discussions. . sk pi = 1. n X Acknowledgements References εi · si i=1 n P k=1 n P pi = [1] D. Berleant, M.-P. Cheong, C. Chu, Y. Guan, A. Kamal, G. Sheblé, S. Ferson, and J. F. Peters, Dependable handling of uncertainty, Reliable Computing, 2003, Vol. 9, No. 6, pp. 407–418. [2] D. Berleant, M. Dancre, J. Argaud, and G. Sheblé, Electric company portfolio optimization under interval stochastic dominance constrraints, In: F. G. Cozman, R. Nau, and T. Seidenfeld, Proceedings of the 4th International Symposium on Imprecise Probabilities and Their Applications ISIPTA’05, Pittsburgh, Pennsylvania, July 20–24, 2005, pp. 51–57. [3] D. Berleant, L. Xie, and J. Zhang, Statool: a tool for Distribution Envelope Determination (DEnv), an interval-based algorithm for arithmetic on random variables, Reliable Computing, 2003, Vol. 9, No. 2, pp. 91–108. [4] D. Berleant and J. Zhang, Using Pearson correlation to improve envelopes around the distributions of functions, Reliable Computing, 2004, Vol. 10, No. 2, pp. 139– 161. [5] D. Berleant and J. Zhang, Representation and Problem Solving with the Distribution Envelope Determination (DEnv) Method, Reliability Engineering and System Safety, 2004, Vol., 85, No. 1–3. [6] D. Berleant and J. Zhang, Using Pearson correlation to improve envelopes around the distributions of functions, Reliable Computing, 2004, Vol. 10, No. 2, pp. 139– 161. [7] A. Borglin and H. Keiding, Stochastic dominance and conditional expectation an insurance theoretical approach, The Geneva Papers on Risk and Insurance Theory, 2002, Vol. 27, pp. 31-48. [8] A. Chateauneuf, On the use of capacities in modeling uncertainty aversion and risk aversion, Journal of Mathematical Economics, 1991, Vol. 20, pp. 343-369. [9] S. Ferson, RAMAS Risk Calc 4.0, CRC Press, Boca Raton, Florida. [10] M. R. Garey and D. S. Johnson, Computers and Intractability, a Guide to the Theory of NP-Completeness, W.H. Freemand and Company, San Francisco, CA, 1979. [11] G. J. Klir, G. Xiang, and O. Kosheleva, “Estimating information amount under interval uncertainty: algorithmic solvability and computational complexity”, Proceedings of the International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems IPMU’06, Paris, France, July 2–7, 2006 (to appear). [12] V. Kreinovich and S. Ferson, Computing Best-Possible Bounds for the Distribution of a Sum of Several Variables is NPHard, International Journal of Approximate Reasoning, 2006, Vol. 41, pp. 331– 342. [13] V. Kreinovich, A. Lakeyev, J. Rohn, P. Kahl, Computational complexity and feasibility of data processing and interval computations, Kluwer, Dordrecht, 1997. [14] V. G. Krymsky, Computing Interval Estimates for Components of Statistical Information with Respect to Judgements on Probability Density Functions, In: J. Dongarra, K. Madsen, and J. Wasniewski (eds.), PARA’04 Workshop on State-of-the-Art in Scientific Computing, Springer Lecture Notes in Computer Science, 2005, Vol. 3732, pp. 151–160. [15] V. Kuznetsov, Interval Statistical Models, Radio i Svyaz, Moscow, 1991 (in Russian). [16] V. P. Kuznetsov, Interval methods for processing statistical characteristics, Proceedings of the International Workshop on Applications of Interval Computations APIC’95, El Paso, Texas, February 23–25, 1995 (a special supplement to the journal Reliable Computing), pp. 116–122. [17] V. P. Kuznetsov, Auxiliary problems of statistical data processing: interval approach, Proceedings of the International Workshop on Applications of Interval Computations APIC’95, El Paso, Texas, February 23–25, 1995 (a special supplement to the journal Reliable Computing), pp. 123–129. [18] D. Skulj, “Generalized conditioning in neighbourhood models”, In: F. G. Cozman, R. Nau, and T. Seidenfeld, Proceedings of the 4th International Symposium on Imprecise Probabilities and Their Applications ISIPTA’05, Pittsburgh, Pennsylvania, July 20–24, 2005. [19] D. Skulj, A role of Jeffrey’s rule of conditioning in neighborhood models, to appear. [20] P. Walley, Statistical Reasoning with Imprecise Probabilities, Chapman & Hall, N.Y., 1991. [21] J. Zhang and D. Berleant, Arithmetic on random variables: squeezing the envelopes with new joint distribution constraints, Proceedings of the 4th International Symposium on Imprecise Probabilities and Their Applications ISIPTA’05, Pittsburgh, Pennsylvania, July 20–24, 2005, pp. 416–422.
© Copyright 2025 Paperzz