1 Intern. Conf. July 19-21, Yalta, Ukraine Text of presentation: ON METODOLOGICAL PECULIARITIES OF APPLIED OPTIMIZATION PROBLEMS Vitaly A. Perepelitsa Department of Economic Cybernetics, Zaporozhye National University, Ukraine, [email protected] Applied optimization problem is two-level problem. At the bottom level, we have structuring problem, i.e., the problem of input data modeling. At the upper level, traditionally, we consider the problem of "optimal" solution finding. Let concrete applied optimization problem be formulated for the system being modeled. There always exists a question: what is the degree of adequacy of this problem to the considered real system? We premise the following inference to the question of adequacy. Mathematical modeling is usually placed among evolutional systems. Reflected parameters of such system change with time. Consequently, the predictable values of parameters of real applied optimization problem are input data for this problem. One usually gets these predictable values by forecasting of time series of considered parameter. Let examine concrete time series (TS) of parameters values for a range of real applied optimization problems. Figure 1 represents time series of "yield" parameter. This parameter is related to well-known optimization problem, namely "assignment problem". parameter is related to "investor problem". 20050317 20050214 20050114 20041206 20041104 20041006 20040907 20040809 20040709 20040609 20040511 20040407 20040309 20040205 20040106 20031202 20031031 20031002 20030903 20030805 20030707 20030605 20030506 20030403 20030304 20030131 20021230 20021128 Figure 2: 20021029 20020930 1930 1932 1934 1936 1938 1940 1942 1944 1946 1948 1950 1952 1954 1956 1958 1960 1962 1964 1966 1968 1970 1972 1974 1976 1978 1980 1982 1984 1986 1988 1990 1992 1994 1996 1998 2000 20 20020830 20020801 20020703 12 20020603 20020429 20020401 2 Figure №1: Time series of annual yield of grain crops in Zaporizhya region (1930-2001 y.) 25 centners 15 10 5 0 years Time series of everyday stock quotation of “Russian Joint-stock Company United Power Grids” (RJSC UPG) (01.04.2002-03.31.2005) RUR 10 8 6 4 2 0 dates Figure 2 represents time series of stock quotation of electrical company. This 1200 0 Figure 4: related to problem of optimal flood control strategy finding. 17.10.05 17.09.05 18.08.05 19.07.05 19.06.05 20.05.05 20.04.05 21.03.05 19.02.05 20.01.05 21.12.04 21.11.04 22.10.04 22.09.04 23.08.04 24.07.04 24.06.04 25.05.04 25.04.04 26.03.04 25.02.04 26.01.04 Figure 3: 27.12.03 27.11.03 28.10.03 28.09.03 29.08.03 30.07.03 30.06.03 800 months I.1926 I.1927 I.1928 I.1929 I.1930 I.1931 I.1932 I.1933 I.1934 I.1935 I.1936 I.1937 I.1938 I.1939 I.1946 I.1947 I.1948 I.1949 I.1950 I.1951 I.1952 I.1953 I.1954 I.1955 I.1956 I.1957 I.1958 I.1959 I.1960 I.1961 I.1962 I.1963 I.1964 I.1965 I.1966 I.1967 I.1968 I.1969 I.1970 I.1971 I.1972 I.1973 I.1974 I.1975 I.1976 I.1977 I.1978 I.1979 I.1980 I.1981 I.1982 I.1983 I.1984 I.1985 I.1986 I.1987 I.1988 I.1989 I.1990 I.1991 I.1992 I.1993 I.1994 I.1995 I.1996 I.1997 I.1998 I.1999 I.2000 I.2001 I.2002 I.2003 900 31.05.03 01.05.03 количество туристов 3 Time series of everyday tourist flow into mountain-ski settlement Dombai (01.05.2003-01.11.2005) Number of tourists 700 600 500 400 300 200 100 0 days дни days Figure 3 represents time series of number of tourists that arrive to mountain- ski base. This parameter is relevant to problem of optimal allocation of resources of this base. Time series of monthly flow of Kuban river (01.1926-12.2003) volume of monthly flow 1000 800 600 400 200 Figure 4 represents time series of mountain river flow. This parameter is 4 Figure 5: Time series of average monthly solar activity (January 1981-December 2005) 250 monthly number of sunspots 200 150 100 50 2005 2004 2003 2002 2001 2000 1999 1998 1997 1996 1995 1994 1993 1992 1991 1990 1989 1988 1987 1986 1985 1984 1983 1982 1981 0 months Figure 6: Time series of average annual solar activity (1700 – 2005). 1989 1972 1955 1938 1921 1904 1887 1870 1853 1836 1819 1802 1785 1768 1751 1734 1700 1717 mean monthly number of sun pots during year 200 180 160 140 120 100 80 60 40 20 0 years Figure 5 and 6 represents time s of solar activity. 5 Figure 7: Time series of average monthly number of children from 0 till 15 years old that got acute respiratory disease (ARD) (January 1993 – December 2005) zk Number of diseased к о л и ч е с тв о за б о л е в ш и х 3200 2700 2200 1700 1200 k 700 1 11 21 31 41 51 61 71 месяцы months 81 91 101 111 121 131 Figure 8: Time series of everyday wholesale trading volume (25.07.05 – 05.02.2006) 160000 daily volume of wholesale trading 140000 120000 100000 80000 60000 40000 20000 30.01.06 02.01.06 09.01.06 16.01.06 23.01.06 12.12.05 19.12.05 26.12.05 14.11.05 21.11.05 28.11.05 05.12.05 24.10.05 31.10.05 07.11.05 26.09.05 03.10.05 10.10.05 17.10.05 05.09.05 12.09.05 19.09.05 08.08.05 15.08.05 22.08.05 29.08.05 25.07.05 01.08.05 0 days Figures 7 represents time series of number of diseased children, Figure 8 represents time series of wholesale trade volumes. Figure 7 and 8 reflect the parameters of known "optimal inventory control problem". 6 Figure 9: Time series of monthly electricity consumption in one district (January 2000- December 2005) отребление электроэнергии в Прикубанском районе с января г. по декабрь 2005 ElectricityПconsumption in one district from January 2000 2000 till December 2005г. volume of monthly electricity consumption 4500 4000 3500 3000 2500 2000 1500 1000 сен.05 ноя.05 май.05 июл.05 янв.05 мар.05 ноя.04 май.04 июл.04 сен.04 янв.04 мар.04 ноя.03 май.03 июл.03 сен.03 мар.03 ноя.02 янв.03 сен.02 май.02 июл.02 мар.02 ноя.01 янв.02 сен.01 июл.01 янв.01 мар.01 май.01 сен.00 ноя.00 июл.00 янв.00 0 мар.00 май.00 500 months Figure 9 represents time series of electric power consumption volumes in some district. This parameter is relevant to risk control problem in power network. Thus, we have to resolve mentioned applied optimization problems: assignment problem, investor problem, inventory control problem, etc. There are algorithms to solve these problems for the case, when parameters are precise numbers. But, in reality, every of these parameters is a result of forecasting. It comes out from visualization of the time series on figures 1-9 that reliable forecasting result of such time series cannot be a precise number. Prediction value of the parameter in the form of precise number is non-adequate value. Let us mention some reasons of inadequacy of forecasting in the form of precise number. First, these time series possess memory. Consequently, independence condition does not take place for their levels. Secondly, the ranges of these time series are comparable to the mean values of their levels. Thus, question of the mentioned adequacy emerges at the bottom level of applied optimization. If we consider the upper level, its problems are determined 7 by the results of the bottom level. If the forecasting values are intervals, then we have optimization problem with interval data at the upper level. If the forecasting values are fuzzy sets, then we have optimization problem with fuzzy data at the upper level. Classical algorithms of mathematical programming are inapplicable in both cases. Let us get back to the bottom level of applied optimization. This is the level where we get forecasting values for parameters of considered problem. The bottom level consists of two sublevels: 1) preforecasting analysis; 2) forecasting by itself. Forecasting of time series is soft computing. L. Zadeh formulated the following definition: (1) Soft =fuzzy +neural + genetic algorithm computings systems network Here, "genetic algorithm" performs optimization tuning of fuzzy sets' membership functions. "Neural network" computes base values of membership function. However, this "Zadeh formula", applied to time series on figures 1-9, did not justify itself. We offer a cardinal correction of "Zadeh formula" (1): instead of "neural network" we use "cellular automaton", and instead of "genetic algorithm" we use "phase analysis". Then, we get the following formula of the soft computings: Soft computings =fuzzy systems +cellular automaton + phase analysis (2) Forecasting on the basis of formula (2) is performed after preforecasting analysis of time series +phase + fuzzy system Preforecasting =fractal of analysis of time analysis of time analysis series time series series (3) The idea of fractal analysis is stated in book [1]. Algorithm of R/S-analysis is the base one for fractal analysis. H. Hurst offered this algorithm half a century 8 ago. Here: R is range of the given time series, S is a standard deviation of the time series, R/S is a normalized range. The modified algorithm of R/S-analysis [3] constructs two trajectories for the given time series: R/S-trajectory and Htrajectory. There are such trajectories for three different time series on figures 10, 11, 12. Figure 10: Typical R/S- and H- trajectories for time series of stock quotation of “RJSC UPG” (look at figure 2) 0.9 3 0.8 Н- trajectory 0.7 0.6 0.5 0.4 R/S- trajectory 0.3 3 0.2 0.1 log (number of observations) 0 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 Figure 11: Typical R/S- and H- trajectories for time series of average monthly solar activity (look at figure 5) 1.4 H - trajectory 1.2 4 1 0.8 0.6 0.4 R/S - trajectory 4 0.2 log (number of observations) 0 0.3 0.5 0.7 0.9 1.1 1.3 1.5 1.7 1.9 9 Figure 12: Typical R/S- and H- trajectories for time series of average annual solar activity (look at figure 6) 1 H- trajectory 0.9 11 0.8 0.7 11 0.6 0.5 0.4 0.3 R/S- trajectory 0.2 0.1 0 0.3 0.5 0.7 0.9 1.1 1.3 1.5 log(number of observation) On figure 12, H-trajectory demonstrates that Hurst rate H stably keeps its value close to one. R/S-trajectory shows stable trend from point one to point eleven. There is a change of trend in point eleven, so the memory's depth about start of time series is eleven. As a whole, H-trajectory and R/S-trajectory demonstrate good trend-stability of the initial segment of the considered time series on figure 12. On figure 10, H-trajectory and R/S/-trajectory show the absence of trend stability. On figure 11, H- trajectory and R/S- trajectory show intermediate situation. We consequently apply algorithm of R/S-analysis to different pieces of the given time series. So, we get an estimate of memory's depth for the whole time series. This estimate is of fuzzy set kind M (Z ) = {(l , µ (l ))} , (4) where µ (l ) is the value of membership function for depth value l; Z = zi , i=1,2,…,n is the given time series. We represent fuzzy set (4) graphically for obviousness. 10 Figure 13: Memory depth fuzzy uzzy set of time series of stock quotation of “RJSC UPG” UPG” (look at figure 2)) 1 µ (l ) 0,9 0,66 0,26 1 2 3 4 0,26 5 6 0,05 0,02 0,05 7 8 9 0,02 10 l On figure 13, there is a fuzzy set (4) of memory depth of time series presented on figure 2. Membership function µ (l ) attains maximal value at minimally possible depth l=3, i.e., there is almost no memory in this time series. One can assert that the time series possess very bad preforecasting characteristics. It has an alternation of positive and negative increments. Figure 14: Memory depth fuzzy uzzy set of time series of everyday number of people that got ARD (look at figure 7)) 1 0,9 0,8 µ (l ) 0,9 0,68 0,7 0,6 0,66 0,5 0,4 0,3 0,2 0,1 0,31 0,28 0,22 0,19 0,12 0,12 0,04 0,03 0 3 4 5 6 7 8 9 10 11 12 13 l 11 Figure 15: Memory depth fuzzy uzzy set of time series of average annual solar activity (look at figure 6)) 1 0,9 µ (l ) 0,90 0,93 0,79 0,8 0,73 0,7 0,63 0,59 0,6 0,5 0,4 0,32 0,28 0,3 0,18 0,2 0,1 0,14 0,10 0,02 0,02 6 7 0,04 0,06 0,02 0,02 21 22 0 8 9 10 11 12 13 14 15 16 17 18 19 20 l Figure 15 shows good preforecasting characteristics; figure 14 shows mean preforecasting characteristics. Parameters of cyclic component of time series constitute an important preforecasting characteristic of the time series. Phase analysis is applied for exposure of this component and its parameters estimation. If we consider time series Z = zi , i = 1, n , then phase trajectory Ф (Z ) = {(z i , z i + 1 )}, i = 1, n − 1 is a base for this analysis [3]. On figure 16, there is a graphical representation or phase trajectory of time series presented on figure 5. 12 Figure 16: Phase trajectory of time series of average monthly solar activity (look at figure 5) (first level of hierarchy of the cyclic component) zi+1 250 200 150 100 50 0 0 50 100 150 200 250 z -50 i Figure 17: (a) - typical quasi-cycle in decomposition of phase trajectory of time series of average monthly solar activity: (b) - distribution of relative frequency of quasicycles’ lengths L. 170 z i + 113 0,35 0,3 0,3 160 14 150 0,25 0,25 140 0,2 130 0,15 12 110 100 110 120 130 140 150 0,1 0,05 zi 100 160 (a) 0,13 0,1 15 120 0,16 170 0,05 0,02 0 3 4 5 6 7 8 9 L (b) On figure 18, we give phase trajectory of time series given on figure 6. Phase trajectories are broken into quasi-cycles. There are examples of typical quasi-cycles on figure 17 and 19. Here, the main parameter is a frequency distribution of quasi-cycles' lengths. 13 Figure 18: Phase trajectory of time series of average annual solar activity (look at figure 6) (second level of hierarchy of the cyclic component) z i +1 200 180 160 140 120 100 80 60 40 zi 20 0 -20 -20 0 20 40 60 80 100 120 140 160 180 200 Figure 19: (a) - typical quasi-cycle in decomposition of phase trajectory of time series of average annual solar activity (look at figure 18); (b) - distribution of relative frequency of quasicycles’ lengths. 5 z i+1 55 35 25 1 2 6 3 9 11 -5 8 6 7 15 5 h(l) h(l) частота frequency 10 4 45 zi 12 4 8 2 10 l 0 -15 -15 5 25 45 zi 9 10 а) 11 12 13 14 длина квазицикла Quasi-cycle length b) Results of phase analysis form the following data: • presence or absence of time series cyclic component; • structure of the set of all quasi-cycles, i.e., the value of their length and frequencies of these lengths; • trajectory of evolution of quasi-cycles' centers; 14 • trajectory of evolution of size of overall quasi-cycles' rectangles [3]. We especially note that phase analysis of time series can be fulfilled together with procedure of aggregation of time series levels. Then we are able to expose the hierarchical structure of cyclic component of the given time series. Consider, for instance, a time series of solar activity from 1700 A.D. till 2005 A.D. (figure 5). Cyclic component of this time series has three-level hierarchy [3] The first level, i.e., the bottom one of the hierarchy, is constituted by quasi-cycles phase trajectory of average monthly time series (figures 16 and 17). The average length of these quasi-cycles is 1 l mean ≈ 5 months. The second level, i.e., the medium one of the hierarchy, is formed by quasi-cycles of phase trajectory of average annual time series (figures 18 and 19 where lmean ≈ 11 years ). The third level of the hierarchy is constituted by centennial quails-cycles, i.e., the quasi-cycles with average length 3 l mean ≈ 100 years (the phase trajectory of third level is represented at mean figure 20). Figure 20: Phase trajectory of local maxima of time series of average annual solar activity (third level of hierarchy of the cyclic component) 200 180 160 140 120 100 80 60 40 28 1 20 0 0 50 100 150 200 We note that cyclic component of infant morbidity time series (figure 7) has three-level hierarchy. The exposure of the cyclic component hierarchy of the given time series has an important meaning for this time series forecasting. This 15 hierarchy allows extending the forecasting horizon many times. For example, solar activity time series may have forecasting horizon of tens years. Let us get back to the second sublevel of the bottom level of applied optimization. According to (1) – (3) instrumentation of cellular automaton is mathematical base of this sublevel. We especially note the familiar result of fon Neumann: every Turing machine is realized by cellular automaton (look joint issue of journal [4]. It implies that every algorithm is realized by cellular automation. Cellular automaton is a basic model for forecasting of time series with memory. Cellular automaton handles not numbers but conditions. In other words, automaton's cells remember not numbers but linguistic values, for instance, "low, medium, high". Before inputting into the memory of cellular automaton, numeric time series is transformed into linguistic time series. One of the stages of such transformation is depicted on figure 21. Figure 21: Transformation of numerical times series into linguistic time series UEL - upper envelope line, BEL – bottom envelope line 50 45 low 40 UEL medium 35 high 30 25 20 15 BEL 10 5 2002 2000 1998 1996 1994 1992 1990 1988 1986 1984 1982 1980 1978 1976 1974 1972 1970 1968 1966 1964 1962 1960 1958 1956 1954 1952 0 Cellular-automaton forecast is a fuzzy set. There is a result of forecasting of yield time series on figure 22 (look at figure 1). Here we can see three forms of representation of the obtained forecast of yield: • first, in the form of linguistic fuzzy set; • second, in the form of numeric fuzzy set; 16 • third, in the form of number, which is received with the help of defuzzification procedure. Figure 22: The illustration example of forecasting results Linguistic fuzzy set U n +1 = {(H ; 0 , 03 ), (C ; 0 , 49 ), (B ; 0 , 48 )} Numerical fuzzy set 3 y i = ∑ µ t ⋅ y t = 0 , 03 ⋅ 18 ,9 + 0 , 49 ⋅ 26,8 + 0 , 48 ⋅ 32 ,0 = 28, 6 ц / га t =1 Y n + 1 = {(18 ,9 ; 0 , 03 ), (26 ,8 ; 0 , 49 ), (32 , 0 ; 0 , 48 )} Forecast in the form of number y n +1 = 3 ∑ µ t ⋅ y ' t = 0 , 03 ⋅ 18 , 9 + 0 , 49 ⋅ 26 ,8 + 0 , 48 ⋅ 32 , 0 = 28 , 6 centners t =1 Forecasting error ~ 9,8 % Figure 23: Instead of cellular – automaton forecasting 1= + fuzzy systems + + cellular automaton we use hybrid time series forecasting Triple - hybrid= + fuzzy systems + + cellular automaton + +phase analysis We should pay special attention to hybrid time series forecasting methods, for example, triple hybrid "cellular automaton + phase analysis + fuzzy system" (figure 23). Figures 24 and 25 demonstrate the idea of this hybrid (look at figure 24, where two-month quise-cycles on intervals B and C show the property of 17 similarity). Quasi-cycles of phase trajectory "optimize" the value of the membership function of cellular-automaton forecast such as presented on figure 22. The error of hybrid forecasting is 20-30 per cent lower than the error of cellular-automaton forecast. Figure 24: Time series of weekly number of people that got ARD 900 y j +1 Б B number of diseased yj В C 800 количество заболевших 700 600 500 400 300 200 100 1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97 weeks недели yj y B – base interval of time series C – forecasting interval of time series Figure 25: Two-month quasi-cycle of phase trajectory of weekly time series (look at figure 24 the base interval B, January 6, 2002 – march 3, 2002) 780 y j +1 В С Н 473 (5) 680 472 (4) 580 480 470 (2) 469 (1) 474 (6) 471 (3) 380 280 180 150 477 (9) low 250 475 (7) 476 (8) medium 350 450 y high 550 650 750 yj In several cases, it is possible to replace the fuzzy set with an interval. Then, at the upper level, we consider the optimization problem with interval data. 18 Let us obtain, at the bottom level, forecasting values of parameters for optimization problem at the upper level. Let us formulate important statements, which are relevant to the upper level of applied optimization problem. As an illustration, we consider well-known perfect matching problem on graph G = (V , E ) V = n . The set of all feasible solutions X = {x} is a set of such subgraphs x = (V1 , V 2 , E x ), E x ⊆ E that every subgraph has set of edges E x that is perfect matching in G, E x = n ∀x ∈ X . Weight ω (e ) is assigned to every edge e ∈ E ; this weight is either a fuzzy number or an interval ω (e ) = [ω , (e ), ω2 (e )] . Objective (5) function F (x ) = w (e ) → extr , extr ∈ {min, max ∑ e∈ E } (5) x attains correspondingly the values of fuzzy number or interval. In this case, pairs of incomparable solutions x ' , x' ' ∈ X appear. Then, in general case, the problem with objective function (5) does not have optimum, but it has Pareto set ~ (PS) X~ ⊆ X . Then, algorithmic problem of Pareto set X or complete set of alternatives (CSA) X 0 ⊆ X~ finding arises. Subset X 0 ⊆ X~ is called CSA if its ~ cardinality X 0 is minimal and simultaneously equality is held: F (X 0 ) = F (X ). Let us formulate a range of statements for optimization problems on graphs with fuzzy or interval weights. There are demonstrated the following statements: 1. In classic statement there are the following polynomially resolvable problems: perfect matching problems, spanning tree problem, shortest chains problem and others. These problems are intractable in case of fuzzy or interval weights of graph edges [4]. 19 2. The class of such intractable contains polynomially resolvable subclasses. For instance, subclass, which is defined by graphs that have edges weighted by intervals of equal length (ω 2 (e ) − ω1 (e )) =const. 3. Every problem with objective function (5) on graphs G = (V , E ) with interval weights ω (e ) = [ω 1 (e ), ω 2 (e )] is equivalent to two-criterion problem with vector objective function F (x ) = (F1 (x ), F2 (x )) , where Fv (x ) = ∑ ω (e ) → extr , v = 1,2 e∈E x v [4, 5]. 4. Theory of algorithms with estimates is a constructive one for discrete optimization problems with fuzzy or interval data. Particularly, statistically effective algorithms [6] and polynomial asymptotically exact algorithms [7] have sufficient conditions of polynomial solvability and so on exist for these problems. 5. All known operations of "fuzzy numbers summation" are based on operations of set theory. In other words, in fuzzy systems theory there is no arithmetic, what satisfies to informal interpretation of applied optimization problems. References 1. Peters Edgar E. Chaos and Order in the Capital Markets. A New View of Cycles, Prices and Market Volatility. John Wiley & Sons, Inc. New York, Toronto, 1996. 2. Hurst H.E. Long-term Storage of Reservoirs, Transactions of the American Society of Civil Engineers, 116, 1951. 3. Perepelitsa V.A., Tebueva F.B., Temirova L.G. Data Structuring by Methods of Nonlinear Dynamics under Process of Two-level Modelling. Stavropol Booky Publisher, Stavropol, 2006 (in Russian). 20 4. Perepelitsa Vitaliy A., Коzinа Galina L. Interval Discrete Models and Multiobdjectivity. Complexity Estimates. Interval Computations, №3 (1993), pp.51-59. Institute for New Technologies, St.Petersburg-Moscow. 5. Emelichev V.A., Perepelitsa V.A. On Cardinality of the Set of Alternatives in Discrete Many-criterion Problems. Discrete Mathematics and Applications, Vol.2, No.5, pp.461-471 (1992). VSP, Utrecht, the Netherlands; Tokyo, Japan. 6. Perepelitsa V.A. On Two Problems from Theory Graphs. American Mathem. Society “ Soviet Math. Dokl.” Vol.11 (1970), №5, 1971, pp.13761379. 7. Perepelitsa V.A. An Asymptotic Approach to the Solution of Certain Extremal Problems on Graphs. Problems of Cybernetics, №26 (1973), pp.291-314. Nauka, Moscow (in Russian).
© Copyright 2026 Paperzz