JOURNAL OF APPLIED COMPUTER SCIENCE Vol. 19 No. 2 (2011), pp. 61-71 Multilevel Time Series Complexity Bohdan Kozarzewski University of Information Technology and Management Faculty of Applied Computer Science H. Sucharskiego 2, 35-225 Rzeszów, Poland [email protected] Abstract. A simple and fast algorithm to quantify time series complexity which follows newly developed nonparametric complexity measure of symbolic sequences is proposed. In order to get complexity measure over many time scales I suggest using wavelets multilevel decomposition of time series instead of coarse-graining. As an example multilevel complexity of series generated by Henon map, as well as some data downloaded from PhysioBank database: synthetic series, gait dynamics, and interbeat heart rate is calculated. Keywords: Complexity, wavelets decomposition, biomedical signals.. 1. Introduction The notion of time series complexity still remains a little bit abstract. There is no precise formal definition of time series complexity, it is only vaguely defined and many alternatives have been proposed. There is no agreement how to quantify time series complexity as well. Mathematical definition of symbolic sequence complexity according to Kolmogorov relies on information theory. As a measure of complexity of symbolic sequence, the length of the shortest binary input to a universal Turing machine that leads to the same sequence, is considered. Unfortunately the definition is no significant help for practical applications. On the other 62 Multilevel Time Series Complexity hand complexity seems to be essential to understand the underlying mechanisms behind complex systems. For the practical use two main approaches to quantify time series complexity are in use. The first one relies on information entropy as tool to define complexity. The complexity measure is defined as the difference between two entropies; sum of local entropies and global entropy [1] or entropies of time series computed for two subsets of different length [2]. See also [3] where modified version of complexity measure called approximate entropy [4] is used. The second approach is close to Kolmogorov definition and measures complexity of symbolic sequence. The algorithm proposed by Ke and Tong [5] consists of rules for parsing strings of symbols from a finite alphabet. The algorithm proposed by them is the significant modification of the algorithm of Lempel-Ziv complexity [6]. Authors call their measure the lattice complexity. In the present paper I focus my attention on the complexity of symbolic binary sequences, as they can represent to some extend systems, whose complexity I would like to estimate. The paper is organized as follows: in Sec.2 I analyze lattice complexity and compare this measures of complexity with sample entropy for series generated by Henon map and some data taken from PhysioBank database [7]. In Sec. 3 I turn to the complexity measure across multiple temporal scales and suggest multilevel wavelets decomposition of the signal instead of multiscale method that has been developed in [2]. 2. Lattice complexity Among motivation for using the techniques of symbolic dynamics in the study of dynamical systems is that symbolic dynamics is the unique rigorous treatment for chaotic systems. In the following I will use the simplest symbolic representation of the system by partitioning state space into two sets and labeling each element of this partition by ”0” or ”1”. In searching for an adequate measure for the complexity of binary strings, two limiting cases must be considered: the regular strings (such as a periodic sequences) and the random ones. A good measure of physical complexity is expected to yield a vanishing complexity for both cases, while the other strings that appear to encode a lot of information are thought to be complex. The nonparametric measure proposed in [5] satisfies above mentioned conditions. In my opinion it is the best time series complexity measure for the time being. The algorithm they proposed consists of a rules for parsing sequence of B. Kozarzewski 63 symbols from a finite alphabet into a specified subsequences (they call them lattices). A lattice is a subsequence with the following properties: it includes an iterative sequence as its prefix, it remembers the history of the sequence and can repeat any series of successive operations in the memory, the last symbol of a lattice must be inserted into lattice unless the end of series is reached. In Appendix one can find listing of corresponding 3-step procedure in Scilab [8] code how to extract lattice from any sequence. The measure of lattice complexity of symbol sequence is simply the number of lattices in the sequence. We make a minor changes in original definition; in order to have lowest complexity value equal to zero we deducted 1 from the original measure and to avoid large numbers we divided result by n log2 n, where n is length of the series. As a result the lattice complexity measure I will use in the present paper is restricted to the range between null and approximately one. Complexity analysis helps to detect whether there is any mechanism, or some dynamical system behind the output time series and is essential to understand the underlying mechanisms behind dynamical systems. Complexity analysis of biological time series (recording gait dynamics or heart interbeat intervals, for example) appears to be useful in discriminating whether they were from healthy persons or patients with some disease. To get some insight into lattice complexity I compare both measures described, i.e. the sample entropy (SE) and the lattice complexity (Lc) for some series. SE is a measure that quantifies the unpredictability in a time series data. It reflects the likelihood that similar observations will not be followed by additional similar conditions. Let (x1 , ..., xN ) represents a time series of length N and um (i) = (xi , xi+1 , ..., xi+m−1 ) be a vector of size m. Let nim (r)) be a number of vectors um ( j)) within distance r from vector um (i)). The distances among vectors are calculated as the maximum absolute distance between their corresponding components. If I define N−m X 1 nlm (r) m Φ (r) = ln (1) N − m i=1,(i, j) N − m then approximate entropy for finite time series is given by S E(m, r, N) = Φm (r) − Φm+1 (r). (2) Sample entropy is function of parameters m and r, it weakly depends on time series length N when N exceeds approximately 103 . 64 Multilevel Time Series Complexity Figure 1. Complexity measure of the Henon map generated series for b = 0.3. Left - sample entropy, right - lattice complexity. As a time series I use output of chaotic Henon map which is considered as the two-dimensional but can be rewritten as 2 xi = bxi−2 − axi−1 + 1. (3) This map reduces to the logistic map when b = 0, becomes conservative when b = 1, and is dissipative map in between the two cases. I focus my analyze on parameter b = 0.3 where complexity of the system significantly exceedes zero. I allow a parameter vary within the range 1 ≤ a ≤ 1.42 where series is periodic, chaotic and diverges in different subranges. For a parameter within mentioned range with increment 0.005 I generate series of 215 elements and calculate for each series its sample entropy (parameters like in [3] are m = 1 and r = 0.15) and lattice complexity as well. One can see that complexity is close to zero with small chaotic component in the parameter range where series is mostly periodic, equals zero above approximately a = 1.42. In the parameter range where complexity has larger values series exhibits more or less complex dynamics. There is qualitative similarity in overall behavior between sample entropy and lattice complexity, however there are substantial differences in details. B. Kozarzewski 65 3. Multiscale (multiresolution) lattice complexity Above discussed measures of time series complexity only quantify the degree of complexity on a single time scale. However very often output of dynamical system has a complicated temporal structure on different scales. An example are outputs of multiple physiologic control mechanisms. In particular, heart rate variability is the output that operate on a wide range of time scales. Costa & al [2] have proposed a method to calculate multiscale entropy (MSE) from complex signals. For time series (x1 , ..., xN ) they construct coarse-grained set of time series y(l) by averaging l data points xi , xi+1, ..., xi+l ) in consecutive nonoverlaping windows. Number l is called a scale of coarse-graining procedure. The length of coarsegrained series decreases with the scale as N/l which results in increasing error in sample entropy for higher scales. Sample entropy indeed depends on scale factor l, and what’s more, character of SE(l) function depends on exponent of power-law correlations of time series. Scale dependent sample entropy allows to discriminate, for example, between output signals generated be the dynamical system in the different environments or between healthy persons and patients with some health failures. As we will see so does the multilevel lattice complexity, however there are significant differences between the two approaches to the scale dependent complexity. To explain them I consider some examples. 3.1. Surrogate time series One can assume that the simplest way to get scale dependent lattice complexity is to adopt coarse-grained procedure. To test the hypothesis I selected four surrogate time series dowloaded from the public domain archives PhysioBank [7]. They belong to the synthetic data category and are characterized by different exponent α. Signals under interest have symbols 0117, 0517, 0917 and 1517, where ”01” means α = 0.1 and ”17” means signal length 217 and so on. The α exponent measures the degree of correlations in the signal F(n) v nα , where F is root mean square fluctuation function and n is the box length. For more details see [9]. Precisely, I restricted myself to shorter signals of length 214 and coarse graining scales from 0 to 10, scale 0 corresponds to original, no coarse-grained signal. Results shown in left plot of Fig.2 are a bit surprising; within numerical error the lattice complexity does not depend on coarse-graining scale. It means that algorithm of time series coarse-graining fails to quantify lattice complexity across multiple scales. There are at least two methods to develop scale dependent lattice complexity. 66 Multilevel Time Series Complexity The first one consists of wavelets decomposition of the signal and use approximate wavelet coefficients instead of coarse-grained series. The second one could be fine-graining procedure to convert digital series into symbolic one with the use of refining alphabet of long enough size. That, however leads to significant increase in computational complexity. Here I restrict ourselves to the former method and the simplest alphabet of size 2. Details of the method are as follows. Wavelet decomposition of numerical series on some level l is a structure [Al , Dl ] that contains decomposition vector Al and bookkeeping vector Dl . The decomposition vector Al is composed of approximate coefficients on level l and detail coefficients on levels 1,2,...,l. There are evidences that series of decomposition vectors are the same complexity as original time series for corresponding level. Performing wavelet decomposition of a signal I collect decomposition vectors at levels from 1 to some reasonable maximum level lmax which depends on length of time series and particular wavelets used. In the next step relative differences of consecutive elements of decomposition vectors (and original series considered as level l = 0) are calculated r(i) = (A(i) − A(i − 1))/A(i − 1), (4) and then transformed into binary symbolic sequence according to the following rule ( 0 if r(i) < 0 r(i) −→ (5) 1 otherwise. In that way I get a lmax +1 of symbolic series (each of same length as the signal itself) representing original time series and decomposition vectors at all levels considered. The set of lattice complexities Lc(l) for all levels I call multilevel lattice complexity. Multilevel lattice complexity depends to some extend on wavelets family used and very weakly on time series length. In the following the Haar wavelets family will be used. Now I am going to analyze some examples which show potential ability of multilevel complexity to discriminate time series generated by particular dynamics. At the beginning I turn again to the surrogate time series discussed earlier. I perform multilevel decomposition of the time series (instead of coarse-graining) and calculate lattice complexity according to the algorithm described above. Wavelets decomposition clearly discriminates between series of different correlation and can be used as a tool for extraction time series structures in multiple scales. B. Kozarzewski 67 Figure 2. Lattice complexity of synthetic series over a range of scales. Left coarse-grained scaling, right -multilevel wavelets decomposition 3.2. Gait dynamics Now I test discrimination ability of multilevel complexity measure on data containing stride intervals. Human gait is a one of complex mechanisms for the interaction of the human body with the environment. It is known that there is a random variation in the stride interval of humans during walking and that variability exhibits long-time correlations. The fractal and multifractal properties of the stride interval time series were studied, using among others, the distribution of the local Hölder exponents [10]. They established that the stride interval time series is more complex than a monofractal phenomenon and that a slightly multifractal and non-stationary time series under different gait conditions emerges. Besides, many diseases affect gait cycle duration and general gait dynamics. Better understanding of gait dynamics may be useful as a diagnostic and prognostic utility for therapeutic intervention. I focus my attention on the time series of the interstride intervals between successive strides in human gait and restrict ourselves to study multilevel complexity of stride interval of young healthy individuals under different circumstances. Precisely, the data I selected are sets of the stride interval time series for 5 healthy young men walking at fast paces in both free and metronomically triggered conditions. Relevant data were taken from PhysioBank signal archives - Gait databases (5 longest files from the set si01.fast to si10.fast and from *.metfast ). Fig. 3 shows averaged results for both fast an metronomic fast walking. 68 Multilevel Time Series Complexity Figure 3. Lattice complexity distribution of spontaneous fast and metronomic fast walking Again wavelets decomposition clearly discriminates between stride interval time series in both different conditions in lower levels. Result suggests that output of walking at fast paces in free conditions is more complex at medium scales and is indistinguishable from metronomically triggered walking output at long scales. 3.3. Human heart interbeat rate Heart rate variability is among relatively simple methods for the studies of physiologic mechanisms responsible for the control of heart rate fluctuations, in which the autonomic nervous system appears to play a primary role. Heart rate variability typically shows a complex behaviour which is believed to reflect the complexity of a central physiologic control system.Variability of complex behaviour has been observed in many clinical states, autonomic neuropathy, heart transplantation, congestive heart failure, and other cardiac and non-cardiac diseases. Heart rate variability depends on age as well as on behavioral states of individuals, e.g. B. Kozarzewski 69 Figure 4. Lattice complexity distribution of heart interbeat rate for young and elderly individuals usual daytime activity and sleep at night. In the present example I analyze multilevel complexity to answer question if there is any characteristic difference in the scaling behavior between heart dynamics of young and elderly individuals. I used the interbeat heart rate records from mini collection of PhysioBank signal archives called Fantasia Database Subset. This collection consists of 10 heart beat time series (about 5000 elements long) from two groups of healthy man, five young (average about 26 years) and five elderly (average about 74 years). Fig. 4 shows averaged result of multilevel lattice complexity analysis for each of the group. The lattice complexity and wavelets decomposition clearly discriminates between interbeat rate time series in both group of individuals in all scales. It is interesting to note that lattice complexity in elderly group is higher at all levels in contrast to the entropy results as stated in [3] that there is a loss of complexity of disease and aging. The topic needs future investigation. 70 Multilevel Time Series Complexity 4. Conclusions The main objective of the present paper was to test the ability of lattice complexity to distinguish effectively on many time scales between signals generated by healthy individuals in different conditions or by healthy individuals and those diseased in body or mind. All examples considered show that lattice complexity is able to deal with the job. There is hope that multilevel lattice complexity of output of dynamical system may allow to learn more about underlying mechanism and be of valuable practical importance. References [1] Rajkovič, M., Entropic nonextensivity as a measure of time series complexity, [on-line]. Access:ArXiv:nlin/00404019v1, 2004, /[2010-12-01]. [2] Costa, M., Peng, C.-K., Goldberg, A. L., and Hausdorff, J. M., Multiscale entropy analysis of human gait dynamics, Physica A, Vol. 330, 2003, pp. 53– 60. [3] Costa, M., Goldberg, A. L., and Peng, C.-K., Multiscale entropy analysis of biological signals, Physical Review E, Vol. 71, No. 2, 2005, pp. 021906–1– 021906–17. [4] Pincus, S. M., Assessing serial irregularity and its implications for health, Annales N Y Acad. Sci., 2001, pp. 245–267. [5] Ke, D.-G. and Tong, Q.-Y., Easily adaptable complexity measure for finite time series, Physical Review E, Vol. 77, No. 5, 2008, pp. 066215–1 – 066215–8. [6] Lempel, A. and Ziv, J., On the complexity of finite sequences, IEEE Trans. Inform. Theor., Vol. IT-22, 1976, pp. 75–81. [7] PhysioBank, Tech. rep., PhysioBank Archive Index, [on-line]. Access: http://wwww.physionet.org, /[2010-12-01]. [8] Scilab-5.0.3, Tech. rep., Consortium Scilab (INRIA, ENPC), [on-line]. Access: http://wwww.scilab.org, /[2010-12-01]. B. Kozarzewski 71 [9] Xu, L., Ivanov, P. C., Hu, K., Carbone, A., and Stanley, H. E., Quantifying signals with power-law correlations, Physical Review E, Vol. 71, No. 5, 2005, pp. 051101–1 – 051101–14. [10] Scafetta, N., Griffin, L., and West, B. J., Holder exponent spectra for human gait, Physica A, Vol. 328, 2003, pp. 561–583. Appendix function Lattice(y,x) //y - part of series already partitioned into lattices, //x the rest of the series n=length(x); R=strcat([R,x(i)]); //1- - - - - k=strcmp(P,R); P=x(1); i=2; end; k=strindex(P,x(i)); //3- - - - - while(isempty(k)) Q=strcat([Q,R]); P=strcat([P,x(i)]); P=Q(1:end-1); if(i==n), La=P; return; end; i=length(Q); i=i+1; S=strcat([y,P]); k=strindex(P,x(i)); end; k=strindex(S,Q); Q=strcat([P,x(i)]); while(~isempty(k)) //2- - - - - if(i==n), La=Q; return; end; j=k;P=”;R=”; i=i+1; while(k) Q=strcat([Q,x(i)]); j=j+1; P=strcat([P,x(j)]); k=strindex(S,Q); if(i==n), La=P; return; end; end; i=i+1; Lattice=Q;
© Copyright 2025 Paperzz