Similarity Measure Based on Partial Information of Time Series Advisor:Dr. Hsu Graduate:You-Cheng Chen Author:Xiaoming Jin Yuchang Lu Chunyi Shi Outline Motivation Objective Introduction Retrieval and Representation of partial Information System Setup Results and Discussion Conclusions Personal Opinion Motivation A “good” similarity measurement is determined by human. Objective To propose a model for the retrieval and representation of the partial information in time series. Introduction The model has three objects: Get the partial information Represent partial information in a compressed form Most similarity model could be applied Retrieval and Representation of Partial Information 3.1 General Description X ( X (1),..., X ( N )) Definition 1: Use a rule F to decompose X into a set of time series X ( X 1' ,..., X T' ) 3.1 General Description Definition 2: (1) Segment X into a set of sub-series X j ( X ( jr r 1),..., X ( jr)) (2) X’jk is the k-th F-based component of sub-series Xj Use mapping rule T to map each X’jk to a value Rk(j) 3.1 General Description Definition 3: K ( K1,..., KW ) is the orders of all the representing sequences of interest. Ak ( A1,..., AW ) where An is the degree of user’s interest to n-th component ' ( A X n Kn Kn ) is portion of partial information of interest 3.1 General Description Definition 4: R(m) AKMOD( m ,w ) RKMOD( m ,w ) ( Km / w ) is the full representing sequence(FRS) of the partial ' information n ( AKn X Kn ) 3.1 General Description Definition 5: Given two time series X,Y MD ( X , Y ) D( FRS ( X ), FRS (Y )) 3.1 General Description Sum up, a representing model for partial information can be summarized by Decomposition method F Representation method T Distance measurement D 3.1 General Description Example 1 3.1 General Description Use F to decompose time series to two components (1) Local fluctuating movement S’1 (2) Global movement S’2 R1 ( j ) S ' j 1 fluctuation 0 otherwise FRS(X)=R1 and the length of the FRS(X)=200/8 3.2 Practial Method Let H is transform matrix of a given orthonormal discrete transform So Tj=H*Xj We denote the results of discrete transform of time Series Xj and Yj by DT(Xj)=XTj, DT(Yj)=YTj 3.2 Practial Method The k-th component of X is X n' (n) Tn / r (k ) IBk (n ( n / r 1) r ) IBm 1 H0 ,m The k-th representing sequence is 1 1 H 1 , m H r 1,m Rk (m) Tm (k ) Then FRS(X) can be calculated as: R(m) AKMOD( m ,w ) Tm / w ( KMOD( m,W ) ) T 3.2 Practial Method MD ( X , Y ) q W 2 2 ( XT ( K ) YT ( K )) A j n j n Kn j 1 n 1 W W W ' L2 ( X Kn AKn , YKn' AKn ) ' W ' L2 ( X Kn AKn , YKn AKn ) n 1 n 1 n 1 n 1 Here we use DCT(discrete Cosine transform) in our experiments 4. System Setup 4.1 Evaluation of Similarity Measurement Based on Partial Information We use hierarchical agglomerative clustering(HAC) to cluster FRSs. Sim (Ci , S j ) 2 Ci S j /( Ci S j ) Sim (C , S ) max Sim Ci , S j / k j i 5. Results and Discussion We used historical stock data and only considered the time series of closing price. Step 1: use DCT to decompose time series and to represent partial information. Step 2: E=(E1,…,Er) to represent the chosen portion. Step 3: E was used to calculate K and together with A Then FRSs of each time series were generated Step 4: calculating MD and clustering 5. Results and Discussion 11,15,14,10,19,10,14,17,14 3, 3, 3, 3, 2, 4, 4, 5, 5 5. Results and Discussion Conclusions The experimental results could help designing a more effective and more efficient similarity measurement Personal Opinion The similarity measurement can be improved better by increasing the weight of the meaningful component.
© Copyright 2026 Paperzz