cos( ) 0 -sin( ) 0 0 1 0 0 sin( ) 0 cos( ) 0 0 0 0 1 θ θ θ θ

Key Pose Selection and Compression of Human Motion Capture Data
Pengjie Wang1,2 ,Xin Yang3, Teer Ba2, Wei Li2
1. College of Information and Communication Engineering, Dalian University of Technology, Dalian, China
2.College of Computer Science & Engineering, Dalian Nationalities University Dalian, China
3.College of Computer, Dalian University of Technology, Dalian, China
Abstract: Large volume motion capture data has been being collected with the advance of data acquisition technologies. One of the
fundamental problems in reuse of motion capture data is how to select key poses and compress them in order to get an overview of motion
and eliminate the redundancy. In this paper, we propose to select key poses by adapting pose distance definition from previous art to
measure the similarity of consecutive poses. We then eliminate these poses which are similar with previous ones within a specific threshold.
Since this scheme employs precise pose measurement, users can get detailed key poses from a motion with it. In the end, based on our
proposed pose selection scheme, we employ LZW encoding to further compress the reduced frames. With the proposed key frame selection
and compression method, we can navigate a big motion and reduce its size effectively. Our method can be widely used in computer
animation, computer games, virtual reality, etc.
Keywords: Computer animation; Motion capture; Key frame extraction; Key pose selection
1. Introduction
Key pose selection is the fundamental function in data-driven computer animation. It can be used in motion compression [1-3],
overview of motion database [4, 5], motion synthesis [6], motion retrieval [7] and activity detection and recognition [8]. Previous key pose
selection arts focus on genetic algorithm and curve fitting. Liu et al. [9] propose to combine the genetic algorithm and probabilistic simplex
method to get an optimized key frame extraction method. Zhang et al. [10] propose to extract key frame from motion capture data by
introducing multiple population genetic algorithm, which considers both global and local search. Matsuda et al. [11] introduce the
refinement technique for handwritten curves to the motion capture data. Lim et al. [12] take a motion as a high dimensional curve and apply
curve simplification to this curve to get the most distant points as key frames. Yang et al. [13] and Wang et al. [14] improve curve
simplification method by introducing layer structure and GPU computing platform respectively.
However, traditional curve simplification methods mostly are based on tree traversal to get key poses with a distance threshold. There
are usually two traverse policies, depth-first and width-first. The depth-first policy usually favors some local segments over other ones,
while the width-first traverse policy mechanically travels each segment with one new generated pose. Different from these tree traversal
methods, we introduce the pose distance defined in motion graph [15] to key pose generation, and determine key poses by eliminating ones
that have a distance within a threshold from previous frames. In this scheme, we can get key poses by defining distance threshold without
employ tree traversal policy. This scheme can serve as detailed pose generation. Based on this key pose selection, we introduce the LZW
encoding method to further compress the reduced frames. Our proposed method can be used in virtual reality, video games, etc.
2.Pose distance based key pose selection
We adopt pose distance measurement definition from motion graph paper [15], where the distance between two poses is presented. We
give two minor modifications based on this definition. One is that we do not use pose windows, and the other is that we assume all the joints
weight are equal. Our adapted pose distance definition is as follows. Firstly, for two poses A and B, we convert them to position set, pi and
pi’. Secondly, we get the rotation angle  and 0 , 0 that will be applied to pi’ for rotation around axle y and then transition on the floor.
After this step, pi’ will be as close to pi as it can. Finally, we can get the distance by accumulating their square of difference between
correspondent joint positions of these two position sets, as Equation (3) shows,
x z
n
D( A, B)   pi  TRotation  TTransition  pi,
2
i 1
where
(3)
cos( )
0
TRotation  
sin( )

0
0 -sin( ) 0 
1
0
0 
0 cos( ) 0 

0
0
1 ,
1 0 0 0 
0 1 0 0 

TTransition  
0 0 1 0 


 x0 0 z 0 0  .
(4)
(5)
In above equation, TRotation represents the rotation matrix that should be applied to pi’, and TTransition represents the transition
matrix that should be applied to pi’. For the detailed equation on  ,
assume all the weight is 1 in the equation of motion graph paper [15].
x0
and
z0
, please refer motion graph paper[15]. Please note that we
Journal of Residuals Science & Technology, Vol. 13, No. 5, 2016
© 2016 DEStech Publications, Inc.
doi:10.12783/issn.1544-8053/13/5/94
94.1
In Equation (3), we give the distance definition. However, there are situations where two poses distance are small but some of their key
joints are different very much. Therefore, another definition, distance variance, should be introduced, before we can use the definition of
Equation (3) to get key pose. The variance is defined as,
n
V ( D( A, B))   ( pi  T , X 0 ,Z0 pi,  D) 2
2
i 1
(6)
n
D

i 1
pi  T , X 0 ,Z0 pi,
2
n
where
.
(7)
fter the definition of pose distance and variance, we can use this combined distance to measure the difference of neighbor frames. If the
distance (as defined in Equation (3)) and variance (as defined in Equation (6)) between frame i and frame i+1 are less than a threshold t1,
and t2 respectively. Frame i+1 can be seen as a redundant frame which will be eliminated.
3.Motion compression and decompression
In the field of data compression, we try as much as we can to reduce data size. Usually, there are two ways. The first is to save the
difference of two neighbor frames which are similar. The second is to keep the important frames and eliminate other unimportant frames.
Key poses selection in Section 3 fall into the latter class, where we have eliminated many redundant frames. The remaining frame data is
compressed by using LZW encoding algorithm. In order to get better compression efficiency, we split the whole motion data into three parts,
the motion information part, the skeleton part and the pose index data part.
4.Result and discussion
By using our key pose method, we can easily get key poses for overview purpose of a motion. Results are shown in Figure 1 for
walking (left) and basketball dribbling (right) motion. We can see that detailed poses can be chosen using this method with distance
thresholds. Compared with previous curve simplification method, pose distance method is better for detailed pose selecting. The reason
might be our carefully designed pose distance and variance metrics.
Figure. 1 key poses for walking (left) and basketball dribbling (right) motion by using proposed key pose selection method.
In Figure 2, we give a comparison between the original running motion (labeled as 1) and the decompressed motion (labeled as 2). We
can see that the decompressed motion is nearly the same as the original one. There might be two reasons for this result. The first is that our
pose selection method can effectively remove the redundant frames and the second is that we employ a lossless encoding algorithm.
Figure. 2 the comparison between original poses (labeled as 1) and decompressed poses (labeled as 2) for running motion.
5.Conclusions and discussion
In this paper, we present a novel scheme for key pose selection and compression of motion capture data. Our scheme can capture the
detailed key pose of a motion and reduce the data size effectively. Our scheme can be used in the data-driven animation related applications.
However, For a segment which are composed of similar poses, our method favors the first frame over latter ones, which are often
eliminated because they are similar with first frame in term of pose distance and variance. This can be seen as a limitation of our method. To
address this problem, a window might be introduced for the optimizing selection from the candidate poses. Nevertheless, how to evaluate a
good candidate for key pose from this window become important. This will be a topic of our future research to investigate this technique.
Journal of Residuals Science & Technology, Vol. 13, No. 5, 2016
© 2016 DEStech Publications, Inc.
doi:10.12783/issn.1544-8053/13/5/94
94.2
Acknowledgements
This work is partially sponsored by CCF-Tencent Open Fund (Project Number CCF-Tencent IAGR20140112) and NSFC (Project
Number: 61300089).
References
[1] L. Váša and G. Brunnett. Rate-distortion optimized compression of motion capture data, Computer Graphics Forum, 2014, 33(2): 283292.
[2] Pengjie Wang, Zhigeng Pan, Mingmin Zhang, Rynson W.H. Lau and Haiyu Song. The alpha parallelogram predictor: a lossless
compression method for motion capture data, Information Science, 2013, 232(2013) , pp. 1-10.
[3] I. Lin, J. Peng, C. Lin and M. Tsai. Adaptive motion data representation with repeated motion analysis. IEEE Transactions on
Visualization and Computer Graphics, 2011, 17(4) , pp. 527-538.
[4] J. Assa, Y. Caspi and D. Cohen-Or. Action synopsis: pose selection and illustration, ACM Transactions on Graphics (TOG), 2005,
24(3) , pp. 667-676.
[5] J. Assa, D. Cohen-Or, I. C. Yeh and T. Y. Lee. Motion overview of human actions, ACM Transactions on Graphics (TOG), 2008,
27(5) , pp. 115:1-115:10.
[6] R. Kawasaki, Y. Kitamura and F. Kishino. Extraction of motion individuality in sports and its application to motion of characters with
different figures, Proceeding of Computer Graphics International, 2003 , pp. 306 - 311.
[7] Pengjie Wang, Rynson W.H. Lau, Zhigeng Pan, Jiang Wang and Haiyu Song. An Eigen-based Motion Retrieval Method for Real-time
Animation, Computer & Graphics, 2014, 38(2) , pp. 255-267.
[8] Z.P. Zhao and A. M. Elgammal. Information theoretic key frame selection for action recognition, Proceedings of the British Machine
Vision Conference, 2008.
[9] Xianmei Liu, Aimin Hao and Dan Zhao. Optimization-based key frame extraction for motion capture animation, The Visual Computer,
2013, 29(1) , pp. 85–95.
[10] Qiang Zhang, Shulu Zhang and Dongsheng Zhou. Keyframe Extraction from Human Motion Capture Data Based on a Multiple
Population Genetic Algorithm, Symmetry, 2014, 6(4) , pp. 926-937.
[11] K. Kondo, K. Matsuda. Keyframe extraction method for motion capture data, Journal for Geometry and Graphics, 2004, 8(1) , pp. 081–
90.
[12] I. S. Lim, D. Thalmann. Key-posture extraction out of human motion data by curve simplification, Proceeding of 23rd Annual
International Conference of the IEEE Engineering in Medicine and Biology Society, 2001, pp. 1167-1169.
[13] Tao Yang, Jun Xiao, Fei Wu and Yueting Zhuang. Extraction of keyframe of motion capture data based on layered curve simplification,
Journal of Computer-Aided Design & Computer Graphics, 2006, 18(11) , pp. 1691-1697,
[14] Pengjie Wang, Mingmin Zhang, Haiyu Song, Haiwei Wang, Mingliang Xu and Gengdai Liu. Key frame extraction from motion capture
data based on GPU, ICIC Express Letters, Part B: Applications, 2011, 2(1) , pp. 209-214.
[15] L. Kovar, M. Gleicher, F. Pighin. Motion graphs, ACM Transactions on Graphics 2002, 21 (3) , pp. 473-482.
[16] M. Tournier, X. Wu, C. Nicolas, A. Élise, R. Lionel. Motion compression using principal geodesic analysis. Proceedings of
Eurographics, Munich, 2009, pp. 355-364.
[17] L. Váša, G. Brunnett. Rate-distortion optimized compression of motion capture data, Computer Graphics Forum, 2014, 33(2) , pp. 283292.
Journal of Residuals Science & Technology, Vol. 13, No. 5, 2016
© 2016 DEStech Publications, Inc.
doi:10.12783/issn.1544-8053/13/5/94
94.3