Interactive Dancing Game with Real-time Recognition of Continuous Dance Moves from 3D Human Motion Capture Jeff K.T. Tang Jacky C.P. Chan Howard Leung Department of Computer Science City University of Hong Kong 83 Tat Chee Avenue, Kowloon, Hong Kong +852 2194 2837 Department of Computer Science City University of Hong Kong 83 Tat Chee Avenue, Kowloon, Hong Kong +852 2194 2547 Department of Computer Science City University of Hong Kong 83 Tat Chee Avenue, Kowloon, Hong Kong +852 2788 7234 [email protected] [email protected] [email protected] ABSTRACT We have implemented an interactive dancing game using optical 3D motion capture technology . We propose a Progressive Block Matching algorithm to recognize the dance moves performed by the player in real-ti me. This makes a virtual partner be able to recognize and respond to the play er’s movement without a noticeable delay. The completion progress of a move is tracked progressively and the virtual partner’s move is rendered in synchronization with the player’s current action. Our interactive dancing game contains moves with various difficulty levels that suits for both novices and skillful players. Through animating the virtual partner in response to the play er’s movements, the player gets immersed into the virtual environment. A user test is performed to have a subje ctive evaluation of our game and the feedbacks from the subjects are positive. 1. INTRODUCTION Categories and Subject Descriptors In real situation, two partnered dancers could communicate through their body movements. Hence, how the human pla yer interacts with the computer avatar becomes a very important issue. The virtual partner must be able to res pond to the pla yer’s dance move promptly. So the sy stem must be able to recognize the player’s movement withou t significant delay . In this paper, we propose a Progressive Block Matching algorithm to perfor m the real-time recognition of continuous dance moves. D.5.1 [Multimedia Information Systems]: Artificial, augmented and virtual realities. Animations, I.2.0 [Artificial Intelligence]: General, Cognitive simulation. I.2.1 [Applications and Expert Systems]: Games. I.3.7 [Three-Dimensional Graphics and Realism]: Animation, Virtual reality. General Terms Algorithms, Measurement, Performance, Design, Experimentation, Human Factors. Keywords Human-Computer Interaction, Interactive Dancing Game, 3D Human Motion Capture, Continuous Motion Recognition. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or com mercial advantage and that copies bear this n otice and the f ull citation on the first page. To copy otherwise, or republish, to post on servers or to re distribute to lists, requires prior specific permission and/or a fee. ICUIMC’11, February 21–23, 2011, Seoul, Korea Copyright 2011 ACM 978-1-4503-0571-6…$10.00. There is an increasing demand from game pla yers for games with enhanced functionalities in different dimensions such as realism, interaction, complexity, multi-player support, etc. The recent advances in te chnologies such as higher computational speed, greater 3D processing power, as well as more s ophisticated motion sensing devices allow game developers to produce more advanced games in order to satisfy the game players. Video Game is getting more ubiquitous to many people’s life and they spend a lot of time on it. Especially, many people are attracted by Virtual Reality games because they enjoy to get immersed into the virtual environment. In this paper, we propose an interac tive dancing game based on 3D motion capture technology. The virtual avatar is able to dance collaboratively by recognizing what a human player is dancing in real-time. In our proposed interactive dancing game, a real-time motion data acquisition method is needed. There are many available real-time sensor-based input technologies. In particular, we use 3D optical motion capture technology because it allows us to capture the human motion precisely. We have implemented the interactive dancing game that conforms to a design principle: a good game should be easy to play but hard to master [1]. There are two modes in the game: training mode and freestyle mode. The training mode aims to help players to get familiar with the dance moves, while the freestyle mode allows players to dance freely. Dance moves consist of various difficulty levels that are suitable for both novices and skillful players. While the interaction makes the game more fun, marks will be given if he/she can collaborate with the virtual partner to comple te a move successfully. A bonus mark will be awa rded to the player when he/she has performed some completed collaborations continuously. We conducted a user study about our game and we have received positive responses from the players. The paper is organized as the following. Section 2 presents some related work. Section 3 provides an overview about our proposed system. Section 4 covers the details of our real-time recognition algorithm for continuous dance m oves. Section 5 explains about the tuning of s ystem parameters. S ection 6 describes the gameplay of our interactive dancing game. The user studies are presented in Section 7. Conclusions and future work are provided in Section 8. 2. RELATED WORK In order to make the computer more helpful to human, researchers spent lot of tim e to e nhance the interaction between human and computer. Babu et al. [2] proposed a multimodal s ocial agent framework called “Marve”, which can recognize human face and speech as multimodal input and a virtual human can res pond through a rule-based system. Jaksic et al. [3] proposed a virtual salesperson in an online shop that can give a ppropriate feedback to customers by monitoring their emotion from facial expression. Apart from facial expression, body language is als o an important way to express our emotion. Iwadate et al. [4] propos ed an interactive dance system which can identify the emotion expressed in a dance video sequence and control the multimedia according to emotion. The identification is based on three features, motion speed, openness of the body and a cceleration of motion, for example, speedy and light openness of the body together representing happy. With the advance in s ensor-based input technology, capturing human motion data becomes more common. Marker-based optical motion capture system is gaining more popularity in the movie industry. The captured human motion da ta is used to render virtual character in the m ovie to produce realis tic motions. In addition to animation rendering, other researchers are interested in analyzing the motion data to understand more about it. Li et al. [5] proposed a motion recognition algorithm which segments motion at equal inter val and selects motion features by Single Value Decomposition (SVD) to be classified using Support Vector Machine (SVM). With a state space approach, Darby et al. [6] used the Hidden Markov Model (HMM) to re cognize human actions. It can predict the move performed by the player using the past frames and is a robust method to model time series. However, enormous amount of tra ining data is needed to ensure the robustness of the recognition. Otherwise, recognition may fail due to over-fitting. In this paper, w e will also perform motion recognition on 3D motion data captured by optical motion capture system. Nevertheless, the difference between our approach and approaches in [5][6] is that we focus more on the real-time issue and target on the continuous recognition without noticeable delay. Some researchers focus on building d ance-related applications. Calvert et al. [7] worke d on the notation of dance motions for recording and editing. Anim ations can be generated by the computer according to the dance notation such that a choreographer can now rehearse the idea with the computer before actual meeting with a real dancer. Ebe nreuter et al. [8] further compared the dance notation with motion capture technology and 3D ani mation on t he ability of recording and editing dance motion. The re sult showed that the three technologies have their own advantage in different aspects such as ease to use, c ost, etc. Dance education is another topic that attracts the interest of various researchers. Magnenat-Thalmann et al. [9] proposed a web 3D platform for dance le arning. They use motion capture technology to capture expert dance movements , and playback through a web interface. Leung et al. [10] proposed a performance training tool that focuses on dance learn ing. It can evaluate how good a person performs and gives useful feedback. Our application also contains some elements of dance learning but with more emphasis on the entertainment aspects. The human-computer interaction through dance is an importa nt topic studied by various parties. Dance Dance Revolution is a famous video game which the game is played using a dance mat. The player has to step on the correct zone in time in order to win the game. Although the p layer cannot really learn dancing through this game, it is a fun game. Tsuruta et al. [11] proposed a virtual dance collaboration system that can identify some simple moves like jumping and waving a hand. Then the virtual avatar will perform the same moves. Nevertheless, their method is not applicable to longer moves while dance motions are always not as simple and short as jumping. Reidsma et al. [12] proposed a rap dance system in which virtual dancers are driv en by the beat detected from sound, music or dance video clips. It does not really matter how the player moves except for the beat indicated. In our application, we would like to provide an interesting application in which the human-computer interaction is based on the movement of the whole body. 3. SYSTEM OVERVIEW In our proposed game, a human dancer (player) can interact with the computer dancer as the virtual pa rtner. Figure 1 shows the architecture of our prop osed system. An optical 3D motion capture system is used to capture the real-time movement of the human dancer. The motion data is digitalized and recognized in a PC. The data server delivers the motion templates that a re necessary for the motion recognition. The play er’s move is analyzed continuously and the sy stem will generate the most appropriate move for the virtual partner to be animated and shown to the player on the screen. Figure 1. The system diagram. The data server contains collaborative dance templates that are need to be captured in advance. In our current system, we use Ago-go dance that conta ins funny interactive moves between the male and female dancers. Each dance template consists of dance moves captured from a male dancer and a female dance r simultaneously. Two examples of A-go-go moves are shown in Figure 2. A-go-go dance moves are chosen because th ey are highly collaborative. In som e move, the movements of both dancers are symmetric to each other as shown in Figure 2(a), while in some movement the gestures of both danc ers are totally different as shown in Figure 2 (b). The pre-stored data in the data server thus contains synchronized movements between the two dancers. This is important bec ause during the gameplay, the s ystem needs to perform continuous recognition of the player’s move in or der to render the corresponding move of the virtual dance partner to facilitate interaction. The detailed description of our continuous recognition algorithm for real-time dance moves is provided in the next section. Completion state. Figure 4 shows the state diagram illustrating the flow between these s tates. The cha nge of state is triggered by block matching cost. Here we describe the FSM in high level. At the beginning, the system is in the Idle state since the input move is unrecognized. In general, there are Nm chains of states corresponding to a total of Nm template moves. Move Some key-postures of the Motion Templates 1 2 3 (a) Symmetric move (b) Collaborative move Figure 2. Example A-go-go dance moves. 4. PROPOSED INTERACTIVE DANCE FRAMEWORK During the game, the player performs the dance moves continuously. As a result, our system needs to recognize the moves performed by the player in a c ontinuous manner. Mo re importantly, the s ystem needs to generate the virtual partne r’s dance move corresponding to the play er’s dance move in re altime. This means that the s ystem cannot wait until th e player finishes his/her move before carrying out the recognition module. Rather, the recognition needs to perform in a real-time manner in which the delay cannot be too high for ge nerating the response immediately. The c ontinuous recognition of real-time dance moves can be represented by a finite state machine (FSM). We proposed a progressive block matching approach that is able to recognize the play er’s move in real-time. In the next subsection, we first explain the states in our f inite state machine representation and the interaction between t he states. In the subsequent subsection, we will explain the progressive matching algorithm in details. 4.1 FINITE STATE MACHINE REPRESENTATION There are eight tem plate dance moves stored in our s ystem as shown in Figure 3. Given the pl ayer’s input mo ve, the system needs to determine whether the input move belongs to one of the template dance moves or it is an unrecognized move. We propose a finite state machine to represent the dif ferent states in the recognition process. There are f our states in our finite state machine: (1) Idle state, (2) Start state, (3) Response state, and (4) 4 5 6 7 8 Figure 3. The eight Template A-go-go Moves. In the Start state of each template move, the input mo tion is divided into blocks (shorter segments) to be matched. The inpu t block is compared with the be ginning block of each template move and a block matching c ost is computed. If the block matching cost is la rger than a threshold THcost, then the inpu t move does not belong to that pa rticular template move and the system returns to the Idle state. On the other hand, if the block matching cost between the input block and the beginn ing block of the best matched template move is lower than the threshold THcost, then the system enters the Response state for that template move. During the Response state, the system continues to check whether the player keeps performing the same motion as the template move. The system keeps track the percent compl etion of the player’s move. If the accumulated error from the template move increases until certain threshold value, it triggers the system to go back to the Idle state. Otherwise, if the player can 100% complete his/her move, the Completion state will be triggered. In this state, score will be awarded to the player. The system will then go back to the Idle state and get prepared to recognize the play er’s next move. problem of finding the matching cos t between the postures at those frames. Figure 5 s hows 20 joints and 5 end-s ites (marked as *) in the human model we have used. Within the joints, there are 6 endeffectors (marked as +). Joint angle difference is considered in our similarity metrics because they are comparable in motions performed by people of different body sizes. The end-sites are the terminus in the human model hierarchy hence they are not considered in our m atching cost that is derived from the joint angle differences. On the other hand, the end-effec tors are also ignored because they are often inconsistent even when the same motion is performed. Idle state Start state Start state Response state Response state Completion state Completion state Start state … Response state Completion state Figure 5. The joints and end-sites in our human model. Move 1 Move 2 Move Nm Figure 4. State diagram of our interactive model. 4.2 REAL-TIME MOTION RECOGNITION ALGORITHM FOR CONTINUOUS DANCE MOVES Our recognition algorithm is developed based on the comparison between two postures. The frame matching cost betwee n two postures needs to be defined. To account for the temporal variations like speed or inconsistency, a frame correspondence is determined by searching for the best match for each input frame. In collaborative dancing game, the vir tual avatar is needed to understand what its partner (the human player) is doing. Hence, motion recognition needs to be performed in real- time. In our proposed method, the input s tream is processed in blocks and a move is recognized as one of the templates based on the Block Matching Cost. The percentage com pletion (i.e. the progress of collaboration) between the play er and the virtual partner is monitored progressively. In the following sub-sections, we will first introduce the Frame Matching Cost, which forms part of the formulation of Block Matching Cost. 4.2.1 FRAME MATCHING COST Each frame of the motion corresponds to a posture. The frame matching cost (i.e., the cost for matching a frame of the pla yer’s move and a frame of a tem plate move) is equivalent to the The frame matching c ost Sim(P1, P2) between a pair of postures P1 and P2 is given by the weighted sum of joint angle differences. The weight of each joint angle wi is given by the distance of that joint measure to the hierarchical end-site. For example, the weight of the left-shoulder joint is the distance measured from leftshoulder joint to left-hand joint that is equal to the s um of lengths of upper and lower arms. Assume that there a re NJ joint angles extracted for each posture. Denote these joint angles from posture P1 by θ1(i) and those from posture P2 by θ2(i), where i = 1,2,…, NJ. The equation for the frame matching cos t Sim(P1, P2) is given by equation (1): NJ Sim( P1 , P2 ) wi 1 (i) 2 (i ) (1) i 1 4.2.2 FRAME CORRESPONDENCE There exist temporal variations when two people are performing the same movement. Even a move is performed by the s ame person several times, the speed may be different. To account for this deviation, more reference fr ames from the tem plate move should be searched in order to determine the best match for each frame of the input move. Dynamic Time Warping (DTW) is often used to match sequential time data. However it is not suit able to us e DTW in our application because we do not know when the play er finishes the current move, and the virtual partne r needs to make decision within a s hort time in order to deliver a pr ompt collaborative response with little delay . Otherwise, the interaction will appear asynchronous and unnatu ral. Hence, we propose a block-based matching method, which imitates DTW by locally matching blocks of frames of varying sizes in ascending time order. The frame correspondence process is illustrated in Figure 6. In (a), the best m atch f(1) for frame 1 of the input block is searched within the matching range w in the tem plate move. The si ze w depends on the size of input block NB. In particular, we set w = 1/4 of NB frames. For each frame i of the input block, the best matched frame f(i) is defined as the frame of the template move that yields the minimum frame matching cost locally within the matching range. Next, as illustrated in Figure 6(b), the best match f(1) is then set as the starting frame of the next matching range in order to search for the best match f(2) for frame 2 of the input block. This process is r epeated until every frame in the in put block is corre sponded to a frame in the template move as illustrated in Figure 6(c). The average frame matching cost will be used to determine whether the play er is performing the s ame move. The frame correspondence is useful in block matching, which will be discussed in Section 4.2.3. 4.2.3 BLOCK MATCHING COST The block matching cost is used to (1) determine whether the input move is s imilar to one of the template moves in the Start state; and (2) determine whether the input move is still similar to the particular template move in the Response state. In the Idle state, the stream to the starting block of a ny candidate move is needed to be alig ned first. The system has to make decision about which templa te move is the most similar to the input. This is trained by a thre shold and will be describe d in Section 5. Once the input move is recognized as one of the template, the Start state is triggered. In the Start sate, the block matching cost between the input block and the beginning block of each template move is computed. The block matching cost is defined as the average local matching cost among all corresponding frames as described in the previous section. The template move th at yields the minimum block matching cost is identified. If this minimum block matching cost is below a threshold THcost, then the system triggers the Response state for that particular move. In the Response state, the block matching cost is computed as the (a) Searching for the best matched frame for the 1st frame cumulative mean of the matching cost of the current block and all previous blocks since the Start state. In other words, it is the average frame matching costs between all corresponding frames up to the current frame. The advantage of using cumulative mean is that the change of value is more stable. It is thus easier to set a threshold for making the decis ion especially for the marginal cases. If the block matching cost is above a thre shold THcost, then the input move has too much deviation from the template move and the system gets back to the Idle state. Otherwise, the system stays in the Response state and continues checking for the ne xt block until the current move has been completed. 4.2.4 PERCENTAGE COMPLETION To let the virtual partner dances collaboratively, it is too late to wait for the player to finish a move before recognizing it. In the proposed real-time block matching method, we need to identify to the current status of the play er’s move and generate the appropriate response from the virtual dance partner. During the Response state, the block matching repeats and hence a sequence of matched block pairs are formed progressively until the player’s move is completely recognized from a certain motion stream. The player’s move is matched continuously so that the system can identify the status of the player’s move and generate the appropriate response for the virtual dance partner. Input stream Best matches between blocks Template move 5% 40% 80% 100% Figure 7. Percent completion. Hence, we suggest keeping track the percentage completion of the player’s move during the Response state. Figure 7 illustrates how (b) Searching for the best matched frame for the 2nd frame Figure 6. Frame correspondence. (c) Frame correspondence of the input block In our real-tim e recognition algorithm, two parameters NB and THcost(n) are required. The parameter NB is the block size, i.e., the number of frames required in the Start state in order for the system to decide that the player’s input move belongs to a template move. NB is also useful to determine the matching range in the block matching is Response state as stated in section 4.2.2. The parameter THcost(n) is the threshold for the block matching cost that determines whether the player’s move is similar to the nth template move, where 1≦n≦N and N = num ber of template classes. A total of 48 m otion samples, eight template moves performed by two dancers in three trials, have been captured to tune the two thresholds. THcost(n) is trained by the cumulative mean CCM(i) which is equivalent to the c umulative matching cost at i-th frame in the matching, where 1≦i≦M and M is the minimum length of two comparing template moves. The calculation of CCM(M) is given in Equation (2), where PH(i) and PK(i) are the i-th frames from any two motion template samples. If these two template samples are very similar, they are likely from the same template and the cumulative cost should be small. 1 CCM ( M ) M M SimP i 1 H (i ), PK (i ) (2) Figure 8 and Figure 9 shows the cumulative frame matching costs CCM(i), where 1 ≦ i ≦ M (refer to pre vious paragraph). The matched cases (in circles) and for the unmatched cases (in crosses) for Templates Move 2 (and Template Move 6) are shown respectively. The unmatched cases (in cros ses) are denoted by CCM_unmatch(i) which is the cumulative mean of matching cost at the i-th frame such that the template moves PH(i) and PK(i) are from different classes. On the other hand, the matched cases (blue circles ) are denoted by CCM_match(i) which is the cumulative mean of matching cost at the i-th frame such that the template moves PH(i) and PK(i) are from the same class. Ideally, for all frames, CCM_unmatch(i) should be higher than CCM_match(i). Let 1≦n≦N (refer to previous paragraph), a set of thresholds at each frame i of the n-th template THcost(n) = {TH(1), TH(2),…, TH(I),…, TH(60)} is given by the mid-value between the minimum of the unmatched group i.e. min({CCM_unmatch}) and the maximum of the matched group i.e. max({CCM_match}), which is shown by dashed lines in F igure 8 and Fig ure 9. This is regarded as the boundary between similar and dissimilar template move samples. Each set of threshold is different from templates as the distinguishing powers between matched and unmatched samples are different from template moves. Template Move 2 Cumulative Mean of Frame Matching Cost 5. PARAMETER TUNING To find a suitable block siz e NB for each t emplate, we consider the distribution of the matched/unmatched groups again. Basically. NB is determ ined by the minim al number of frames Z such that TH(i) > max(CCM_match(i)) and TH(i) < min(CCM_unmatch(i)) holds in each template move. Note that Z is different from template classes. In the training result, Z is equal to 1 for Template Move 2 as illustrated in Figure 8 while it is equal to 26 for Template Move 6 as illustrated in Figure 9. The parameter NB is determined as the maximum value of Z’s for all template moves. This means that the input stream is divided into blocks of 26 frames to be processed. Hence, the recognition time is roughly 26/60 or 0.43 second. 0.08 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0 0 10 20 30 40 50 60 Frame Number Figure 8. Threshold tuning for Template Move 2. Cumulative Mean of Frame Matching Cost the percentage completion PC% can be estimated in a particular template move bas ed on the frame correspond ence. It can be calculated by the time index of the corresponding frame matched with the current frame in the input stream (the m–th frame) over the total number of M frames of that recognized template move, i.e. PC% = m/M100%. Template Move 6 0.1 0.09 0.08 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0 0 10 20 30 40 50 60 Frame Number Figure 9. Threshold tuning for Template Move 6. Figure 10 illus trates the process of checking the completion progress of an input template move using our tuned thres holds. The postures below the graph are the key frames from the input move and postures above the curve are the corresponding postures of the template move. This shows that the frame correspondence can be determined even if the inp ut move has s ome temporal variations from the tem plate move and th at the percent completion of an input move can be monitored progressively. Figure 10. The mapping between an input move and a template move. 6. GAMEPLAY Currently in our interactive dancing gam e, the tem plate dance moves belong to A-go-go dance that is a popular dance in 1960s. The purpose of the game is to let the player learn A-go-go and have fun at the same time. In the ga me, a virtual partner will dance with the player. For those who are new to A-go-go, they can use the “training mode” to practice with their virtual partners first. Once they become experienced, the “freestyle mode” is designed for them to show off their dancing skills. In the following sections, the set-up of the game, the training m ode and the freestyle mode are discussed. 6.1 SETUP OF THE GAME Figure 11 s hows an example of how a player dances with his virtual partner. The play er has to wear a tight suit attached with some optical markers as the sensor-based input. Watchin g the screen in front of him/h er, the player can s ee his/her virtual partner and other messages appeared i n the game such as the score, recognized template moves, etc. The rendering is done by OpenGL and the game interface is developed by MFC. The player can select a game mode between the training mode and the freestyle mode. Figure 11. (a) A player is dancing with his virtual partner, (b) A screen shot showing a sample of image like the one shown on the screen in (a) 6.2 TRAINING MODE The training mode is for the players to get familiar with different A-go-go template moves. There are eight template moves that are labeled as either easy or difficult. Players can first learn the easy template moves and then the dif ficult template moves. Each template move is trained in three steps. In the first step, the player watches the demonstration of a template player’s move and the corresponding virtual partner’s move. T his lets the player know how he/she should move and have an impression about how the virtual partner will respond to his/her move. In the second step, the player has to perform the move as much as he/she ca n in 20 seconds and understand whether he/she and the virtual partner are able to dance toge ther as a team. The virtual partner cannot react with the desired move if the play er dances poorly. One mark will be awarded to the player for each completed move to motivate the player. In the third s tep, the player can see the playback of the captured move. If the play er is successful in making the moves, he can see his avatar dancing with the virtual partner. If the player fails to dance correctly, he/she should find out what is wrong with his/her moves. By going th rough the training mode, the pla yer will soon be familiar with all the eight template moves and is ready for the freestyle mode. 6.3 FREESTYPE MODE In the freestyle mode, the player can dance freely with any of the eight template moves with the virtual partner. The player will get scores if he/she completes a template move. One mark is awarded to the player for each completed easy template move and three marks are awarded to the player for each completed d ifficult template move. To intro duce more challenges, the pl ayer may perform some “combo” moves which correspond to a sequence of certain completed moves. Once the player successfully performs a combo, a message will be shown on the screen to notify the player with a bonus score of 10 marks. After the performance, the player can also watch the playback to see his/her performance. accurate. The average marks of questions 4 and 5 are both 4.1, which proves our recognition method is acceptably accurate. Table 1. The six questions in the questionnaire No. Question Avg. Mark 1. Do you agree that the game is fun? 4.7 2. Do you agree that y ou know more about A-go-go? 4.7 3. Do you agree that the motion of the virtual dancer is smooth? 4.0 4. Do you agree that the virtual dancer can follow your move? 4.1 5. Do you agree that the virtual dancer can perform the correct move as you desire? 4.1 Apart from the quantitativ e results of th e questionnaires, some valuable opinions have been collected from the subjects. Some comments have been given on the usability of the sy stem. One subject said that the demonstration in the training mode was quite hard to follow as the display speed was too fast for novice and the view point was not s uitable. The display for mus ic is also a concern of one s ubject who thought that music can help him moving with the right pace. One of them suggested that it would be easier fo r players to learn if some comments about performance of players can be given. It is also interesting that the scoring is so important in the game. Some subjects reported that they became much more eager to play after they knew that another subject’s score was higher than theirs. For the recognition of motion, one subject suggested that we can do some calibration to adapt the system according to user variations. It is a good idea as it can make our system more robust. We will consider using the motion data c aptured in the training mode to tune the system’s threshold in the near future. Other c omments about the implementation of the system will also be handled in the future. 7. USER STUDIES Seven subjects (one female and six males) are invited to try out our interactive dancing game. Two of them have prior experience of A-go-go dancing. Af ter introducing the game, each subject played the game for twenty minutes. Afterwards, each subject was asked to fill in a questionnaire with the questions listed in Table 1. Subjects should put down marks ranging from 1 (totally disagree) to 5 (totally agree) for the given statements. The average marks given by the subjects are also illustrated in Table 1. The avera ge mark of question 1 is 4.7, which suggests that our game is fun. Question 2 obtained an average mark of 4.7 that shows our game can help pla yers learning A-go-go. The average mark of Question 3 is 4.0, which shows our method can most players thought our interac tive dancing game can recognize their action well in real-time situation, and also the virtual partner is rendered smoothly without much delay or jitter. In order to give the player an impression that the virtual partner follows his move and reacts with the correct move, the recognition of the player’s moves and the identificatio n of the percent completion must be 8. CONCLUSION AND FUTURE WORK In this paper, we have described an interactive dancing game that provides an environment for a human play er to dance with a virtual partner. The 3D motion capture technology is used as the sensor-based input. W e highlight the propos ed real-time recognition algorithm in a c ontinuous motion s tream. A block matching approach is introduced which p erforms local frame matching block by block in forward time s equence. In our proposed method, the threshold for each template is trained with the samples performed by different players and hence adaptive to style variation. In this work, the A-g o-go dance motions have been considered. A prototype dancing game is built based on this algorithm. It is designed to s uit for players with different skill levels. Novice can practice w ith the training mode. Skillful players will be awarded by bonus points when they can interact with the virtual partner in combo. T he scoring scheme aims to motivate the player to keep play ing the game. The virtual partner in the virtual environment allows the player to interact with it and hence he/she will get immersed to the virtual dance room. Some subjects are invi ted to tr y the sy stem and t heir feedbacks are mostly positive. Journal of Computer Games Technology, vol. 2008, Article ID 751268, 7 pages, 2008. Our proposed work can be further applied to applications or games using optical motio n capture devices or video-bas ed system like XBox Kinect [13] , etc. Games can be des igned to allow a player to collaborate with a virtual characters to complete a task, which is especially useful for performing art that required a high level of collaboration (e.g. street dance). [7] Calvert, T., Wilke, L., Ryman, R., Fox, I. 2005. Applications of Computers to Da nce. IEEE Computer Graphics and Applications, vol. 25, no. 2, pp. 6-12, Mar/Apr 2005. As future work, in order to enhance the recognition performance, we will provide a training mode to let the system learn to accommodate with sty le variations among different players. Meanwhile, more template moves are included in the system. Another approach is to consider interactive moves of other types that require closer collaboration between the player and virtual partner, such as holding hands like in Waltz dance. Collision detection between the avatars representing the play er and the virtual dance partner is an interesting problem to be solved, which can be extended to a m ore sophisticated interaction model between multiple real and virtual players. 9. ACKNOWLEDGMENTS The work described in this paper was substantially supported by a grant from the Research Grant Council of the Hong Kong Special Administrative Region, China [project No. CityU 1165/09E]. 10. REFERENCES [1] Moore, M.E., and Sward, J. 2007. Introduction to the Game Industry, Prentice Hall, 2007. (ISBN: 0-13-168743-3), Chapters 8, pp.249-276. [2] Babu, S., Schmugge, S., Inugala, R., Rao, S., Barnes, T., and Hodges, L. F. 2005. Marve: a prototype virtual human interface framework for stu dying human-virtual human interaction. In Lecture Notes in Computer Science, vol. 3661, pp.120-133. [3] Jaksic, N., Branco, P., Stephenson, P., and Encarnação, L. M. 2006. The effectivenes s of so cial agents in reducing us er frustration. In Proceedings of the CHI '06 Extended Abstracts on Human Factors in Computing Systems (Montréal, Québec, Canada, Ap ril 22 - 27, 2006). CHI '06. ACM, New York, NY, pp.917-922. [4] Iwadate, Y., Inoue, M., Suzuki, R., Hikawa, N., Makino, M., Kanemoto, Y. 2000. M IC Interactive Dance System-an emotional interaction system, In proceedings of the Fourth International Conference on Knowledge-Based Intelligent Engineering Systems and Allied Technologies, vol. 1, pp. 9598, 2000. [5] Li, C., Zheng, S.Q., and Prabhakaran, B. 2007. Segmentation and R ecognition of Motion Streams by Similarity Search, ACM Transactions on Multimedia Computing, Communications and Applications (ACM TOMCCAP), Vol. 3(3), Article 16, August 2007. [6] Darby, J., Li, B., a nd Costen, N. 2008. Activity Classification for Interactive Game Interfaces, International [8] Ebenreuter, N. 2005. Dance Movement: A F ocus on the Technology, IEEE Computer Graphics and Applications, vol. 25, no. 6, pp. 80-83, Nov/Dec 2005. [9] Magnenat-Thalmann, N., Protopsaltou D., and Kavakli, E. 2008. Learning How to Dance Using a Web 3D Platform. Lecture Notes in Computer Science, vol. 4823, pp.1-12, 2008. [10] Leung, H., Chan, J ., Tang K.T., and Komura, T. 2007. Ubiquitous Performance Training Tool Using Motion Capture Technology. In proceedings of the First International Conference on Ubiquitous Information Management and Communication (ICUIMC 2007), pp.185194, Suwon, Korea, 8-9 February 2007. [11] Tsuruta, S., Kawauchi, Y., W oong, C., and Ha chimura K. 2007. Real-Time Recognition of Body Motion for Virtual Dance Collaboration System. In proceedings of the 17th International Conference on Artificial Reality and Telexistence, pp. 23-30, 28-30 Nov. 2007. [12] Reidsma, D., Nijholt, A., Poppe, R.W., Rienks, R.J., and Hondorp, G.H.W. 2007. Virtual Rap Da ncer: Invitation to Dance. In proceedings of the CHI '06 extended abstracts on Human factors in computing systems, pp. 263-266, 2006. [13] Microsoft Corporation. 2010. Xbox Kinect. Available: http://www.xbox.com/en-US/kinect
© Copyright 2026 Paperzz