TVE 17 006 juni Examensarbete 30 hp Juni 2017 Investigating the Effects of Trend Lines with the Microworld GridRail Kristina Mach Institutionen för teknikvetenskaper Department of Engineering Sciences Abstract Investigating the Effects of Trend Lines with the Microworld GridRail Kristina Mach Teknisk- naturvetenskaplig fakultet UTH-enheten Besöksadress: Ångströmlaboratoriet Lägerhyddsvägen 1 Hus 4, Plan 0 Postadress: Box 536 751 21 Uppsala Telefon: 018 – 471 30 03 Telefax: 018 – 471 30 00 Hemsida: http://www.teknat.uu.se/student In 1995 a research collaboration was launched between the department of Information Technology at Uppsala University and the Swedish Transport Administration (Trafikverket). Together they built a software protype STEG, for controlling trains and planning new routes. The aim of the recent research has been to understand why STEG is appreciated, how to improve it and for collecting information about how to build future systems.As an introductory explanation of the generally positive appreciations of STEG different hypothesis were constructed. This thesis is part of a systematic variation and investigation of one of thesehypotheses. The other two will be investigated in future experiments.In order to test the hypothesis, a microworld was developed a simplified game of a complex system. a few experiments have previously been conducted which did not support the first hypothesis about the trend lines causing a positive effect, based on how quickly the participants finished playing GridRail. More experiments are needed in order to look for explanations. The purpose of this study was to see how the game play was affected by absence or presence of trend lines. A between study that was conducted; we made 32 participants use the microworld GridRail a game with the purpose of driving 6 trains to opposit stations without collisions as fast as possible. The results does not show any significant difference on how fast the participans end each trial between the conditions with or without trend lines. However, comments in verbalisations and notes of how many trains each participant drives at the same time are an indication that the lines do affect how participants play, even though the finish time is not affected. Handledare: Anton Axelsson Ämnesgranskare: Anders Jansson Examinator: Nóra Masszi TVE 17 006 juni. Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Previous experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Questions to be answered . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Delimitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 4 4 4 2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.1 Early research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.2 GridRail . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.3 Recent evaluation experiments with GridRail . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.4 Future experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3 Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Dynamic Decision Making . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Microworld . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3 Think Aloud . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 Pre-processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.2 Basic assumptions for the Think Aloud . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.3 Problems with Think Aloud . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 13 13 14 15 15 16 4 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Participants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Pilot study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5 Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 17 18 19 20 20 5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1 Learning curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Effects of trend lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.1 Boxplot - Mean time per trial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.2 Boxplot - Mean time before each trial . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.3 ANOVA for Mean time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Analyses of collisions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.1 Division by lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.2 Division by verbalisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.3 Participants without trend line condition . . . . . . . . . . . . . . . . . . . . . . . 5.3.4 Participants with trend line condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.5 ANOVA for collisions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 24 25 25 26 26 28 29 29 30 31 31 5.3.6 ANOVA for collisions in Order 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Effects of verbalisations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.1 Mean time based on verbalisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4.2 Variable interaction: Verbal and Order . . . . . . . . . . . . . . . . . . . . . . . . . . Analyses of verbalisations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5.1 Summation from table 11 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5.2 Breaking down numbers from notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5.3 Notes from experiment by trend line condition and verbalisation order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5.4 Verbalisation transcript . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 32 32 33 34 34 35 6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 Analysis of results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Discussion of methods used . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4 Future research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4.1 GridRail . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4.2 Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 41 43 43 44 44 44 7 References 45 5.4 5.5 .................................................................................................. 35 37 1. Introduction In 1995 a research collaboration was launched between the department of Information Technology at Uppsala University and the Swedish Transport Administration (Trafikverket). Together with Trafikverket, Uppsala University built a software protype STEG, for controlling trains and planning new routes. The prototype has after that been used in Norrköping and Boden with mainly positive results. The aim of the recent research has been to understand why STEG is appreciated, how to improve it and for collecting information about how to build future systems. As an introductory explanation of the generally positive appreciations of STEG, the following three general hypotheses were identified: either the visualized trend lines in the interface support the train traffic controller when he/she is making predictions, or; the spatial layout in the interface makes it easier for the controller to map this analog representation to the real geographic area and thus facilitates his/her understanding of the actions going on in the domain, or; the ability to re-plan with the help of direct interactive feedback supports an efficient and cognitively less demanding way of working with the interface. This thesis is a part of a systematic variation and investigation of the first of these hypotheses. The other two will be investigated in future experiments. In order to test the hypothesis, a microworld was developed - a simplified game of a complex system. A few experiments have previously been conducted which did not support the first hypothesis about the trend lines causing a positive effect, based on how quickly the participants finished playing GridRail However, more experiments are needed in order to look for explanations, and this with different methods or approaches. In this thesis we explore the impact of the trend lines based on what the participants say during the game and how they play rather than the output measured in time. The data will be gathered by using a Think Aloud method, by observing and taking notes and by recording the screen and voice activity. 3 1.1 Previous experiments The latest research on STEG had the aim to understand why the train traffic controllers found the system helpful and what researchers should consider when building similar systems in the future. STEG is a complex system in which the traffic controller can see the ongoing process, plan and execute new routes and see a prediction of future traffic. In order to test which components of this system are good the system has been divided in to sub components of which one is trend lines. These trend lines have been evaluated in earlier research based on the hypothesis that these trend lines were a part of why STEG was perceived as a good system; with an outcome that they did not help the user to finish a task of driving trains quicker. It is however problematic to conclude that the trend lines are not helpful in a big and complex system based on a single variable; time. Multiple experiments are needed to give more data upon which a conclusion can be based. For example; we do not know if participants with trend lines actually noticed and used the lines and we do not know if the participants steered and planned the routes differently based on trend lines. Further investigations are therefore interesting which can include new angles of the same experiment. 1.2 Questions to be answered Building on earlier research two more specific research questions have been formulated to give a broader bases for investigations into decision-making. Hypothesis: The presence of trend lines in a simple dynamic system will accelerate learning and improve performance. • Are the trend lines in GridRail helping the users? • Can the participant’s descriptions of how they solved the task give input to whether the trend lines are helpful? 1.3 Delimitations By isolating parts of STEG and testing them in the microworld GridRail, where the idea is to drive 6 trains to an opposite station, the results can not directly be interpreted as a result for how train controllers perceive the complex system STEG; even thought it can give us a direction of how people react on trend lines. Novice participants never used the system STEG or the microworld GridRail and by only using novice test persons a better comparison can be done be4 tween participants and a comparison of improvement in games over time. This selection of participants have a limiting impact in that sense that the real controllers are not novice users; and the results will therefore not have an impact on how to improve STEG today - this would require a test of STEG in its natural environment with the traffic controllers as participants. The experiment will not help us to understand if the trend lines in STEG are helpful in the complex system when a controller has been learning it over a longer time. 5 2. Background 2.1 Early research In 1995 a research collaboration was launched between the department of Information Technology at Uppsala University and the Swedish Transport Administration (Trafikverket). The aim of this project was to gather knowledge about the traffic control systems and how to develop a future interface for a more effective and operational planning system; given high safety and a good work environment. Extensive demands of the train network capacity together with higher speeds and higher demands of trains being on time amounted and still accounts for better planning tools. Research activities included description and analysis of the control tasks, design of new control principles, user interface design, decision support systems, work organization and work place design (Andersson et al, 2015) Previous research resulted in a body of knowledge for interaction between humans in various roles, various technical support programs and for a basic analysis of train control operation. Some highlights of the insights from the collected information are mentioned below: • Technical systems must be built according to professional individuals needs and domain specific knowledge. • In future developments a focal point should be the professional environment and the cognitive aspects. • To increase the situation awareness for future controllers is of high importance. • Develop a real time plan which can be adjusted and make it available for all agents to see. Resulting in all agents feeling "In the loop". • The real time plan should execute automatically without changing the operators planned routes. • The planning interface should contain all information needed and the changes in the train traffic should be done in the same interface. (Andersson et al., 2015) Control systems of railways in real time has to be dynamic, safe and precise and it therefore results in a complex system. The traffic controllers are trained to handle a high amount of information and they can overview, interpret and in real time use almost unlimited amount of information. Therefore, the interface was not build to be simple and downscaled but instead to be more efficient and usable. (Andersson et al., 2013). 6 The complexity is defined by the properties of information which is given by the process that is being controlled (Andersson et al., 2013). The research group found that interfaces fail if the developer misunderstands the complexity and therefore does not provide the skilled operators with enough tools to work with (Andersson et al., 2013). Humans need a regular environment or surrounding to be able to predict outcomes and they also have to learn these regularities by practicing them. When this is achieved expertise and intuitive skills are developed. In different words, train controllers have learned to handle intense rush-hours, plan new routes near real time as new information is flowing. Consequently, focus was set on understanding the work process of tracking diagrams, panels and how controllers solve conflicts when they occur (Andersson et al., 2013). A model was introduced to steer the research in the right direction for systems supporting human control of complex, dynamic processes. The model is called GMOC, an acronym for Goal, Model, Observability and Controllability (Andersson et al., 2015). These four areas are seen as necessary to take in account when building systems for human operators to achieve control. 1. Goals The goal is the basic specification of what the operator will, must, should, or wants to achieve - a specification of the objectives of the control process. 2. Models The model includes an understanding of the users work, fellow employees, the control systems, in other words - their whole work environment. 3. observability Observability refers to the ability of the system to produce information that can be used and interpreteted by the operator. It can be visual observation of information like an interface, signal lights, or other digital instruments. 4. Controllabillity Controllability means the ability of the system to provide means for the operator to gain control over the process. It refers to different ways of interaction for example keyboards, buttons or another steering mechanism or voice control. (Tschirner, 2015) From the information collected through the years and from the analysis of complexity the research group developed prototypes for the real time interface. After evaluation experiments in laboratory environments the prototypes where developed in to a full scaled system, named STEG. It was later implemented in Norrköping (See figure 1) and Boden, two of the control centres in Sweden. The implementations in the two control centres resulted in improved support 7 for the controllers and simplified the re-planning (Andersson et al., 2015). Before STEG the controllers used a paper-based time-distance graph for replanning routes (see figure 2) and to remember what needs to be done at what point in time. The paper based planning required a lot of memory from the controller which gave a heavy cognitive workload. This, because during replanning the controller drew new time table lines on a paper sheet and used it to remember what had to be done at which. At the same time new information could appear on the screens or by phone and because the new paper based plan was not integrated into the system there was a big risk that the different automatic functions in the system will work against the new plan. Therefore, it required the controller to always be alert in different systems and to remember the planned changes. With STEG the planning is done in the software with results showing immediately, relative to other trains, and it gets integrated the automatic system see figure 3 (Andersson et al., 2013). Figure 1. Control center in Norrkoping 8 Figure 2. Paper-based time-distance graph Figure 3. STEG 2.2 GridRail After the implementation of STEG in Norrköping and Boden the research continued but instead of focusing on the need the question was why the traffic controllers actually liked the system. The process of focusing on usability during all stages of the process and also further on, during the systems whole life cycle, is called User-centered system design - valuable information when build future systems.(UCSD) (Gulliksen et al., 2003). The hypothesis for why the controllers liked GridRail came down to the following; 9 • Trend lines visualizing the trains future position gives support for the train traffic controller • The geographic representation in the software system makes it easier for the traffic controllers to plan their route • The ability to re-plan in the interface makes it more usable than the paper based planning The ongoing research wanted to explore these hypothesis and started with the first one mention above; that the trend lines improved the performance. In order to test just the trend lines they were isolated from the whole system. A useful tool when testing a complex and dynamic system is to build a microworld (see more in section 3.1) - a simplified model of the real and complex system STEG. 2.3 Recent evaluation experiments with GridRail Two experiments have been done on a microworld GridRail - programmed to be a simple simulation of STEG, which is a complex system (more about microworlds in part 3.1 and 3.2). In 2016 an experiment 1 was done by Sercan Caglarca to explore if the presence of trend lines, in a simple dynamic system, will accelerate learning and improve performance. This was based on the hypothesis that these trend lines were a part of why STEG was perceived as a good system. The trend lines visualize the history, the present situation and the future position of the planned traffic. (Caglarca, 2015) Additionally, the experiment also questioned whether introduction of a time target, where the participant where introduced with a finish time to aim at, would give an effect on the behaviour of the participants. The experiment was divided in to two blocks with 20 attempts in each block. Different instructions were given depending on the game - with or without trend lines (Caglarca, 2015). The first hypothesis was falsified. It took the group with trend lines more time to complete the task in the first 20 attempts than the group without trend lines. In the last 20 attempts the participants which did not have trend lines almost stopped improving their performances in the last 20 attempts and the participants with trend lines kept slightly improving their performances and therefore converged the performance levels between the two test groups (Caglarca, 2015). For the conditions with and without a target, results showed that if the participant did not have to finish the game at a specific range of time there were 10 no significant difference in performance between the groups with or without trend lines. However, if the participants were told to aim at 32 seconds for each round the participants with trend lines performed significantly worse. Furthermore, a post-hoc comparison showed that the mean time spent in a trial was higher with trend lines and with a time target than in for participants without trend lines and with a time target. The trend lines do not help the participants to solve the puzzle when introduced to a target. Instead, the result showed in a negative result with a higher mean time spent on a trial (Caglarca, 2015). Conclusions where made that the presence of trend lines did not improve performance and slowed down learning among the novice users for 40 trials. Experimental evidence also confirmed that when trend lines are visible to the participants, introduction of a target imposed a heavy cognitive load which lead a perception of a more difficult task (Caglarca, 2015). Caglarca suggested to slow down the speed of the trains in order to decrease the difficulty and to see if the results then differed. Moreover, he also suggested to explore the relation of performance and long periods of practice where the participants could get two or three days of practice. (Caglarca, 2015). Another recent experiment with GridRail was also done by Sercan Caglarca, later summarised by Axelsson, and it was built on the previous experiment mentioned above. However, this time the experiments focused on how the performance and learning of novice users are affected by the presence of trend lines in an interface of a dynamic system and if introduction to a target actually affect the user behaviour. Furthermore, the last question was how the perceived difficulty is affected by the presence of trends and a target (Axelsson, 2016. Participants used GridRail similarly to Experiment 1 apart from that a session contained 3 blocks of 20 trials instead of 2, and also the participants returned the next day for a second session of 3 blocks of 20 trials, and a month later for a third session of 1 block of 20 trials. (Axelsson, 2016). The results showed that Performance Time means were higher in the condition with both Trend lines and target time compared to the one with only trend lines and only target time. There is also a significant interaction in perceiving the task difficult if the participants had both the trend lines and were given a time target (Axelsson, 2016). 11 2.4 Future experiments The cognitive workload is showed to be higher when participants are exposed to trend lines in the recent experiments. This can be explained by the fact, as mentioned earlier, that humans need to learn regularities by practicing them in order to achieve expertise or intuitive skills. By making the participants to come back another day, an attempt to increase the expertise was made in the most recent experiment. However, the participants might not have understood that the trend lines where something they actually should learn to use and the possibility exists that they did not reflect upon it. The instructions for the participants with trend lines where extended with the following sentence; "In the top of the game you will see lines representing a prognosis of the trains future horizontal positions", on a two paged instruction manual this might have been overlooked. One of the most basic rules when it comes to writing a text or designing a website is that the more important something is, the more prominent it should be. A larger heading, bolder text or even a distinctive colour are examples of how a reader could identify and understand what is important in the text. Another important aspect is that people often scan a text and do not always read it word by word; it might be different if a participant is in a situation where he or she gets an instruction manual instead of browsing a website. However, we tend to think that our behaviour is more sensible then it really is. (Krug , 2006). To see if the participants will perform better when actively learning and noticing the trend lines, the participants will receive an instruction where the importance of the trend lines are explicitly described. Furthermore, to understand if the participants actually use the lines, even if they appear on the screen, a Think Aloud (TA) study will be conducted which can add insight into the intentions and strategy of the participants. During the analysis a comparison will be made between what the participants did from TA-recordings and was seen during the recordings. This might help to shed light on previous experiments, where analysis was made with an interview as foundation, and also to understand more about the perceived cognitive difficulty and the actual difficulty. 12 3. Theory 3.1 Dynamic Decision Making Dynamic decision means that the choices made are based on a series of earlier decisions. In turn, the decisions are depending on a changing world driving the process of choice making to be done in real time and without a long time of deliberation (Brehmer, 1993, 211). So a decision is based on earlier decisions and a series of decisions are needed to reach a goal. With other words the most recent choice made will constrain the following questions ahead and the state of the problem will depend on the decision maker’s actions in real time (Edwards, 1963). The dynamic decision making can be studied in the field where a real system is in use, like a train traffic control centre. However, the data collection is time consuming and the data gathering itself can be dangerous due to that the collection require some level of interventions in the usual work process effects on the real time planning of trains can not be foreseen (Logie, 2011). When studying dynamic decision making in an isolated part of a program using a microworld is helpful (Brehmer, 1992). The term microworld was introduced by Seymour Papert and refers to a "subset of reality or a constructed reality so as to allow a human learner to exercise particular powerful ideas or intellectual skills" (Papert, 1980). So the idea of microworlds emerged from a learning perspective. Nowadays however, microworlds are used in evaluation of decision making and in usability studies. The microworld can be described as a programmed simulation of an environment which enables a dynamic process where the user manipulates and explore the program 3.2 Microworld The use of microworlds fits the purpose of studying decision making in a complex environment like train traffic control. With a microworld study the inclusion of dynamic dimensions, stress and real time decision making are feasible. This, because the main purpose with the microworld is to simulate dynamic decision making so by nature it has to have a level of complexity. They are designed to have a feedback that does not give a hundred percent clear response - the user has to search, interpret and then decide what the next step should be. 13 Also, the feedback is not given immediately which defines a complex system where the user has to use a feedforward strategy (Brehmer, 1992). The progress of the simulation is therefore not to be controlled by the time it takes for the user to process and think, instead it should be an ongoing process. Four conditions should be remembered when building a microworld. Namely, there has to be a goal with using the system, the user must understand in which state the system is at and the system itself has to be interactive so the user is able to change the state, with various actions. Furthermore, the microworld must be a model of a system or at least that the controller should behave as if he or she have some kind of familiarity with the system and its controls (Brehmer, 1992). 3.3 Think Aloud For an evaluation experiment different methods can be chosen. One of the most used methods is the Think Aloud (TA) method. The TA-testing will not prove whether one interface is better than another but instead provide an input on why the Interface is or is not preferred. When we are using an interface everything can make us stop and think, even if we do not reflect upon it; like colouring or smart names. In general, people do not like to get disturbed by thinking of the interface when wanting to complete a task (Krug, 2006). By using a TA we want to capture those kind of sudden and unexpected thoughts (Krug, 2006). There are two types of TA; Retrospective and Concurrent. In the latter the participants are asked to talk aloud when completing a task or a set of tasks; this type enables an understanding of what participants think about the interface and how the process of using the trend lines play out. Instead of talking aloud while completing the task the Retrospective approach askes the participant to do the task in silence. Only afterwards the participant can comment on their experience of the interface, which can provide additional insight into the intentions and strategy of the participant. To summarise: the Retrospective type is focused on why the participant used the interface in a specific way and the Concurrent type is focused on how the participant use the interface (Hanington and Martin, 180). With the purpose of extracting information on how the students use GridRail or what they actually think of and use as a tool when solving the task. Furthermore, a good reason for choosing a TA rather than only an interview after the participants used GridRail is because humans are good at rationalizing. This can be expressed in making up convincing reasons for their 14 behaviour after the event, presumably making use of theories about what is appropriate. So if parts of the task are done unconsciously then they are not available for report later on. The person reporting may say what he genuinely thought he did but this may not be what he actually did (Bainbridge, 1999). In order for the TA experiment to give results the observed performance must first be translated in to data and then analysed. In order to use different analysing tools the data must be divided in to soft or hard data. Hard Data is defined as data in the form of numbers or graphs and can therefore be used in various statistical analysis and soft data. Soft data on the other hand can be interpretation and opinions. In this thesis we are more interested in hard data and in order to turn a Think Aloud in to hard data the recordings should be translated in to a transcript (Ericsson and Simon, 1993). 3.3.1 Pre-processing Words that are repetitions and sounds of stress are eliminated in the transcription. This step is called pre-processing. When the transcription is done, the next step is to encode the text by translating it to terminology of the theoretical model used. Determining the terminology is done beforehand and then the translation is done by a human who judges the information independently of the surrounding segments (Ericsson and Simon, 1993). During the translation each phrase can be analysed by content analysis. During this stage sentences are categories by type and then the number of phrases in each category Is counted. The categories of phrases types can be based on the referents of content words in the phrases, their syntax or the implied cognitive processes (Bainbridge, 1999). 3.3.2 Basic assumptions for the Think Aloud • Verbal behaviour is a recordable behaviour • The cognitive process that generate any kind of recordable response behaviour • The participants behaviour can be viewed as a search through a problem space, accumulating knowledge about the problem situation as he or she goes on. • Each step in the search involves the application of an operator to knowledge held by the participant. Application of the operator brings new knowledge moving the subject to anew point in the problem space. 15 • The verbalisations of the subject correspond to some part of the information he or she is currently holding and usually to information that has recently been acquired. • The information consists primarily of knowledge required as inputs to the operators, new knowledge produced by operators and symbols representing active goals and sub goals (Ericsson and Simon, 1993). 3.3.3 Problems with Think Aloud The participant may not report what is obvious to him. He might collect unmentioned information while reporting other activities. Also, the participants might use beginner’s methods or doing things in sequence rather than doing several things at the same time because it is easier to describe. (Bainbridge, 1999). Furthermore, no matter if the thoughts are verbal or non-verbal, when someone is reporting, he can choose what to make public. And because participants want to help they often say what they think the experimenter wants to hear instead of what they actually think. Just because something is not mentioned in the protocol does not prove that the operator does not know it. For this reason, it is good to follow up the Think Aloud with an interview. (Bainbridge, 1999). 16 4. Methods Every piece of research based on usability is part of the ongoing project of understanding users of complex systems and how to build better systems (Kuniavsky,2012). The future users may be the train controllers or strangers who have never used complex systems but regardless of the user, the more experiments we conduct the bigger our knowledge bank will be. The intention of the experiment is to understand whether or not the trend lines are helping the users to plan the train route in the most efficient way. Earlier research (see part 2.1 tracked the time it took for each participant to complete each trial and compared trial time between the two groups with and without lines. The results showed that the lines did not make the participants finish the trials faster than the participants without lines. To further investigate if the lines help to decrease the cognitive workload these experiments will not only focus on the time itself. Instead, participants expressions and reactions during the game play and how they plan and drive the trains will be included. The expression and reactions will easiest be collected via a Think Aloud (see section 3.3) because the thoughts are produced spontaneously during the game and collecting them afterwards, via for example a survey, would not give the precise content. To collect data about how the trains are driven; the amounts of collisions will be counted. Participants which will play the game with the intersections lines can subconsciously understand and might play the game so that the trains will not collide; because the idea of the intersections lines is to show where the trains intersect and therefore avoid collisions. An imaginable scenario is that the participants without lines might speed up the trains to finish the game quicker without considering that the trains should not collide. Therefore, collisions will be counted and included to the instructions, for both groups, will be explicit information that the goal is to finish the game as quick as possible without collisions. This also increases the chance that the complexity is high for both groups. 4.1 Participants For better comparison between experiments the difference between the participant’s demographics should be low. Earlier experiments where run with participants from Uppsala University’s technical programs and PhD students 17 with an average age round 25 (Sercan, 2015, 45). Table 1 shows the requirements. The preference of participants who have taken a course in HCI is based on the choice of method. A TA-study is only successful if the participants do not forget to talk aloud while conducting the experiment. People who are familiar with these types of studies might feel more comfortable during a TA and might understand that they should think aloud rather than explain what they do. Table 1. Demographics for recruiting Demographics Preference Ages 20-35 Gender AX University education Ongoing Preference (not compulsory) People who have taken at least one HCI- course Experience with GridRail None Experience with computer products High; at least two technical courses Targeting single or multiple group Single group Undesirable characteristics experience with GridRail 4.2 Design The intention of the experiment is to understand whether or not the trend lines are helping the users to plan the train route in the most efficient way. Earlier research (see part 2.1) tracked the time it took for each participant to complete each trial and compared trial time between the two groups with and without lines. The results showed that the lines did not make the participants finish the trials faster than the participants without lines. To further investigate if the lines help to decrease the cognitive workload these experiments will not only focus on the time itself. Instead, participants expressions and reactions during the game play and how they plan and drive the trains will be included. The expression and reactions will easiest be collected via a Think Aloud (see section 3.3) because the thoughts are produced spontaneously during the game and collecting them afterwards, via for example a survey, would not give the precise content. To collect data about how the trains are driven; the amounts of collisions will be counted. Participants which will play the game with the intersections lines can subconsciously understand and might play the game so that the trains will not collide; because the idea of the intersections lines is to show where the trains intersect and therefore avoid collisions. An imaginable scenario is that the participants without lines might speed up the trains - to finish the game quicker - without considering that the trains should not collide. Therefore, collisions will be counted and included to the instructions, for both groups, will be explicit information that the goal is to finish the game as 18 quick as possible without collisions. This also increases the chance that the complexity is high for both groups. During the experiments there will only be one evaluator present in the room together with the participant. More evaluators could affect the participant and we want to examine how the participant solve the game and behave as if he or she would be in the participant’s own home. In other words, only one evaluator could make them be more relaxed. In order to compare the outcome between the different participators a controlled setting, or at least similar, isolated rooms are needed. This to know that outer influences does not affect the experiments differently. Therefore, one room will be booked for all the experiments and the setting will look similar for each participant (See figure 6 and 7). There will be two blocks that are 30 min each distributed on two different occasions. See table 2 and 3. During the second block the TA-session will be conducted and a structured interview. Half of the participants will have explicit information about how to use trend lines Table 2. independent variables TA in Block 1 8 participants 8 participants Lines No lines TA in Block 2 8 participants 8 participants Table 3. Dependent variables 1 2 3 Collective Data Time per trial Number of collisions per trial Think Aloud transcript 4.3 Pilot study In order to test if the experiment setup was feasible a pilot study was conducted with the same game constructions which was used in earlier research (see part 1.2) but added amount of trials so they were divided by two blocks with 20 trials per block and 40 trials in total.The participants, both with and without lines, acknowledged that the trains were moving slow. The slow trains could be a possible obstacle in detecting whether the lines decrease the cognitive workload or not. The idea of using microworlds (as seen in part 4.1) is to evaluate an complex system. Therefore, we increased the speed of the trains proportional to each other to be sure to challenge the participants for comparing the cognitive ability later between the groups. 19 4.4 Material For this experiment a MacBook Pro (Retina, 13-inch, Late 2012) was used together with a complementary mouse. A computer was necessary in order to use the microworld GridRail. GridRail is a software for testing how participants react or change their behaviour based on trend lines. Beneath in figure 4 and 5 the interface of GridRail is shown. Figure 4. GridRail interface Figure 5. GridRail interface 4.5 Procedure All the participants were given a time slot of 2 hours for the experiment located in the same room. See figure 6 and 7. In order to have comparable results the input from the surrounding should be similar. The evaluator is one outer fac20 tor that influences the behaviour of the participant. To decrease the influence a manuscript will be followed by the evaluator. The only conversation that is Ad-Hock will be the answers to the participants questions which of natural causes will be different in each situation. A manuscript was used for the conversation between me and the participant. See table 4 for the manuscript. The part in the middle was said before the block with verbalisation. For half of the participants this was done after block 1 and for the other half the text was read coherent. Based on the participants line condition a instruction over GridRail was read after the first manuscript.(See appendix 1 and 2) Before the start of GridRail a survey was conducted on the screen to determine demographic segmentation and technology use. This gives a possibility to exclude participants to make the group more homogeneous. After filling in the question sheet the first trial began. In order to see the screen but still be out of sight the evaluator sat behind the participant, taking notes. When 20 trials were over, the a text saying that the participant has a 5 min break showed on GridRail. The participant was then told that he or she could go to the toilet or just stand up and when they are ready, they could start. When the participants came back from the break they were either told them that they did not have to verbalise in this block or they were read the verbalisation instructions (see table 4). 21 Table 4. Manuscript for the experiment Welcome This experiment will be held in English - I hope it’s ok with you. Please take a seat in front of the computer. Follow the text while i read the instructions With your permission, we’re going to record what you do on the computer screen and what you have to say. The screen recording will be used only to help us in the experiment because I don’t have to take as many notes. If you would, I’m going to ask you to sign something for us. It simply says that we have your permission to record you. (Say this before TA-block) This session we want to hear exactly what you do, so please talk aloud while you think, don’t worry that you’re going to say anything wrong.As we go along, I’m going to ask you to talk out loud,to tell me what’s going through your mind. This will help us. If you have questions, just ask. I will not be able to answer them right away, since we’re interested in how people do when they don’t have someone sitting next to them, but I will try to answer any questions you still have when we’re done. Do you have any questions before we begin? To demonstrate how to talk during a Talk Aloud I will talk aloud while counting the windows in my mothers home. Now it is your turn, please talk aloud while mentioning one country in each continent. Good, now lets begin! 22 Figure 6. Settings for the experiment Figure 7. Settings for the experiment 23 5. Results 5.1 Learning curves From figure 8 we obtain four graphs. The y-axis show the time measured in seconds and the x-axis show each trial. The first picture in both rows represents the 20 trials in which the participants verbalised, the second picture on the same row represents the block in which there were no verbalisation. Figure 8. Time - Interaction between all the variables In other words, in the first row the trials go from 1-20 and then 21-40 and in the second row 21-40 and 1-20. The red Lines represent participants with the 24 trend line condition and the blue lines represent participants without the trend line condition. 5.2 Effects of trend lines This section includes figures and statistical analysis based on the research question whether trend lines in GridRail are helping the users. Results are divided based on the mean time it took for the participants to finish each trial and the mean time the participants thought before each trial. 5.2.1 Boxplot - Mean time per trial The boxplot in figure 9 show the mean time, divided between participants that played with or without lines to visualize the difference. The statistical difference is shown in table 6 and the table 5 shows the boxplot in numbers. Figure 9. Time - Interaction between all the variables Table 5. Box Plot. Variable No Lines 1-20 No Lines 21-40 Lines 1-20 Lines 21-40 Mean 119,700 87,917 110,485 89,721 Minimum 78,604 74,154 87,986 73,182 Median 101,858 87,116 101,378 91,387 Maximum 303,540 128,806 208,230 103,933 Continued on next page 25 5.2.2 Boxplot - Mean time before each trial The boxplot in figure 10 shows how long each participant waited until starting the next trial - this is assumed to be a time where the participants think and plan their next trial. Figure 10. Planning time before each trial 5.2.3 ANOVA for Mean time Table 6 illustrates the result from the multiple variance analysis. The ANOVA was conducted with performance time as a dependent variable and the target conditions was Block (If you started with verbalisation or not) conducted with performance time and if you had trend lines or not. The decision criterion of 5 percent is used for the analysis. With a multiple-way analysis of variance (ANOVA) for the mean time we can see the variation among variables and between variables. This helps us understand if and how the intersection lines affected the mean time outcome of the trials. With different variables involved it is important to segregate the impact of the lines from the other dependent variables. Table 6. ANOVA for mean time. Effect SS (1)Lines 0 (2)Block 23709 Lines * Block 4004 Error 228727 (3)Verbal 24608 Verbal*Lines 2549 Verbal*Block 164518 Verbal*Lines*Block 1167 26 Degr. of Freedom 1 1 1 28 1 1 1 1 MS F p 0 0.000 0.998052 23709 2.902 0.099523 4004 0.490 0.489642 8169 24608 7.310 0.011525 2549 0.757 0.391560 164518 48.873 0.000000 1167 0.347 0.560636 Continued on next page Effect Error (4)Trial Trial*Lines Trial*Block Trial*Lines*Block Error Verbal*Trial Verbal*Trial*Lines Verbal*Trial*Block Trial*Lines*Block Error SS 94254 305227 28297 24766 35625 514594 30158 17920 198045 43679 448637 Degr. of Freedom 28 19 19 19 19 532 19 19 19 19 532 MS 3366 16065 1489 1303 1875 967 1587 943 10423 2299 843 F p 16.608 1.540 1.348 1.938 0.000000 0.067102 0.147693 0.010044 1.882 1.118 12.360 2.726 0.013361 0.327615 0.000000 0.000122 Continued on next page Explanation of variables Verbalisation : Mixes both blocks but groups the trials based on if there was verbalisation during those trials or not. Block : Mixes both trials with and without verbalisation but groups trials depending on if participants starts or ends with verbalisation. Trial : There are 20 trials per block and hence the trials goes from 1-20 Lines : This condition separates the trials which have trend lines present against those trials without trend lines. Results from table 6 Main effect From the results we can obtain that there is no significant main effect for variables Lines or block (p=0.998, p=0.099). However, there is a significant main effect with the variables verbalisation and Trial (p=0.012, p=0.000). Two-interaction effect There are no interaction effects between Lines and Block (p=0.489642), Verbal and Lines (p=0.391), Trials and Lines (p=0.067) or between Trial and Block (p=0.148). The interaction effect are found between following variables: Verbal and Block (p=0.000) and Verbal and Trial (p=0.013). 27 Three-interaction effect There is no significant effect between Verbal, Lines and Block (p=0.560) or between Verbal, Trials and Lines (p=0.328). However, the effects is seen between Trials, Lines and Block (p=0.010) and between Verbal, Trials and Block (p=0.000) 5.3 Analyses of collisions Looking at figure 11 we can see a difference between Line and Trial. Also, when looking at the figures 12,13 and 14 which are just different ways of representing the mean collisions per trial with respect to if the participants had trend lines or not. In figure 12 which has separated the trials with respect to verbalisation and order; we can then see that the participants who verbalised in block 2 have a lower collision rate when playing GridRail with trend lines. In the figures 12 and 14, we obtain that the participants with Lines started and ended with a lower mean time, except in block 1 when participants start with verbalisation. The multiple-way analysis of variance (ANOVA) for collisions gives the variation among variables and between variables. The decision criterion of 5 percent is used for the analysis. The ANOVA for collisions was done to understand and see if the impact of trend lines for collisions during GridRail is significant. There is no significant effect loking overall on the trials. However, we conducted another ANOVA, see table 8, because of the visual difference in collisions (that is visualized in figure 12) and the results show a statistically significant impact of lines on collisions when playing in block 2. 28 5.3.1 Division by lines Figure 11. Collision Mean per trial 5.3.2 Division by verbalisation figures with Order 1 shows the first 1-20 trials and Order 2 shows the last 2140 trials. The results are then divided in two two columns where Verb 1 shows data from particiapnts that verbalised in the first block and the other column Verb 2 shows data from participants who verbalised in block 2. 29 Figure 12. Collisions - Interaction between all the variables 5.3.3 Participants without trend line condition Figure 13. Collisions - Trials without trend lines 30 5.3.4 Participants with trend line condition Figure 14. Collisions - Trials with trend lines 5.3.5 ANOVA for collisions Table 7. ANOVA for collisions. Effect Degr. Lines Block Lines * Block Verbal Line*verbal Order:verbal Line*Order*verbal Trial Line*trial Order*trial Line*Order*Trial Verb*Trial Line*Verb*Trial Order*Verb*Trial Line*OrderVerb*Trial of Freedom 1 1 1 1 1 1 1 19 19 19 19 19 19 19 19 MS 153.32 3.94 197.66 21.788 5.126 0.413 9.976 3.162 2.153 1.943 1.951 3.247 2.868 2.879 2.573 F p 1.920 0.177 0.049 0.826 2.476 0.127 1.826 0.187 0.430 0.518 0.035 0.854 0.836 0.368 1.007 0.451 0.686 0.835 0.619 0.893 0.621 0.891 1.065 0.384 0.941 0.532 0.945 0.527 0.844 0.654 Continued on next page 5.3.6 ANOVA for collisions in Order 2 31 Table 8. ANOVA for collisions. Summary 1 Block Count Sum Average Variance 20 51.125 2.556 0.398 Count Sum Average Variance 20 27.625 1.381 0.132 Count Sum Average Variance 40 78.75 1.968 0.612 Source of Variation Sample Columns Interaction Within Totalt SS 43.697 1.762 1.837 21.338 68.635 2 Block Total Without Lines 20 40 63.125 114.25 3.156 2.856 0.422 0.492 With Lines 20 40 27.5 55.125 1.375 1.378 0.169 0.147 Total 40 90.625 2.265 1.101 ANOVA df MS 1 43.697 1 1.762 1 1.837 76 0.280 79 F 155.634 6.278 6.545 p-value 4.489E-20 1.436E-2 1.250E-2 F-crit 3.966 3.966 3.966 Continued on next page 5.4 Effects of verbalisations Based on the research question if participants can descriptions how they solved the task and give input to whether the trend lines are helpful following results are shown. The figures show how verbalisation gives an effect on the mean time and what participants said during the verbalisation. 5.4.1 Mean time based on verbalisation In figure 15 we can see the spread of mean time per trial based on the variable Verbal. The variable is divided in to two groups; trials which are played during verbalisation (1) and trials without verbalisation (2). This visualisations, together with the ANOVA in table 6 how verbalisation effect the mean time. What can be read from Figure 15 is that the trials in which the participants verbalized have both a higher maximum and a lower minimum than for trials 32 in which there were no verbalisation. Overall, the trials without verbalization had a lower mean time. Figure 15. Mean time grouped by verbalisation condition 5.4.2 Variable interaction: Verbal and Order In figure 16 we can see that participants finish the game faster in block 2 regardless of when they verbalize but the variation within the same group is bigger when the group is verbalizing. The mean time for all trials in block two is similar regardless if one starts or end with verbalization. However, fastest time in each group is similar for the trials in block 1 however the group that is verbalizing in the second block reaches a faster time. Order 1 & Verbal 1: Order 2 & Verbal 1: Order 1 & Verbal 2: Order 2 & Verbal 2: Block 1 trials for participants who verbalized in block 1. Block 2 trials for participants who verbalized in block 2. Block 2 trials for participants who verbalized in block 1. Block 1 trials for participants who verbalized in block 2. Table 16 shows the interaction between the order of the verblisation, the verbalisation and how it effects the mean time per trial. 33 Figure 16. Interaction between Verbalisation and Order 5.5 Analyses of verbalisations The notes taken during the experiment are divided on both verbalisation and trend lines and give information about the amount of trains used at the same time. This gives indications on a difference between how GridRail is played depending on trend line condition. Table 9 which is obtained from the notes indicates a difference in how participants play GridRail based on how many trains they use at the same time. Reading the verbalisation in table 11 we can see that participants express that they feel stressed in all four conditions. 5.5.1 Summation from table 11 Comments about the trend lines: • Now I’m going to think more about to look at the collision lines, they are hard to look for but very helpful • Now I’m totally looking at the lines so much more than previously. • I’m thinking that I should use the lines more but I’m stressed because I don’t want to collide and get a better time 34 • I’m just thinking of the lines and where they intersect • You see directly when the trains are going to collide and then you also try to buy some time • Its hard now to align (the trend lines) and when I have more trains it get problems Positive comments from participants with lines: • And the nice thing with this strategy is that I save the fast train last • Its nice when you dont have to think about all the things at the same time • You get happy when you see the time because you can finish the game very quickly. • I’m getting closer to my best time so it makes me happy. • This feels better timewise. • That feels quite good timewise. • Now I’m going to think more about to look at the collision lines, they are hard to look for but very helpful. Positive comments from participants without lines: • that is awesome. • Oh no thats amazing. Now its quite easy because I can move them all straight forward. • I think it was quite good how I did. Why not, it worked this time. • This was quicker! • That was good, I think Im improving. I stopped thinking • This was pretty good 5.5.2 Breaking down numbers from notes The data in table 9 is taken from table 10. Table 9. Numbers from Notes. Condition Driving multiple trains No lines VB 1 3 No lines VB 2 5 Lines VB 1 7 Lines VB 2 5 Plans for a long time 3 5 2 2 Continued on next page 5.5.3 Notes from experiment by trend line condition and verbalisation order 35 Table 10. Notes. Notes No lines VB 1 Have difficulties driving the trains and focuses on rules Driving with multiple trains but not suceeding in steering without colliding First just one train and then almost in block 2 the participant changes to multiple trains Thinks a lot before every trial but stills holds on to one tactic One train at a time, does not run smoothly Thinks a lot before every trial and reads instruction frequently Many trains but collides a lot, changes tactic to driving 1-2 trains with lower speed Drives slowly but still collide a lot No lines VB 2 Fokus mostly on speeds rather than Lines Uses one strategy, mostly drives with two trains Thinks a lot before every trial, have different stratagies Multiple trains Plans for a long while and try to mesure actual distance on the screen Many trains at the same time. Thinks a lot before. Does different test within one trial Many trains. tries to calculate distances and speed. Thinks a lot before each trial Many trains but gets stuck in trying get trains in to one intersection. Many collisions Starts with more than 3 trains, thinks a lot. Draws a plan on a papaer. Lines VB 1 Uses multiple trains New tactic in block 2 First one train at a time, understand the lines in the middle of block 2 New stratagies and drives multiple trains at the same time Many trains at the same time. Thinks a lot before each trial. First just one train at a time and then to multiple trains, tries to calculate speed Many trains at the same time (sometimes more than 4) new stratagies Starts with 1-2 trains and then drives 3 trains at the same time in block 2. New stratagies Lines VB 2 Testing a lot without a special tactic Uses more than 2 trains at the same time, changing tactics a lot Many trains at the same time and new stratagies More than 2 trains at the same time, slowly and does not speed up after aligning the lines Many trains at the same time. Hardly any collisions. Forgets one train many times. Drives few (mostly 1-2) train at the same time, thinks a lot before each trial Many trains. Thinks a lot before each trial. Try to calculate distance on paper Drives only 1-2 trains at the same time. Confuses direction of train Continued on next page 36 5.5.4 Verbalisation transcript Table 11. Verbalisation transcript. ID Verb Lines and Verbalisation in 1 block 2 Why is it stopping. I keep forgetting. 2 Whoops thats going to crash. Thats not god. 5 Not fast enough for the controlls. Its hard to get it right. 5 I could definitey done that one better. 5 Its hard to control, I want to go with multiple trains and its hard to 5 control all the trains at the same time 5 This did not work out at all. I dont learn from my misstakes. 5 You should not be tired when you do this. 5 I want to make a better time but when I try it I keep on crashing. 12 Its hard to focus the game pulls you in and its hard to multitask 12 We are not really made for it. Its all about moving multiple trains at the same time. 12 I totally see this beging used in education. 12 Where there job consists of making the train move without any collision ahappen 14 I guess I failed. 14 I’m thinking to try have them moving at the same time and then just pass each other. 14 I had problems. 18 The average time is sinking but im trying to get it 18 Im starting to get tired. A little bit better but still bad strategy 18 im trying the âmeeting in two placesâ strategy but it does not get any better. 22 Im trying to take in consideration the length of the road and the speed of the train 22 I dont know how to do this because I dont know the exact distance. Its mostly my intuition 25 Im thinking that my goal is to just click the trains once and not have to stop them at all. 25 No Im going to think more about to look more at the collisions lines 25 they are hard to look for but very helpful. Lines and Verbalisation in 2 block 4 And the nice thing with this strategy is that I save the fast train last 4 Its nice when you dont have to think about all the things at the same time 4 Its hard now to align. When I have more trains it get problems, 4 im trying to not do to many stuff simultaneously 4 because you forget easily if two trains collide 4 so im trying to do as easy as you want for yourself and at the same time. 4 It gets harder when I have the fastest and if.. 4 I only have two trains to think about its a low risk. 4 You see directly when the trains are going to collide and.. 4 then you also try to buy some time 7 Okey its bad 7 Okey I should not have done that. That did not work.. 7 changing direction. Still a long waiting. Continued on next page 37 ID 7 9 9 9 9 9 13 13 13 13 13 13 13 13 13 13 13 13 13 13 19 19 19 19 19 19 23 23 23 23 23 23 28 32 32 32 32 32 32 32 32 32 32 32 32 38 Verb Maybe this was to much It so hard to controll them all if you move many but you still want to use many to get a lower time This gets so stressfull if you have all the trains. This was to much clicking. Its always the same speed even though I think that I’am succeeding. Sometimes its easy to forget that you need an empty spot.. in the parking so you can fit another train. But once you’ve solved that its pretty okey. Sometimes when I think that there were passangers in the trains I think they would get sick because I speed it up and then slow it down. Its so weird because its only a game but its stressfull to see where the trains will can see collide. You get happy when you see the time because you can finish the game very quickly. Im getting closer to my best time so it makes me happy. I want to train many trains simultanously but if they are to close and something happens then you don’t have the time to stop them. And they will collide with each other even though they travel at the same direction. Its like every time a get a pretty fast time I get to excited and then i over think so I forget one and make stupid mistakes. That feels quite good timewise. I don’t know how to speed up and get as many trains at the same time as possible. This is going to be bad This feels better timewise. Im to stressed, Im doing stupid misstakes when I’m stressed. Im getting annoyed with myself but it might be a bad choice. "Its not the worse decision I made I always end up with the slowest one so what I want to go for is the average speed meets the slowest one to meet but it is really hard to make that really happen as smoot as possible, Dident make much of a differencee Now I’m only thinking move faster, go faster and Im already thinking about my next trial. Now I’m totally looking at the lines so much more than previuously. I’m thinking of my time constantly because I want to be better. I’m thinking that I should use the lines more but Im stressed because I don’t want to collide and get a better time Im just thinking of the lines and where they intersect, even thought my time is not the best I like it more because they give me more control over the situation and I like that. I was to fast in my mind. Im going to go with more trains and see what happends. Its easier if only two trains are going against each other I know that I have to change strategy but i dont know how. Everything was going so good and then I forgot everything. Continued on next page ID 1 1 1 1 1 1 1 1 1 1 1 6 6 6 6 6 6 6 6 6 6 10 10 10 10 10 15 15 15 15 15 15 15 15 20 21 21 27 27 27 31 31 31 31 Verb No Lines and Verbalisation in 1 block that is awesome. Wow it did not go , it sucks, I understand. This one was a tough one I dont know how to play this. Oh it worked, lets do this stupid tacktic again. Will it crash? Oh no thats amazing. Now its quite easy because I can move them all straight forward. It was not easy but it was not to difficult but I would say more difficult than easy. I think it was quite good how I did. Why not, it worked this time. How is that possible? I am doing to many things at the same time. That was stupid. What am I doing. I tried a new way to do it but I just got lost. I think its easier to do it as I did before Its difficult. This was quicker. Okey, this one was a bit quicker. Bad. This is messy. This is worse Its getting quite stressful. Okey I forgot about the white one. Not to bad. This is not good timing Okey. It was also really quick even if it was not good No now its messy, I was trying to increase the train but it did not do much for the timing. Ill try this again because I did some errors Im thinking if i can improve it somehow. Because now I just do trial and error. I don’t know if this is the best strategy but sometimes.. you just have to pick something and optimise it Its easy to solve but hard to master. Okey so this did not work out to well. This did not go to well. You need to have a sense for multitasking for this to work. It still takes over a minute. The hardest part is that you want all trains to move simultaneously because you want the fastest time but then its hard to control all the trains. I’m feeling a bit frustrated. It feels as if it should not be that hard. Okey so that seems as a good way of doing this. Its hard to avoid collision. I think I have messed it up already. This is not fun Its not har to play but its hard to understand the rules Oh no they would collide. The red and the white train can meet if you make them to max speed but the hard part is not make them crash and also turn on the other trains. This game is quit hard I think. efore I controlled the speed of all trains and its a little bit confusing so if I control one and the other are in between. It did not work out at all. I am actually a bit confused. Continued on next page 39 ID 3 3 3 3 3 3 8 8 8 11 16 16 16 16 16 17 17 17 17 17 17 17 17 17 17 30 30 40 Verb No Lines and Verbalisation in 2 block I thought wrong. Im not sure what to do, maybe put the grey in the position no then they would crash. That was a failed attempt. No that wont work Maybe I should try something else. And now we have a crash. I think I know what to do now, I just need to do it precises. So I just tried to do three train at the same time but I failed. Im trying to use many trains at the same time but it does not work, lets try it again. So now I tried to do three trains at the same tie but failed. Im trying this new game It was very strange that it did not really do what I thought it will do. Its really fun, It was not as easy as I thought That was good, I think Im improving. I stopped thinking I just keep getting worse. This was pretty good Im feeling that I’m to smart but i cant play this I dont really know the logic behind this strategy but trial and error has shown that this strategy is effective. Im starting to think that I should do something else but I don’t now how to do it more efficient. I tried to make the crossing with all the trains at the same time but its really hard and apparently I also forgot one train I think that if I do this propperly my time will be better than my record. I dont have any better ideas unfortunately. I feel kind of stuck and that I dont do progress anymore. Maybe I could do a quicker execution. No, that was bad really bad. This was a bad idea they collide and lets try something else Continued on next page 6. Discussion 6.1 Analysis of results To answer our research questions if the trend lines in GridRail helping the users and if the participantâs descriptions of how they solved the task give input to whether the trend lines are helpful we will go through the results. In section 5.1 the learning curves show that there is no significant difference on how the participans mean time decreases between the conditions with or without trend lines. Looking at the boxplot in section 5.2.1 and the Analysis of variance for mean time per trial the main effect of trend lines is not significant in the mixed-ANOVA but the variable is included in a significant three-interaction effect between trials, lines and in which block the participants verbalised. We can see a difference between the mean time it took for each participants to think before starting a new trial (see boxplot 5.2.2). Participants with trend lines did not take as long pauses between the trials. This might be because of various reasons and in part 3.1 we can read about the line of necessary decisions that have to be made in order to reach a goal and that the cognitive workload is lower after repeating the same task multiple times because it is connected to the short term memory. An idea is that participants with trend lines stick to one or a few strategies, repeating them, and therefore do not have to go back to the process of new decisions and re-learning. Additionally, because of the fact that the mean time of the trials does not differ in the same way as the planning time and the participants with trend lines do not re-plan their action in the same amount it could indicate that they are satisfied with their strategy. Meaning that the participants align the trend lines so there would not occur any collisions and then wait for the trial to finish. On the subject of different playing strategics the figures 14 and 12 give an indication of a difference of the amount of collision incidents per trial between participants with and without trend lines. When looking at the mixed-ANOVA for all the trials and conditions there were no significant results. However, the two-way ANOVA was constructed for trials where verbalisation was done in block 2. The two-way ANOVA showed an effect of trend lines which is statistically significant for participants who verbalise in block 2. 41 One could argue that keeping an eye on the trend lines require the same mental resources as actually planning and driving the trains and that the mean time results proves it. (more about cognitive workload can be read in section 3.1). One could also argue that these experiments have given more legitimacy to earlier experiments, adding transparency and depth for further discussions, where the hypothesis about trend lines being helpful was falsified. Yet, as mentioned above, there is a difference between how long each participant wait and think before each trial.Participants with trend lines are more satisfied with their option. While participants without lines might not remember entirely how they arranged the route when they don’t see the intersetion lines through the whole trial; making them try new ideas. Another thinkable scenario, could be that the participants without lines are exposed to higher cognitive workload resulting in a high cognitive fatigue and longer resting and planning periods in between trials. (Read more about cognitive fatigue in section 3.1). The trend lines seem to make the participants play the game differently, even though resulting in similar mean time. An observed difference is also seen in the amount of trains driven at the same time.Participants with trend lines use multiple trains at the same time more often as seen in table 9. So to answer the whether the trend lines in GridRail are helping the users we can look at what the participants said and as seen in subsection 5.5; in the few cases where the participants mention the trend lines the majority is a positive opinion (5 out of 6). We can also see that the trend lines are helpful when it comes to avoiding collisions, as mentioned above. Based on the statistical analysis in table 6 we found that verbalisation gives an significant main effect of how well participants perform. In figure 15 we saw that the participants which started verbalising in the first 20 trials had a higher mean time than participants that did not verbalise in the first 20 trials. However, the improvement between block 1 and block 2 was significantly higher for the group who did not verbalise in block 2 and hence, in the last 20 trials the mean time evened out; both groups ended with almost the same mean time. The participant’s descriptions of how they solved the task have given an input on whether the trend lines are helpful. However, the participant’s descriptions have not given more insight in to previous experiments made on GridRail; comments about trend lines where not many. 42 6.2 Discussion of methods used One of the goals with the experiment was to get new insight and shed light on previous experiments. However, these experiments have instead invoked new questions about trend lines. The instructions given to the participants told them to drive the trains to the opposite stations, without collision, as fast as possible. Looking at the time; the mean time data showed that the lines did not give any significant effect. However, when looking at the videos of the trials we can see a difference in behaviour. This was tried to be highlighted with data of amount of collisions, how long it takes for a participant to plan before each trial and how many trains the participants drive at the same time; data that give a hint to a difference in behaviour but does not statistically prove the differences. Observed in the both the comments from the participants, and form the notes taken during the trials was that some participants with Lines did not use them in the beginning of the game. After noticing or understanding the trend lines the participants expressed positive emotions. Excluding trend lines based on mean time per trial before making an experiment based on the behaviour differences would therefore not give the entire picture. Further investigation are needed on the difference in the actual planning of routes before we exclude that trend lines does not help the users.This, because the aim of driving the trains for both the controller and the participant using the microworld is to smoothly plan and execute new routes, with multiple trains. Measuring the time should therefore just be one of many factors. 6.3 Conclusion The purpose of this study was to see how the game play was affected by absence or presence of trend lines. A between study that was conducted; we made 32 participants use an already developed microworld called GridRail a game with the purpose of driving 6 trains to opposit stations without collisions as fast as possible. There is no significant difference on how fast the participans fnished each trial between the conditions with or without trend lines. However, comments in verbalisations and notes of how many trains each participant drives at the same time are an indication that the lines do affect how participants play, even though the finish time is not affected. 43 6.4 Future research 6.4.1 GridRail GridRail should be made easier without any playing exceptions. For example; it should be made possible to enter an intersection even if there is one paused train on it. So the train should not stop until it actually bumps in to the other train. Furthermore; the trend lines should also include when trains from the same station will collide with each other (now the lines only show when you will bump in to a train from the opposite station) and circles which indicate how close two trains can be without colliding in to each other should be visible all the time (now only the klicked/marked trains circle is showing). These changes would make it easier for the players to focus on the actual task with driving the trains to opposite stations without collision and planning their routes. The situation caused by the different rules and exceptions make it difficult for the players, they tend to focus on what is allowed and why the trains sometimes fit in to the intersetion and sometimes not. When the lines are not showing when a train will collide with a train from the same station and participants with trend lines have more trains moving at the same time it could be an explanation to why participants with lines have almost the same collision rate. Making a solution of withdrawing the amount of collisions direct from the database would give more reliable results than obtaining numbers from counting collisions from the trial videos. Human error can give wrong numbers. 6.4.2 Experiment Another useful change for future experiments is to verbalise during the whole session, not just one block. Participants who started to verbalise during the first block continued subconsciously during the second block - which could have cause a longer mean time in the second block for that group. Also, because we obtained a significant impact of verbalisation. Another useful change would be to have a collision counter visible after each trial so both participants with and without lines aim for driving the trains smoothly while driving multiple trains. Observations made on how often players change tactics could be useful in understanding the difference in the cognitive workload and investigate further if cognitive fatigue is more common among participants without trend lines. 44 7. References Andersson.AW, Jansson.A, Sandblad.B, Tschirner.S. Recognizing Complexity: Visualization for Skilled Professionals in Complex Work Situations. Uppsala University. Department of Information Technology. 2013. Andersson.AW, , Sandblad.B, Tschirner.S, Jansson.A. Framtida tagtrafikstyrning ,Sammanfattande forskningsrapport, Slutrapport fran FOT-projektet. Uppsala University. Department of Information Technology. 2015 Axelsson.A. Experiment write-up. Division of Visual Information and Interaction. Uppsala University. 2016. Bainbridge. Verbal reports as evidence of the process operator’s knowledge. International Journal of Human-Computer Studies, 51(2). 1999. Brehmer.B. Dynamic decision making: human control of complex systems. Acta Psychol.81. 1992. Brehmer.B Dorner.D. Experiments with computer-simulated microworlds: Escaping both the narrow straits of the laboratory and the deep blue sea of the filed study. Comput. Hum. Behav. 9. 1993. Caglarca.S, Investigating the effects of trends in an interface to a dynamicsystem. diploma thesis, Uppsala University. Department of Information Technology. 2015. Edwards, W. Dynamic decision theory and probabilistic information processing. Human Factors, 4. 1963. Ericsson.K.A and Simon.H.A. Protocol analysis, revised edition. MIT Press. 1993. Gulliksen.J , Goransson.B, Boivie.I Blomkvist.S, Persson.J and Cajander.A Key principles for user-centred systems design. Behaviour Information Technology. November-December 2013. Hanington.B, Martin.B. Universal Methods of Design: 100 Ways to Research Complex Problems, Develop Innovative Ideas, and Design Effective Solutions. Rockport Publishers. 2012. Krug.S.Don’t make me think - A Common Sense Approach to Web Usability. New Riders Publishing, Berkley, California, 2 edition. 2006. Kuniavsky.M. Observing the User Experience, volume 2. Elsevier Inc. Rockport Publishers. 2012. Logie.R.H. The functional organization and capacity limits of working memory. Current Directions in Psychological Science, 20. 2011. Papert.S. Computer-based microworlds as incubators for powerful ideas. The computer in the school: Tutor, tool, tutee. New York: Teacherâs College 45 Press. 1980. Sandblad.B, Andersson.A.W, Kauppi.A Isaksson-Lutteman.G. Human-Computer Interaction, Dept of Information Technology, Uppsala University, Sweden Tschirner.S. The gmoc model - supporting development of systems for human control. Diploma thesis, Uppsala University. Department of Information Technology. 2015. 46 Appendix1 Instructions Goal:Drivetrainsfromonestationtotheotherwithoutcollision Time:Approximately60minutes Break:5minbetweensessions Welcome! Youwillplayagamewiththeaimofdrivingtrainstoanoppositestationwithoutcollision youwillget20trialspersession. Lineexample Atooltoyoucanusetoavoidcollisionswiththetrainsisavailableintheformofalinewhich givesapredictionofwherethetrainswillmeet. Thelinesofthethreetrains(purple,whiteandgrey)showthatthetrainswillintersectata pointwherethereisonlyonetraintrack=collision. Here,thelinesofthethreetrains(purpleandgrey)showthatthetrainswillintersectata pointwherethereexistsameetingpoint=nocollision. Interfaceandcontrol Beneaththetrainrailsyouwillfindacontrolpanelforselectionoftrainandspeed.The circlesrepresenteachtraincorrelatingtothetrain’scolourandthespeedgoesfrom backwards(dragbartotheleft),standstill(middle)andforward(dragbartotheright). Trials Goal:Movethethreegreyscaletrainstotheleftstationandthecolouredtrainstotheright station. Totalamountoftrials:40trialsintotaldividedintotwosessionsof20trials. Trialtime:Thetimerstartsonceyouselectthefirsttrain.Thetimecountwillgiveyouahint ofwhichsolutionismostoptimum.Itisshownafterthetrial. Afteronetrial:Youwillbegivenapossibilitytostartanewgamebyclickingonthestart button. Ifyouhaveanyquestions,pleaseaskthetestleadernow. Goodluck! Appendix2 Instructions Goal:Drivetrainsfromonestationtotheotherwithoutcollision Time:Approximately60minutes Break:5minbetweensessions Welcome! Youwillplayagamewiththeaimofdrivingtrainstoanoppositestationwithoutcollision youwillget20trialspersession. Collisions Toavoidcollisions,youhavetobeawareoftheintersections. Apointwherethereisonlyonetraintrack=collision. Apointwherethereexistsameetingpoint=nocollision. Interfaceandcontrol Beneaththetrainrailsyouwillfindacontrolpanelforselectionoftrainandspeed.The circlesrepresenteachtraincorrelatingtothetrain’scolourandthespeedgoesfrom backwards(dragbartotheleft),standstill(middle)andforward(dragbartotheright). Pleasenote:Youcannotclickdirectlyonatraintoselectit. Trials Goal:Movethethreegreyscaletrainstotheleftstationandthecolouredtrainstotheright station. Totalamountoftrials:40trialsintotaldividedintotwosessionsof20trials. Trialtime:Thetimerstartsonceyouselectthefirsttrain.Thetimecountwillgiveyouahint ofwhichsolutionismostoptimum.Itisshownafterthetrial. Afteronetrial:Youwillbegivenapossibilitytostartanewgamebyclickingonthestart button. Ifyouhaveanyquestions,pleaseaskthetestleadernow. Goodluck!
© Copyright 2025 Paperzz