Evaluation of IVIS/ADAS using driving simulators Comparing performance measures in different environments Thomas Engen, Lone-Eirin Lervåg, Terje Moen SINTEF Transport Research 1. BACKGROUND Evaluation of traffic safety measures has traditionally been conducted in terms of observational studies as before and after studies or comparative studies. Experimental studies have traditionally been seen as somewhat difficult to conduct, however they are now becoming more readily available. In this paper we demonstrate how to measure change in performance due to the use of ADAS/IVIS and unveil how we can represent real life through experiments. Results from experiments are compared to observational studies as well as evaluation and comparison of experiments conducted in different environments. We demonstrate how to measure change in performance and outline how this can be utilised to conduct evaluation of ADAS/IVIS. We show that experimental studies in driving simulators, on test tracks and in real traffic can be valid tools, but at the same time it demands special expertise to create good experimental design. In several different projects we have compared the results in driving simulators with results from observational studies in real traffic, other experiments and literature reviews. 2. INTRODUCTION 2.1 Experimental studies Experimental studies have traditionally been seen as somewhat difficult to conduct, however they are now becoming more readily available. For studies of IVIS/ADAS the use of experimental design is needed as we want to study the effect of IVIS/ADAS prior to mass production and common use in vehicles. At the same time we must validate the use of experimental methods to improve the truthfulness of the inferences we make through the experiments. 2.2 Behavioural validation Physical validation and behavioural validation are two main approaches to validating simulators. Physical validation is the validation of parameters such as how the car performs as compared to a real world car. Behavioural validation is an assessment of how the driver reacts and performs within the virtual world of a simulation. This paper deals with behavioural validation. There exists different methods for behavioural validation (for example by (Kaptein, Theeuwes et al. 1996) In this paper the validation of behaviour in the driving simulator is based on reducing four threats to validity (Cook and Campbell 1979): © Association for European Transport and contributors 2009 1 • • • • Statistical conclusion validity: To which degree is the use of the appropriate statistics used to conclude whether the presumed independent and dependant variables co-vary. The validity of inferences about the correlation (co-variation) between treatment and outcome. Internal validity: To which degree the result of an experiment can be attributed to the manipulation of the independent variable rather to some other, uncontrolled variable. Construct validity: The certainty to which a measurement device accurately measures the theoretical construct it is designed to measure External validity: The degree to which the results experiment can be generalized to different persons, setting, treatment variables and measurement variables. It is put a lot of emphasis on statically conclusion validity and internal validity in evaluation research. The use of confidence intervals and other statically tool are used to reduce this specific threat. Within traffic safety research methods of reducing threats like regression effects and control of factors like traffic volume increase are used to reduce the threat of internal validity. Construct validity are minimized because effects are traditionally measured by accident rates. Meta –analyses are used to reduce the threat of external validity. 2.3 Measuring the influence of IVIS/ADAS There are several different methods to measure the influence of IVIS/ADAS. One must put special emphasise on construct validity to select performance indicators and measures that accurately describes the effects of IVIS/ADAS. Theories both of workload influence and of measuring primary and secondary tasks have been developed. All of these theories rely on the possibility to measure different variables. Measurable variables can be sorted into two main categories: • Physiological Measures (Heart Rate Variability, Respiration Rate Variability, Galvanic Skin Response, Muscle Tension) • Driving performance (Lateral Control, Longitudinal control, Visual management, Interaction with other vehicles) Physiological measures are difficult to conduct unless you are carrying out experiments. We have primarily focused on studies were we can compare the results of lateral control, longitudinal control and interaction with other vehicles. Measuring performance of driving can be done by using several methods. Registrations can be done both by doing registrations of naturalistic tasks and by introducing new artificial secondary tasks that is not normally handled in traffic but might indicate change in workload. © Association for European Transport and contributors 2009 2 Naturalistic driving tasks Artificial driving tasks Continuous registrations Lane tracking Speed Time gap Steering wheel reversal Line tracking Incident registrations Table 1: Driving performance indicators Reaction time Number of errors Peripheral Detection Task Further description and discussion of different Performance Indicators and Measures can be found in a report from the FESTA projects (Kircher 2008) 2.4 Theoretical properties of driving simulators, test tracks and real traffic evaluation When doing evaluation one must be aware that the validity of the results can not be absolute, but will always have some uncertainty to them. The validity of the results will “refer[ing] to the approximate truth of an inference”(Cook and Campbell 1979). Evidence of validity may come from other sources of information, such as from past findings and theories. Through a literature review (Engen 2008) we have found research projects that involve validation of driving simulators: • Direct comparison with real life data • A comparison of the driving simulator with physiological tests and a questionnaire • Expert testing • Validation compared to specific driver characteristics • Stability over time and driver characteristics • Driving training In general, most of the research found that driving simulators was to some extent valid tools for behavioural research. Although the different tools can be valid tools for research, one must be aware that the different methods and environments have their specific drawbacks: • Driving simulator – lacks the possibility to produce real danger feeling, Realism might be missing, due to low resolution of devices (Screens, audio…) • Test track – Lack danger of interaction with other vehicles. • Real traffic – Extreme and dangerous situation can not be tested © Association for European Transport and contributors 2009 3 2.5 Evaluation procedure To minimize the threat to construct validity, we have created a four step procedure for designing the experiments for evaluation of IVIS/ADAS: 1. Describe what characteristics of the IVIS/ADAS that is to be evaluated. 2. Decide what measures and performance indicators that can be used for the evaluation. 3. Create scenarios were the performance indicators can be measured and calculated. 4. Combine several scenarios together to form a research design. If one of the purposes of the evaluation is to compare the effects of the IVIS/ADAS with previous research, one should seek to use the same performance indicators, scenarios and to a large extent the same research design. 2.6 Environments used for the experiments Experimental studies can be conducted in different environments: • Driving simulator • Test track • Real traffic The driving simulator used for the studies presented in this paper constitutes of a Renault Scenic 1997 year model with a three-axis moving platform, a vibration system in the chassis and a four-channel sound system. The visual representation of the road is presented on three screens in front of the driver and two screens behind the driver, for a total of five projectors. The three front screens are rear projected and provide in sum a 180° horizontal field of view and 47° vertical field of view. The two screens behind the vehicle provide in sum a 90° horizontal field of view and 47° vertical field of view. The instrumented vehicle is a Volvo V70 2.4s. Data sources in the car can be divided into two parts: the collection of information from standard sensors that are built into the car from the manufacturer, and the collection of data from extra sensors specially mounted on the instrumented car. © Association for European Transport and contributors 2009 4 The test track used for validation of the driving simulator is a model of the real world test track “Lånke”. Figure 1: Test track model in real life and in the driving simulator 3. COMPARISON OF RESULT BETWEEN DRIVING SIMUALTORS AND REAL TRAFFIC 3.1 Reaction time The reaction time studies conducted in the driving simulator were compared to real life measurements, previous research, and measurements of reaction time in a video-based simulator. The reaction time in the driving simulator was measured by introducing near collision situations. Designing near collision situations using an instrumented vehicle might be unethical and involving test drivers in an accident are unethical. Measuring near collision situations is therefore most practical in a driving simulator. Creating near collision situations in a driving simulator imposes at least two serious problems: • The near collision situation is designed by the researcher and represents that one specific situation. It is difficult to generalize the results. • Presenting more than one near collision situation can make the test driver aware of the purpose of the project. The driving simulator results were compared to several other sources: • Literature review documenting earlier research and results about reaction times • Measurements of reaction times at junctions in real traffic • Previous tests in a video-based simulator © Association for European Transport and contributors 2009 5 The reaction time found in the driving simulator varied a great deal in different situations. This was, however, reasonable and comparable to the results from all the other sources (Engen and Giæver 2004). In this case study, the most important threat to the validity of the test was that subjects might learn what measurements are being used. In the case of reaction time it is very important that each situation will be a surprise to the test subjects, but we found that the subject’s alertness level increased after only one incident. Mean reaction time in different traffic situation 2 .00 Reaction time (s) Error Bars show 95.0% Cl of Mean 1 .50 Bars show Means 1 .00 0 .50 0 .00 1 2 3 4 5 6 7 8 Situation Figure 2: Mean reaction time for different traffic situations. 3.2 Speed and lateral position Speed and lateral position measurements of related conditions were conducted through observational studies in real traffic and experiments in the driving simulator. Typical Norwegian rural roads were used and speed and lateral position in several situations were analysed. © Association for European Transport and contributors 2009 6 Speed in the driving simulator Road width: 8.5 m Scenario Lane Width = 3.25m, Shoulder width = 1.0m Lane Width = 3.0m, Shoulder width = 1.25m Mean Speed(km/h) 90,0 85,0 80,0 81,2 81,4 Left curve with radius 360 m Straight road 84,2 83,7 83,2 82,9 82,4 81,7 82,5 83,2 75,0 70,0 Right curve with radius 1250 m Left curve with radius 500 m Right curve with radius 2500 m Curve Error bars: 95% CI Figure 3: Effects of road characteristics on mean speed The results were similar in both real traffic and the driving simulator (Giæver and Engen 2005), but there were some key differences. It should be emphasized that even though the measurement of speed and lateral position is relatively easy, finding one real world speed that can be compared to data from the driving simulator is not easy. The difference in statistical mean could be just as small compared between simulator and roadside measurements as between different roadside measurements. The most important result from the study of speed and lateral position was that the driving simulator results have less standard deviation than real world measurements. This is to be expected, because real world measurements are more prone to influence from stochastic variability. The control of confounding variables possible in a driving simulator can create more exact results, but at the same time, there is a need for a good understanding of this confounding variable to be able to create sound scenarios. 3.3 Time gap Measurements of time gap were done both in the driving simulator and the instrumented vehicle (Engen 2008). This case study was meant specially for testing the method and not for finding the precise time gap. In the driving simulator both the mean time gap and standard deviation was much smaller than those conducted using an instrumented vehicle. The time gap found by using an instrumented vehicle was comparable to previous results from roadside registrations(Giæver 1993). © Association for European Transport and contributors 2009 7 Histogram of time gap from driv ing simulat or Histogram of time gap from instrumented vehicle 3 0 00 0 1 5 00 0 1 000 800 Frequency 2 0 00 0 Frequency Mean = 1.13 Std.Dev = 0.64 Max time gap used for statistics 2 5 00 0 Max time gap used for st atistics 1 200 600 1 0 00 0 400 5 00 0 200 0 Mean = 2.57 Std.Dev = 1.12 0 0,00 5 ,00 1 0,00 15 ,0 0 0,00 5,00 Time gap (s) 10,00 15,00 Time gap (s) Figure 4: Time gap distribution. In the case of time gap measurements in the driving simulator, the importance of understanding confounding variables was even more evident. The driving simulator was designed in an overly simplistic way as compared to a real world situation, which led to very small time gaps. Similarly, lack of the ability to control both the instrumentation and the traffic situation probably led to too large a time gap in recording as compared to a queued situation. As was found in the speed and lateral position case study, the standard deviation of the simulator study was much smaller than the standard deviation of the instrumented vehicle. 4. COMPARISON OF RESULT BETWEEN DRIVING SIMUALTOR AND TEST TRACK 4.1 Speed and steering wheel movement Comparing driving performance when influenced by alcohol in driving simulator and on test track revealed that values of most traffic behaviour variables in simulator do not differ considerably from corresponding values on test track. The variables tested where: response time, speed, steering wheel reversals, steering wheel movement speed, number of cones knocked down during serpentine driving, stopping, distance to tracking line and self reported experience. (Sakshaug 2008) Mean speed and steering wheel reversals per sec gave pretty much the same pattern at test track as in simulator. Round track driving on test track - Mean speed Round track driving in simulator - Mean speed 40,0 40,0 30,0 Frequency Frequency 30,0 20,0 10,0 Mean =42,7878 Std. Dev. =7,31103 N =300 0,0 20 30 40 50 Mean speed (kmh) 60 70 20,0 10,0 Mean =44,1 Std. Dev. =9,846 N =307 0,0 20 40 60 Mean speed (kmh) 80 Figure 5: Round track driving. Speed distribution. © Association for European Transport and contributors 2009 8 Previously we have found that measurements of standard deviation of speed on real road registrations are smaller than in the driving simulator. This is not the case here. The standard deviation was not smaller in the driving simulator than on the test track. This is probably because there are no uncontrolled factors due to other traffic as there is on road registrations. Serpentine driving on test track - Mean speed distribution Serpentine driving in simulator - Mean speed 25 20,0 Mean =33,435 Std. Dev. =3, 94255 N =144 20 Frequency Frequency 15,0 15 10 10,0 Normal 5,0 Mean =33,69 Std. Dev. =5,332 N =145 5 0,0 0 20 25 30 35 40 45 20 25 Mean speed (kmh) 30 35 40 Mean speed (kmh) 45 50 Figure 6: Serpentine driving. Mean speed distribution. 4.2 Determining distance to object in the driving simulator Compared to the real world, the ability for the driver to calculate the distance of an object in a simulator is quite different (Sakshaug 2008). This is because the simulator image is 2D compared to the real 3D world. In addition, the g-forces in the driving simulator used were smaller than the real world. This leads to fewer cues for the driver about lateral and longitudinal acceleration of the vehicle. During a serpentine exercise conducted both on a test track and a corresponding simulated test track, the subject knocked down or touched more cones in the simulator than compared to the real world test track. It was also easier to assess the distance to a stop line on the test track than in the simulator. Simulator driving - no of cones knocked down Driving on test track - No of cones knocked down Serpentine cone Stop line cone 6 5 Mean no of cones knocked down 5 No of cones knocked down Serpentine driving No of cones knocked down Start/Stop - No of cones knocked down 6 4 3 2 4 3 2 1 1 0 0 Baseline Test drive Test drive Test drive Test drive Baseline Before 1 - sober 2 - BAC 3 - BAC 4 - BAC After (sober) level 1 level 2 level 3 (sober) Baseline before (sober) Test condition Test Test Test Test Baseline driving - driving - Drivnig - driving after Sober BAC level BAC level BAC level (sober) 1 2 3 Test condititon Figure 7: Serpentine driving. Number of cones knocked down © Association for European Transport and contributors 2009 9 However, the mean distance from the line, was found to vary with test drive category in the same way in both environments. The serpentine driving task was obviously more difficult in a simulator than on the test track. 5. CONCLUSION Experimental studies are important tools for evaluation of IVIS/ADAS, but at the same time it demands special expertise to create good experimental design. For example an important strength of driving simulator experiments lies in controlling the confounding variables, but at the same time this creates results with less variance than experiments in real traffic. In this paper we have described several possibilities for experimental methods for evaluating IVIS/ADAS and outlined some of the constraints and advantages of some of these methods. Further research using experimental methods will increase the validity of using such methods. BIBLIOGRAPHY Cook, T. D. and D. T. Campbell (1979). Quasi-experimentation : design & analysis issues for field settings. Chicago, Rand McNally College Pub. Co. Engen, T. (2008). Use and validation of driving simulators. Faculty of Engineering Science and Technology, Department of Civil and Transport Engineering. Trondheim, Norwegian University of Science and Technology. Doctoral thesis. Engen, T. and T. Giæver (2004). Reaksjonstid i vegtrafikken. Trondheim, SINTEF, Teknologi og samfunn, Veg og samferdsel. Giæver, T. (1993). Trafikkavvikling under vinterforhold. SINTEF Rapport. Trondheim, SINTEF Samferdslsteknikk. Giæver, T. and T. Engen (2005). Testing av visuell midtdeler. Trondheim, SINTEF, Teknologi og samfunn, Veg og samferdsel. Kaptein, N. A., J. Theeuwes, et al. (1996). "Driving simulator validity: Some considerations." Transportation research record(1550): 30-36. Kircher, K. (2008). D2.1 – A Comprehensive Framework of Performance Indicators and their Interaction. Sakshaug, K. (2008). VALIDAD Pilot Experiment in Simulator and on Test Track. Results of Data Analysis. Trondheim, SINTEF: 37. © Association for European Transport and contributors 2009 10
© Copyright 2026 Paperzz