Analysing PIAAC Log Data

Analysing PIAAC Log Data
Possibilities and Tools
Heiko Rölke, DIPF
Overview
• Introduction to Log Data Analysis
• Tool Support for PIAAC Log Data –
the LogDataAnalyzer
• Predicting Item Success &
Classifying Problem Solvers –
PIAAC (& PISA) Log Data Use Cases
Assessment Log Data
Advantage of Computer Based Assessments
Log Data:
• Interaction Data
• Para Data
Assessment Log Data
•
•
•
•
Mouse clicks
Mouse movement
Keystrokes
Other input devices
– Audio
– Video
– Movement
–…
What to do with all this data?
Data Mining!
Automatic discovery of patterns in large data sets:
• Classification
• Cluster analysis
• Anomaly Detection
• Process Mining
• …
Still classical approaches like statistics, visualization etc.
are possible and useful!
Example Research Questions
• Result corrections:
Did test takers change their mind? To the better?
• Strategies:
– Information gathering
– Problem solving
• Relation between education/job/… and item
interactions?
So Log Data Analysis is Great.
But…
The challenge of log files:
• Different degrees of granularity
• No real standard existent
• File format conversion time consuming and errorprone
• Much work just to filter data
•…
7
Log file at first sight…
…2FitemRDFid%3E%3CitemLabel%20%2F%3E%3CitemComment%20%2F%3E%3CitemTrace
%3E%3CtaoEvent%20Name%3D%27taoPIAAC%27%20Type%3D%27START%27%20Time%3
D%270%27%3ETEST%5FTIME%3D452%3C%2FtaoEvent%3E%3CtaoEvent%20Name%3D%2
7stimulus%27%20Type%3D%27RADIO%5FBTN%27%20Time%3D%2742577%27%3Eid%3Du
06arg1%5Fprop2%3C%2FtaoEvent%3E%3CtaoEvent%20Name%3D%27stimulus%27%20Typ
e%3D%27RADIO%5FBTN%27%20Time%3D%2753962%27%3Eid%3Du06arg5%5Fprop1%3C
%2FtaoEvent%3E%3CtaoEvent%20Name%3D%27stimulus%27%20Type%3D%27RADIO%5F
BTN%27%20Time%3D%2759521%27%3Eid%3Du06arg4%5Fprop2%3C%2FtaoEvent%3E%3
CtaoEvent%20Name%3D%27stimulus%27%20Type%3D%27RADIO%5FBTN%27%20Time%3
D%2765773%27%3Eid%3Du06arg3%5Fprop2%3C%2FtaoEvent%3E%3CtaoEvent…
8
Log file example – decoded –
899914 15
{"sender":"15","type":"unloading","data":""}
900006 15
{"sender":"15","type":"variable_change","data":
{"name":"item_score","value": "<?xml version=\"1.0\" encoding=\"ASCII\„ ?>
\u000a<cbascoringresultmm:ItemScore xmi:version=\"2.0\" xmlns:xmi=\
"http://www.omg.org/XMI\"xmlns:cbascoringresultmm\„
http://cbascoringresultmm/1.0\" name=\"Item15_Algen_Task1\„
result=\"true\" nbHits=\"1\" hitWeight=\"1\" nbMisses=\"0\"
missWeight=\"0\" creditClass=\"1\" creditWeight=\"1\" execTime=\"67\"
execTimeTotal=\"67\" reactionTime=\"15\" reactionTimeTotal=\"15\"
nbInteractions=\"7\"
nbInteractionsTotal=\"7\">\u000a
<hitList name=\"Item15_Algen_Hit1\„ weight=\"1\" />\u000a
</cbascoringresultmm:ItemScore>\u000a"}}
900009 15
{"sender":"15","type":"variable_change","data":{
"name":"_15_NumberOfActions_state","value":{"score":"7"}}}
9
Solution: Tool Support
OECD contracted GESIS and DIPF to
• Make the PIAAC log data available
– Store data
– Develop and enforce access procedures
• Develop tool support
– Simplify analysis
– Document log file format
Example…
<itemTrace>
<Event Name='taoPIAAC' Type='START' Time='0'> TEST_TIME=332</Event>
<Event Name='stimulus' Type='TEXTLINK' Time='19652'>
id=u010a_default_txt5|*$href=unit10apage20|*$target=_self</Event>
<Event Name='stimulus' Type='HISTORY_ADD' Time='19908'>id=unit10apage20</Event>
<Event Name='stimulus' Type='TOOLBAR' Time='37341'> id=toolbar_find_btn</Event>
<Event Name='stimulus' Type='TOOLBAR' Time='40258'> id=toolbar_find_btn</Event>
<Event Name='PIAAC' Type='NEXT_INQUIRY' Time='44364'> REQUEST</Event>
<Event Name='PIAAC' Type='NEXT_BUTTON' Time='44364'> id=next_button</Event>
<Event Name='stimulus' Type='CONFIRMATION_OPENED' Time='44366'> type=TASK
</Event>
<Event Name='PIAAC' Type='BUTTON' Time='44367'>id=nextInquiry_button</Event>
<Event Name='PIAAC' Type='DOACTION' Time='44367'> action=tao_item.nextInquiry
</Event>
<Event Name='stimulus' Type='BUTTON' Time='47191'>id=endtask_txt3</Event>
<Event Name='stimulus' Type='CONFIRMATION_CLOSED' Time='47191'>action=OK
</Event>
<Event Name='PIAAC' Type='NEXT_ITEM' Time='47192'> REQUEST</Event>
<Event Name='PIAAC' Type='END' Time='47193'> END</Event>
</itemTrace>
Log Information I
• Time stamps:
– Overall time (time on task)
– Time till first interaction
• Answer
– Raw answer
– Maybe answer 1, answer 2, … answer n
Log Information II
• Navigation elements
– Frequency of page visits
– Number of page visits
– (relative) time on start page
– (relative) time on relevant pages
– Relevant pages visited (yes/no)
– Sequence of visited pages
• Item specific elements
– Number of bookmarks
Log File Documentation
Only technical documentation exists
• DataImportExport document
Aim
• Make log files understandable
for non-PIAAC researchers
• Examples (free items + logs), enriched
screenshots
• Links from item interactions to log events
Web Site
LogDataAnalyzer Tool
Tool & Data Availability
Development will be finished early 2017.
• Check OECD PIAAC website for news.
Many (r1) countries share their data:
Austria, Belgium (Flanders), Denmark, Estonia,
Finland, France, Germany, Ireland, Italy, Korea,
Netherlands, Norway, Poland, Slovakia, Spain,
UK, US
Use Case I
Background:
• PIAAC field test data from Germany
• Domain: PS-TRE
– Item „job search“
• N = 182
Potential Analysis
• Problem solving processes
• Identification of groups of test takers showing
similar interaction behaviour
• Possible relation between item interaction
processes and item score
• Etc.
26
Classification – Item „Job Search“
Use Case II
Background:
• International PISA Data (2012)
• Domain: PS
– Item „TICKETS“
• N = 26525
– Sub-sample for manual coding: N = 223
Research Question(s)
• Identify different problem solvers:
– Apathetic
– Goal oriented
– Feature explorer
• Automatically classify test-takers
Result of Supervised Learning
93,27% correct classification