Capturing and Analyzing Low‐Level Events from the Code Editor YoungSeok Yoon ([email protected]) Institute for Software Research (ISR) Brad A. Myers ([email protected]) Human-Computer Interaction Institute (HCII) School of Computer Science Carnegie Mellon University Motivation • When trying to understand what problems the developers have, researchers investigate… how frequently do those problems occur? what kinds of problems do developers have? in what context does a certain event occur? when / how do the developers use certain features? in what sequence do the developers complete certain tasks? PLATEAU 2011 2 Motivation • When trying to evaluate existing tools… how do the developers actually use those tools? exactly how they are used in detail? frequently do developers use the tools? do the developers use the tools as expected? PLATEAU 2011 how much time do they spend using the tools? what types of errors do they make? 3 How do we answer these questions? • Essentially, we are collecting the usage data from the developers • The usage data can be collected by: ▪ ▪ ▪ ▪ asking the developers observing the developers mining software repositories logging the developers’ behavior • However, previous tools make it difficult or impossible to get / analyze fine‐grained code editing history PLATEAU 2011 4 Asking the developers • Interviews, surveys, focus groups • Limitations ▪ Responses may not be reliable ▪ Developers perform many operations quite automatically Maybe they might not remember the specific occasions PLATEAU 2011 5 Observing the developers • Contextual inquiries, lab studies • Think aloud + videotape • Limitations ▪ Requires manual inspection of the videotape Can be time‐consuming and error‐prone • [Coman2008], [Ko2003], [Ko2005], ... PLATEAU 2011 6 Mining software repositories • Has been used for studying software evolution • Limitations ▪ Cannot know exactly what happened between two consecutive revisions we may miss some important user behavior ▪ Need to infer from the check‐in comments and other clues • [Aversano2007], [Bettenburg2009], [Kim2005], [Murphy‐Hill2009], … PLATEAU 2011 7 Eclipse Usage Data Collector (UDC) • Logging tool for aggregate data • Collects usage information from all the Eclipse users who consented to upload their data to UDC • Limitations ▪ Does not capture the sequences of events ▪ Misses some important commands executed • e.g., UDC ignores backspace, moving cursor with arrow keys, … ▪ Command‐specific parameters are not captured PLATEAU 2011 8 Other Logging Tools • Mylyn Monitor [Kersten2006] ▪ Textual level changes are not captured ▪ Focuses on more abstract user interaction data on the entire IDE • Syde / Replay [Hattori2011] ▪ Only commands which change the source code are logged ▪ AST‐level, not textual level PLATEAU 2011 9 Our new logging tool: FLUORITE Full of Low-level User Operations Recorded In The Editor PLATEAU 2011 10 Our new logging tool: FLUORITE • FLUORITE is a publicly available Eclipse plug‐in that captures low‐level code editing events and produces XML log files • This is a tool for the researchers • Can be used to overcome the previously listed limitations • For each event, FLUORITE logs: ▪ timestamp ▪ event type / command ID ▪ event‐specific parameters PLATEAU 2011 11 Three types of events • FLUORITE logs three different types of events: ▪ Commands: all the user events in the code editor • (e.g., typing new text, moving cursor, copying, …) ▪ Document Changes • logged whenever the active file is modified • contain the actual inserted/deleted code, resulting code length, … • makes it possible to reproduce snapshots of the files at any point ▪ Annotations • logged when the developer manually adds annotations to help the researcher PLATEAU 2011 12 An example Click PLATEAU 2011 13 An example Shift ↓ PLATEAU 2011 14 An example Del PLATEAU 2011 15 An example PLATEAU 2011 16 Resulting FLUORITE log <Command __id="2" _type="MoveCaretCommand" caretOffset="142" docOffset="142" timestamp="3977"/> <Command __id="3" _type="EclipseCommand" commandID="eventLogger.styledTextCommand.SELECT_LINE_DOWN" timestamp="5598"/> <DocumentChange __id="4" _type="Delete" docASTNodeCount="22" docActiveCodeLength="125" docExpressionCount="10" docLength="151" endLine="9" length="39" offset="142" startLine="8" timestamp="7186"> <text> <![CDATA[ System.out.println("Hello World!"); ]]> </text> </DocumentChange> <Command __id="5" _type="EclipseCommand" commandID="org.eclipse.ui.edit.delete" timestamp="7202"/> PLATEAU 2011 17 Resulting FLUORITE log ① Cursor was moved by the mouse clicking <Command __id="2" _type="MoveCaretCommand" caretOffset="142" docOffset="142" timestamp="3977"/> <Command __id="3" _type="EclipseCommand" commandID="eventLogger.styledTextCommand.SELECT_LINE_DOWN" timestamp="5598"/> <DocumentChange __id="4" _type="Delete" docASTNodeCount="22" docActiveCodeLength="125" docExpressionCount="10" docLength="151" endLine="9" length="39" offset="142" startLine="8" timestamp="7186"> <text> <![CDATA[ System.out.println("Hello World!"); ]]> </text> </DocumentChange> <Command __id="5" _type="EclipseCommand" commandID="org.eclipse.ui.edit.delete" timestamp="7202"/> PLATEAU 2011 18 Resulting FLUORITE log ② One line of code <Command __id="2" _type="MoveCaretCommand" caretOffset="142" was selected by docOffset="142" timestamp="3977"/> SHIFT + ↓ <Command __id="3" _type="EclipseCommand" commandID="eventLogger.styledTextCommand.SELECT_LINE_DOWN" timestamp="5598"/> <DocumentChange __id="4" _type="Delete" docASTNodeCount="22" docActiveCodeLength="125" docExpressionCount="10" docLength="151" endLine="9" length="39" offset="142" startLine="8" timestamp="7186"> <text> <![CDATA[ System.out.println("Hello World!"); ]]> </text> </DocumentChange> <Command __id="5" _type="EclipseCommand" commandID="org.eclipse.ui.edit.delete" timestamp="7202"/> PLATEAU 2011 19 Resulting FLUORITE log <Command __id="2" _type="MoveCaretCommand" caretOffset="142" docOffset="142" timestamp="3977"/> <Command __id="3" _type="EclipseCommand" commandID="eventLogger.styledTextCommand.SELECT_LINE_DOWN" timestamp="5598"/> <DocumentChange __id="4" _type="Delete" docASTNodeCount="22" docActiveCodeLength="125" docExpressionCount="10" docLength="151" endLine="9" length="39" offset="142" startLine="8" timestamp="7186"> <text> <![CDATA[ System.out.println("Hello World!"); ]]> </text> ③ “Delete” key was pressed </DocumentChange> <Command __id="5" _type="EclipseCommand" commandID="org.eclipse.ui.edit.delete" timestamp="7202"/> PLATEAU 2011 20 Resulting FLUORITE log <Command __id="2" _type="MoveCaretCommand" caretOffset="142" docOffset="142" timestamp="3977"/> <Command __id="3" _type="EclipseCommand" commandID="eventLogger.styledTextCommand.SELECT_LINE_DOWN" timestamp="5598"/> ④ The actual code <DocumentChange __id="4" _type="Delete" docASTNodeCount="22" deleted by the docActiveCodeLength="125" docExpressionCount="10" docLength="151" “delete” command endLine="9" length="39" offset="142" startLine="8" timestamp="7186"> <text> <![CDATA[ System.out.println("Hello World!"); ]]> </text> </DocumentChange> <Command __id="5" _type="EclipseCommand" commandID="org.eclipse.ui.edit.delete" timestamp="7202"/> PLATEAU 2011 21 FLUORITE log files • The log files are written in XML format ▪ Anyone can build their own automatic analyzer! • Is FLUORITE practical enough to use? ▪ Already has been useful for a couple of studies • Our own exploratory study with 12 developers • Dörner’s evaluation study of Euklas system ▪ Size of the logs • Average log size: 236.8KB / hr = 9.25MB / week • Could be reduced to 1MB / week if the logs were compressed ▪ Performance • There was no measurable performance loss during our study PLATEAU 2011 22 Built‐in analyses • We also provide a log analyzer which has several built‐in analyses • Our study focused on when and how the developers backtrack while editing code ▪ The analyses were built for this purpose PLATEAU 2011 23 Example Analysis: Command distribution report PLATEAU 2011 24 Example Analysis: Command distribution report PLATEAU 2011 25 Example Analysis: Command distribution report PLATEAU 2011 26 Example Analysis: Command distribution report PLATEAU 2011 27 Example Analysis: Keystroke distribution report PLATEAU 2011 28 Example Analysis: Keystroke distribution report PLATEAU 2011 29 Example Analysis: Keystroke distribution report PLATEAU 2011 30 Example Analysis: Keystroke distribution report PLATEAU 2011 31 Example Analysis: Code size growth graph PLATEAU 2011 32 Events View PLATEAU 2011 33 Example Analysis: Code editing pattern detection • FLUORITE logs enable us to detect code editing patterns composed of sequences of events • Examples (not all of them are implemented) ▪ ▪ ▪ ▪ ▪ Typo correction Parameter tuning Commenting out / uncommenting Cutting/copying and pasting within a project Manual refactoring (e.g., rename variable) • Preliminary implementations Pattern Typo correction Parameter tuning Counts 274 / 288 52 / 98 PLATEAU 2011 Rate 13.6 / hr 2.6 / hr Precision 95.14% 53.06% 34 Conclusion • FLUORITE web page: http://www.cs.cmu.edu/~fluorite/ • FLUORITE is publicly available, open‐sourced tool for Eclipse which can be used when conducting studies • FLUORITE turned out to be useful for our own study, and we hope that it will help you too! PLATEAU 2011 35 Questions? • FLUORITE web page: http://www.cs.cmu.edu/~fluorite/ • Acknowledgements ▪ National Science Foundation (NSF) CCF‐0811610 ▪ Korea Foundation for Advanced Studies (KFAS) PLATEAU 2011 36 References • • • • • • • • • [Aversano2007] Aversano, L., Cerulo, L. and Di Penta, M. 2007. How Clones are Maintained: An Empirical Study. In Proc. 11th European Conf. on Soft. Maint. and Reengineering (CSMR’07). 81‐90. [Bettenburg2009] Bettenburg, N., Weyi, S., Ibrahim, W., Adams, B., Ying, Z. and Hassan, A. E. 2009. An Empirical Study on Inconsistent Changes to Code Clones at Release Level. In Proc. 16th Working Conf. on Reverse Eng. (WCRE’09). 85‐94. [Coman2008] Coman, I. D. and Sillitti, A. 2008. Automated Identification of Tasks in Development Sessions. In Proc. 16th IEEE Int’l Conf. on Program Comprehension (ICPC’08). 212‐217. [Kersten2006] Kersten, M. and Murphy, G. C. 2006. Using task context to improve programmer productivity. In Proc. 14th ACM SIGSOFT Int’l Symp. on Foundations of Soft. Eng. (FSE’06). 1‐11. [Kim2005] Kim, M., Sazawal, V., Notkin, D. and Murphy, G. 2005. An empirical study of code clone genealogies. In Proc. 10th Euro. Soft. Eng. Conf. & 13th ACM SIGSOFT Int’l Symp. on Foundations of Soft. Eng. (ESEC/FSE’05). 187‐196. [Ko2003] Ko, A. J. and Myers, B. A. 2003. Development and evaluation of a model of programming errors. In Proc. IEEE Symp. on Human Centric Computing Languages and Environments (HCC’03). 7‐14. [Ko2005] Ko, A. J., Aung, H. H. and Myers, B. A. 2005. Design requirements for more flexible structured editors from a study of programmers’ text editing. In Proc. Extended Abstracts of CHI'2005. 1557‐1560. [Hattori2011] Hattori, L., D’Ambros, M., Lanza, M. and Lungu, M. 2011. Software Evolution Comprehension: Replay to the Rescue. In Proc. 19th IEEE Int’l Conf. on Program Comprehension (ICPC’11). 161‐170. [Murphy‐Hill2009] Murphy‐Hill, E., Parnin, C. and Black, A. P. 2009. How we refactor, and how we know it. In Proc. 31st Int’l Conf. on Soft. Eng. (ICSE’09). 287‐297. PLATEAU 2011 37 BACKUP SLIDES PLATEAU 2011 38 Longer example log <Command __id="2" _type="MoveCaretCommand" caretOffset="103" docOffset="103" timestamp="11073"/> <Command __id="3" _type="EclipseCommand" commandID="eventLogger.styledTextCommand.SELECT_LINE_DOWN" timestamp="13001"/> <DocumentChange __id="4" _type="Delete" docASTNodeCount="15" docActiveCodeLength="86" docExpressionCount="4" docLength="112" endLine="8" length="39" offset="103" startLine="7" timestamp="14969"> <text><![CDATA[ System.out.println("Hello World!"); ]]></text> </DocumentChange> <Command __id="5" _type="EclipseCommand" commandID="org.eclipse.ui.edit.delete" timestamp="14985"/> <DocumentChange __id="6" _type="Insert" docASTNodeCount="22" docActiveCodeLength="125" docExpressionCount="10" docLength="151" length="39" offset="103" timestamp="17564"> <text><![CDATA[ System.out.println("Hello World!"); ]]></text> </DocumentChange> <Command __id="7" _type="UndoCommand" timestamp="17570"/> PLATEAU 2011 39 Example Analysis: Command distribution report PLATEAU 2011 40 Example Analysis: Command distribution report PLATEAU 2011 Command Count (Percentage) InsertString 5797 (31.48%) LINE_DOWN 5693 (10.67%) DELETE_PREV 4495 (10.48%) MoveCaret 3586 (8.63%) LINE_UP 2751 (8.27%) 41 Example Analysis: Keystroke distribution report PLATEAU 2011 42 Example Analysis: Keystroke distribution report PLATEAU 2011 Key Count (Percentage) ↓ 5797 (12.64%) Backspace 5693 (12.41%) ↑ 4495 (9.80%) → 3586 (7.82%) ← 2751 (6.00%) 43
© Copyright 2026 Paperzz