Timna: A Framework for Automatically Combining Aspect Mining Analyses David Shepherd1 Jeffrey Palm2 Lori Pollock1 Mark Chu-Carroll3 1University of Delaware 2Northeastern University 3IBM 1 Introduction - What is AOP? Aspect Oriented Programming Each rectangle represents a source file figures from aspectj.com Red lines represent source code lines implementing concept A Language Support for Crosscutting Concerns (CCCs) 2 Introduction - A Closer Look at AOP class Line { private Point p1, p2; aspect DisplayUpdating { Point getP1() { return p1; } Point getP2() { return p2; } void setP1(Point p1) { this.p1 = p1; Display.update(); } void setP2(Point p2) { this.p2 = p2; Display.update(); } } after(): call(void Line.setP1(Point)) || call(void Line.setP2(Point)) { Display.update(); } } figures from aspectj.com • Benefits of Refactoring into AOP • Increased "ilities" – readability – maintainability – extensibility 3 • Crosscutting concerns (CCCs) are explicit Working Assumptions 1. AOP can provide benefits in modularity for some problems 2. Crosscutting Concerns are dangerous; awareness is essential 4 Targeted Problem: Mining • Legacy Applications – refactor into AOP Aspect Mining is the process of finding these candidates figures generated by the AJDT 5 Targeted Problem: Mining Applications written with AOP, still problems... 3 different programmers implement 1 concept 6 State of the Art Researchers currently create a single analysis to perform aspect mining Examples • Fan-in Analysis [Marin et al, WCRE 05] • Code Clone Analysis [Shepherd et al, SERP 05] • Dynamic Analysis [Breu et al, ASE 04] • ... Little work done on combining analysis 7 State of the Art • Fan-In [Marin et al, WCRE 04] public void credit(float amount) { AccessController.checkPermission( new BankingPermission("accountOperation")); _balance = _balance + amount; } Good Candidate for Refactoring public void debit(float amount) throws InsufficientBalanceException { AccessController.checkPermission( new BankingPermission("accountOperation")); if (_balance < amount) { throw new InsufficientBalanceException("Insufficient total balance"); } else { _balance = _balance - amount; } checkPermission Bad Candidate for Refactoring public void checkOut(SpecialList items) { SpecialIterator it = items.iterator(); while(it.hasNext()) checkOut(it.next()); } public void markItems(SpecialList items) { SpecialIterator it = items.iterator(); while(it.hasNext()) ((Item)it.next()).mark(); } next 8 State of the Art Conventional Transaction Management • Clone Detection [Shepherd et al, SERP 05] public void debit(float amount) throws InsufficientBalanceEx UserTransaction ut = ...; try { ut.begin(); ... business logic ... ut.commit() } catch (Exception ex) { ut.rollback(); // rethrow after logging and wrapping } } public void credit(float amount) { UserTransaction ut = ...; try { ut.begin(); ... business logic ... ut.commit() } catch (Exception ex) { 9 Remaining Challenges • Combining Analyses – if (code clone & fan-in high),Humans moredolikely to this during be a refactoring candidate manual mining • Running a large number of analyses – Methods with void return types – Getters and setters – ... WeOur invented framework several (Timna) new analyses, combines useanalyses 11 totalto analyses make a decision 10 Key Insight Currently, humans are the best miners. What is their process? 1. Manual Inspection 2. Learn to identify candidates in specific system 3. Generalize to other systems 4. Apply in other systems 11 Automated Approach 1. Create Training Data 1. Manual Tag Known Program 2. Automatically Run Individual Mining Analyses 2. Learn • Output: set of rules to classify • boolean or categories 3. Classify Unknown programs • Output: refactoring candidates 12 Method Identifier toolDone setTool Create Training Data exit LearnStart Apply Attributes {?, ?, ?} {?, ?, ? } {?, ?, ?} {?, ?, ?} Classification refactor don't refactor don't refactor refactor Approach - Learning 1. 2. 3. Create Training Data Classification Table Known Program Manual Tagging Class 2 Method B Class 1 Method C Class 1 Method D Class 3 Augmented Classification Table Learn Rules Classification Rules Method A Machine Learning Method A Attributes Class 2 Method B Attributes Class 1 Method C Attributes Class 1 Method D Attributes Class 3 Mining Analyses Fan-in No Parameters Code Clone Pairings 13 Approach - Learning 1. 2. 3. Create Training Data Learn Method Identifier Apply toolDone Create Training Data 1. If( Fan-insetTool > 5 and is-Void = true ), Attributes Classification {6, false, 3} refactor {1, true, 1Table } don't refactor Classification then (refactor) exit 2. If ( true ), then (don't refactor) Final Result: Manual Start OnlyTagging Program output to Classifying Phase Class 2 Method B Class 1 Method C Class 1 Method D Class 3 Augmented Classification Table Learn Rules Classification Rules {0, false, 2} {4, false, 2} Method A Machine Learning Method A Attributes Class 2 Method B Attributes Class 1 Method C Attributes Class 1 Method D Attributes Class 3 don't refactor refactor Mining Analyses Fan-in No Parameters Code Clone Pairings 14 Approach - Classifying Method Method Method Identifier Identifier Identifier Attributes Attributes Attributes Classification Classification Classification showPrompt showPrompt showPrompt Classify Unknown Program {} takeOrder takeOrder takeOrder {} Unknown sendMessage sendMessage sendMessage {} Program endend end {} Classification Table Method A Method B Mining Analyses Fan-in No Parameters Method C {6, {1, {0, {4, false, {6, false, 3} 3} true, {1, true, 1} 1} false, {0, false, 2} 2} false, {4, false, 2} 2} refactor don't refactor don't refactor refactor Augmented Classification Table Completed Classification Table Method A Attributes Method A Attributes Class 2 Method B Attributes Method B Attributes Class 1 Method C Attributes Method C Attributes Class 1 Method D Attributes Method D Attributes Class 3 Classifier Code Clone Method D Pairings Classification Rules 15 Evaluation Questions 1. Does combining analyses increase precision and recall? 2. Are generated rules effective on other programs? 3. Does categorical tagging increase performance? in two different ways: 4. What is the (time) Tagged overhead? Boolean: either refactor or don't 5. Can rules help direct research Categorical: either and don't refactor or a reason (category) why to evaluate new analyses? 16 refactor Experimental Setup Subject Programs • Training Program (JHotDraw, 11K LOC) • Testing Program (PetStore, 9K LOC) Steps 1. Train 2. Test on Training Program 3. Test on Testing Program Metrics • Precision and Recall • Time 17 Experimental Results Timna: Cat Timna: Bool Timna: Cat Timna: Bool Precision Recall = (number = (number of of candidates good candidates returned) returned) / / Program not tagged, so (number (number of actual can't ofcalculate candidates goodrecall candidates) returned) 18 Experimental Results Why1.Does is Fan-Incombining performing analyses increase poorly? and recall? precision Single analyses work well for specific cases, but fail to find all aspects. Timna: Cat Timna: Bool Timna: Cat Timna: Bool In this case, combining analyses does increase precision and recall. 19 Experimental Results 2. Are generated rules effective on other programs? Timna: Cat Timna: Bool Timna: Cat Timna: Bool In this case, the rules effectively mine from the testing program. 20 Experimental Results 3. Does categorical tagging increase performance? Timna: Cat Timna: Bool Timna: Cat Timna: Bool In this case, the categorical tagging and the boolean tagging perform similarly. 21 Experimental Results 4. What is the (time) overhead? Timna: Cat Timna: Bool Timna: Cat Timna: Bool Learn 6.24s 1.88s --- Analyze 5m28s 5m28s 2m04s 2m04s Only done once Canincrementally, be done incrementally, Can be done at each compile/edit, or at each compile/edit, or From these results, we believe could be integrated overnight overnightTimna into an IDE without degrading response time. 22 Experimental Results 5. Can rules help direct research and evaluate new analyses? • [WARE 05] elaborates on use in evaluating new analyses • if analyses does not appear in rules, it is providing no new information • Human readable rules can help define style 23 Contributions • Technique to combine mining analyses to automatically identify refactoring candidates • Demonstrated how to apply machine learning to learn good AOP style from canonical examples – generate human readable rules • Invented several (7) novel mining analyses during our initial use of Timna • Experimentally shown evidence that combining analyses can improve performance24 Possible Application Provide hints, shaded by level of confidence Of course! Whenever I change something in the drawing, I should check to see if I damaged the drawing. Moving this concept to an aspect can eliminate a lot of similar calls from my OOP code. 25 Future Work • Examine other aspect categorization – "Sorts" (Marin et al, ICSM 05) • Extend with additional analyses – NLP-based analyses • Apply to more unknown programs 26
© Copyright 2024 Paperzz