1 The Present and Future Role of Probability in Software Engineering Trial Lecture Friday 30 December Siv Hilde Houmb Siv HIlde Houmb, The Present and Future Role of Probability in Software Engineering 2 Outline • • • • • Short introduction to SE and Probability Probability as Decision Support for SE Project/Software Estimation in SE The Present Role of Probability in Project Estimation The Future Role of Probability in Project Estimation – Treat estimation as a probabilistic phenomenon – Combine estimation with quantitative risk assessment to yield realistic estimates – Estimation as a System Identification problem • • Concluding Remarks References Siv HIlde Houmb, The Present and Future Role of Probability in Software Engineering 3 Software Engineering Software engineering (SE) is the application of a systematic, disciplined, quantifiable approach to the development, operation, and maintenance of software [1]. Siv HIlde Houmb, The Present and Future Role of Probability in Software Engineering Cost 4 Product Management Methodology and Tools y rit ty le u b c a e Se Saf Reli End-Product Development Methodology and Tools M1 Requirement M2 Design M3 Implementation Software Processes Maintenance Project Management Methodology and Tools Time Siv HIlde Houmb, The Present and Future Role of Probability in Software Engineering 5 Probability Theory Probability theory is the branch of mathematics concerned with analysis of random phenomena [1]. • Probability theory are concerned with random variables, stochastic processes and events • Mathematical abstractions of non-deterministic events or measured quantities that may either be single occurrences or evolve over time in an apparently random fashion Siv HIlde Houmb, The Present and Future Role of Probability in Software Engineering 6 Probability as Decision Support in SE • The goal of a software project is to produce reliable and effective software that meets its requirements and expectations • This involves many DECISIONS – Which methodology and tools are most effective? – What are the project risks? – What are expected and realizable quality goals (safety, security, reliability)? – What are the risks to meeting these goals? – How much should and will the development cost in money and resources? – What are the risks to meeting the deadline, budget? Siv HIlde Houmb, The Present and Future Role of Probability in Software Engineering 7 Estimation in SE Motivation for looking into software/project estimation is to find out if the situation there is the same as in security “estimation” (quantification) Size Cost, Duration/Time, Schedule Compute TCF User Specs Count Function Points UFP Apply Organization Particulars Product Size Estimation tool (COCOMO) Development Process Planning Tool Project Plans From page 184 in Bernstein & Yuhas (2005) [10] Siv HIlde Houmb, The Present and Future Role of Probability in Software Engineering 8 Project Estimation • Concerns estimating time (T), schedule (S) and cost (C) – Time often aggregated into a quantitative value denoted Effort – Cost C must usually be within some Budget B • Decisions concerns deriving at an estimate of time and budget that is close to the “real” values • The ultimate goal is: S=T and B=C, meaning that the a prior estimates equals the posterior observations Siv HIlde Houmb, The Present and Future Role of Probability in Software Engineering 9 Estimation Methods • (Repeatable) Wideband Delphi Estimation – Uses two types of meetings; kick-off and estimation meetings, and results in an agreed upon estimate • PROBE (Proxy Based Estimation) – Individual estimates based on database storing prior experience of a particular engineer • COCOMO II (COnstructive COst MOdel) – Uses 5 scale drivers and 15 cost drivers to estimate size and complexity and from that derive the required effort • The Planning Game – For XP (Extreme Programming) – Estimation as a game between the engineers and the stakeholders using user/usage stories Siv HIlde Houmb, The Present and Future Role of Probability in Software Engineering 10 The Present Role of Probability (1) • Estimation as traditional business analysis with little real statistics and where one attempts to obtain best estimates with little or no formal representation of the uncertainty in the estimates • Software estimation's cone of uncertainty – Uncertainty in the estimates – Variability in the events and activities of a project Siv HIlde Houmb, The Present and Future Role of Probability in Software Engineering 11 The Present Role of Probability (2) – "If I give you another week to work on your estimate, can you refine it so that it contains less uncertainty?" – This does not work – The reason the estimates contain variability is that the software project itself contains variability, not that the estimates are variables at different points in time – The only way to reduce the variability in the estimate is to reduce the variability in the project itself or to capture these in the estimation model 12 Interpretation of Probability • Classical (frequentist) interpretation – Estimate P’ of event E, where uncertainty is an estimate of the relative distance between the actual P and P’ – Uses/requires historical or empirical data • Bayesian or subjectivist interpretation – B. Finetti (1973) [12,13] – The belief P’ of event E, where uncertainty expresses how certain the source is that P’ is P – Uses expert judgment, historical data and other available sources • Predictive Bayesian interpretation – T. Aven (2003) [20] – Estimate an subject’s uncertainty of P for future event E – Uses expert judgment on historical data and experience 13 Future Role of Probability (1) 1. Treat project estimation as a probabilistic phenomenon according to the subjectivist or Bayesian interpretation (general idea from Pfleeger (1999) [14] and Fenton et al. (1999/2000/2004) [11,18,19]) • Rather than looking at the variability in the process/software and uncertainty in the estimates separately based on historical data using the classical interpretation of probability one can combine them under the subjectivist or Bayesian interpretation of probability 14 Future Role of Probability (2) 2. Combine estimation with quantitative risk assessment to yield realistic estimates (Kansala (1007) [6] and to support trade-off analysis (Fenton et al. (1999/2000/2004) [11,18,19]) 3. Estimation seen as a (probabilistic) System Identification Problem (from Ramil (2000) [7]) 15 Project Estimation as a probabilistic phenomenon • • COCOMO II developed based on 83 projects examining the relation between A-prior (estimates) and A-posterior (observations) using regression analysis Regression type models (Chulani, Boehm and Steece (2000) [24]) – Might lead to misunderstanding about cause and effect – Represent a static analysis and need lot of historical data (no missing data points and no outliners) – Assumes that there is an actual value – Treats estimates as “objective” – Assumes that all works well during the project – does not counter for the variability – Must be calibrated to an organization 16 A Causal Approach to Estimation • • Using causal models (as in software reliability engineering) based on the metrics in e.g. COCOMO II extends the estimation model to a dynamic decision-support and risk analysis tool Causal analysis using Bayesian Belief Nets (BBNs) (Fenton et al. (1999/2004) [11,19]) – Diverse process and product variables (to express variability in each and the dependencies between them) – Empirical evidence and expert judgment – Genuine cause and effect relationships – Uncertainty – Incomplete information • No additional metrics neither in the data-collection or the sophistication of the metrics – The BBN topology simply expresses the current metrics only extended to handle disparate information sources in sets of conditional probability statements 17 BBN Topology for Estimation • BBN consist of – A graphical network (DAG) with nodes and arcs • Nodes represent uncertain variables • Arcs models the causal relationship between the variables – Probability tables • Provides the probabilities of each state of the variable for a node • We will look at a BBN topology for estimating resources in a project from Fenton and Neil (1999) [11] that can be used both as a trade-off analysis and as project risk analysis 18 Classical versus Causal [11] 19 Required Resources Subnet (from Fenton and Neil (1999) [11]) 20 Example: Problem Size of 1400-1500 FP 21 Example: Require High Accuracy 22 Propagating Evidence in BBN (1) • Evidence propagates through the topology using Bayes method • Bayes method – Estimate a prior probability (P(A)) – initial belief – Collect evidence/information (P(B)) • • • • Perform experiments Historical data Collect expert opinions Other information sources – Update prior estimates to a posterior estimate (P(A|B)) 23 Propagating Evidence in BBN (1) • Bayes rule is used to update the network P ( B | A) P( A) P( A | B) = P( B) • Bayes rule updates our belief about a hypothesis A in the light of new evidence B – Our prior belief P(A) is updated to posterior belief P(A|B) by multiplying our prior belief P(A) with the likelihood that B will occur if A is true P(B|A) 24 Concluding Remark – Benefits of Causal Model/BBN (1) • Explicitly modeling of variability in projects and uncertainty in estimates • Explicitly modeling of cause-effect relationships • Can combine diverse types of information • Makes explicit those assumptions that were previously hidden • Intuitive graphical format makes it easier to understand chains of complex and seemingly contradictory reasoning 25 Concluding Remark – Benefits of Causal Model/BBN (2) • Ability to forecast missing data • Support for ‘what-if?’ analysis and forecasting of effects of process changes • Use of subjectively or objectively derived probability distributions • Rigorous mathematical semantics for the model • No need to do any of the complex Bayesian calculations as tools like HUGIN does that 26 "Not everything that can be counted counts, and not everything that counts can be counted.“ Unknown placed on Einstein’s office door at Princeton Albert Einstein Siv HIlde Houmb, The Present and Future Role of Probability in Software Engineering 27 References (1) 1. 2. 3. 4. 5. 6. 7. 8. 9. IEEE Standard Glossary of Software Engineering Terminology, IEEE std 610.12-1990. A. Stellman and J. Green. Applied Software Project Management. O’Reilly, 2005. B. Boehm et al. Software Cost Estimation with Cocomo II. Addison Wesley, 2000. B. Boehm. Software Engineering Economics. Englewood Cliffs, N.J, PrenticeHall, 1981. F. P. Brooks Jr. The Mythical Man Month. Essays on Software Engineering. Addison Wesley, USA (1975). K. Kansala. Integrating Risk Assessment with Cost Estimation. IEEE Software, Vol. 14, No.4 (1997), pp. 61-67. J. Ramil. Why COCOMO' Works Revisited or Feedback Control as a Cost Factor. Submitted to FEAST 2000 International Workshop on Feedback in Software and Business Processes, July 10-12, Imperial College, London, 2000. S. McConnell. Software Estimation: Demystifying the Black Art. Microsoft Press, 2006. B. Boehm and K. Sullivan. Software economics status and prospects. Information and Software Technology, Volume 41, No. 14, (1999), pp937-946. 28 References (2) 10. L. Bernstein and C.M. Yuhas. Trustworthy Systems Through Quantitative Software Engineering. John Wiley & Sons, 2005. 11. N. Fenton and M. Neil. Software Metrics and Risk. 2nd European Software Measurement Conference. 1999. 12. B. D. Finetti. Theory of Probability Volume 1. John Wiley & Sons, 1973. 13. B. D. Finetti. Theory of Probability Volume 2. John Wiley & Sons, 1973. 14. S.L. Pfleeger. Albert Einstein and Empirical Software Engineering. IEEE Computer, 32(10):32-38, October 1999. 15. C. Wohlin and P. Runeson and M. Höst and C.O. Ohlsson and B. Regnell and A. Wesslé. Experimentation in Software Engineering: An Introduction. Kluwer Academic Publishers, 2000. 16. R. Cooke. Experts in Uncertainty: Opinion and Subjective Probability in Science. Oxford University Press, 1991. 17. M. Jørgensen and K.J. Moløkken-Østvold. How Large Are Software Cost Overruns? Critical Comment on the Standish Group’s CHAOS Report. Information and Software Technology, 48(4):297-301, 2006. 29 References (3) 18. M. Jørgensen. Estimation of Software Development Work Effort: Evidence on Expert Judgment and Formal Models. International Journal of Forecasting 23(3):449-462, 2007. 19. N. Fenton and M. Neil. Software metrics: roadmap. In Proceedings of 22nd International Conference on Software Engineering: Future of Software Engineering Track, pages 357-370, 2000. 20. N. Fenton, W. Marsh, M. Neil, P. Cates, S. Forey and M. Tailore. Making resource decisions for software projects. Proceedings of the 26th International Conference on Software Engineering (ICSE04), pages 391-406, 2004. 21. T. Aven. Foundations of Risk Analysis: A Knowledge and Decision-Oriented Perspective. Wiley, 2003. 22. S. Chulani. Incorporating Bayesian Analysis to Improve the Accuracy of COCOMO II and Its Quality Model Extension. Ph.D. Qualifying Exam Report, University of Southern California, February 1998. 23. S. Chulani, B. Boehm and B. Steece. Bayesian Analysis of Empirical Software Engineering Cost Models. IEEE Transactions on Software Engineering, Special Issue on Empirical Methods in Software Engineering, Vol. 25, No. 4, July/August 1999. 30 References (4) 24. S. Chulani, B. Boehm and B. Steece. From Multiple Regression to Bayesian Analysis for Calibrating COCOMO II. Proceedings of the 21st Annual Conference of the International Society of Parametric Analysts (ISPA), 2000. 25. H. Wang, F. Peng, C. Zhang and A. Pietschker. Software Project Level Estimation Model Framework based on Bayesian Belief Networks. Proceedings of the Sixth International Conference on Quality Software. IEEE Computer Society, pages 209-218, 2006. 26. I. Stamelos, L. Angelis, P. Dimou and E. Sakellaris. On the use of bayesian belief networks for the prediction of software productivity. Information & Software Technology, 45(1):51-60, 2003. 31 Project Resource BN (from Fenton et al. (2004) [19]) 32 Subnet for Total Effective Effort (from Fenton et al. (2004) [19]) 33 Example: Resource Prediction (from Fenton et al. (2004) [19])
© Copyright 2026 Paperzz