Software Cost Estimation with COCOMO II

COCOMO II Overview
LiGuo Huang
Computer Science and Engineering
Southern Methodist University
1
Agenda
•
•
•
•
•
•
•
•
COCOMO Introduction
Basic Estimation Formulas
Cost Factors
Reuse Model
Sizing
Software Maintenance Effort
COCOMO Tool Demo
Data Collection
2
COCOMO Background
• COCOMO - the “COnstructive COst MOdel”
– COCOMO II is the update to COCOMO 1981
– ongoing research with annual calibrations made available
• Originally developed by Dr. Barry Boehm and published in
1981 book Software Engineering Economics
• COCOMO II described in new book Software Cost
Estimation with COCOMO II
• COCOMO can be used as a framework for cost estimation
and related activities
3
COCOMO II Model Objectives
• Provide accurate cost and schedule estimates for software
projects
• Enable organizations to easily recalibrate, tailor, or extend
COCOMO II to better fit their unique situations
• Provide careful, easy-to-understand definitions of the
model’s inputs, outputs, and assumptions
• Provide a constructive model
• Provide a normative model
• Provide a evolving model
4
COCOMO II Black Box Model
Software product size
estimate
Software product, process,
computer, and personal
attributes
Software reuse,
maintenance, and
increment parameters
Software organization’s
Project data
COCOMO
Software development
and maintenance:
• Costs (effort)
• Schedule estimates
• Distributed by phase,
activity, increment
COCOMO locally
calibrated to
organization’s data
5
Software Estimation Accuracy
• Effect of
uncertainties
over time
4x
2x
Relative
x
Size
Range
0.5x
0.25x
Operational
Concept
Feasibility
Life Cycle
Objectives
Plans/Rqts.
Life Cycle
Architecture
Design
Initial
Operating
Capability
Develop
and Test
Phases and Milestones
6
Major COCOMO II Features
• Multi-model coverage of different development
sectors
• Variable-granularity cost model inputs
• Flexibility in size inputs
–
–
–
–
SLOCS
function points
application points
other (use cases ...)
• Range vs. point estimates per funnel chart
7
COCOMO Uses for
Software Decision Making
• Making investment decisions and business-case
analyses
• Setting project budgets and schedules
• Performing tradeoff analyses
• Cost risk management
• Development vs. reuse decisions
• Legacy software phaseout decisions
• Software reuse and product line decisions
• Process improvement decisions
8
Productivity Ranges
• COCOMO provides natural framework to identify high leverage
productivity improvement factors and estimate their payoffs.
Docum entation
1.27
Databas e Size
1.28
Required Developm ent Schedule
1.29
Pers onnel Continuity
1.48
1.49
Cost Factor
Platform Volatility
Main Storage Cons traint
1.57
Multis ite Developm ent
1.60
Required Reus e
1.64
Execution Tim e Cons traint
1.67
Us e of Software Tools
1.72
Required Reliability
1.85
Product Com plexity
2.21
Pers onnel Experience
3.37
Pers onnel Capability
4.14
1
1.5
2
2.5
3
3.5
4
4.5
Productivity Range
9
COCOMO Submodels
• Applications Composition Model: involves rapid development or
prototyping efforts to resolve potential high-risk issues such as user
interfaces, software/system interaction, performance, or technology
maturity.
– sized with application points (weighted screen elements, reports and
3GL modules)
• Early Design model: explore alternative software/system architectures
and concepts of operation
– sized with function points
– a coarse-grained set of 7 cost drivers
• Post-Architecture model: the actual development and maintenance of a
software product
– source instructions and / or function points for sizing, with modifiers
for reuse and software breakage;
– a set of 17 multiplicative cost drivers; and a set of 5 factors
determining the project's scaling exponent
10
Agenda
•
•
•
•
•
•
•
•
COCOMO Introduction
Basic Estimation Formulas
Cost Factors
Reuse Model
Sizing
Software Maintenance Effort
COCOMO Tool Demo
Data Collection
11
COCOMO Nominal-schedule
Effort Formulation
PMNS(person-months) = A× (Size)E×
•
# of cost drivers
P EMi
i =1
Where:
– A is a constant derived from historical project data
(currently A = 2.94 in COCOMOII.2000)
– Size is in KSLOC (thousand source lines of code),
or converted from function points or object points
– E is an exponent for the diseconomy of scale dependent on five additive scale
drivers;
5
E  B  0.01  SF j
j 1
•
where, B= 0.91, SFj is a weighting factor for jth scale driver
– EMi is the effort multiplier for the ith cost driver. The geometric product results
in an overall effort adjustment factor to the nominal effort.
– # of cost drivers = 16 (exclude SCED)
Automated translation effects are not included
12
COCOMO Effort Formulation
PM(person-months) = A× (Size)E×
# of cost drivers
P EMi
i =1
# of cost drivers = 17 (including SCED)
13
Diseconomy of Scale
• Nonlinear relationship when exponent > 1
P e rs o n M o n th s
16000
EB = 1 .2 2 6
14000
12000
10000
8000
6000
B = 1 .0 0
E
4000
2000
EB = 0 .9 1
0
0
500
1000
KSLO C
14
COCOMO Schedule Formulation
TDEV (months) = C× (PMNS)F×(SCED%/100)
Where:
– TDEV is the schedule estimate of calendar time in months from the
requirements baseline to acceptance
– C is a constant derived from historical project data
(currently C = 3.67 in COCOMOII.2000)
– PMNS is the estimated person-months excluding the SCED effort multiplier
5
F  D  0.2  0.01  SFj  D  0.2  ( E  B)
j 1
where D = 0.28, B = 0.91
– SCED% is the compression / expansion percentage in the SCED cost driver
• This is the COCOMOII.2000 calibration
• Formula can vary to reflect process models for reusable and COTS
software, and the effects of application composition capabilities.
15
Multiple Module Effort Estimation
n
1.
Sum the sizes for all components: Size Aggregate   Sizei
i 1
2.
Apply the project-level drivers, the Scale Factors and the SCED to the
aggregated size to derive the overall basic effort for the total project:
PM Basic  A  (Size Aggregate) E  SCED
3.
Determine each component’s basic effort:
PM Basic( i )
4.
 Sizei



 PM Basic 
 Size

Aggregate 

Apply the component-level cost drivers (excluding SCED) to each
component’s basic effort:
16
PM i  PM Basic(i )   EM j
j 1
5.
Sum each component’s effort:
n
PM Aggregate   PM i
i 1
6.
Schedule is estimated by repeating steps 2 to 5 without SCED used in
step 2. Then use the schedule estimating formula.
16
Coverage of Different Processes
• COCOMO II provides a framework for tailoring the model
to any desired process
• Original COCOMO was predicated on the waterfall process
– single-pass, sequential progression of requirements, design, code, test
• Modern processes are concurrent, iterative, incremental, and
cyclic
– e.g. Rational Unified Process (RUP), the USC Model-Based
Architecting and Software Engineering (MBASE) process
• Effort and schedule are distributed among different phases
and activities per work breakdown structure of chosen
process
17
Common Process Anchor Points
• Anchor points are common process milestones around
which cost and schedule budgets are organized
• COCOMO II submodels address different
development stages anchored by these generic
milestones:
– Life Cycle Objectives (LCO)
• inception: establishing a sound business case
– Life Cycle Architecture (LCA)
• elaboration: commit to a single architecture and elaborate it to cover all major risk
sources
– Initial Operational Capability (IOC)
• construction: commit to transition and support operations
18
RUP Phase Distributions
Phase
Effort %
Inception 5
Schedule %
10
Elaboration
20
30
Construction
65
50
Transition 10
10
COCOMO Total 100
Project Total
100
100
100
19
Waterfall Phase Distributions
Phase
Effort %
Plans & rqts
Schedule %
7
20
Product Design
17
26
Programming
58
48
Integration & Test 25
26
Transition
12
COCOMO Total 100
Project Total
12.5
100
119
132.5
20
MBASE Phase Distributions
Phase
Effort %
Inception
Schedule %
6
12.5
Elaboration
24
37.5
Construction
76
62.5
Transition
12
COCOMO Total 100
Project Total
12.5
100
118
125
• see COCOMO II book for complete phase/activity
distributions
21
COCOMO II Output Ranges
• COCOMO II provides one standard deviation
optimistic and pessimistic estimates.
• Reflect sources of input uncertainties per funnel
chart.
• Apply to effort or schedule for all of the stage
models.
• Represent 80% confidence limits: below optimistic or
pessimistic estimates 10% of the time.
Stage
Optimistic
Estimate
Pessimistic
Estimate
1
0.50 E
2.0 E
2
0.67 E
1.5 E
3
0.80 E
1.25 E
22
COCOMO Tailoring and Enhancements
• Calibrate effort equations to organizational
experience
– USC COCOMO has a calibration capability
• Consolidate or eliminate redundant cost driver
attributes
• Add cost drivers applicable to your organization
• Account for systems engineering, hardware and
software integration
23
Agenda
•
•
•
•
•
•
•
•
COCOMO Introduction
Basic Estimation Formulas
Cost Factors
Reuse Model
Sizing
Software Maintenance Effort
COCOMO Tool Demo
Data Collection
24
Cost Factors
• Significant factors of development cost:
– scale drivers are sources of exponential effort variation
– cost drivers are sources of linear effort variation
• product, platform, personnel and project attributes
• effort multipliers associated with cost driver ratings
– Defined to be as objective as possible
• Each factor is rated between very low and
very high per rating guidelines
– relevant effort multipliers adjust the cost up or down
25
Scale Factors
• Precedentedness (PREC)
– Degree to which system is new and past experience applies
• Development Flexibility (FLEX)
– Need to conform with specified requirements
• Architecture/Risk Resolution (RESL)
– Degree of design thoroughness and risk elimination
• Team Cohesion (TEAM)
– Need to synchronize stakeholders and minimize conflict
• Process Maturity (PMAT)
– SEI CMM process maturity rating
26
Cost Drivers
• Product Factors
–
–
–
–
–
Reliability (RELY)
Data (DATA)
Complexity (CPLX)
Reusability (RUSE)
Documentation (DOCU)
• Platform Factors
– Time constraint (TIME)
– Storage constraint (STOR)
– Platform volatility (PVOL)
• Personnel factors
–
–
–
–
–
Analyst capability (ACAP)
Program capability (PCAP)
Applications experience (APEX)
Platform experience (PLEX)
Language and tool experience
(LTEX)
– Personnel continuity (PCON)
• Project Factors
– Software tools (TOOL)
– Multisite development (SITE)
– Required schedule (SCED)
27
Example Cost Driver
- Required Software Reliability (RELY)
• Measures the extent to which the software
must perform its intended function over a
period of time.
• Ask: what is the effect of a software failure?
Very Low
RELY
Low
slight
low, easily
inconvenience recoverable
losses
Nominal
moderate,
easily
recoverable
losses
High
Very High
high financial
loss
risk to human
life
Extra High
28
Example Effort Multiplier Values for
RELY
RELY
Rating
Defect Impact
Very
High
Loss of
Human Life
High
High
Financial
Loss
Nominal
Moderate
recoverable
loss
Low
Very
Low
Safety-critical
1.26
Commercial
1.10 quality leader
In-house support software
1.0
Low, easily
recoverable
loss
0.92
Slight
inconvenience
Commercial
cost leader
Early beta-test
0.82
0
0.8
12
0.9
22
34
1.0
1.1
54 Added Testing Time (%)
1.2
1.3
Relative Cost/Source Instruction
E.g. a highly reliable system costs 26% more than a nominally reliable system 1.26/1.0=1.26)
or a highly reliable system costs 85% more than a very low reliability system (1.26/.82=1.54)
29
Scale Factors
• Sum scale factors Wi across all of the
factors to determine a scale exponent, E,
using E = .91 + .01 S Wi
Scale Factors (Wi)
Very Low
Low
Nominal
Precedentedness
(PREC)
thoroughly
unprecedented
largely
unprecedented
somewhat
unprecedented
Development
Flexibility (FLEX)
rigorous
occasional
relaxation
Architecture/Risk
Resolution (RESL)*
little (20%)
Team Cohesion
(TEAM)
very difficult
interactions
Process Maturity
(PMAT)
High
Very High
Extra High
generally
familiar
largely
familiar
throughly
familiar
some
relaxation
general
conformity
some
conformity
general
goals
some (40%)
often (60%)
generally
(75%)
mostly
(90%)
full (100%)
some difficult
interactions
basically
cooperative
interactions
largely
highly
cooperative cooperative
seamless
interactions
Weighted average of “Yes” answers to CMM Maturity Questionnaire
* % significant module interfaces specified, % significant risks eliminated
30
Precedentedness (PREC) and
Development Flexibility (FLEX)
• Elaboration of the PREC and FLEX rating
scales:
Feature
Very Low
Nominal / High
Extra High
General
Considerable
Thorough
Experience in working with related software
systems
Moderate
Considerable
Extensive
Concurrent development of associated new
hardware and operational procedures
Extensive
Moderate
Some
Considerable
Some
Minimal
Need for software conformance with preestablished requirements
Full
Considerable
Basic
Need for software conformance with
external interface specifications
Full
Considerable
Basic
Premium on early completion
High
Medium
Low
Precedentedness
Organizational understanding of product
objectives
Need for innovative data processing
architectures, algorithms
Development Flexibility
31
Architecture / Risk Resolution (RESL)
• Use a subjective weighted average of the characteristics:
Characteristic
Very Low
None
Low
Little
Nominal
Some
High
Generally
Very High
Mostly
Extra High
Fully
Schedule, budget, and internal milestones
through PDR compatible with Risk
Management Plan
None
Little
Some
Generally
Mostly
Fully
Percent of development schedule devoted
to establishing architecture, given general
product objectives
5
10
17
25
33
40
Percent of required top software architects
available to project
20
40
60
80
100
120
Tool support available for resolving risk
items, developing and verifying
architectural specs
None
Little
Some
Good
Strong
Full
Level of uncertainty in Key architecture
drivers: mission, user interface, COTS,
hardware, technology, performance.
Extreme
Significant
Considerable
Some
Little
Very Little
> 10
Critical
5-10
Critical
2-4
Critical
1
Critical
> 5 NonCritical
< 5 NonCritical
Risk Management Plan identifies all critical
risk items, establishes milestones for
resolving them by PDR.
Number and criticality of risk items
32
Team Cohesion (TEAM)
• Use a subjective weighted average of the
characteristics to account for project turbulence
and entropy due to difficulties in synchronizing
the project's stakeholders.
• Stakeholders include users, customers, developers,
maintainers, interfacers, and others
Characteristic
Very Low
Low
Nominal
High
Very High
Extra High
Consistency of stakeholder
objectives and cultures
Little
Some
Basic
Considerable
Strong
Full
Ability, willingness of stakeholders to
accommodate other stakeholders'
objectives
Little
Some
Basic
Considerable
Strong
Full
Experience of stakeholders in
operating as a team
None
Little
Little
Basic
Considerable
Extensive
Stakeholder teambuilding to achieve
shared vision and commitments
None
Little
Little
Basic
Considerable
Extensive
33
Process Maturity (PMAT)
• Two methods based on the Software
Engineering Institute's Capability Maturity
Model (CMM)
• Method 1:
Overall Maturity Level
(CMM Level 1 through 5)
• Method 2:
Key Process Areas
(see next slide)
34
Key Process Areas
• Decide the percentage of compliance for each of the
KPAs as determined by a judgement-based averaging
across the goals for all 18 Key Process Areas.
 n  KPA%i 1 
PMAT = 5    100  n 

 i 1 
Key Process Areas
Almost Always
(>90%)
Frequently
(60-90%)
About Half
(40-60%)
Occasionally
(10-40%)
Rarely If Ever
(<10%)
Does Not
Apply
Don't
Know
1 Requirements
Management







2 Software Project
Planning







3 Software Project
Tracking and Oversight







4 Software Subcontract
Management







(See COCOMO II Model Definition Manual for remaining details)
35
Cost Drivers
•
•
•
•
Product Factors
Platform Factors
Personnel Factors
Project Factors
36
Product Factors
• Required Software Reliability (RELY)
– Measures the extent to which the software must
perform its intended function over a period of time.
Ask: what is the effect of a software failure
RELY
Very Low
slight
inconvenience
Low
low, easily
recoverable losses
Nominal
moderate, easily
recoverable losses
High
Very High
high financial loss risk to human life
Extra High
37
Product Factors (cont.)
• Data Base Size (DATA)
– Captures the effect large data requirements have on
development to generate test data that will be used to exercise
the program.
– Calculate the data/program size ratio (D/P):
D DataBaseSize( Bytes )

P ProgramSize( SLOC )
Very Low
DATA
Low
DB bytes/ Pgm SLOC < 10
Nominal
10  D/P < 100
High
100  D/P < 1000
Very High
D/P > 1000
Extra High
38
Product Factors (cont.)
• Product Complexity (CPLX)
– Complexity is divided into five areas:
• control operations,
• computational operations,
• device-dependent operations,
• data management operations, and
• user interface management operations.
– Select the area or combination of areas that characterize the
product or a sub-system of the product.
– See the module complexity table, next several slides
39
Product Factors (cont.)
• Module Complexity Ratings vs. Type of
Module
– Use a subjective weighted average of the attributes,
weighted by their relative product importance.
Very Low
Low
Nominal
High
Very High
Extra High
Control
Operations
Straightline code
with a few nonnested structured
programming
operators: DOs,
CASEs,
IFTHENELSEs.
Simple module
composition via
procedure calls or
simple scripts.
Straightforward
nesting of
structured
programming
operators.
Mostly simple
predicates.
Mostly simple
nesting. Some
intermodule
control. Decision
tables. Simple
callbacks or
message
passing,
including
middlewaresupported
distributed
processing.
Highly nested
structured
programming
operators with many
compound
predicates. Queue
and stack control.
Homogeneous, dist.
processing. Single
processor soft realtime ctl.
Reentrant and
recursive coding.
Fixed-priority
interrupt handling.
Task synchronization,
complex callbacks,
heterogeneous dist.
processing. Singleprocessor hard realtime ctl.
Multiple resource
scheduling with
dynamically changing
priorities. Microcodelevel control.
Distributed hard realtime control.
Computational
Operations
Evaluation of
simple
expressions: e.g.,
A=B+C*(D-E)
Evaluation of
moderate-level
expressions:
e.g.,
D=SQRT(B**24.*A*C)
Use of standard
math and
statistical
routines. Basic
matrix/vector
operations.
Basic numerical
analysis: multivariate
interpolation, ordinary
differential eqns.
Basic truncation,
roundoff concerns.
Difficult but structured
numerical analysis:
near-singular matrix
equations, partial
differential eqns.
Simple
parallelization.
Difficult and
unstructured
numerical analysis:
highly accurate
analysis of noisy,
stochastic data.
Complex
parallelization.
40
Product Factors (cont.)
• Module Complexity Ratings vs. Type of
Module
– Use a subjective weighted average of the attributes,
weighted by their relative product importance.
Very Low
Low
Nominal
High
Very High
Extra High
Devicedependent
Operations
Simple read,
write
statements
with simple
formats.
No cognizance
needed of particular
processor or I/O
device
characteristics. I/O
done at GET/PUT
level.
I/O processing
includes device
selection, status
checking and error
processing.
Operations at physical
I/O level (physical
storage address
translations; seeks,
reads, etc.).
Optimized I/O overlap.
Routines for interrupt
diagnosis, servicing,
masking.
Communication line
handling.
Performance-intensive
embedded systems.
Device timingdependent coding,
micro-programmed
operations.
Performancecritical embedded
systems.
Data
Management
Operations
Simple arrays
in main
memory.
Simple COTSDB queries,
updates.
Single file subsetting
with no data structure
changes, no edits, no
intermediate files.
Moderately complex
COTS-DB queries,
updates.
Multi-file input and
single file output.
Simple structural
changes, simple
edits. Complex
COTS-DB queries,
updates.
Simple triggers
activated by data
stream contents.
Complex data
restructuring.
Distributed database
coordination. Complex
triggers. Search
optimization.
Highly coupled,
dynamic relational
and object
structures. Natural
language data
management.
User
Interface
Management
Simple input
forms, report
generators.
Use of simple graphic
user interface (GUI)
builders.
Simple use of
widget set.
Widget set
development and
extension. Simple
voice I/O, multimedia.
Moderately complex
2D/3D, dynamic
graphics, multimedia.
Complex
multimedia, virtual
reality.
41
Product Factors (cont.)
• Required Reusability (RUSE)
– Accounts for the additional effort needed to construct
components intended for reuse.
Very Low
RUSE
Low
Nominal
High
Very High
none
across project
across program
across product line
Extra High
across multiple product lines
• Documentation match to life-cycle needs
(DOCU)
– What is the suitability of the project's documentation
to its life-cycle needs.
Very Low
DOCU
Many life-cycle
needs uncovered
Low
Some life-cycle
needs uncovered
Nominal
Right-sized to lifecycle needs
High
Excessive for lifecycle needs
Very High
Extra High
Very excessive for
life-cycle needs
42
Platform Factors
• Platform
– Refers to the target-machine complex of hardware and
infrastructure software (previously called the virtual
machine).
• Execution Time Constraint (TIME)
– Measures the constraint imposed upon a system in
terms of the percentage of available execution time
expected to be used by the system.
Very Low
TIME
Low
Nominal
High
Very High
Extra High
 50% use of available execution time
70%
85%
95%
43
Platform Factors (cont.)
• Main Storage Constraint (STOR)
– Measures the degree of main storage constraint
imposed on a software system or subsystem.
Very Low
Low
STOR
Nominal
High
Very High
Extra High
 50% use of available storage
70%
85%
95%
• Platform Volatility (PVOL)
– Assesses the volatility of the platform (the complex of
hardware and software the software product calls on to
perform its tasks).
Very Low
PVOL
Low
Nominal
High
Very High
major change every 12 mo.;
minor change every 1 mo.
major: 6 mo.;
minor: 2 wk.
major: 2 mo.;
minor: 1 wk.
major: 2 wk.;
minor: 2 days
Extra High
44
Personnel Factors
• Analyst Capability (ACAP)
– Analysts work on requirements, high level design and detailed design.
Consider analysis and design ability, efficiency and thoroughness, and
the ability to communicate and cooperate.
ACAP
Very Low
Low
Nominal
High
Very High
15th percentile
35th percentile
55th percentile
75th percentile
90th percentile
Extra High
• Programmer Capability (PCAP)
– Evaluate the capability of the programmers as a team rather than as
individuals. Consider ability, efficiency and thoroughness, and the
ability to communicate and cooperate.
PCAP
Very Low
Low
Nominal
High
Very High
15th percentile
35th percentile
55th percentile
75th percentile
90th percentile
Extra High
45
Personnel Factors (cont.)
• Applications Experience (AEXP)
– Assess the project team's equivalent level of experience with this type
of application.
AEXP
Very Low
Low
Nominal
High
Very High
 2 months
6 months
1 year
3 years
6 years
Extra High
• Platform Experience (PEXP)
– Assess the project team's equivalent level of experience with this
platform including the OS, graphical user interface, database,
networking, and distributed middleware.
PEXP
Very Low
Low
Nominal
High
Very High
 2 months
6 months
1 year
3 years
6 year
Extra High
46
Personnel Factors (cont.)
• Language and Tool Experience (LTEX)
– Measures the level of programming language and software tool
experience of the project team.
LTEX
Very Low
Low
Nominal
High
Very High
 2 months
6 months
1 year
3 years
6 years
Extra High
• Personnel Continuity (PCON)
– The scale for PCON is in terms of the project's annual personnel
turnover.
PCON
Very Low
Low
Nominal
High
Very High
48% / year
24% / year
12% / year
6% / year
3% / year
Extra High
47
Project Factors
• Use of Software Tools (TOOL)
– Assess the usage of software tools used to develop the product in
terms of their capabilities and maturity.
Very Low
edit, code,
debug
Low
Nominal
simple,
frontend,
backend CASE,
little integration
basic lifecycle
tools, moderately
integrated
High
strong, mature
lifecycle tools,
moderately
integrated
Very High
Extra High
strong, mature,
proactive lifecycle
tools, well integrated
with processes,
methods, reuse
48
Project Factors (cont.)
• Multisite Development (SITE)
– Assess and average two factors: site collocation and
communication support.
SITE:
Collocation
Very Low
Low
Nominal
High
Very High
Extra High
International
Multi-city and
Multi-company
Multi-city or
Multi-company
Same city or
metro. area
Same building or
complex
Fully collocated
Individual phone,
FAX
Narrowband
email
Wideband
electronic
communication
Wideband elect.
comm, occasional
video conf.
Interactive
multimedia
SITE:
Some phone,
Communications mail
• Required Development Schedule (SCED)
– Measure the imposed schedule constraint in terms of the
percentage of schedule stretch-out or acceleration with respect to a
nominal schedule for the project.
SCED
Very Low
Low
Nominal
High
Very High
75% of nominal
85%
100%
130%
160%
Extra High
49
Cost Factor Rating
• Whenever an assessment of a cost driver is
between the rating levels:
– always round to the lower rating
– e.g. if a cost driver rating is between High and
Very High, then select High.
50
Cost Driver Rating Level Summary
RELY
Very Low
Low
slight
inconvenience
low, easily
recoverable
losses
moderate, easily
recoverable losses
high financial loss
risk to human life
DB bytes/
Pgm SLOC < 10
10  D/P < 100
100  D/P  1000
D/P > 1000
DATA
CPLX
Nominal
High
Very High
Extra High
(see Complexity Table)
RUSE
none
across project
across program
across product line
Some life-cycle
needs uncovered
Right-sized to lifecycle needs
Excessive for
life-cycle needs
Very excessive for
life-cycle needs
TIME
50% use of
available execution
time
70%
85%
95%
STOR
 50% use of
70%
85%
95%
major: 2 mo.;
minor: 1 wk.
major: 2 wk.;
minor: 2 days
DOCU
Many life-cycle
needs uncovered
across multiple
product lines
available storage
PVOL
major change
every 12 mo.;
minor change
every 1 mo.
major: 6 mo.;
minor: 2 wk.
51
Cost Driver Rating Level Summary (cont.)
Very Low
Low
Nominal
High
Very High
Extra High
ACAP
15th percentile
35th percentile
55th percentile
75th percentile
90th percentile
PCAP
15th percentile
35th percentile
55th percentile
75th percentile
90th percentile
PCON
48% / year
24% / year
12% / year
6% / year
3% / year
AEXP
 2 months
6 months
1 year
3 years
6 years
PEXP
 2 months
6 months
1 year
3 years
6 year
LTEX
 2 months
6 months
1 year
3 years
6 year
TOOL
edit, code,
debug
simple, frontend,
backend CASE,
little integration
basic lifecycle
tools, moderately
integrated
strong, mature
lifecycle tools,
moderately
integrated
strong, mature, proactive
lifecycle tools, well
integrated with processes,
methods, reuse
SITE:
Collocation
International
Multi-city and
Multi-company
Multi-city or
Multi-company
Same city or
metro. area
Same building or complex
Fully
collocated
SITE:
Communications
Some phone,
mail
Individual phone,
FAX
Narrowband email
Wideband
electronic
communication.
Wideband elect. comm,
occasional video conf.
Interactive
multimedia
SCED
75% of nominal
85%
100%
130%
160%
52
Dependencies of Cost Factor Ratings
• RUSE, RELY and DOCU:
– RELY should be rated at least one level below
the RUSE rating
– DOCU rating should be at least Nominal for
Nominal and High RUSE ratings; at least High
for Very High and Extra High RUSE ratings
53
Agenda
•
•
•
•
•
•
•
•
COCOMO Introduction
Basic Estimation Formulas
Cost Factors
Reuse Model
Sizing
Software Maintenance Effort
COCOMO Tool Demo
Data Collection
54
Reused and Modified Software
• Effort for adapted software (reused or
modified) is not the same as for new
software.
• Approach: convert adapted software into
equivalent size of new software.
55
Nonlinear Reuse Effects
• The reuse cost function does not go through the origin due to a cost of about
5% for assessing, selecting, and assimilating the reusable component.
• Small modifications generate disproportionately large costs primarily due to
the cost of understanding the software to be modified, and the relative cost of
interface checking.
Data on 2954
NASA modules
[Selby,1988]
1.0
1.0
0.70
0.75
0.55
Relative
cost
0.5
Usual Linear
Assumption
0.25
0.046
0.25
0.5
0.75
1.0
Amount Modified
56
COCOMO Reuse Model
• A nonlinear estimation model to convert adapted
(reused or modified) software into equivalent
size of new software:
AAF  0.4( DM )  0.3(CM )  0.3( IM )
ESLOC 
ASLOC[ AA  AAF (1  0.02( SU )(UNFM ))]
, AAF  50
100
ESLOC 
ASLOC[ AA  AAF  ( SU )(UNFM )]
, AAF  50
100
57
COCOMO Reuse Model (cont.)
•
•
•
•
•
•
•
•
•
ASLOC - Adapted Source Lines of Code
ESLOC - Equivalent Source Lines of Code
AAF - Adaptation Adjustment Factor
DM - Percent Design Modified. The percentage of the adapted software's design
which is modified in order to adapt it to the new objectives and environment.
CM - Percent Code Modified. The percentage of the adapted software's code
which is modified in order to adapt it to the new objectives and environment.
IM - Percent of Integration Required for Modified Software. The percentage of
effort required to integrate the adapted software into an overall product and to test
the resulting product as compared to the normal amount of integration and test
effort for software of comparable size.
AA - Assessment and Assimilation effort needed to determine whether a fullyreused software module is appropriate to the application, and to integrate its
description into the overall product description. See table.
SU - Software Understanding. Effort increment as a percentage. Only used when
code is modified (zero when DM=0 and CM=0). See table.
UNFM - Unfamiliarity. The programmer's relative unfamiliarity with the
software which is applied multiplicatively to the software understanding effort
increment (0-1).
58
Assessment and Assimilation
Increment (AA)
AA Increment
Level of AA Effort
0
None
2
Basic module search and documentation
4
Some module Test and Evaluation (T&E), documentation
6
Considerable module T&E, documentation
8
Extensive module T&E, documentation
59
Software Understanding
Increment (SU)
• Take the subjective average of the three categories.
• Do not use SU if the component is being used
unmodified (DM=0 and CM =0).
Very Low
Low
Nominal
High
Very High
Structure
Very low
cohesion, high
coupling,
spaghetti code.
Moderately low
cohesion, high
coupling.
Reasonably wellstructured; some
weak areas.
Application
Clarity
No match
between program
and application
world views.
Some correlation
between program
and application.
Moderate
Good correlation
correlation
between program
between program and application.
and application.
Clear match between
program and
application worldviews.
SelfObscure code;
Descriptivenes documentation
s
missing, obscure
or obsolete
Some code
commentary and
headers; some
useful
documentation.
Moderate level of
code
commentary,
headers,
documentations.
Good code
commentary and
headers; useful
documentation;
some weak areas.
Self-descriptive code;
documentation up-todate, well-organized,
with design rationale.
SU Increment
to ESLOC
40
30
20
10
50
High cohesion, low Strong modularity,
coupling.
information hiding in
data / control
structures.
60
Programmer Unfamiliarity
(UNFM)
• Only applies to modified software
UNFM Increment
Level of Unfamiliarity
0.0
Completely familiar
0.2
Mostly familiar
0.4
Somewhat familiar
0.6
Considerably familiar
0.8
Mostly unfamiliar
1.0
Completely unfamiliar
61
Commercial Off-the-Shelf (COTS) Software
• Current best approach is to treat as reuse
• A COTS cost model is under development
• Calculate effective size from external interface files and
breakage
• Have identified candidate COTS cost drivers
62
Reuse Parameter Guidelines
Code Category
New
- all original software
Adapted
- changes to pre-existing software
Reused
- unchanged existing software
COTS
- off-the-shelf software (often
requires new glue code as a wrapper
around the COTS)
DM
0% - 100%
normally >
0%
0%
0%
CM
Reuse Parameters
IM
AA
not applicable
SU
UNFM
0+% - 100%
0% usually >
100+%
0% – 8% 0% - 50%
0-1
DM and
IM usually
must be >
moderate
0%
and can be
> 100%
0% 0%
100%
0% – 8%
not applicable
rarely 0%,
but could
be very
small
0%
0% 100%
0% – 8%
not applicable
63
Automatically Translated Code
• Reengineering vs. conversion
• Automated translation is considered separate activity
from development
– Add the term (1- AT/100) to the equation for ESLOC
Very Low
Low
Nominal
High
Very High
Structure
Very low
cohesion, high
coupling,
spaghetti code.
Moderately low
cohesion, high
coupling.
Reasonably wellstructured; some
weak areas.
Application
Clarity
No match
between program
and application
world views.
Some correlation
between program
and application.
Moderate
Good correlation
correlation
between program
between program and application.
and application.
Clear match between
program and
application worldviews.
SelfObscure code;
Descriptivenes documentation
s
missing, obscure
or obsolete
Some code
commentary and
headers; some
useful
documentation.
Moderate level of
code
commentary,
headers,
documentations.
Good code
commentary and
headers; useful
documentation;
some weak areas.
Self-descriptive code;
documentation up-todate, well-organized,
with design rationale.
SU Increment
to ESLOC
40
30
20
10
50
High cohesion, low Strong modularity,
coupling.
information hiding in
data / control
structures.
64
Agenda
•
•
•
•
•
•
•
•
COCOMO Introduction
Basic Estimation Formulas
Cost Factors
Reuse Model
Sizing
Software Maintenance Effort
COCOMO Tool Demo
Data Collection
65
Lines of Code
• Code size is expressed in KSLOC
• Source Lines of Code (SLOCs) = logical source statements
• Logical source statements = data declarations + executable
statements
• Executable statements cause runtime actions
• Declaration statements are nonexecutable statements that
affect an assembler's or compiler's interpretation of other
program elements
66
Lines of Code Counting Rules
• Standard definition for
counting lines
– Based on SEI definition
checklist from CMU/SEI92-TR-20
– Modified for COCOMO II
• When a line or statement
contains more than one
type, classify it as the
type with the highest
precedence. Order of
precedence is in
ascending order
Statement type
Includes
1. Executable

Excludes
2. Non-executable:
3.
Declarations

4.
Compiler directives

5.
Comments:
6.
On their own lines

7.
On lines with source code

8.
Banners and non-blank spacers

9.
Blank (empty) comments

10. Blank lines

67
Lines of Code Counting Rules (cont.)
• (See COCOMO II book for remaining details)
How produced
Includes
1. Programmed

Excludes

2. Generated with source code generators
3. Converted with automated translators

4. Copied or reused without change

5. Modified

6. Removed

Origin
Includes
1. New work: no prior existence

Excludes
2. Prior work: taken or adapted from:

3.
A previous version, build, or release
4.
Commercial, off-the-shelf software (COTS), other than libraries

5.
Government furnished software (GFS), other than reuse libraries

6.
Another product

7.
A vendor-supplied language support library (unmodified)

8.
A vendor-supplied operating system or utility (unmodified)

9.
A local or modified language support library or operating system

68
Counting with Function Points
• Used in both the Early Design and the PostArchitecture models.
• Based on the amount of functionality in a software
product and project factors using information
available early in the project life cycle.
• Quantify the information processing functionality
with the following user function types:
69
Counting with Function Points (cont.)
– External Input (Inputs)
• Count each unique user data or user control input type
that both
– Enters the external boundary of the software system being
measured
– Adds or changes data in a logical internal file.
– External Output (Outputs)
• Count each unique user data or control output type that
leaves the external boundary of the software system being
measured.
70
Counting with Function Points (cont.)
– Internal Logical File (Files)
• Count each major logical group of user data or control
information in the software system as a logical internal
file type. Include each logical file (e.g., each logical group
of data) that is generated, used, or maintained by the
software system.
– External Interface Files (Interfaces)
• Files passed or shared between software systems should
be counted as external interface file types within each
system.
71
Counting with Function Points (cont.)
– External Inquiry (Queries)
• Count each unique input-output combination, where
an input causes and generates an immediate output, as
an external inquiry type.
• Each instance of the user function types is then
classified by complexity level. The complexity
levels determine a set of weights, which are applied
to their corresponding function counts to determine
the Unadjusted Function Points (UFP) quantity.
72
Counting with Function Points (cont.)
• The usual Function Point procedure involves
assessing the degree of influence of fourteen
application characteristics on the software
project.
• The contributions of these characteristics are
inconsistent with COCOMO experience, so
COCOMO II uses Unadjusted Function Points
for sizing.
73
Unadjusted Function Points
Counting Procedure
• Step 1 - Determine function counts by type
– The unadjusted function counts should be counted by a lead technical
person based on information in the software requirements and design
documents.
– The number of each of the five user function types should be counted
• Internal Logical File (ILF)
– Note: The word file refers to a logically related group of data and
not the physical implementation of those groups of data.
•
•
•
•
External Interface File (EIF)
External Input (EI)
External Output (EO)
External Inquiry (EQ))
74
Unadjusted Function Points
Counting Procedure (cont.)
• Step 2 - Determine complexity-level function counts
– Classify each function count into Low, Average and High complexity
levels depending on the number of data element types contained and the
number of file types referenced. Use the following scheme:
For ILF and EIF
Record
Elements
For EO and EQ
Data Elements
1 - 19
20 - 50
51+
1
Low
Low
Avg
2-5
Low
Avg
6+
Avg
High
File
Types
For EI
Data Elements
1-5
6 - 19
20+
0 or 1
Low
Low
Avg
High
2-3
Low
Avg
High
4+
Avg
High
File
Types
Data Elements
1-4
5 - 15
16+
0 or 1
Low
Low
Avg
High
2-3
Low
Avg
High
High
3+
Avg
High
High
75
Unadjusted Function Points
Counting Procedure (cont.)
• Step 3 - Apply complexity weights
– Weight the number in each cell using the following scheme. The weights
reflect the relative value of the function to the user.
Function Type
Complexity-Weight
Low
Average
High
Internal Logical Files
7
10
15
External Interfaces Files
5
7
10
External Inputs
3
4
6
External Outputs
4
5
7
External Inquiries
3
4
6
• Step 4 - Compute Unadjusted Function Points
– Add all the weighted functions counts to get one number, the Unadjusted
Function Points.
76
Requirement Volatility (REVL)
• REVL: adjust the effective size cause by requirements
evolution and volatility
REVL
Size  (1 
)  Size D
100
SizeD: reuse-equivalent of the delivered software
77
Sizing Software Maintenance
• (Size)M = [(Base Code Size) * MCF] * MAF
• Maintenance Change Factor (MCF):
Size Added  Size Modified
MCF 
Base Code Size
• (Size)M = [(Size Added + Size Modified) * MCF] * MAF
• Maintenance Adjustment Factor (MAF):
 SU

MAF  1  
 UNFM 
 100

78
Agenda
•
•
•
•
•
•
•
•
COCOMO Introduction
Basic Estimation Formulas
Cost Factors
Reuse Model
Sizing
Software Maintenance Effort
COCOMO Tool Demo
Data Collection
79
Software Maintenance
• SCED cost driver is not used in maintenance effort estimation
• RUSE cost driver is not used in maintenance effort estimation
• RELY cost driver has a different set of effort multipliers:
– See Table 2.41 in COCOMO II book
• Apply the scaling exponent E to the number of changed
KSLOC (added and modified, not deleted) rather than the total
legacy system:
15
PM M  A  ( Size M )   EM i
E
i 1
• The average maintenance staffing level: FSPM = PMM /TM
– May use any desired maintenance activity duration TM
80
COCOMO II Demo
81
Agenda
•
•
•
•
•
•
•
•
COCOMO Introduction
Basic Estimation Formulas
Cost Factors
Reuse Model
Sizing
Software Maintenance Effort
COCOMO Tool Demo
Data Collection
82
Cost Driver Ratings Profile
• Need to rate cost drivers in a consistent and
objective fashion within an organization.
• Cost driver ratings profile:
– Graphical depiction of historical ratings to be
used as a reference baseline to assist in rating
new projects
– Used in conjunction with estimating tools to
gauge new projects against past ones
objectively
83
Example Cost Driver Ratings Profile
Very Low
Low
Nominal
RELY - required software
reliability
effect: slight
inconvenience
low, easily
recoverable
losses
PROJ2
PROJ3
PROJ4
PROJ1
DATA - data base size
DB
bytes/Prog.
SLOCS  10
moderate,
easily
recoverable
losses
see attached
table
____________
Very High
PROJ1
PROJ4
PROJ2
PROJ3
PROJ5
PROJ6
high financial
loss
risk to human
life
100 D/P
1000
D/P  1000
Extra High
PROJ5
PROJ6
10 D/P 100
PROJ3
PROJ5
PROJ1
CPLX - product complexity
High
____________
PROJ4
____________
PROJ2
PROJ6
____________
____________
84
Techniques to Generate Cost Driver
Ratings Profile
• Single person
– Time efficient, but may impose bias and person
may be unfamiliar with all projects
• Group
– Converge ratings in a single meeting (dominant
individual problem)
– Wideband Delphi technique (longer calendar time,
but minimizes biases). See Software Engineering
Economics, p. 335
85
COCOMO Dataset Cost Metrics
•
•
•
•
•
•
Size (SLOCS, function points)
Effort (Person-hours)
Schedule (Months)
Cost drivers
Scale drivers
Reuse parameters
86
Recommended Project Cost Data
• For each project, report the following at the
end of each month and for each release:
– SIZE
• Provide total system size developed to date, and report new
code size and reused / modified code size separately. This can
be at a project level or lower level as the data supports and is
reasonable. For languages not supported by tools such as
assembly code, report the number of physical lines separately
for each language.
– EFFORT
• Provide cumulative staff-hours spent on software development
per project at the same granularity as the size components.
87
Recommended Project Cost Data (cont.)
– COST DRIVERS AND SCALE DRIVERS
• For each reported size component, supply the cost driver ratings
for product, platform, personnel and project attributes. For each
reported size component, supply scale driver ratings.
– REUSE PARAMETERS
• For each component of reused/modified code, supply reuse
parameters AA, SU, UNFM, DM, CM and IM.
• See Appendix C in COCOMO II book for
additional data items
• Post-mortem reports are highly recommended
88
Effort Staff-Hours Definition
• Standard definition
– Based on SEI definition checklist form CMU/SEI-92-TR-21
– Modified for COCOMO II
• Does not include unpaid overtime, production and
deployment activities, customer training activities
• Includes all personnel except level 3 or higher
software management (i.e. directors or above who
timeshare among projects)
• Person-month is defined as 152 hours
89
Further Information
• B. Boehm, C. Abts, W. Brown, S. Chulani, B.
Clark, E. Horowitz, R. Madachy, D. Reifer, B.
Steece, Software Cost Estimation with
COCOMO II, Prentice-Hall, 2000
• B. Boehm, Software Engineering Economics.
Englewood Cliffs, NJ, Prentice-Hall, 1981
90