Using Domain Information to Categorize Effort Distribution

University of Southern California
Center for Systems and Software Engineering
Using Domain Information to Categorize
Effort Distribution
Thomas Tan
Brad Clark
Ye Yang
University of Southern California
Center for Systems and Software Engineering
Table of Contents
• Introduction
• Research Project
– Research Purpose
– Data Collection and Processing
– Initial Results
• Being Dr. Boehm’s PhD Students
• Special Video Tribute to Dr. Boehm from USC
CSSE and the CS department
Symposium in Honor of Professor Barry W. Boehm
2
University of Southern California
Center for Systems and Software Engineering
Introduction
• USC CSSE Software Cost Estimation Team
– Under the direction of Dr. Boehm to work on various research
projects on software cost estimation, software metrics, and
measurement
– Particular emphasize on COCOMO model
– Also maintaining the CodeCount Tool
• Currently working on composing a Software Cost
Estimation Manual for the US Government
– Study on application domains and operating environment is
also part of this effort
Symposium in Honor of Professor Barry W. Boehm
3
University of Southern California
Center for Systems and Software Engineering
Research Purpose
• The overall goal is to study application
domains and operating environments of
software projects and how these factors affect
the project’s effort
– Simple Cost Effort Relationships using these two
factors
– Effects of these factors on effort distribution
– Extends the current COCOMO II model to
incorporate the effects of these factors
Symposium in Honor of Professor Barry W. Boehm
4
University of Southern California
Center for Systems and Software Engineering
Application Domains and Effort Distribution
• Currently working on application domain’s
effect on effort distribution.
• Key questions to answer are:
1.Do we see different effort distribution patterns
from different application domains?
2.Do we see difference between effort distribution
patterns based on application domains and the
COCOMO II model’s generic effort distribution?
3.What information can we extract from application
domains that can be used to explain these
differences?
Symposium in Honor of Professor Barry W. Boehm
5
University of Southern California
Center for Systems and Software Engineering
Effort Activities Definitions
• From Software Resource Data Report (SRDR) definitions
• Compliant with COCOMO II Waterfall definitions, following table
illustrate the mapping between COCOMO II Phases vs. SRDR
activities
Phase
Activities
Requirement
Software requirements analysis
Architecture
Software architecture and detailed
design
Coding, unit testing
Coding and Unit Testing
Integration and Qualification Testing Software integration and
system/software integration
Qualification/acceptance testing
Symposium in Honor of Professor Barry W. Boehm
6
University of Southern California
Center for Systems and Software Engineering
Domain Definitions
•
•
Adapted from Reifer’s Application types in combination with US Air Force Software
Cost Estimation Handbook’s application types
21 application domains and 8 operating environments. We only show 8 domains that
we use for this analysis.
Domain
Business
Command and Control
Communications
Mission Management
Mission Planning
Definitions
Software that automates business functions, stores and retrieves data, processes orders, manages/tracks the
flow of materials, combines data from different sources, or uses logic and rules to process information.
Software that enables decision makers to manage dynamic situations and respond in real time. Software
provides timely and accurate information for use in planning, directing, coordinating and controlling resources
during operations. Software is highly interactive with a high degree of multi-tasking.
Software that controls the transmission and receipt of voice, data, digital and video information. The software
operates in real-time or in pseudo real-time in noisy environments.
Software that enables and assists the operator in performing mission management activities including scheduling
activities based on vehicle, operational and environmental priorities.
Software used for scenario generation, feasibility analysis, route planning, and image/map manipulation. This
software considers the many alternatives that go into making a plan and captures the many options that lead to
mission success.
Simulation
Software used to evaluate scenarios and assess empirical relationships that exist between models of physical
processes, complex systems or other phenomena. The software typically involves running models using a
simulated clock in order to mimic real world events.
Sensor Control and
Processing
Software used to control and manage sensor transmitting and receiving devices. This software enhances,
transforms, filters, converts or compresses sensor data typically in real-time. This software uses a variety of
algorithms to filter noise, process data concurrently in real-time and discriminate between targets.
Weapon Delivery and
Control
Software used to select, target, and guide weapons. Software is typically complex because it involves
sophisticated algorithms, fail-safe functions and must operate in real-time.
Symposium in Honor of Professor Barry W. Boehm
7
University of Southern California
Center for Systems and Software Engineering
Data Collection and Processing
• Data from source-sanitized SRDR data collection
• Experts from the government side helped to fill in domains and
environment parameters to all data points
• Since data background information is unknown, we assume
normal distribution of the data and performed necessary normality
tests to verify our assumption
• Select only the data with full effort information, i.e. data must have
efforts for requirements, architecture & design, coding & unit
testing, integration & qualification testing
• Also tests with backfilling the data with missing effort data, more
details on next slide
• Calculate means and medians for each domain subset
Symposium in Honor of Professor Barry W. Boehm
8
University of Southern California
Center for Systems and Software Engineering
Data Backfilling
• Motivation: we have many data points missing some effort
activities data, for instance, missing hours for requirements, or for
qualification testing
• Goal: to check if backfilling can tune-up the analysis results by
adding more data points
• Experiment: we use the averages percentages from fullinformation data set to backfill those data points missing at most
two effort activities; resulting set is a backfilled set that we can
calculate for new average percentages
• Results: not as useful as we expected: results before and after
backfilling are similar and initial reading is that backfilling will cost
extra effort that probably not worth it for insignificant improvement
in analysis results
Symposium in Honor of Professor Barry W. Boehm
9
University of Southern California
Center for Systems and Software Engineering
Initial Results
• The following table shows the average percentages for
domains:
20.98%
17.82%
14.21%
18.33%
15.66%
15.69%
Arch &
Design
22.55%
25.47%
28.32%
16.51%
13.30%
28.31%
Code &
Unit Test
24.96%
36.98%
33.42%
29.39%
44.08%
30.43%
Integration
& QT
31.51%
19.74%
24.05%
35.77%
26.95%
25.56%
9.98%
39.42%
24.43%
26.17%
13.06%
20.37%
33.05%
33.52%
Domain
Requirement
Business
Command & Control
Communications
Mission Management
Mission Planning
Simulation
Sensors Control and
Processing
Weapons Delivery and
Control
Symposium in Honor of Professor Barry W. Boehm
10
University of Southern California
Center for Systems and Software Engineering
Domain’s Effort Distributions are different
Symposium in Honor of Professor Barry W. Boehm
11
University of Southern California
Center for Systems and Software Engineering
Domain’s Effort Distributions are different
•
Use Simple ANOVA to test the following:
– H0: effort distributions are same
– Ha: effort distributions are not all the same from domain to domain
*Based on 90% Confidence Level
Activity Groups
F
P-value
Results*
Plan & Requirements
0.9165
0.4995
Can't reject
Architecture &
Design
3.7064
0.0019
Reject
Code & Unit Testing
1.8000
0.1020
Barely reject
Integration &
Qualification Testing
2.1125
0.0542
Reject
Symposium in Honor of Professor Barry W. Boehm
12
University of Southern California
Center for Systems and Software Engineering
Differences against COCOMO II
•
Use independent one-sample t-test to test the following:
– H0: domain average is the same as COCOMO average
– Ha: domain average is not the same as COCOMO average
– Tests run for every domain on every activity group
* Again, we use 90% confidence level to determine result
Activity Groups
COCOMO Average
Results*
All domains reject except Sensor Processing and
Control
All domains reject except Sensor Processing and
Control
Plan & Requirements
7%
Architecture & Design
42%
Code & Unit Testing
33%
Only Business domain rejects
Integration & Qualification
Testing
25%
Only Mission Management and Weapon Delivery
and Control domains reject
Symposium in Honor of Professor Barry W. Boehm
13
University of Southern California
Center for Systems and Software Engineering
Next Steps
• Continue on to extract information from
application domains to explain the effort
distribution differences
• Use the resulting average percentages on a
test set to estimate effort distribution; compare
the results against using COCOMO II model
and report improvements or otherwise
• Explore the effect of operating environment
using the same procedures as we have for
application domains
Symposium in Honor of Professor Barry W. Boehm
14
University of Southern California
Center for Systems and Software Engineering
Being Dr. Boehm’s PhD Students
• We got the best go-to guy
– Dr. Boehm always knows what to do
• We got the best teacher
– Dr. Boehm always gives us directions and hints to
find the pieces to answer our questions
• We got the best role-model
– Dr. Boehm always shares his works and shows us
how to research, solve, and present
Symposium in Honor of Professor Barry W. Boehm
15
University of Southern California
Center for Systems and Software Engineering
Dr. Boehm is our leader
• "If your actions inspire others to dream more,
learn more, do more and become more, you
are a leader."
-- John Quincy Adams
Symposium in Honor of Professor Barry W. Boehm
16
University of Southern California
Center for Systems and Software Engineering
Special Video Tribute to Dr. Boehm
• From Julie, Shang-hua, Jun, Rick, Julieta, Binh
Lizsl, Steve, Sue, Pongtip, Nupul, and Qi
Symposium in Honor of Professor Barry W. Boehm
17