Data and Statistics

Slides Prepared by
Juei-Chao Chen
Fu Jen Catholic University
© 2006 by Thomson Learning, a division of Thomson Asia Pte Ltd..
Slide ‹#›
Chapter 1
STATISTICS in PRACTICE
BUSINESS WEEK
• Most issues of Business Week
provide an in-depth report on a
topic of current interest. Often,
the in-depth reports contain
statistical facts and summaries
that help the reader understand
the business and economic
information.
• Business Week also uses statistics
and statistical information in
managing its own business.
© 2006 by Thomson Learning, a division of Thomson Asia Pte Ltd..
Slide ‹#›
Chapter 1
Data and Statistics
1.1
1.2
1.3
1.4
1.5
1.6
Applications in Business and Economics
Data
Data Sources
Descriptive Statistics
Statistical Inference
Computers and Statistical Analysis
© 2006 by Thomson Learning, a division of Thomson Asia Pte Ltd..
Slide ‹#›
1.1 Applications in
Business and Economics
•
•
•
•
•
Accounting
Finance
Marketing
Production
Economics
© 2006 by Thomson Learning, a division of Thomson Asia Pte Ltd..
Slide ‹#›
Applications in
Business and Economics
• Accounting
Public accounting firms use statistical
sampling procedures when conducting audits for
their clients.
• For instance, the audit staff selects a subset of the accounts
called a sample. After reviewing the accuracy of the
sampled accounts, the auditors draw a conclusion as to
whether the accounts receivable amount shown on the
client’s balance sheet is acceptable.
© 2006 by Thomson Learning, a division of Thomson Asia Pte Ltd..
Slide ‹#›
Applications in
Business and Economics
•
Economics
Economists use statistical information in making forecasts
about the future of the economy or some aspect of it.
• For instance, the analysts review a
variety of financial data including
price/earnings ratios and dividend
yields. By comparing the information
for an individual stock with information
about the stock market averages, a financial analyst
can begin to draw a conclusion as to whether an individual
stock is over- or under priced.
© 2006 by Thomson Learning, a division of Thomson Asia Pte Ltd..
Slide ‹#›
Applications in
Business and Economics
•
Marketing
Electronic point-of-sale scanners at retail checkout counters
are used to collect data for a variety of marketing
research applications.
• For example, Brand managers can review the
Scanner statistics and the promotional activity
statistics to gain a better understanding of the
Relationship between promotional activities and sales.
Such analyses often prove helpful in establishing future
marketing strategies for the various products.
© 2006 by Thomson Learning, a division of Thomson Asia Pte Ltd..
Slide ‹#›
Applications in
Business and Economics
•
Production
A variety of statistical quality control
charts are used to monitor the output
of a production process.
• For example, that a machine fills containers with
12 ounces of a soft drink. Periodically, a production
worker selects a sample of containers and computes
the average number of ounces in the sample. Properly
interpret the average can help determine when
adjustments are necessary to correct a production
process.
© 2006 by Thomson Learning, a division of Thomson Asia Pte Ltd..
Slide ‹#›
Applications in
Business and Economics
• Finance
Financial advisors use price-earnings ratios and dividend
yields to guide their investment recommendations.
© 2006 by Thomson Learning, a division of Thomson Asia Pte Ltd..
Slide ‹#›
1.2 Data
•
•
•
•
•
Data and Data Sets
Elements, Variables, and Observations
Scales of Measurement
Qualitative and Quantitative Data
Cross-Sectional and Time Series Data
© 2006 by Thomson Learning, a division of Thomson Asia Pte Ltd..
Slide ‹#›
1.2 Data
Data and Data Sets
• Data are the facts and figures collected,
summarized, analyzed, and interpreted.
• The data collected in a particular study are
referred to as the data set.
© 2006 by Thomson Learning, a division of Thomson Asia Pte Ltd..
Slide ‹#›
Elements, Variables, and Observations
• The elements are the entities on which data are
collected.
• A variable is a characteristic of interest for the
elements.
• The set of measurements collected for a
particular element is called an observation.
• The total number of data values in a data set is
the number of elements multiplied by the
number of variables.
© 2006 by Thomson Learning, a division of Thomson Asia Pte Ltd..
Slide ‹#›
Data, Data Sets,
Elements, Variables, and Observations
•
•
•
the data set contains 8 elements.
five variables: Exchange, Ticker Symbol, Market Cap,
Price/Earnings Ratio, Gross Profit Margin.
observations: the first observation (DeWolfe Companies)
is AMEX, DWL, 36.4, 8.4, and 36.7.
© 2006 by Thomson Learning, a division of Thomson Asia Pte Ltd..
Slide ‹#›
Data, Data Sets,
Elements, Variables, and Observations
Variables
Element
Names
Company
Dataram
EnergySouth
Keystone
LandCare
Psychemedics
Stock
Exchange
AMEX
OTC
NYSE
NYSE
AMEX
Annual
Earn/
Sales($M) Share($)
73.10
74.00
365.70
111.40
17.60
0.86
1.67
0.86
0.33
0.13
Data Set
© 2006 by Thomson Learning, a division of Thomson Asia Pte Ltd..
Slide ‹#›
Scales of Measurement
• Nominal scale
When the data for a variable consist of labels
or names used to identify an attribute of the
element.
For example, gender, ID number,
“exchange variable” in Table 1.1
nominal data can be recorded using a
numeric code. We could use “0” for female,
and “1” for male.
© 2006 by Thomson Learning, a division of Thomson Asia Pte Ltd..
Slide ‹#›
Scales of Measurement
• Nominal
Example:
Students of a university are classified by the
school in which they are enrolled using a
nonnumeric label such as Business,
Humanities, Education, and so on.
Alternatively, a numeric code could be used for
the school variable (e.g. 1 denotes Business, 2
denotes Humanities, 3 denotes Education, and
so on).
© 2006 by Thomson Learning, a division of Thomson Asia Pte Ltd..
Slide ‹#›
Scales of Measurement
• Ordinal scale
if the data exhibit the properties of nominal
data and the order or rank of the data is
meaningful.
For example, questionnaire: a repair service
rating of excellent, good, or poor.
Ordinal data can be recorded using a numeric
code. We could use 1 for excellent, 2 for good,
and 3 for poor.
© 2006 by Thomson Learning, a division of Thomson Asia Pte Ltd..
Slide ‹#›
Scales of Measurement
• Ordinal
Example:
Students of a university are classified by their
class standing using a nonnumeric label such
as Freshman, Sophomore, Junior, or Senior.
Alternatively, a numeric code could be used for
the class standing variable (e.g. 1 denotes
Freshman, 2 denotes Sophomore, and so on).
© 2006 by Thomson Learning, a division of Thomson Asia Pte Ltd..
Slide ‹#›
Scales of Measurement
• Interval
the data show the properties of ordinal data
and the interval between values is expressed in
terms of a fixed unit of measure.
Example: SAT scores, temperature.
Interval data are always numeric.
© 2006 by Thomson Learning, a division of Thomson Asia Pte Ltd..
Slide ‹#›
Scales of Measurement
• Interval
Example:
three students with SAT scores of 1120, 1050,
and 970 can be ranked or ordered in terms of
best performance to poorest performance.
In addition, the differences between the scores
are meaningful. For instance, student 1 scored
1120 – 1050 =70 points more than student 2,
while student 2 scored 1050 – 970 = 80 points
more than student 3.
© 2006 by Thomson Learning, a division of Thomson Asia Pte Ltd..
Slide ‹#›
Scales of Measurement
• Ratio
the data have all the properties of interval
data and the ratio of two values is meaningful.
• Ratio scale requires that a zero value be
included to indicate that nothing exists for the
variable at the zero point.
• For example, distance, height, weight, and
time use the ratio scale of measurement.
© 2006 by Thomson Learning, a division of Thomson Asia Pte Ltd..
Slide ‹#›
Scales of Measurement
• Ratio
Example:
Melissa’s college record shows 36 credit hours
earned, while Kevin’s record shows 72 credit
hours earned. Kevin has twice as many credit
hours earned as Melissa.
© 2006 by Thomson Learning, a division of Thomson Asia Pte Ltd..
Slide ‹#›
Qualitative and Quantitative Data
• Data can be further classified as either qualitative
or quantitative.
• The statistical analysis appropriate for a
particular variable depends upon whether the
variable is qualitative or quantitative.
• If the variable is qualitative, the statistical
analysis is rather limited.
• In general, there are more alternatives for
statistical analysis when the data are quantitative.
© 2006 by Thomson Learning, a division of Thomson Asia Pte Ltd..
Slide ‹#›
Qualitative Data
• Labels or names used to identify an attribute
of each element
• Qualitative data are often referred to as
categorical data
• Use either the nominal or ordinal scale of
measurement
• Can be either numeric or nonnumeric
• Appropriate statistical analyses are rather
limited
© 2006 by Thomson Learning, a division of Thomson Asia Pte Ltd..
Slide ‹#›
Quantitative Data
• Quantitative data indicate how many or how
much:
discrete, if measuring how many
continuous, if measuring how much
• Quantitative data are always numeric.
• Ordinary arithmetic operations are
meaningful for quantitative data.
© 2006 by Thomson Learning, a division of Thomson Asia Pte Ltd..
Slide ‹#›
Scales of Measurement
Data
Qualitative
Numerical
Nominal
Ordinal
Quantitative
Nonnumerical
Nominal
Ordinal
© 2006 by Thomson Learning, a division of Thomson Asia Pte Ltd..
Numerical
Interval
Ratio
Slide ‹#›
Cross-Sectional Data
Cross-sectional data are collected at the same or
approximately the same point in time.
Example: data detailing the number of building
permits issued in June 2003 in each of the counties
of Ohio
© 2006 by Thomson Learning, a division of Thomson Asia Pte Ltd..
Slide ‹#›
Time Series Data
Time series data are collected over several time
periods.
Example: data detailing the number of building
permits issued in Lucas County, Ohio in each of
the last 36 months
© 2006 by Thomson Learning, a division of Thomson Asia Pte Ltd..
Slide ‹#›
Data Sources
• Existing Sources
Within a firm – almost any department
Business database services – Dow Jones & Co.
Government agencies - U.S. Department of Labor
Industry associations – Travel Industry Association
of America
Special-interest organizations – Graduate Management
Admission Council
Internet – more and more firms
© 2006 by Thomson Learning, a division of Thomson Asia Pte Ltd..
Slide ‹#›
1.3 Data Sources
• Existing Sources
• Statistical Studies
• Data Acquisition Errors
© 2006 by Thomson Learning, a division of Thomson Asia Pte Ltd..
Slide ‹#›
© 2006 by Thomson Learning, a division of Thomson Asia Pte Ltd..
Slide ‹#›
Data Sources
• Statistical Studies
In experimental studies the variables of interest
are first identified. Then one or more factors are
controlled so that data can be obtained about how
the factors influence the variables.
In observational (nonexperimental) studies no
attempt is made to control or influence the
variables of interest.
a survey is a
good example
© 2006 by Thomson Learning, a division of Thomson Asia Pte Ltd..
Slide ‹#›
Data Acquisition Considerations
Time Requirement
•
Searching for information can be time consuming.
• Information may no longer be useful by the time it
is available.
Cost of Acquisition
• Organizations often charge for information even
when it is not their primary business activity.
Data Errors
• Using any data that happens to be available or
that were acquired with little care can lead to poor
and misleading information.
© 2006 by Thomson Learning, a division of Thomson Asia Pte Ltd..
Slide ‹#›
1.4 Descriptive Statistics
• Descriptive statistics are the tabular,
graphical, and numerical methods used to
summarize data.
© 2006 by Thomson Learning, a division of Thomson Asia Pte Ltd..
Slide ‹#›
Example: Hudson Auto Repair
The manager of Hudson Auto
would like to have a better
understanding of the cost
of parts used in the engine
tune-ups performed in the
shop. She examines 50
customer invoices for tune-ups. The costs of parts,
rounded to the nearest dollar, are listed on the next
slide.
© 2006 by Thomson Learning, a division of Thomson Asia Pte Ltd..
Slide ‹#›
Example: Hudson Auto Repair
•
Sample of Parts Cost for 50 Tune-ups
91
71
104
85
62
78
69
74
97
82
93 57
72 89
62 68
88 68
98 101
75 52 99
66 75 79
97 105 77
83 68 71
79 105 79
© 2006 by Thomson Learning, a division of Thomson Asia Pte Ltd..
80
75
65
69
69
97 62
72 76
80 109
67 74
62 73
Slide ‹#›
Tabular Summary:
Frequency and Percent Frequency
Parts
Cost ($)
50-59
60-69
70-79
80-89
90-99
100-109
Parts
Frequency
2
13
16
7
7
5
50
Percent
Frequency
4
26
(2/50)100
32
14
14
10
100
© 2006 by Thomson Learning, a division of Thomson Asia Pte Ltd..
Slide ‹#›
Graphical Summary: Histogram
Tune-up Parts Cost
18
16
Frequency
14
12
10
8
6
4
2
Parts
50-59 60-69 70-79 80-89 90-99 100-110 Cost ($)
© 2006 by Thomson Learning, a division of Thomson Asia Pte Ltd..
Slide ‹#›
Numerical Descriptive Statistics
• The most common numerical descriptive
statistic is the average (or mean).
• Hudson’s average cost of parts, based on the
50 tune-ups studied, is $79 (found by summing
the 50 cost values and then dividing by 50).
© 2006 by Thomson Learning, a division of Thomson Asia Pte Ltd..
Slide ‹#›
1.5 Statistical Inference
Population
- the set of all elements of interest in a
particular study
Sample - a subset of the population
Statistical inference - the process of using data obtained
from a sample to make estimates
and test hypotheses about the
characteristics of a population
Census - collecting data for a population
Sample survey - collecting data for a sample
© 2006 by Thomson Learning, a division of Thomson Asia Pte Ltd..
Slide ‹#›
Process of Statistical Inference
1. Population
consists of all
tune-ups. Average
cost of parts is
unknown.
2. A sample of 50
engine tune-ups
is examined.
4. The sample average
is used to estimate the
population average.
3. The sample data
provide a sample
average parts cost
of $79 per tune-up.
© 2006 by Thomson Learning, a division of Thomson Asia Pte Ltd..
Slide ‹#›
1.6 Computers and Statistical Analysis
• Statistical analysis often involves working with large
amounts of data.
• Computer software is typically used to conduct the
analysis.
• Statistical software packages such as Microsoft Excel
and Minitab are capable of data management,
analysis, and presentation.
• Instructions for using Excel and Minitab are
provided in chapter appendices.
© 2006 by Thomson Learning, a division of Thomson Asia Pte Ltd..
Slide ‹#›
End of Chapter 1
© 2006 by Thomson Learning, a division of Thomson Asia Pte Ltd..
Slide ‹#›