(A and B). - Serkan Albayrak

STAT 111
Principles of statistics
YASAR UNIVERSITY
2010-2011 FALL
ASSIST. PROF. DR. R. SERKAN ALBAYRAK
[email protected]
HTTP://YASARUNIVERSITY.YAHOOBOARD.N
ET/
Brief Introduction
 Syllabus
 Course Materials
 Grading-Next Slide
 Etc.
Previously in Principles of Statistics…2007
Letter Grades
17
Count
13
13
11
8
7
5
A
A-
B+
3
B
B-
C+
C
5
5
C-
D+
3
D
F
Previously in Principles of Statistics…2008
Letter Grades
10
9
9
8
7
6
6
5
Total
4
3
3
3
3
3
3
2
2
1
1
1
1
1
1
D
D+
0
A
A-
B
B-
B
B+
C
C-
C
C+
F
Chapter 1
What is Statistics
 Statistics can be thought of as a whole subject or
discipline ...
 It can be thought of as the methods used to collect,
process and/or interpret data ...
 It can be thought of as the collections of data
gathered by those methods ...
 It can also be thought of as a specially calculated
figures (e.g. averages) to characterize collection ...
What is Statistics
 Statistics are like a bikini; What is revealed is
interesting; What is concealed is crucial. - R.
Taylor
 Statistics is the science and art of making decisions
based on quantitative evidence.
“The most fundamental
principle of all in
gambling is simply equal
conditions, e.g. of
opponents, of bystanders,
of money, of situation, of
the dice box, and of the
die itself. To the extent to
which you depart from
that equality, if it is in
your opponent’s favor,
you are a fool, and if in
your own, you are
unjust.”
Girolamo Cardano
1501 – 1576
PROPOSITION IV
“Suppose now that I am
playing against someone with
the agreement that the first of
us to win three times will take
the stake. And suppose that I
have already won twice and my
opponent has already won
once. I want to know how
much of the money should fall
to me if we do not wish to
continue the game, but rather
to divide equitably the money
we are playing for.”
Christiann Huygens
1629 – 1695
Branches of Statistics
Descriptive Statistics
Involves organizing,
summarizing, and
displaying data.
Inferential Statistics
Involves using
sample data to draw
conclusions about a
population.
Branches of Statistics
 The objective of descriptive statistics methods is
to summarize a set of observations.
 The objective of inferential statistics methods is
to make inferences (predictions, decisions) about
population based on information contained in a
sample, and to quantify the level of uncertainty in
our decisions.
Example: Descriptive and Inferential Statistics
Decide which part of the study represents the
descriptive branch of statistics. What conclusions
might be drawn from the study using inferential
statistics?
A large sample of men, aged 48,
was studied for 18 years. For
unmarried men, approximately
70% were alive at age 65. For
married men, 90% were alive
at age 65.
Solution: Descriptive and Inferential Statistics
Descriptive statistics involves statements such as “For
unmarried men, approximately 70% were alive at age 65” and
“For married men, 90% were alive at 65.”
A possible inference drawn from the study is that being
married is associated with a longer life for men.
PLAN
 We will follow
 Logic
 Statistics (Descriptive)
 Probability
 Statistics (Inferential)
Valid Rules of Reasoning
 An ARGUMENT is a sequence of statements, one of
which is called the CONCLUSION. The other
statements are PREMISES (assumptions). The
argument presents the premises—collectively— as
evidence that the conclusion is true.
Example: If A is true then B is true. A is true.
Therefore, B is true.
If A is true then B is true. A is true. Therefore, B
is true.
The CONCLUSION is that B is true. The PREMISES are If A is true then B is
true and A is true. The premises support the conclusion that B is true. The
word "therefore" is not part of the conclusion: It is a signal that the
statement after it is the conclusion.
 The words thus, hence, so, and the phrases it follows that, we see that, and
so on, also flag conclusions. The words suppose, let, given, assume, and so
on, flag premises.
 A concrete argument of the form just given might be:
 If it is sunny, I will wear sandals. It is sunny. Therefore, I will wear
sandals.
 Here, A is "it is sunny" and B is "I will wear sandals."
 We usually omit the words "is true." So, for example, the previous
argument would be written
 If A then B. A. Therefore, B.
 The statement not A means A is false.
Validity & Soundness
An argument is VALID if the conclusion must be true
whenever the premises are true.
If an argument is valid and its premises are true, the
argument is SOUND.
Cheese more than a billion years old is stale.
The Moon is made of cheese. The Moon is
more than a billion years old. Therefore, the
Moon is stale cheese.
VALID but NOT SOUND!
Some Valid Rules of Reasoning
 A or not A. (LAW OF THE EXCLUDED MIDDLE)
 Not (A and not A).
 A. Therefore, A or B.
 A. B. Therefore, A and B.
 A and B. Therefore, A.
 Not A. Therefore, not (A and B).
 A or B. Not A. Therefore, B. (DENYING THE DISJUNCT)
 Not (A and B). Therefore, (not A) or (not B). (DE MORGAN)
 Not (A or B). Therefore, (not A) and (not B). (DE MORGAN)
 If A then B. A. Therefore, B. (AFFIRMING THE PRECEDENT, MODUS
PONENDO PONENS,
"affirming by affirming")
 If A then B. Not B. Therefore, not A. (DENYING THE CONSEQUENT, MODUS
TOLLENDO TOLLENS, "denying by denying")
Common Formal Fallacies
 A or B. Therefore, A.
 A or B. A. Therefore, not B. (AFFIRMING THE DISJUNCT)
 NOT BOTH A AND B ARE TRUE. NOT A. THEREFORE, B.
 IF A THEN B. B. THEREFORE, A.
 IF A THEN B. NOT A. THEREFORE, NOT B.
 IF A THEN B. C. THEREFORE, B.
 IF A THEN B. NOT C. THEREFORE, NOT A.
 IF A THEN B. A. THEREFORE, C.
 IF A THEN B. NOT B. THEREFORE, NOT C.
AD HOMINEM (PERSONAL ATTACK)
 NANCY CLAIMS THE DEATH PENALTY IS A GOOD THING. BUT NANCY ONCE
SET FIRE TO A VACANT WAREHOUSE. NANCY IS EVIL.
THEREFORE, THE
DEATH PENALTY IS A BAD THING.
 THIS ARGUMENT DOES NOT ADDRESS NANCY'S ARGUMENT, IT JUST SAYS SHE MUST
BE WRONG (ABOUT EVERYTHING) BECAUSE SHE IS EVIL.
WHETHER NANCY IS GOOD
OR EVIL IS IRRELEVANT: IT HAS NO BEARING ON WHETHER HER ARGUMENT IS
SOUND.
 THIS IS A FALLACY OF RELEVANCE: IT ESTABLISHES THAT NANCY IS BAD, THEN
EQUATES BEING BAD AND NEVER BEING RIGHT. IN SYMBOLS, THE ARGUMENT IS IF
A
THEN B. A. THEREFORE C. (IF SOMEBODY SETS FIRE TO A VACANT WAREHOUSE,
THAT PERSON IS EVIL. NANCY SET FIRE TO A VACANT WAREHOUSE. THEREFORE,
NANCY'S OPINION ABOUT THE DEATH PENALTY IS WRONG.)
 AD HOMINEM IS LATIN FOR "TOWARDS THE PERSON." AN AD HOMINEM ARGUMENT
ATTACKS THE PERSON MAKING THE CLAIM, RATHER THAN THE PERSON'S REASONING.
A VARIANT OF THE AD HOMINEM ARGUMENT IS "GUILT BY ASSOCIATION."
BAD MOTIVE
 BOB CLAIMS THE DEATH PENALTY IS A GOOD THING. BUT
BOB'S FAMILY BUSINESS MANUFACTURES CASKETS. BOB
BENEFITS WHEN PEOPLE DIE, SO HIS MOTIVES ARE SUSPECT.
THEREFORE, THE DEATH PENALTY IS A BAD THING.
 THIS ARGUMENT DOES NOT ADDRESS BOB'S ARGUMENT, IT
ADDRESSES BOB'S MOTIVES. HIS MOTIVES ARE IRRELEVANT: THEY
HAVE NOTHING TO DO WITH WHETHER HIS ARGUMENT FOR THE
DEATH PENALTY IS SOUND.
 THIS IS RELATED TO AN AD HOMINEM ARGUMENT. IT, TOO,
ADDRESSES THE PERSON, NOT THE PERSON'S ARGUMENT. HOWEVER,
RATHER THAN CONDEMNING BOB AS EVIL, IT IMPUGNS HIS MOTIVES
IN ARGUING FOR THIS PARTICULAR CONCLUSION.
TU QUOQUE (LOOK WHO'S TALKING)
 AMY SAYS PEOPLE SHOULDN'T SMOKE CIGARETTES
IN PUBLIC BECAUSE CIGARETTE SMOKE HAS A
STRONG ODOR.
BUT AMY WEARS STRONG PERFUME
ALL THE TIME. AMY IS CLEARLY A HYPOCRITE.
THEREFORE, SMOKING IN PUBLIC IS FINE.
 THIS ARGUMENT DOES NOT ENGAGE AMY'S ARGUMENT: IT
ATTACKS HER FOR THE (IN)CONSISTENCY OF HER
OPINIONS IN THIS MATTER AND IN SOME OTHER MATTER.
WHETHER AMY WEARS STRONG FRAGRANCES HAS
NOTHING TO DO WITH WHETHER HER ARGUMENT AGAINST
SMOKING IS SOUND.
TWO WRONGS MAKE A RIGHT
 YES, I HIT BILLY. BUT SALLY HIT HIM FIRST.
 THIS ARGUMENT CLAIMS IT IS FINE TO DO SOMETHING WRONG
BECAUSE SOMEBODY ELSE DID SOMETHING WRONG. THE
ARGUMENT IS OF THE FORM: IF A THEN B. A. THEREFORE C.
(IN WORDS: IF SALLY HIT BILLY, IT'S OK FOR BILLY TO HIT
SALLY. SALLY HIT BILLY. THEREFORE, IT'S OK FOR ME TO HIT
BILLY.)
 GENERALLY, THE TWO-WRONGS-MAKE-A-RIGHT ARGUMENT
SAYS THAT THE JUSTIFIED WRONG HAPPENED AFTER THE
EXCULPATORY WRONG, OR WAS LESS SEVERE. FOR INSTANCE,
SALLY HIT BILLY FIRST, OR SALLY HIT BILLY HARDER THAN I
DID, OR SALLY PULLED A KNIFE ON BILLY.
AD BACULUM (APPEAL TO FORCE)
 IF YOU DON'T GIVE ME YOUR LUNCH MONEY, MY BIG BROTHER WILL
BEAT YOU UP.
YOU DON'T WANT TO BE BEATEN UP, DO YOU?
THEREFORE, YOU SHOULD GIVE ME YOUR LUNCH MONEY.
 THIS ARGUMENT APPEALS TO FORCE: ACCEPT MY CONCLUSION—OR ELSE. IT
IS NOT A LOGICAL ARGUMENT. [+17]
 NOTE 2-17: BUT IT CAN BE QUITE PERSUASIVE NONETHELESS.
 IT IS AN ARGUMENT THAT IF YOU DO NOT ACCEPT THE CONCLUSION (AND
GIVE ME YOUR LUNCH MONEY), SOMETHING BAD WILL HAPPEN (YOU WILL
GET BEATEN)—NOT AN ARGUMENT THAT THE CONCLUSION IS CORRECT. THE
FORM OF THE ARGUMENT IS IF A THEN B. B IS BAD. THEREFORE, NOT A.
HERE, A IS "YOU DON'T GIVE ME YOUR LUNCH MONEY," B IS "YOU WILL BE
BEATEN UP."
AD MISERICORDIUM (APPEAL TO PITY)
 YES, I DOWNLOADED MUSIC ILLEGALLY—BUT MY GIRLFRIEND LEFT ME AND
I LOST MY JOB SO I WAS BROKE AND I COULDN'T AFFORD TO BUY MUSIC AND
I WAS SO SAD THAT I WAS BROKE AND THAT MY GIRLFRIEND WAS GONE
THAT I REALLY HAD TO LISTEN TO 100 VARIATIONS OF SHE CAUGHT THE
KATY.
 THIS ARGUMENT JUSTIFIES AN ACTION NOT BY CLAIMING THAT IT IS
CORRECT, BUT BY AN APPEAL TO PITY: EXTENUATING CIRCUMSTANCES OF A
SORT.
 AD MISERICORDIUM IS LATIN FOR "TO PITY." IT IS AN APPEAL TO
COMPASSION RATHER THAN TO REASON. ANOTHER EXAMPLE:
 YES, I FAILED THE FINAL. BUT I NEED TO GET AN A IN THE CLASS OR I
[WON'T GET INTO BUSINESS SCHOOL] / [WILL LOSE MY SCHOLARSHIP] /
[WILL VIOLATE MY ACADEMIC PROBATION] / [WILL LOSE MY 4.0 GPA]. YOU
HAVE TO GIVE ME AN A!
AD POPULUM (BANDWAGON)
 MILLIONS OF PEOPLE SHARE COPYRIGHTED MP3 FILES AND
VIDEOS ONLINE. THEREFORE, SHARING COPYRIGHTED MUSIC
AND VIDEOS IS FINE.
 THIS "BANDWAGON" ARGUMENT CLAIMS THAT SOMETHING IS MORAL
BECAUSE IT IS COMMON. COMMON AND CORRECT ARE NOT THE SAME.
WHETHER A PRACTICE IS WIDESPREAD HAS LITTLE BEARING ON
WHETHER IT IS LEGAL OR MORAL. THAT MANY PEOPLE BELIEVE
SOMETHING IS TRUE DOES NOT MAKE IT TRUE.
 AD POPULUM IS LATIN FOR "TO THE PEOPLE." IT EQUATES THE
POPULARITY OF AN IDEA WITH THE TRUTH OF THE IDEA: EVERYBODY
CAN'T BE WRONG. FEW TEENAGERS HAVE NOT MADE AD POPULUM
ARGUMENTS: "BUT MOM, EVERYBODY IS DOING IT!"
STRAW MAN-BOSTAN KORKULUĞU
 BOB: SLEEPING A FULL 12 HOURS ONCE IN A WHILE IS A
HEALTHY PLEASURE.
 SAMANTHA: IF EVERYBODY SLEPT 12 HOURS ALL THE
TIME, NOTHING WOULD EVER GET DONE; THE
REDUCTION IN PRODUCTIVITY WOULD DRIVE THE
COUNTRY INTO BANKRUPTCY. THEREFORE, NOBODY
SHOULD SLEEP FOR 12 HOURS.
 SAMANTHA ATTACKED A DIFFERENT CLAIM FROM THE
ONE BOB MADE: SHE ATTACKED THE ASSERTION THAT IT
IS GOOD FOR EVERYBODY TO SLEEP 12 HOURS EVERY
DAY. BOB ONLY CLAIMED THAT IS WAS GOOD ONCE IN A
WHILE.
RED HERRING-DIKKATI BAŞKA YERE ÇEKMEK
 ART: TEACHER SALARIES SHOULD BE INCREASED TO ATTRACT BETTER
TEACHERS.
 BETTE: LENGTHENING THE SCHOOL DAY WOULD ALSO IMPROVE
STUDENT LEARNING OUTCOMES.
SHOULD REMAIN THE SAME.
THEREFORE, TEACHER SALARIES
 ART ARGUES THAT INCREASING TEACHER SALARIES WOULD ATTRACT BETTER
TEACHERS. BETTE DOES NOT ADDRESS HIS ARGUMENT: SHE SIMPLY ARGUES
THAT THERE ARE OTHER WAYS OF IMPROVING STUDENT LEARNING OUTCOMES.
ART DID NOT EVEN USE STUDENT LEARNING OUTCOMES AS A REASON FOR
INCREASING TEACHER SALARIES. EVEN IF BETTE IS CORRECT THAT
LENGTHENING THE SCHOOL DAY WOULD IMPROVE LEARNING OUTCOMES, HER
ARGUMENT IS SIDEWAYS TO ART'S: IT IS A DISTRACTION, NOT A REFUTATION.
 A RED HERRING ARGUMENT DISTRACTS THE LISTENER FROM THE REAL TOPIC
 RED HERRING ARGUMENTS ARE VERY COMMON IN POLITICAL DISCOURSE.
EQUIVOCATION
 ALL MEN SHOULD HAVE THE RIGHT TO VOTE. SALLY IS NOT A
MAN. THEREFORE, SALLY SHOULD NOT NECESSARILY HAVE
THE RIGHT TO VOTE.
 THIS IS AN EXAMPLE OF EQUIVOCATION, A FALLACY FACILITATED BY
THE FACT THAT A WORD CAN HAVE MORE THAN ONE MEANING.
 THIS ARGUMENT USES THE WORD MAN IN TWO DIFFERENT WAYS. IN
THE FIRST PREMISE, THE WORD MEANS HUMAN WHILE IN THE
SECOND, IT MEANS MALE. GENERALLY, EQUIVOCATION IS
CONSIDERED A FALLACY OF RELEVANCE, BUT THIS EXAMPLE FITS OUR
DEFINITION OF A FALLACY OF EVIDENCE.
 THE LOGICAL FORM OF THIS ARGUMENT IS IF A THEN B. NOT C.
THEREFORE, B IS NOT NECESSARILY TRUE.
Others (Generalizability)
 Trident (4/5)
Trident® sugarless gum used to advertise that "4 out
of 5 dentists surveyed recommend Trident®
sugarless gum for their patients who chew gum."
 Yale University Graduates
Data
 In its broadest sense, Statistics is the science of
drawing conclusions about the world from data. Data
are observations (measurements) of some quantity
or quality of something in the world.
 "Data" is a plural noun; the singular form is
"datum." Our lives are filled with data: the weather,
weights, prices, our state of health, exam grades,
bank balances, election results, and so on. Data come
in many forms, most of which are numbers, or can be
translated into numbers for analysis.
 There are several important questions to keep in
mind when you evaluate quantitative evidence:
 Are the data relevant to the question asked?
 Was the data collection fair, or might there have
been some conscious or unconscious BIAS that
influenced the results or made some cases less likely
to be observed?
 Do the data make sense?
Data ~ Information
 Qualitative Data : Consists of attributes, labels, or
nonnumerical entries.
Major
Place of birth
Eye color
Hot/Warm/Cold
 Population density: low/medium/high
 Height: short/medium/tall
 Young/Middle-aged/Old
 Social class: lower/middle/upper
 Family size: fewer than 3, 3–5, 5 or more
 Rural/Urban area
 Type of climate
 Gender
 Ethnicity
 Zip code
 Hair color
 Country of origin
Data ~ Information
 Quantitative Data : Numerical measurements or
counts.
Age
Weight of a
letter
Temperature
 Temperature in °C
 Population density: people per square mile
 Height in inches
 Height in centimeters
 Body mass index (BMI)
 Age in seconds
 Income in dollars
 Family size (#people)
Example – Classifying Data by
Type
 The base prices of several vehicles are shown in the
table. Which data are qualitative data and which
are quantitative data? (Source Ford Motor Company)
Solution – Classifying Data by Type
Qualitative Data
(Names of vehicle
models are
nonnumerical
entries)
Quantitative Data
(Base prices of
vehicles models are
numerical entries)
 The fact that a category is labeled with a
number does not make the variable
quantitative!
 The real issue is whether arithmetic with the values
makes sense.
Levels of Measurement
Nominal level of measurement



Qualitative data only
Categorized using names, labels, or qualities
No mathematical computations can be made
Ordinal level of measurement
• Qualitative or quantitative data
• Data can be arranged in order
• Differences between data entries is not
meaningful
Example – Classifying data by level
 Two data sets are shown. Which data set consists of
data at the nominal level? Which data set consists of
data at the ordinal level? (Source: Nielsen Media Research)
Solution – Classifying data by
level
Ordinal level (lists the
rank of five TV
programs. Data can be
ordered. Difference
between ranks is not
meaningful.)
Nominal level (lists the
call letters of each
network affiliate. Call
letters are names of
network affiliates.)
Levels of Measurement
Interval level of measurement
 Quantitative data
 Data can be ordered
 Differences between data entries is meaningful
 Zero represents a position on a scale (not an inherent
zero – zero does not imply “none”)
Levels of Measurement
Ratio level of measurement
 Similar to interval level
 Zero entry is an inherent zero (implies “none”)
 A ratio of two data values can be formed
 One data value can be expressed as a multiple of
another
Example – Classifying data by level
 Two data sets are shown. Which data set consists
of data at the interval level? Which data set
consists of data at the ratio level? (Source: Major League
Baseball)
Solution – Classifying data by level
Interval level
(Quantitative data. Can
find a difference between
two dates, but a ratio does
not make sense.)
Ratio level (Can find
differences and write
ratios.)
Summary of Four Levels of Measurement
Put data in
categories
Arrange
data in
order
Subtract
data
values
Determine if one
data value is a
multiple of another
Nominal
Yes
No
No
No
Ordinal
Yes
Yes
No
No
Interval
Yes
Yes
Yes
No
Ratio
Yes
Yes
Yes
Yes
Level of
Measurement
Variable, Value & Data
 One of the most problematic relationship.
 What is really a variable?
 What is value?
Data
 What is data?
 How they are related?
Variable
Theoretical
Values
Observed
Example
Variable: New York Yankees’
World Series Victories
Values: 1901,1902,…(all
possible years)
Data: 1923,1927,1928,…