Chapter 3: Reliability and Validity **This chapter corresponds to

Chapter 3: Reliability and Validity
**This chapter corresponds to chapter 6 of your book (“Just the Truth”)
What it is: Reliability and validity are terms that refer to the quality of the measures used in a
research study. Reliability refers to the consistency and validity refers to the accuracy of the
measure. There are several types of reliability (test-retest, parallel forms, internal consistency,
interrater) as well as several types of validity (content, criterion, construct). These different types
of reliability and validity are used for different types of measures but can also work together. The
more types of reliability and validity a measure demonstrates, the more confident we can be in
the quality of the measure.
When to use it: Reliability and validity are important any time you measure anything (so
basically in every research study). The exact nature of a study determines which type of
reliability and/or validity you should assess (see your text for details).
Using SPSS to calculate Internal Consistency Reliability (Cronbach’s Alpha) (dataset:
Chapter3Example1.sav)
The rest of this chapter will focus on internal consistency reliability (accessed with Cronbach’s
Alpha). We are focusing on this topic specifically because it is the most commonly utilized type
of reliability as well as the most straightforward to demonstrate in SPSS. Cronbach’s alpha
ranges from 0 to 1 and tells you how internally consistent a group of items are. In other words,
Cronbach’s alpha tells you the extent to which a group of items measure the same thing. The
closer the value of Cronbach’s alpha is to 1, the more consistent the items in a measure.
In this example you will use SPSS to calculate Cronbach’s alpha for The Meaning in Life
Questionnaire. Here is what that measure looks like:
The Meaning in Life Questionnaire (Steger, Frazier, Oishi, & Kaler, 2006)
Please take a moment to think about what makes your life feel important to you. Please respond
to the following statements as truthfully and accurately as you can, and also please remember
that these are very subjective questions and that there are no right or wrong answers. Please
answer according to the scale below:
Absolutely Untrue
1
2
3
4
5
6
7
1. I understand my life’s meaning.
2. My life has a clear sense of purpose.
3. I have a good sense of what makes my life meaningful.
4. I have discovered a satisfying life purpose.
5. My life has no clear purpose.
Open up the data set Chapter3Example1.sav. It should look like this:
Absolutely True
You’ll see this data set includes the responses of 100 people to the 5 questions from the
meaning in life questionnaire. Each question is called an item and is labeled with a variable
name that tells you which question number it refers to (i.e., MIL1 is “I understand my life’s
meaning”.).
Take a second and go back and look at the 5 items included in the meaning in life
questionnaire. Do any of these items jump out at you? Does any item seem like it might not
belong with the others?
You may have noticed that item 5 is different than the other 4 items. A person who believes their
life has meaning would respond with high numbers to items 1-4 BUT with a low number to item
5. This is called a reverse coded item – meaning that it should be coded opposite of the rest of
the measure. Whereas responses with high numbers on items 1-4 indicate high meaning in life,
responses with low numbers on item 5 indicate high meaning in life.
How to handle reverse coded items
When you have a measure that includes reverse coded items, you must use the recode
procedure in SPSS to reverse people’s responses to those items. This means that a 7 is turned
into a 1, a 6 is turned into a 2 and so on…
In this example, we’ll need to reverse score one item: MIL5.
To recode an item, highlight the “transform” menu and then click on “recode into different
variables” as shown below:
That will bring up the following window:
Highlight the item you want to recode (MIL5) and click the arrow to move it over to the window.
Next type in a name for what the new recoded variable will be called. You can name it whatever
you want, but a handy way to do this is to simply add an “r” (“r” stands for “recoded”) after the
existing variable name (e.g. “MIL5r”). You’ll have to click the “change” button in the output
variable box to make the new variable name show up in the “numerical variable -> output
variable” box. Your screen should look like this:
Next click on the “old and new values” button to bring up the following window:
Type the value of the lowest number in your scale (i.e., a “1”) in the “value” box under the word
“Old Value”. Then type the value of the highest number in your scale (i.e., a “7”) in the “value”
box under the word “New Value”. In this example, this tells SPSS to turn 1’s into 7’s. Now click
the “Add” button to make the numbers show up in the “Old -> New” box. Continue this process
for each number that was a response option. The 2’s become 6’s, the 3’s become 5’s and so
on. Your screen should look like this:
Once you have done this for all the numbers click “continue” and then “ok”. Take a look at the
data window to confirm that your new variable is there – it will be located in the last column of
the data set. Spot-check some of the values in your new recoded variable to make sure the
recoding worked (e.g., does it look like the 1’s were switched to 7’s?).
Calculating Cronbach’s Alpha
Once you have reverse scored any reverse coded items, you’re ready to calculate Alpha. To
begin, click “Analyze” and then highlight “Scale”. Next click on “Reliability Analysis” as shown
below:
That will bring up the following window:
Next highlight the first item that is part of your scale and click the arrow to move it over to the
“Items” box. Continue until you have done this for each of the items in your scale. Be careful
with the reverse coded item, you only want to include the reverse scored version (not both).
Your screen should look like this:
Whenever you compute Cronbach’s alpha, it’s helpful to ask SPSS for an extra bit of output that
tells you what the alpha would be if you dropped any of the items. This helps you identify “bad
items”. If SPSS tells you that the alpha for a scale would be quite a bit higher if an item were
dropped – this is a potential bad item. It may be a reverse coded item you forgot to recode or an
item that should potentially be dropped from the scale because it is poorly worded or it
measures something different. To get this extra output click on the “statistics” button and then
check the box in the “Descriptives for” box that is labeled “Scale if item deleted”. Your screen
should look like the following:
Now click continue and then OK. You output will look like the following:
Case Processing Summary
N
Cases
Valid
Excludeda
Total
%
100
100.0
0
.0
100
100.0
a. Listwise deletion based on all variables in the
procedure.
This box tells you the number
of participants in your sample
that completed enough items
to be included in the analysis.
This is the value of Cronbach’s
Alpha for your scale.
Reliability Statistics
Cronbach's Alpha
N of Items
.846
This is the number of items in your scale.
5
Item-Total Statistics
Scale Mean if Item
Scale Variance if
Corrected Item-
Cronbach's Alpha if
Deleted
Item Deleted
Total Correlation
Item Deleted
MIL1
19.4300
19.783
.685
.808
MIL2
19.2300
18.502
.740
.791
MIL3
18.6000
19.636
.667
.811
MIL4
19.3000
19.646
.643
.817
MIL5r
18.4400
18.128
.573
.846
This table
is the extra
output we
requested.
This column is the only part we need to pay attention to. Remember that we’re looking for items
where alpa would go UP if an item was deleted. You can see that all of our items are “good items”
because alpha would stay the same or go down if the item was deleted.
Interpreting the Output
We have two goals when we look at our output for Cronbach’s alpha.
1. Make sure that our alpha is “good”.
2. Make sure we don’t have any “bad items”.
Interpreting the value of alpha can be somewhat subjective because people may have different
ideas about what’s a “good enough” value for alpha. Just remember that the closer alpha is to 1,
the better. In psychology, people generally think of an alpha that is higher than .80 as “good
enough” and an alpha between .70 and .80 as OK. Anything below .70 may be a reason to
worry.
If you find any potential “bad items” (items that the alpha would increase if they were deleted),
go back to your scale and try to figure out what might be going on with that item. Should it be
reverse coded if it wasn’t? If it was reverse coded, was that a mistake? If there’s no coding
problem, does it seem like the item might be hard to understand or is measuring something
different than the rest of the scale? Sometimes, researchers delete such “bad items” from the
scale for the sake of higher reliability. This is a judgment call that has to be made by the
researcher (often, after years of practice).
Now what?
If you’ve decided that your alpha is “good enough” for you, the next step is usually to compute
an average score for each participant. This is because what we really want to know is a
person’s average meaning in life rather than their responses to 5 different items. Having an
internally reliable set of items tells you that it’s ok to average people’s responses on the items
because they each measure the same thing.
You may be wondering why researchers use multiple items in the first place if you have to mess
with all of this alpha stuff and then average the items into a single number anyway. The reason
has to do with validity. We know that asking one single item is not always a very good way to
accurately measure what we’re interested in. This is based on the same logic as taking multiple
exams in a single class over the course of a semester. Imagine if you only took one test in this
class at the end of the semester. That might not be the most accurate way to assess how much
you know about the topic – what if you are just having a bad day? In the same way that taking
multiple tests makes our assessment of your knowledge of the topic more accurate, so does
asking multiple items about one thing.
To compute a mean, click on “transform “and then click on “compute variable” as seen below:
This will bring up the following window:
In the Target Variable box you should type a name for the average score. For example,
“avgMIL” for average meaning in life.
Next, you will want to tell SPSS how to calculate the mean by putting a command in the
“Numeric Expression” box. To compute a mean, the command is Mean(item1,item2,item3…).
So type “mean(“ in the numeric expression box. Next highlight the first item and click the arrow
to move it into the box. Continue this for each item, separating the items with a comma. Don’t
forget to ONLY include the reverse coded item for MIL5. When you have done this for each
item, type in the closing parentheses “)”. Your screen should look like this:
Now click OK and then navigate to the data view window to make sure your new variable is
there – It can be found at the end of the data set. Spot-check your new average variable (e.g., is
the new variable the correct average of the five items for Participant 1?). You would be able to
use this variable in many of the analyses you will learn about the rest of this semester. For
example, you might use a t-test to compare meaning in life between 2 groups or you might use
a regression analysis to predict meaning in life from some predictor. Exciting times are ahead!
Practice Problem (answers in appendix)
In this problem you will assess the reliability of the openness to experience subscale of The Big
Five Inventory (John, Donahue, and Kentle, 1991). People who are high in openness to
experience are more intellectually curious and willing to think about different ideas. This
measure is designed to tap the extent to which somebody is open to experiences.
Here is what the scale looks like:
Please choose a number next to each statement to indicate the extent to which you agree or
disagree with that statement.
Disagree strongly
1
2
3
4
5
Agree Strongly
1. I see myself as someone who is original, comes up with new ideas
2. I see myself as someone who is curious about many different things
3. I see myself as someone who is ingenious, a deep thinker
4. I see myself as someone who has an active imagination
5. I see myself as someone who is inventive
6. I see myself as someone who values artistic, aesthetic experiences
7. I see myself as someone who prefers work that is routine
8. I see myself as someone who likes to reflect, play with ideas
9. I see myself as someone who has few artistic interests
10. I see myself as someone who is sophisticated in art, music, or literature.
The dataset “chapter3problem1.sav” includes the responses of 100 people to these 10
questions.
A. Read the items carefully and look for any reverse coded items (hint: there are two
items). Use SPSS to reverse code the items you believe are reverse coded – write
below which items you reverse coded.
B. Use SPSS to calculate an alpha for this measure. What is the alpha?
C. Do you see any potential “bad items”? What do you think might explain these bad items?
D. Calculate a mean for the openness scale then use the descriptive procedure to find the
range, mean, and standard deviation for the scale. Write these values below.