File - Kyle Dropp

Administering Panel Surveys on Amazon Mechanical Turk:
A Guide to Within-Subjects Experiments
Kyle A. Dropp1
January 13, 2015
Click here for latest version.
Comments Welcome!
This document provides step-by-step instructions for implementing panel surveys embedded
with experiments. Data analysis will be conducted in R, surveys will be programmed in
Qualtrics, and cases will be collected via Amazon Mechanical Turk (AMT). This tutorial
assumes very limited knowledge or familiarity with R, Qualtrics, Amazon Mechanical Turk,
HTML or server applications.2 All supporting materials are available at kyleadropp.com/
panels
1
Assistant Professor, Department of Government, Dartmouth College [email protected] I
thank Solomon Messing for providing information on MTurkR- http://solomonmessing.wordpress.com/
2013/06/24/streamline-your-mechanical-turk-workflow-with-mturkr/
2More advanced applications require basic knowledge of php, sql and MySQL.
1
1. Purpose and Outline
2. Preliminaries: Installing R, Accessing Qualtrics / Amazon Mechanical Turk
2.1. Installing R
2.2. Logging into Qualtrics
2.3. Logging into Amazon Mechanical Turk
3. Administering The First Survey Wave
3.1. Programming Survey in Qualtrics
3.1.1. Importing Survey
3.1.2. Foreign Direct Investment Experiment
3.1.3. Confirmation Code
3.1.4. Obtaining Worker ID
3.2. Programming in Amazon Mechanical Turk
3.3. Fielding Wave 1
3.3.1. Evaluating Results from Wave 1
4. Administering The Second Survey Wave
4.0.2. Import Wave 2 into Qualtrics
4.0.3. Branch Logic and Wave 2 Treatment
4.1. Managing Wave 2 from R
4.1.1. Install MTurkR
4.1.2. Enter Amazon Mechanical Turk credentials
4.1.3. Invite Wave 1 respondents to Wave 2
4.1.4. Compensating Respondents
5. Data Analysis
3
4
4
4
4
5
5
5
5
7
8
8
10
11
14
14
14
15
15
15
16
17
18
Contents
2
Materials: kyleadropp.com/panels
1. Purpose and Outline
This document provides step-by-step instructions for implementing panel surveys embedded
with experiments. Panels are a powerful tool involving multiple interviews with the same
respondent that can identify changes in public opinion (or behavior) over time, improve
conclusions made from one-time, between-subjects experiments, increase the precision of
statistical estimates and address important methodological questions. This tutorial provides
step-by-step instructions for implementing multiple, complex treatments across survey waves.
The specific experiment in this tutorial involves a newspaper article depicting a purchase of
a battery maker by a foreign country in the first survey and then varies the foreign country
(but keeps everything else the same) in the second survey wave. The goal of this experiment
is to determine whether Americans are more likely to favor investments by some foreign
countries but oppose similar investments from other countries.
Data analysis will be conducted in R, surveys will be programmed in Qualtrics, and cases
will be collected via Amazon Mechanical Turk (AMT). This tutorial assumes very limited
knowledge or familiarity with R, Qualtrics, Amazon Mechanical Turk, HTML or server
applications.3 All supporting materials are available at kyleadropp.com/panels
This document is organized in the following manner. First, it includes preliminary information concerning R, Qualtrics and Amazon Mechanical Turk. Second, it provides information
for conducting the first survey wave, such as programming the survey instrument in Qualtrics,
creating an assignment in Amazon Mechanical Turk (AMT), and inviting respondents to the
survey. Third, it provides information for conducting the second survey wave, such as linking
respondents across surveys, programming the survey instrument, creating the assignment,
inviting respondents to the survey, and compensating respondents. Finally, we analyze the
data using statistical software.
3More
advanced applications require basic knowledge of php, sql and MySQL.
3
Materials: kyleadropp.com/panels
2. Preliminaries: Installing R, Accessing Qualtrics / Amazon Mechanical
Turk
2.1. Installing R. We will use the freely available R statistical software in this tutorial.
Instructions on how to download R are provided at the following websites:
• Windows - http://cran.r-project.org/bin/windows/base/
• MacOS X - http://cran.r-project.org/bin/macosx/
– Download ‘R-3.1.1-snowleopard.pkg’ if you have MacOS X 10.6 to Mac OS X
10.8 and download ‘ R-3.1.1-mavericks.pkg’ if you have MacOS X 10.9 or higher.
You can find out your Mac Version by clicking the Apple icon on the top right
of your desktop, then clicking ‘About this Mac,’ and then looking at the Version
number.
I strongly encourage you to download RStudio, a user friendly interface for R. RStudio is
an integrated development environment for R with separate panels for the script, console,
datasets and plots. You will be able to execute all programming code within this environment.
Here are instructions for installing RStudio:
• Go to the ‘Installers’ section at http://www.rstudio.com/products/rstudio/download/.
As of September 13, 2014, the latest version for Mac OS X is titled ‘RStudio 0.98.1056
- Mac OS X 10.6+ (64-bit)’ and the latest version for PC is ‘RStudio 0.98.1056 - Windows XP/Vista/7/8.’
2.2. Logging into Qualtrics. Qualtrics is an interface for programming surveys. Login
to your Qualtrics account (e.g., tuck.qualtrics.com/ for Dartmouth College, http://
stanforduniversity.qualtrics.com/ for Stanford University, princeton.qualtrics.com
for Princeton University). University affiliates should have free access to an account, but see
your department or university administrator if you have difficulty accessing Qualtrics.
2.3. Logging into Amazon Mechanical Turk. Go to https://www.mturk.com/mturk/
welcome and select ‘Requester’ in the upper right corner. If you do not have an account,
select ‘Create an Account’ to create an account. Otherwise, select ‘sign in’ to access your
account.
4
Materials: kyleadropp.com/panels
3. Administering The First Survey Wave
In this section, we use Qualtrics to program a simple survey and Amazon Mechanical Turk
to collect interviews.
3.1. Programming Survey in Qualtrics. Prior to collecting cases via Amazon Mechanical Turk (AMT), we will program a survey in Qualtrics, an interface for programming
surveys. See the previous section for Qualtrics login information.
3.1.1. Importing Survey. Please login to Qualtrics. On the ‘Edit Survey’ tab, select ‘Advanced Options’ on the far right, then select ‘Import Survey,’ and browse to the file ‘panel wave1 tutorial.qsf
in the supporting materials folder (located at http://kyledropp.weebly.com/panels.
html). Choose the file and then select ‘Import’ (See Figure 1 for the survey import). This
survey is the first wave of the study and includes a basic experiment on how Americans’
preferences toward foreign direct investment vary based on the foreign country making the
investment.
Figure 1. Import Survey into Qualtrics
3.1.2. Foreign Direct Investment Experiment. Respondents view a fake newspaper article
describing a battery making company that has been acquired. By random draw, the company
purchasing the battery maker is Japanese, German, Chinese, or American. After viewing
the article, respondents state whether they favor or oppose the acquisition. See Figure 2
below for a picture of the newspaper article in the ‘Japanese’ condition:
5
Materials: kyleadropp.com/panels
Figure 2. Foreign Direct Investment Experiment - Survey 1
In a subsequent question battery, respondents state whether they believe the acquisition will
harm national security, lead to job losses, or harm American culture and values. See Figure
3.
Figure 3. Question Battery Evaluating Foreign Direct Investment - Survey 1
Please select ‘Survey Flow,’ the third option from the left on the ‘Edit Survey’ tab and scroll
down to a Web Service item. The first figure displays a variable called ‘randCountry’ that is
an integer between 1 and 4. When the variable ‘randCountry’ equals 1, the respondent sees
an article with Germany purchasing the battery maker. The other randomizations (‘randCountry’ = 2, 3, or 4) display the Chinese, Japanese, and American treatments, respectively.
See Figures 4 and 5.
In Wave 2, the respondent will view a similar article but will be randomly assigned to another
value of ‘randCountry.’ That is, the respondent who sees the Germany treatment in Wave 1
6
Materials: kyleadropp.com/panels
Figure 4. Survey Flow - Creating the ‘randCountry’ Variable
Figure 5. Survey Flow - Assigning Values to the ‘randCountry’ Variable
will have an equal probability of viewing the American, Chinese, or Japanese treatment in
Wave 2. This design creates a powerful between-subjects and within-subjects analysis.
3.1.3. Confirmation Code. Respondents access this survey through Amazon Mechanical Turk
(see next section). To confirm that they have completed the survey, we generate a confirmation code between 5,000,000 and 9,999,999, place it at the end of the survey, and ask
respondents to enter the code into the AMT interface to receive payment. See Figure 6
Figure 6. Creating Confirmation Code in Survey Flow
A corresponding text question at the end of the survey provides the confirmation code
generated from the web service. See Figure 7 below.
7
Materials: kyleadropp.com/panels
Figure 7. Confirmation Code Item in Survey
3.1.4. Obtaining Worker ID. Each worker on Amazon Mechanical Turk (AMT) has an ID
called a ‘Worker ID’ that we will obtain to link individuals across multiple survey waves. On
the ‘Edit Survey’ tab, select ‘Survey Flow.’ There is an empty embedded data variable called
‘MID’ in your Qualtrics survey flow that obtains each AMT Worker ID. When AMT workers
enter the survey, they will have an ‘MID’ appended to their url and the empty embedded
data variable captures this value. When we are using Survey Sampling International (SSI),
an online panel vendor, the respondent identifier is typically called ‘psid’ or ‘pid’. See Figure
8 below.
Figure 8. Creating Variable for Worker ID
3.2. Programming in Amazon Mechanical Turk. Login to Amazon Mechanical Turk
as a ‘Requester.’ If you are not familiar with AMT, there are many good tutorials.4 Enter
the ‘Create’ tab in Amazon Mechanical Turk, select ‘New Project,’ then select ‘Survey,’ and
then click ‘Create Project.’
4http://docs.aws.amazon.com/AWSMechTurk/latest/AWSMechanicalTurkGettingStartedGuide/
Welcome.html
8
Materials: kyleadropp.com/panels
Figure 9. Creating New Project in Amazon Mechanical Turk - Part 1
Here are the details I suggest entering for this project. Enter the title ‘Answer a short
survey’ and description ‘Answer a short, fun survey’ and keywords ‘surveys.’ Set the reward
to $0.20, assign 100 HITs, allot 1 hour to complete, have the HIT expire in 6 hours, and
have the results automatically approved in 6 hours. See Figure 10.
Figure 10. Creating New Project in Amazon Mechanical Turk - Part II
9
Materials: kyleadropp.com/panels
Select the ‘Advanced’ tab on the bottom right, select ‘Worker requirements,’ click ‘Customize
Worker Requirements..’ in the drop down menu, specify Location as ‘United States,’ HIT
Approval rate ‘greater than or equal to 95’, and Number of HITs Approved ‘greater than or
equal to 100.’
Figure 11. Creating New Project in Amazon Mechanical Turk - Part III
Select ‘Design Layout’ to move to the next pane, click ‘OK’ when prompted regarding Master
Workers. You will now add a brief description of the survey, a link to the survey in Qualtrics,
and a confirmation code for the AMT respondent to enter upon completion of the Qualtrics
survey.
Open the ‘AMT wave1.html’ file in the supporting materials folder online. Please note that
you should open this file in a text editor such as TextEdit or TextWrangler rather than in
a browser window. Return to the Amazon Mechanical Turk web page and click ‘Source’ on
the right side of the ‘Design Layout’ page. Paste the entire HTML file into the body. This
file includes code with a brief description of the project, commands to extract the AMT
respondent’s unique Worker ID, and a confirmation code. Modify lines 1, 4, and 5 that
pertain to survey length and eligibility. In line 28, modify the quoted ‘href’ portion to the
full Qualtrics link for the survey, which you can find by clicking on the ‘Distribute Survey’
tab in Qualtrics.5
If you click ‘Source’ again, the body should look like this (See Figure 13 below) and contain a
warning message regarding JavaScript. You can ignore this warning message. Click ‘Preview’
to preview the HIT (See Figure 14 below) and click ‘Finish’ to finish the programming portion
of the HIT.
3.3. Fielding Wave 1. Now click the ‘Create’ tab in AMT, find your Project, select ‘New
Batch,’ click ‘Next,’ and then click ‘Publish HITs.’ You may need to add funds to your
account. Your HIT is now live and you are collecting data! You can review the status of
5Thanks
to these scholars for the HTML code - http://www.academia.edu/1803170/Screening_
participants_from_previous_studies_on_Amazon_Mechanical_Turk_and_Qualtrics
10
Materials: kyleadropp.com/panels
Figure 12. Design Layout in Amazon Mechanical Turk - HTML source
Figure 13. Design Layout in Amazon Mechanical Turk
your project by clicking the ‘Manage’ tab. There are a number of methods for reviewing and
approving completed HITs. From the least to most automated, you can select the ‘Results’
tab on a given project and review the confirmation codes for completed HITs, you can merge
a .csv of the results from Amazon with your Qualtrics .csv file (on the ‘confCode’ variable)
and check whether the confirmation code is correct, or you can use the package ‘mTurkR’
to batch approve respondents who have provided the correct confirmation code.
3.3.1. Evaluating Results from Wave 1. After you have finished Wave 1 data collection,
login to Qualtrics, select ‘View Results,’ click ‘Download Data,’ scroll down and click the
highlighted CSV to download a .csv. See Figure 16.
The key variables in this file are the Worker ID (‘MID’) and the Wave 1 treatment assignment
(‘randCountry’), the confirmation code (‘confCode’), support for the acquisition (‘Q16’), and
post-test mechanism questions (e.g., Q25 1 through Q25 5).
11
Materials: kyleadropp.com/panels
Figure 14. Design Layout in Amazon Mechanical Turk - Preview
Figure 15. Survey Progress in Manage Tab
Using R,6 I randomized the Wave 2 treatment assignment (‘randCountry2’). This file has
been saved as ‘wave2 assignment.csv’ online. The R code file has been saved as ‘wave2 assignment.R.’
## Read file with completed responses from Dropbox public folder
df0 = read.csv(url("http://kyledropp.weebly.com/uploads/1/2/0/9/12094568/data_wave1.csv"),skip=0,head=T)
df1 = read.csv(url("http://kyledropp.weebly.com/uploads/1/2/0/9/12094568/data_wave1.csv"),skip=2,head=F)
names(df1)=names(df0)
6R
can be downloaded here - http://cran.us.r-project.org/
12
Materials: kyleadropp.com/panels
Figure 16. Download Survey Results from Qualtrics
## Add Randomization to Wave 2
df1$randCountry2 = rep(NA,dim(df1)[1]) ##create null variable
for(i in 1:nrow(df1)){ ##loop through Wave 1 rows
hline1 = df1$randCountry[i]
## identify Wave 1 treatment
conds = 1:4 ## list of possible Wave 2 treatments
df1$randCountry2[i] =sample(conds[conds!=hline1],1)
} ## Sample from non-Wave 1 treatments
df1$randCountry2
write.csv(df1[order(df1$randCountry2),],"wave2_assignment.csv",row.names=F) # specify location
In an ideal world, we would wait a few weeks to administer Wave 2 of this panel. For this
tutorial, I will administer it less than 24 hours after Wave 1. Let’s get to it.
13
Materials: kyleadropp.com/panels
4. Administering The Second Survey Wave
This section will provide instructions for administering Wave 2. See the Appendix (or message me) for methods for administering Wave 2 treatments using servers and MySQL.
4.0.2. Import Wave 2 into Qualtrics. Enter Qualtrics, and Upload the survey ‘panel tutorial wave2.qsf’
in the same way you uploaded the first survey at the beginning of the tutorial (i.e., edit survey, advanced options, import survey). This survey contains the same questions and question
blocks as Wave 1.
4.0.3. Branch Logic and Wave 2 Treatment. On the ‘Edit Survey’ tab, select ‘Survey Flow.’
Scroll down to see the new branch logic. Open the file ‘wave2 assignment.csv’, which is sorted
by Wave 2 treatment assignment (‘randCountry2’) and also includes the AMT Worker ID.
I have copied the AMT Worker IDs associated with each unique treatment into the logic of
the Survey Flow.
Figure 17. Survey Flow in Survey 2
We have assigned embedded data variables to each of the four possible Wave 2 treatment
assignments. Now, when respondents enter the survey we will link their ‘MID’ to their Wave
2 treatment group. Next, we use branching logic to assign respondents to an appropriate
14
Materials: kyleadropp.com/panels
treatment. See Figure 18 where we have created the variables ‘headline’ and ‘country’ that
will appear in the newspaper article.
Figure 18. Branch Logic for Wave 2 Assignment
4.1. Managing Wave 2 from R. This section utilizes R to invite respondents who took
the first survey to the second survey wave. Then, it uses R to compensate these survey
respondents.
4.1.1. Install MTurkR.
library(devtools)
## install.packages(’MTurkR’) if you don’t have package installed
require("MTurkR")
4.1.2. Enter Amazon Mechanical Turk credentials. You must provide two unique identifiers,
your AWS Access Key ID and your AWS Secret Access Key, to control AMT in R.
## Sign into Amazon Mechanical Turk with AWS Access Key ID (‘xxxx’)
## and AWS Secret Access Key (‘yyyy’)
15
Materials: kyleadropp.com/panels
credentials(c("xxxx","yyyy"))
AccountBalance()
Here is how you acquire your AWS Access Key ID. First, go to http://aws.amazon.com/,
select Security Credentials in the top right, sign into your account, click ‘Continue to Security
Credentials,’ click ‘Access Keys’ tab, and you will see your Access Key ID. Copy this into
the ‘xxxx’ portion of the credentials command. This is first value. Now, to find your AWS
Secret Access Key, select ‘Security Credentials’ in the yellow box below your Access Key
ID, select ‘Access Credentials,’ and click ‘Show’ under Secret Access Key. This is the AWS
Secret Access Key. Copy this into the ‘yyyy’ portion of the credentials command.
Figure 19. Security Credentials for Amazon Mechanical Turk
4.1.3. Invite Wave 1 respondents to Wave 2. Now, you will contact workers to invite them to
the Wave 2 panel. Use the ContactWorker() command to send messages to each respondent,
specify a bonus, and add a subject to the email. A typical subject line is “Thanks for
completing my HIT!”, a typical body is
“I will pay a $0.20 bonus if you complete a brief follow-up study. The survey
can be completed at http://tuck.qualtrics.com/SE/?SID=SV_ePSM9ygoprOmMkZ&MID=
xxxxx”
Please note the MID is the unique Amazon Mechanical Turk ID for the respondent. Here is
the appropriate R code (with the same data frame df1). See the ‘Distribute Survey’ link in
Qualtrics to provide to correct Qualtrics survey link. Use the paste command to append an
‘MID’ identifier to each respondent who responds to your invitation. This code is available
in the file ‘invite workers.R’ in the supporting materials.
df0 = read.csv(url("http://kyledropp.weebly.com/uploads/1/2/0/9/12094568/data_wave1.csv"),skip=0,head=T)
df = read.csv(url("http://kyledropp.weebly.com/uploads/1/2/0/9/12094568/data_wave1.csv"),skip=2,head=F)
names(df)=names(df0)
16
Materials: kyleadropp.com/panels
df = df[as.character(df$MID)!="",]
a <- "Complete a brief question follow-up survey for $0.20 bonus!"
b <- paste("The survey can be completed at ",
"http://tuck.qualtrics.com/SE/?SID=SV_6LOw3p5hGeaF9uR&MID=",
df$MID,sep="")
c = as.character(df$MID)
d <- ContactWorker(subjects=a,msgs=b,workers=c)
Your screen will look like this when you run the command.
Figure 20. Output From Contact Worker Command
Respondents now will take Wave 2 of the survey. Download the results from the ‘View Results’ tab after a sufficient number of individuals have responded. Typically, I have received
Wave 2 response rates of 70% or higher. The Wave 2 data file is saved as ‘data wave2.csv.’
4.1.4. Compensating Respondents. After the worker has submitted the job, or HIT, you must
pay him or her using the GrantBonus() command. First, in the manage tab in AMT, select
‘Results’ on your latest batch of HITs, click ‘Download’ , then click ‘here’ on the Download
Batch Results page to download an individual level file with completed HITs.
The file includes the ‘AssignmentId’ column, along with a ‘WorkerId’ column. Merge this
dataset with the Wave 2 results file, on the ‘MID’ variable to determine which respondents
successfully entered Wave 2. Then, using AMT Worker IDs and Assignment Ids, send a
bonus to the workers.
wave1AMT = read.csv(url("http://kyledropp.weebly.com/uploads/1/2/0/9/12094568/wave1_amt_results.csv"),skip=0,head=T
wave2Data = read.csv(url("http://kyledropp.weebly.com/uploads/1/2/0/9/12094568/data_wave2.csv"),skip=0,head=T)
merge1 = merge(wave2Data,wave1AMT,by.y="WorkerId",by.x="MID")
dim(merge1)
a1 <- as.character(merge1$MID)
b1 <- as.character(merge1$AssignmentId)
c1 <- ".20"
d1 <- "Thanks for your great work on my HIT! I really appreciate it!"
##GrantBonus(workers=a1,assignments=b1,amounts=c1,reasons=d1)
17
Materials: kyleadropp.com/panels
Figure 21. Download Completed HITs
5. Data Analysis
You now have two separate .csvs containing completed responses to Wave 1, and a second
file containing responses to Wave 2. Merge the two files based on the respondent’s Amazon
Mechanical Turk ID and then start your between-subjects and within-subjects analysis.
Here is code to merge the files and examine the treatment assignments. A figure below
demonstrates that the Wave 1 treatment assignments differ from Wave 2 assignments.
df0 = read.csv(url("http://kyledropp.weebly.com/uploads/1/2/0/9/12094568/data_wave1.csv"),skip=0,head=T)
18
Materials: kyleadropp.com/panels
wave1data = read.csv(url("http://kyledropp.weebly.com/uploads/1/2/0/9/12094568/data_wave1.csv"),skip=2,head=F)
names(wave1data)=names(df0)
wave2data = read.csv(url("http://kyledropp.weebly.com/uploads/1/2/0/9/12094568/data_wave2.csv"),skip=0,head=T)
dim(wave1data)
dim(wave2data)
names(wave1data)=paste(names(wave1data),"_W1",sep="")
names(wave2data)=paste(names(wave2data),"_W2",sep="")
waves12 = merge(wave1data,wave2data,by.x="MID_W1",by.y="MID_W2")
cbind(waves12$headline_W1[11:20],waves12$headline_W2[11:20])
19
Materials: kyleadropp.com/panels