Final Presentation

Students: Ilya Paskhover, Itay Gal
Supervisors: Oleg Rokhlenko, Nadav Golbandi
Definitions
 AMT - Amazon Mechanical Turk is a crowd sourcing
internet marketplace that enables to coordinate the
use of human intelligence to perform computational
tasks
 HIT - Human Intelligence Task.
 Requester - a user which publishes and pays for
completing HITs.
 Worker - a user which completes HITs and gets paid
for it.
2
Goals
 Build a framework that will enable a researcher to run
experiments to find the best possible slots for
advertising on a web page.
Getting acquainted with Amazon Mechanical Turk.
Defining HIT structure.
Generating formatted HIT.
Sending HIT using AMT API.
Receiving and displaying HIT results.
Designing a GUI for the framework.
3
Workflow
templates.xml
images.xml
articles.xml
coolDowns.xml
Sending HIT
New HIT
Loading Data
from the XML
files
hits.xml
Amazon
MTurk
HITs List
Receiving Results
Result Display
4
Used technologies
 AMT SDK
 JAVA
 XML
 HTML
 JavaScript
 Swing
5
Methodology
 HIT Structure
Every HIT comprises from 4 screens showed to the
worker one after another.
1)
Instructions set:
Explains to the worker in AMT what he has to do.
6
Methodology
 HIT Structure (cont.)
2) Article:
An article which a worker has to read.
Each article and a random set of images will
be arranged in a chosen template.




Template - A table of 10 rows and 10 columns.
Each cell will be referred as a box. Each box can be
filled with a paragraph or an image. A box size
can range from 1X1 up to 10X10 cells (the whole
table).
The requester can add a question to each
paragraph or image. This question will be added
automatically to the questionnaire.
Article - An article is a set of paragraphs, each
paragraph can contain a question.
Images - The database contains sets of images,
each set represent a different category, such as
“animals”, “fruits” etc. For Every HIT a random set
of images is selected in such a way that only one
image will be selected from each set.
7
Methodology (cont.)
 HIT Structure (cont.)
3) Cool down:
A screen with an unrelated task, which can take a few
seconds up to a few minutes. This task takes place in
order to create a small margin from reading the article
and answering the questions.
8
Methodology (cont.)
 HIT Structure (cont.)
4) Questionnaire:
A set of questions the worker
has to answer. The set will be
built during the generation of
the article.
If the box contains a question,
the question will be added and
will be presented in that order.
9
Methodology (cont.)
 AMT API
 Using AMT SDK
We chose to use Java Amazon SDK to interact with the
AMT. This SDK provides a basic functionality for
interacting with AMT. Java allows us to use Swing as a
GUI framework.
 The HIT is formatted with HTML which provides much
more flexibility in designing the structure of our article
and images.
10
Methodology (cont.)
 AMT API (cont.)
 AMT enables to build a one page HIT only. JavaScript is
used to manipulate the HTML code in such a way that
we can create a multiple pages form.
 A worker should not be able to see the article and the
images once he already saw the questionnaire. Using
cookies, we identify the user’s state and prevent access
to the article once he stepped to the questionnaire page.
 Every time a requester opens a HIT details, results will
be downloaded from AMT and updated in our storage.
11
Methodology (cont.)
 GUI
 The GUI contains 5 screens
allowing the requester to
define HIT structure and
details, viewing available
HITs including results for
each HIT.
 The GUI is built using Swing.
 Data storage
 Template definitions, available images, articles, cool
downs and HITs (including all internal data) are saved
in XML files in a well formatted and easy, readable form.
12
Completed goals
Framework requirements definition.
Data Structure design.
Creating HIT.
Interacting with AMT API.
Load and store data using xml files.
Building a GUI.
Displaying basic process of the results.
Documentation.
Ant Installer.
13
Conclusions
 The AMT API for Java is very limited for creating
designed HIT templates, therefore, we decided to
implement some parts of the HIT in HTML.
 AMT does not allow to create HITs containing 2 pages.
Since we needed to create multi-screen HITs while
preventing a user from going back to the previous
screen, we had to use some JavaScript manipulations.
14
Conclusions (cont.)
 Frequent meetings proved to be crucial for better
understanding the project requirements and adjusting
the implementation accordingly.
 XML is easy to use and a very comfortable way to store
structural data. Since it is common standard it is very
well documented.
 Eclipse Java Swing is a simple and friendly framework
to design a GUI and also well documented.
15