PowerPoint - Advanced Defense Lab

22nd USENIX Security
(August, 2013)
Xinyu Xing, Wei Meng, Dan Doozan,
Georgia Institute of Technology
Alex C. Snoeren,
UC San Diego
Nick Feamster, and Wenke Lee,
Georgia Institute of Technology
Outline
Introduction
 Overview and Attack Model
 Pollution Attacks on YouTube
 Google Personalized Search
 Pollution Attacks on Amazon

2013/9/3
A Seminar at Advanced Defense Lab
2
Introduction

Modern Web services are increasingly
relying upon personalization to improve
the quality of their customers’
experience.

Many services with personalized content
log their users’ Web activities.
2013/9/3
A Seminar at Advanced Defense Lab
3
This paper...

We demonstrate that contemporary
personalization mechanisms are
vulnerable to exploit.
2013/9/3
A Seminar at Advanced Defense Lab
4
Our Attack

We show that YouTube, Amazon, and
Google are all vulnerable to the same
class of cross-site scripting attack, which
we call a pollution attack, that allows
third parties to alter the customized
content.

A distinguishing feature of our attack is
that it does not exploit any vulnerability
in the user’s Web browser.
2013/9/3
A Seminar at Advanced Defense Lab
5
Overview and Attack Model

The main instrument that a service
provider can use to affect the content
that a user sees is modifying the choice
set.

When a user issues a query, a service’s
personalization algorithm affects the
user’s choice set for that query.
2013/9/3
A Seminar at Advanced Defense Lab
6
Overview and Attack Model (cont.)

In this paper, we focus on how changes
to a user’s history can affect the choice
set, holding other factors fixed.

This attack requires three steps:
1. Model the service’s personalization
algorithm.
2. Create a “seed” to pollute the user’s history.
3. Inject the seed with a vector of false clicks.
2013/9/3
A Seminar at Advanced Defense Lab
7
2013/9/3
A Seminar at Advanced Defense Lab
8
Pollution Attacks on YouTube

Personalization rule
 Consider only those videos that the user
watched for a long period of time
 Similar viewing histories
 Not recommend a video the user has
already watched
 Two of suggested videos are recommended
based upon personalization
2013/9/3
A Seminar at Advanced Defense Lab
9
2013/9/3
A Seminar at Advanced Defense Lab
10
Preparing Seed Videos
Video channel (C)
ΩS
2013/9/3
ΩT
A Seminar at Advanced Defense Lab
11
Inject Seed Videos

We see the video:
 http://www.youtube.com/user_watch?
plid=<value>&video_id=<value>

We watch for a period of time:
 http://www.youtube.com/set_awesom
e?plid=<value>&video_id=<value>
2013/9/3
A Seminar at Advanced Defense Lab
12
Experimental Design
Relationship
New
Existing
Account
New
Two 3-minute videos
(with about 65
sequentially watching)
Existing
(22 volunteers)
2013/9/3
100 channel
(in top 2000)
X 25 videos
Channel
OnlyyouHappycamp
X 15 videos
A Seminar at Advanced Defense Lab
13
Evaluation

We evaluated the effectiveness of our
pollution attacks by logging in as the
victim user and viewing 114
representative videos.
2013/9/3
A Seminar at Advanced Defense Lab
14
Evaluation (New Accounts)
Successfully
 we computed

 the Pearson correlation between the
showing frequencies and the lengths of the
target videos
○ 0.54 => medium
 the Pearson correlation between the
showing frequencies and the view counts of
the target videos
○ 0.23 => moderate
2013/9/3
A Seminar at Advanced Defense Lab
15
2013/9/3
A Seminar at Advanced Defense Lab
16
2013/9/3
A Seminar at Advanced Defense Lab
17
Evaluation (Existing Accounts)

For existing channel OnlyyouHappycamp
 14 of the 22 volunteers (64%)
 Ten of our volunteers shared their histories
 The majority of the videos recommended to
users for whom our attacks have low promotion
rates have longer lengths and more view
counts than our target videos.
2013/9/3
A Seminar at Advanced Defense Lab
18
2013/9/3
A Seminar at Advanced Defense Lab
19
Google Personalized Search

We describe two classes of personalization
algorithms:
 contextual personalization
 persistent personalization
2013/9/3
A Seminar at Advanced Defense Lab
20
2013/9/3
A Seminar at Advanced Defense Lab
21
Identifying Search Terms

Contextual Personalization
 The keywords injected into a user’s search
history should be both relevant to the
promoting keyword and unique to the
website being promoted.
2013/9/3
A Seminar at Advanced Defense Lab
22
Identifying Search Terms
(cont.)

Persistent Personalization
 In this case, the size of the keyword set
should be larger than that used for a
contextual attack in order to have a greater
effect on the user’s search history.

An attacker can safely inject roughly 50
keywords a minute using cross-site
request forgery.
 we assume an attacker can inject at most 25
keywords into a user’s profile
2013/9/3
A Seminar at Advanced Defense Lab
23
Contextual Personalization
Google results
30 URLs
5,761 Search Terms
from
made-in-china.com
URLs having
unique <meta>
keywords
30 URLs
30 URLs
URLs having
unique <meta>
keywords
30 URLs
URLs having
unique <meta>
keywords
151,363
unique URLs
2013/9/3
A Seminar at Advanced Defense Lab
2,136 URLs
1,739 search
terms
24
2,136 URLs for Contextual
Personalization
2013/9/3
A Seminar at Advanced Defense Lab
25
Persistent Personalization
Google results
30 URLs
551 Search Terms from
made-in-china.com
30 URLs
30 URLs
URLs having
unique Google
AdWords keywords
30 URLs
151,363
unique URLs
2013/9/3
A Seminar at Advanced Defense Lab
15,979 URLs
26
Evaluation

Contextual Personalization
44%
1.1%
62.8%
28%
2013/9/3
A Seminar at Advanced Defense Lab
27
Evaluation (cont.)

Persistent Personalization
17%
4.3%
22.7%
??%
2013/9/3
A Seminar at Advanced Defense Lab
28
Evaluation (cont.)

Real Users
 97.1% of our 729 previously successful
contextual attacks remain successful.
 Only 77.78% of the persistent pollution
attacks that work on fresh accounts achieve
similar success
2013/9/3
A Seminar at Advanced Defense Lab
29
Pollution Attacks on Amazon
Amazon tailors a customer’s homepage
based on the previous purchase,
browsing and searching behavior of the
user.
 We focused on the personalized
recommendations Amazon generates
based on the browsing and searching
activities

2013/9/3
A Seminar at Advanced Defense Lab
30
2013/9/3
A Seminar at Advanced Defense Lab
31
Amazon Recommendations

Amazon’s personalization is based on
history that maintained by the user’s
web browser.
 Session cookie
2013/9/3
A Seminar at Advanced Defense Lab
32
Identifying Seed Products and
Terms

Visit-Based Pollution
 the attacker visits the Amazon page of the
product and retrieves the related products
that are shown on Amazon page of the
targeted product.

Search-Based Pollution
 An attacker could use a natural language
toolkit to automatically extract a candidate
keyword set from the targeted product’s
name.
2013/9/3
A Seminar at Advanced Defense Lab
33
2013/9/3
A Seminar at Advanced Defense Lab
34
2013/9/3
A Seminar at Advanced Defense Lab
35
2013/9/3
A Seminar at Advanced Defense Lab
36