Are these ads safe? Detecting hidden attacks through the mobile

ARE THESE ADS SAFE? DETECTING HIDDEN ATTACKS
THROUGH THE MOBILE APP-WEB INTERFACE
VAIBHAV RASTOGI, RUI SHAO,YAN CHEN, XIANG PAN, SHIHONG ZOU, AND RYAN RILEY
PRESENTED BY HELEN ZHAO
MOBILE SECURITY PROBLEM
 Mobile phones are everywhere
 Android has 80% of the world’s market share in
mobiles
 But is particularly susceptible to malware and
scams
ANDROID SECURITY
 Users able to install apps from unverified
sources
 APKs from the internet
 Third Party App Stores
 Third party app stores are used almost
exclusively in China
 May not have as robust integrity and
security checks as the Google Play Store
ANDROID ATTACKS
 Trojans are the most common form of
attacks as application sandboxing makes
drive-by-download attacks difficult
 Trojans are applications that have some
useful function but also hide some
malicious behaviour
WHERE ARE THESE ATTACKS COMING FROM
 Malicious applications
 Possibly benign, legitimate
applications that are knowingly or
unknowingly hosting malicious ads
 This paper focuses on the 2nd type
ADS IN ANDROID
 Many applications on App Stores, lots of them free
 Relies on revenue from ads
 Ads are links
 Ads can come from ad networks/aggregators, such as Google Ads
 These ads are connected to the web – hence are an app-web interface
HOW ARE ADS ADDED TO AN APP
 Developer just embeds the ad into the app statically
 Ad network code (e.g. API call) is added to the app that is
responsible for serving up ads
REDIRECTION
 When you click on an ad…
 Ad networks generally don’t run independently: often
bid with each other or through ad exchanges, or
sell/delegate ad spaces to each other
 Leads to an ad being redirected many times, often
through all the different ad network’s channels
 This is called the redirection chain
 The final ad page that the ad redirects to is called the
landing page
GOAL OF THE PAPER
 Analyse and understand mobile attacks
through ads via the app-web interface
How?
Creating an analysis tool, deploying it
for 2 months and analysing 600,000
applications in the US and in China
HOW THE TOOL WORKS
 The tool is an analysis framework that follows three steps
 Triggering the UI – clicking on all the web links
 Detecting malicious content in triggered pages
 Provenance – Determining where and from whom the malicious
content originated
HOW: TRIGGERING THE UI
 Can’t just use static analysis on the app to identify ad links as ad
networks dynamically load the ads
 Created automated tool that ran applications in an emulator in a
virtual machine – dynamic app analysis
 Extracted features and code elements from displayed UI and
constructed a hierarchy of the widgets within in e.g. buttons, panels
ISSUES WITH TRIGGERING UI
 Ran into issues with WebViews, as
appeared opaque, flat UI hierarchy
 Used graphics based algorithm to find
clickable buttons/widgets
 Looks for convex, bounded
contours
CAPTURING INFORMATION
 To analyse the information, needed to capture and store:
 the links
 redirection chains
 landing pages
CAPTURING INFORMATION
 Redirection chains
 Created custom browser that behaves as an user would
 Gets around time-based checks, e.g. by Google to prevent ad click fraud
 Landing Pages
 Landing pages were dynamically analysed, and every web link within
recorded and visited
 Often landing pages ask the user to download some file – potential trojans
 Stored and recorded any files downloaded
DETECTING MALICIOUS CONTENT
 Used information in the VirusTotal system to determine whether a
URL/file is malicious or not
 VirusTotal is a database that aggregates results from 50+ blacklists and
50+ anti-virus systems
 Checked all the URLs/downloaded files against VirusTotal
 Anti-viruses are prone to false positives, therefore a file/URL needed to
be flagged by three different systems for the authors to consider the
links malicious
PROVENANCE
 Once a malicious URL or file is detected, need to:
 Determine where the malicious content originated
 Find who is responsible for the malicious content
PROVENANCE
Two types of malicious ads:
 Ads redirecting to malicious landing pages
 Examine redirection chain to find out who owns the URLs finally redirecting to
the malicious page
 Malicious links embedded in the application
 Need to find which block of code called the link
 Could be ad network code or developer code
FINDING THE RESPONSIBLE AD NETWORK
Method used:
 Identified loosely coupled libraries
 Clustered them according to their set of API endpoints
 Manually determined if a cluster was an ad network based on the
library
 Identified 201 unique ad networks
TOOL SUMMARY
DEPLOYMENT
 System was run in Northwestern University Campus in US and Zhejiang
University Campus in China
 Location is important for ads
 Ran for two months
 Required little human intervention
RESULTS
US
CHINA
App to web link launches
1,000,000
415,000
Malicious URLs
948
1475
Unique domains hosting
above malicious URLs
64
139
Malicious / Total file
downloads from landing
pages
271/468
~ 58%
435/1097
~40%
RESULTS
US
China
RESULTS
US
China
SCAMS DETECTED
 Armor for Android anti-virus scan trojan
accounted for 244/271 malicious apps
downloaded in US, and 102/435
malicious apps downloaded in China
 Run by malicious ad network Tapcontext
 Caught by the tool at least 20 days
before Google Safebrowsing caught it
OTHER SCAMS DETECTED
 Win free iPhone/iPad
 Personal information gathering
 Fake Movie Player Malware
 SMS trojans
REFLECTION
Good:
 Wide-reaching – 600,000 apps tested in two
countries
 Created a tool that can be used by government
agencies/Google
 Well-researched, many algorithms and tool were
based on previous studies
CRITICISM
Improvements and Issues:
 Applications that used native code excluded (30%)
 Tool relied on dynamic triggering of ads – may have been malicious ads in
an ad library that wasn’t triggered
 UI Triggering was blocked by things such as login screens
 Ethics – running the experiment involved clicking on ads and generating
revenue for ad networks
 Malware detection relied entirely on VirusTool and its database
 Focuses on identifying well-known malware instead of new malware
THANK YOU