Analysis and Identification of Malicious JavaScript Code

This article was downloaded by: [rami alsalman]
On: 25 December 2012, At: 07:03
Publisher: Taylor & Francis
Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House,
37-41 Mortimer Street, London W1T 3JH, UK
Information Security Journal: A Global Perspective
Publication details, including instructions for authors and subscription information:
http://www.tandfonline.com/loi/uiss20
Analysis and Identification of Malicious JavaScript Code
a
a
b
Mohammad Fraiwan , Rami Al-Salman , Natheer Khasawneh & Stefan Conrad
c
a
Department of Computer Engineering, Jordan University of Science and Technology, Irbid,
Jordan
b
Department of Software Engineering, Jordan University of Science and Technology, Irbid,
Jordan
c
Institute of Computer Science, Heinrich-Heine University, Düesseldorf, Germany
Version of record first published: 15 Feb 2012.
To cite this article: Mohammad Fraiwan , Rami Al-Salman , Natheer Khasawneh & Stefan Conrad (2012): Analysis and
Identification of Malicious JavaScript Code, Information Security Journal: A Global Perspective, 21:1, 1-11
To link to this article: http://dx.doi.org/10.1080/19393555.2011.624160
PLEASE SCROLL DOWN FOR ARTICLE
Full terms and conditions of use: http://www.tandfonline.com/page/terms-and-conditions
This article may be used for research, teaching, and private study purposes. Any substantial or systematic
reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to
anyone is expressly forbidden.
The publisher does not give any warranty express or implied or make any representation that the contents
will be complete or accurate or up to date. The accuracy of any instructions, formulae, and drug doses should
be independently verified with primary sources. The publisher shall not be liable for any loss, actions, claims,
proceedings, demand, or costs or damages whatsoever or howsoever caused arising directly or indirectly in
connection with or arising out of the use of this material.
Information Security Journal: A Global Perspective, 21:1–11, 2012
Copyright © Taylor & Francis Group, LLC
ISSN: 1939-3555 print / 1939-3547 online
DOI: 10.1080/19393555.2011.624160
Downloaded by [rami alsalman] at 07:03 25 December 2012
Analysis and Identification of Malicious
JavaScript Code
Mohammad Fraiwan1,
Rami Al-Salman1,
Natheer Khasawneh2,
and Stefan Conrad3
1Department of Computer
Engineering, Jordan University
of Science and Technology,
Irbid, Jordan
2Department of Software
Engineering, Jordan University
of Science and Technology,
Irbid, Jordan
3Institute of Computer Science,
Heinrich-Heine University,
Düesseldorf, Germany
ABSTRACT Malicious JavaScript code has been actively and recently utilized as a vehicle for Web-based security attacks. By exploiting vulnerabilities
such as cross-site scripting (XSS), attackers are able to spread worms, conduct
Phishing attacks, and do Web page redirection to “typically” porn Web sites.
These attacks can be preemptively prevented if the malicious code is detected
before executing. Based on the fact that a malignant code will exhibit certain
features, we propose a novel classification-based detection approach that will
identify Web pages containing infected code. Using datasets of trusted and
malicious Web sites, we analyze the behavior and properties of JavaScript code
to point out its key features. These features form the basis of our identification
system and are used to properly train the various classifiers on malicious and
benign data. Performance evaluation results show that our approach achieves
a 95% or higher detection accuracy, with very small (less than 3%) false positive and false negative ratios. Our solution surpasses the performance of the
comparable literature.
KEYWORDS classification, malicious JavaScript, Web testing
1. INTRODUCTION
Address correspondence to
Mohammad Fraiwan,
Department of Computer Engineering,
Jordan University of Science
and Technology, P.O. Box 3030 Irbid,
22110, Jordan.
E-mail: [email protected]
JavaScript, as a programming/scripting language, is used in Web development to add more features, effects, and improvements to the end-user
experience. For example, JavaScript can help reduce the server-side load by
conducting validation checks of filled forms prior to sending them to the
Web server. Thus, it is shifting some of the computation to the users’ side.
However, enabling such capabilities can endanger the end-user if the Web page
is infected with malicious JavaScript code. Although JavaScript can be disabled
by the Web browser, many Web sites require this feature to be enabled in
order to view their JavaScript intensive content. Malicious JavaScript code is
typically injected around benign code. Moreover, attackers will obfuscate this
code to avoid detection mechanisms (e.g., de-compilation; see FIGURE 1).
This infected code is used to spread worms and viruses, install Malware, conduct Phishing attacks, spread spam, and redirect Web pages to typically porn
Web sites. The most recent attack employing JavaScript code at the time of this
1
will identify Web pages containing infected code. The
classification model is built based on attributes of
malicious JavaScript codes, giving end users the option
to make the decision on whether he/she wants to view
and/or run the possibly infected Web content.
The objectives of this paper are as follows:
Downloaded by [rami alsalman] at 07:03 25 December 2012
FIGURE 1 Obfuscated JavaScript malicious code. The code in
this example can be used to import a malicious flash content from
a malicious Web site. (color figure available online.)
writing has affected the Twitter social networking Web
site. The JavaScript code was used to spread worms
by injecting false tweets into users’ pages (“Twitter
scrambles,” 2010).
Although it is imperative to use and enable
JavaScript in browsers, it is hard to stop malicious
code from downloading into end users’ (i.e., victims)
machines. Attackers use various techniques to lure
end users into downloading the infected Web content (e.g., P2P file sharing, pop up ads, spam emails).
In the aforementioned Twitter attack, users were only
required to move the mouse over the false tweets to get
infected, without even clicking (“Twitter scrambles,”
2010). Moreover, most countermeasures that flag certain Web sites as dangerous may be ineffective and
outdated (Ma, Saul, Savage, & Voelker, 2009).
For example, predefined blacklists, such as
PhishTank (PhishTank, 2010) are ineffective as
the attackers are fast in changing their IP addresses and
URLs (Moore, Clayton, & Stern, 2009). Continuously
updating these lists requires high overhead, making it
difficult to keep up with the attackers. Also, Fast Flux
attacks, where attackers exploit a continually changing
network of compromised nodes, are becoming more
prevalent (Holz, Gorecki, Rieck, & Freiling, 2008), thus
inundating traditional countermeasures. JavaScript
should always be enabled for the correct operation
of a wide variety of Web pages. However, harmful
code can be easily and inadvertently downloaded and
run. Thus, we target the execution of the malicious
code. Attacks can be preemptively prevented if this
code is detected before executing. Because a malignant
code will exhibit certain features, we propose a novel
classification-based detection approach. This approach
M. Fraiwan et al.
1. We study and analyze the syntax and semantics
of malicious JavaScript code to point out its key
distinguishing features and marks.
2. Based on these feature, we propose the use of classification algorithms to single out malicious code.
3. We conduct performance evaluation studies on
various datasets to show the effectiveness of our
proposed scheme.
4. We compare our work to recent related solutions in
the literature.
The remainder of this paper is organized as follows.
We discuss the related work in section 2. The analysis
of malicious and benign JavaScript codes is presented
in section 3. The system architecture is described
in section 4. Section 5 describes the performance
evaluation and comparison experiments. We conclude
in section 6.
2. RELATED WORK
There has been a plethora of research into the detection of JavaScript malicious code. In this section, we
survey the literature landscape, clarify key aspects and
issues, and point out avenues for improvement and
contribution. The majority of the works in the literature focus on a small subset of the malicious JavaScript
features or distinguishing dimensions. These works
failed in several ways:
1. They did not relate the JavaScript code with the
content or other parts of the Web page.
2. They did not consider the behavior and semantics
of the JavaScript code.
3. They focus on the behavior and ignore other factors.
The result of these shortcomings is a poor detection
accuracy of malicious JavaScript code.
To detect any anomaly, it is natural to look for the
behavior of that anomaly and its main characteristics as an indication of its existence. For example, a
user may click on a trusted link (e.g., www.yahoo.
2
Downloaded by [rami alsalman] at 07:03 25 December 2012
com); however, the malicious code will intentionally
send a wrong request for an unavailable page (e.g.,
yahoo.com/IAMNOTAVAILABLE.html). After that,
the trusted Web site will naturally return “Failed status
404 No document exists.” The user is then directed
to a malicious Web site containing harmful code
or open several pop-up windows. In an attempt to
solve this problem, Hallaraker & Vigna (2005) monitor the behavior of the malicious code. Based on
the number of windows opened as pop-up (i.e., using
window.open()), they decide whether this code is
malicious or not. The problem with their approach
is that it will have many false positives (i.e., benign
code classified as malicious) because they only consider the code behavior before and after running. The
reputation of the URL also plays a role in its classification as malicious or not. Ma et al. (2009) used
the Yahoo-PhishTank database of malicious Web sites
in conjunction with other features to train a Support
Vector Machine (SVM) classifier. This classifier is later
used to identify malicious JavaScript code. They have
reported an accuracy of 80%. Similarly, Likarish & Jung
(2009) used a Web crawler to collect both malicious
and benign JavaScript code. The collected data are later
fed to a classifier for training. As before, the classifier is
used at run time to detect malicious code.
Other machine learning-based approaches have also
been developed. Kolter and Maloof (2006) use classification to detect malicious executables using a variety of inductive methods (e.g., Naïve Bayes). Several
JavaScript-related tools and plug-ins exist in practice; Caffeine Monkey (Feinstein & Peck, 2007) is
a customized release of the Mozilla SpiderMonkey
(Mozilla C, n.d.) that has the capability to analyze
and detect obfuscated malicious JavaScript and Flash
code. Google Caja (google-caja, n.d.)is another tool
that effectively sandboxes mistrusted code. It allows
the containing application to completely control access
rights of external content. Similarly in (Dewar, Holz, &
Freiling, 2010), JavaScript code is executed in an isolated environment and every critical action is logged.
An analysis of these logs decides whether the site
is malicious. Some researchers have suggested adding
policies that allow Web sites to control unexpected
JavaScript execution (Jim, Swamy, & Hicks, 2007). This
is especially helpful for Web sites importing and using
external content. However, if the accessed Web site is
already malicious, then there is no protection for the
end user.
3
The use of classification techniques in the detection
of obfuscated malicious JavaScript code was also proposed in (Likarish, Jung, & Jo, 2009; Cova, Kruegel,
& Vigna, 2010). Likarish et al. used features based
on the syntax and quality features of the code (e.g.,
number of lines, average characters per line, number
of strings, argument length). These features are hardly
distinguishing features of a malicious code, which in
turn will result in poor detection accuracy. Cova et al.
uses features based mainly on the code execution (e.g.,
number of dynamic code executions, number and target of redirections). In our work, we look at features
that have to do with the semantics and the behavior of the code rather than the quality features of the
code. In comparison, our work produces much higher
detection accuracy; see section 5.
3. ANALYSIS OF MALICIOUS
JAVASCRIPT CODE
We identify four categories of features that we used
in comparing benign and malicious codes: URL properties, JavaScript code output, JavaScript code behavior, and JavaScript code content. The features per
category are as follows:
3.1. URL Properties
(a) Is the URL blacklisted by trusted entities (e.g.,
Symantec)?
(b) Is the corresponding IP address blacklisted?
3.2. JavaScript Code Output
We check the output of the JavaScript code against
the list of the most frequent words in comparable malicious JavaScript code. The main intuition here is that
malicious JavaScript code will try to solicit users with
a Web content of certain nature (e.g., monetary or sexual). The feature is good if the most frequent words
in malicious codes are different from the benign ones,
which is mostly true.
Table 1 show that the most common words appearing in the output of a malicious code differ greatly
from those of a benign one. This feature can be used to
distinguish the two types of JavaScript codes. Table 2
shows the two top 30 lists. Fortunately, these two lists
are highly disjointed as only one word “privacy” is
shared among the two lists.
Analysis of Malicious JavaScript
TABLE 1 A Comparison of the Most Frequent Words
Appearing in the Output of Both Malicious and Benign
JavaScript Codes. It is Clear that the Two Lists are Highly
Disjointed
Downloaded by [rami alsalman] at 07:03 25 December 2012
Number of
common
Top words
10
20
30
50
0
1
1
6
70
7
100
14
Common words
–
Privacy
Privacy
privacy, com, terms, jobs, information,
business
privacy, com, terms, jobs,information,
business, company
privacy, com, terms, jobs,information,
business, company, see, email, here, sign,
services, register, http
TABLE 2 The List of Top 30 Most Frequent Words Appearing
in Malicious and Benign URLs. Only Keyword Privacy is Shared
Among the Two Lists
Type of URL
Malicious
Benign
Most frequent words
rights, fill, reserved, privacy, bill, work,
online, have, home, password, policy,
may, free, loans, permission, financial,
money, information, making, personal,
link, business, make, top, post, jobs,
credit, debt, forum, apply, please
email, sign, 2010, keep, help, English,
Facebook, logged, site careers, terms,
forgot, privacy, advertising developers,
people, friends, mobile, find, badges,
life, share connect, search, helps, just,
directory, use, center, pages, can
TABLE 3 Percentage of Malicious and Benign Code Samples
Containing More than a Certain Number of the Most Frequent
Words Appearing in Malicious Code
using >2
using >5
using >10
86%
35%
6%
5%
0%
0%
Malicious
JS code
Benign JS
code
Table 3 is a key to this distinguishing feature. The
classification algorithm needs to learn the optimal
bound on the number of matching words from the
malicious set before flagging this feature as true (i.e.,
indicator of malicious code). This table shows that
by analyzing malicious and benign code samples, the
choice of more than two words is the best bound. The
M. Fraiwan et al.
table shows that by increasing this number (e.g., more
than 5) a large percentage of malicious samples will
be excluded. However, only a small percentage of the
benign samples contain these words. For example, only
6% of the malicious samples—and no benign samples—
contain more than 10 of the most frequent words in
malicious JS codes. On the other hand, about 86% of
the malicious samples contain more than 2 words, but
only 5% of the benign ones contain more than 2. Thus,
we see it as a distinguishing feature.
3.3. JavaScript Code Behavior
Check for the eval() and unescape() functions.
The combination of these functions is extensively used
by attackers to execute obfuscated code. For example, in FIGURE 1, the function find($) replaces the
obfuscated code with the actual intended malicious
JavaScript code by means of character replacement.
Then, the replacement is followed by script execution. The character replacement is done using the
following method: first, a for() loop in the function
inserts ‘%’ at appropriate places in the obfuscated code.
After that the unescape() function converts the hexadecimal code to the actual characters. For example,
unescape(“It%27s%20me%21”) will result in producing the string “It’s me.” Finally, the eval(find($)) function call is used to execute the malicious code. Similar
functions, with comparable capabilities, can also be
checked.
Table 4 show the percentage of malicious
and benign code samples containing eval(),
unescape(), the combination of eval()/unescape(),
and XMLHTTPREQUEST (feature 3c). It is clear
that the eval() and unescape() functions alone are
not strong indicators of malicious code. However, the
combination of both is a strong indicator of malicious
behavior.
TABLE 4
Percentage of Malicious and Benign Code Samples
Containing Eval(), Unescape(), Combination of Eval()/Unescape(),
and XMLHTTPREQUEST. Details of the Dataset are in Section 5
eval()
and
eval() unescape() unescape() XMLHTTPREQUEST
Malicious
JS code
Benign JS
code
8%
8%
8%
3%
2%
27%
0%
90%
4
Downloaded by [rami alsalman] at 07:03 25 December 2012
Communicate with other programs (e.g., Applets).
The malicious JavaScript code can communicate indirectly with outside entities (e.g., the attacker’s machine)
by utilizing Applets and other similar components.
This way, the JavaScript code will be more benign looking. However, it is extremely dangerous as it may be
used to steal confidential user information.
If the JavaScript code communicates via
the Document Object Model (DOM) XML
HTTPREQUEST object with other servers, then
this may be an indication of malice intent. For
example, users’ information, including user names and
passwords, can be communicated to other malicious
servers. However, as Table 4 shows, this function
is extensively used by benign code as well. Extra
attention need to be paid on the destination of the
communication and the type of information being
communicated.
If the JavaScript code changes its behavior by
redirecting users to other locations, then we can
draw the same aforementioned conclusions. For
example, redirect using the line document. location
(“unsafeURL.com”).
(a) Data Collection.
3.4. JavaScript Code Content
This feature is related to the size-behavior and
how the content of the code can be changed at run
time. Much like the example in FIGURE 1, the deobfuscation of the malicious code and its behavior may
lead to changes in the size of the source code file.
Some of these indicators may be considered circumstantial or not enough for a clear-cut judgment.
However, their combined existence provides a strong
case to flag the code as malicious. Moreover, some
of these features can only be detected at run time,
which may seem like the user will need to run the
harmful code and get infected in order to detect it.
Fortunately, browser plug-ins can be easily developed
to do a kind of test run of the suspicious content in a
controlled manner before displaying to the user. Such
a solution will suggest security versus response time
tradeoff, which can be given as a “settings” option to
the end user.
4. BUILDING THE CLASSIFICATION
MODEL
Figure 2 shows the steps taken to build the classification model. We first collect and process the
5
(b) Classifier Training. NN: Neural Networks,
DT: Decision Tree, SVM: Support Vector Machine.
FIGURE 2
Steps in building the classification model. (color
figure available online.)
classification data (section 4.1), extract features of
malicious code (section 4.2), and finally train the
classifiers (section 4.3).
4.1. Data Collection
In analyzing malicious JavaScript code (see
Figure 2a), we collected two types of data: benign and
malicious. For the latter, we used data from PhishTank,
which contains about 3,000 malicious Web sites URL
information. We trimmed this list to about 100, as
many of the Web sites in the original list are closed,
moved, or renamed. As for the benign list, we used
data from the well-known Alexa Web information
database.
4.2. Feature Extraction
The next step is to extract the previously identified
features from the collected JavaScript codes and
Analysis of Malicious JavaScript
Downloaded by [rami alsalman] at 07:03 25 December 2012
generate the features vector. For features 1.a and 1.b,
the URL and IP will be mapped into string values in
the vector. These values are set based on whether the
IP or URL is blacklisted. To implement feature 2.a,
we generate a list of the top most frequent words in
both malicious and benign JavaScript-based Web sites.
To this end, we used the Sphider search engine. Sphider
accepts URLs and extracts the top most frequent words
that appear in the input URLs. The same procedure is
done for the list of trusted Web sites.
As for the remaining features, there are some useful
JavaScript analysis tools in the literature that we
have used (JavaScript debugger, n.d.; Firebug, n.d.).
We used Firebug in detecting features 3.a, 3.b, and
3.c. Firebug is a suite of Wed development tools that
gave us the capability to monitor, analyze, and edit
Web pages. Feature 3.d is detected by parsing the
JavaScript code. Moreover, we used the JavaScript
Debugger in analyzing the code content with respect
to its size-behavior as indicated in feature 4.a. This
Firefox add-on gave us the capability to debug the
JavaScript code during run time.
4.3. Classifier Training
The Waikato Environment for Knowledge Analysis
(WEKA) toolkit (Weka 3, n.d.) and its associated algorithms were used to generate four classification models:
Neural Networks (NN), Decision Tree (DT), Support
Vector Machine (SVM), and Naïve Bayes. These classifiers are trained and later used to check the input
JavaScript code at run time (see Figure 3). The WEKA
input is required to be in a certain file format; we used
the “.arff” (Attribute-Relation File Format) file format.
A vector of the eight aforementioned features was generated for each input sample. The URL and IP features
were of type String while all other features were of type
Boolean. Five-fold cross-validation was used to train the
classification algorithms. In this training method, the
training data are repeatedly divided into five parts of
equal size, out of which four are used for training and
one for testing.
4.4. Implementation Details
In this section, we go through some of the
implementation-specific details not previously covered. In extracting the values of features, step 5 of
Figure 3, some thresholds need to be defined and
M. Fraiwan et al.
FIGURE 3
The malicious JavaScript detection system at work.
(color figure available online.)
set based on the input. These thresholds are implementation specific and can be provided as a “control
settings” option for the user. The user then decides
based on his/her security preferences. As for our implementation, we chose the following values in setting
the corresponding variable in the features vector; an
example of the features vector is shown in FIGURE 4.
• For feature 2.a, we check the top 30 most frequent
words in the output of the JavaScript output. The
feature is flagged as true if the code shares more than
two words with the corresponding list of most frequent words in the malicious data set (see Tables 1
and 2). Previously identified shared words (e.g., the
word “privacy”) are excluded from this check.
• As for feature 3.a, the Boolean value in the features
vector corresponding to this feature will be set as
malicious if the eval() and unescape() function calls
each appear once.
• Regarding feature 3.b, the JavaScript code is checked
if it uses DOM tree to access other HTML documents. If it does, then the corresponding flag is set as
malicious. Remember that DOM is a platform and
an interface that can be used by many benign codes.
However, its existence in conjunction with the other
features constitutes enough circumstantial evidence
to decide that the JavaScript code is malicious.
• Feature 3.c is decided by checking if the JavaScript
code uses DOM XMLHTTPREQUEST object to
communicate with dubious external servers.
6
URL,IP-address,Term Frequency,EVAL,usafe ccmm,aax enable redirection, size_b,Maldols
http://twurl.nl/sdp7na,87.255.53.196.yes.no.yes.no.yes.yes.yes
http://www.tedde.com/lllkeu.com.174.132.193.77.yes.no.yes.no.no.no.yes
http://www.techvillagebd.com.174.37.53.156.yes.yes.no.no.yes.no.yes
http://www.loddonalive.com.au.113.20.11.241.yes.no.yes.no.yes.yes.yes
http://www.hiLlo/.67.222.21.73.yes.no.yes.no.no.yes.yes
.
.
.
Downloaded by [rami alsalman] at 07:03 25 December 2012
FIGURE 4 An example of the features vector.
• Feature 3.d is decided by checking if the codes
make redirections to other malicious Web pages.
Such a check can be conducted by verifying if the code uses methods such as document.location(attakWebsite.com), as previously
mentioned.
• For feature 4, the code is run offline and in a controlled environment to check the code size before
and after execution. If the code maintains its size,
then it is considered benign when setting the corresponding feature value.
four are used for training and one for testing. The
final approach is 10-fold cross-validation, which is the
method used by the authors in (Likarish et al., 2009).
We implement it to compare the performance of our
approach to theirs. As for the performance metrics, we
are mainly concerned with the classification accuracy
defined in Eq. (1) as:
Using Apache server and PHP scripting language,
an interface was built to insert features data into a
Mysql database. After that, another PHP-based tool
called PhpMyAdmin (http://www.phpmyadmin.net/
home page/index.php) was used to export data in
Comma Separated Values (CSV) format. Another code
was written to transform the CSV files into the WEKArequired “.arff” format.
Eq. (1) comes in two flavors per se, one for correctly classified benign codes divided by the number
of benign samples. The accuracy can also be expressed
in terms of the number of correctly classified malicious code divided by the total number of malicious
samples. This definition is used in Figure 7. The same
goes for the other performance metrics. From a benign
code perspective, the False Negative (FN) is defined
by the percentage of benign code classified as malicious (see Eq. (2)). The same logic applies to the
malicious code perspective. While the False Positive
(FP) ratio represents the percentage of malicious code
classified as benign (see Eq. (3), which is the complement of the Accuracy and included here for the sake of
completeness.
5. PERFORMANCE EVALUATION
5.1. Experimental Setup
We present the results of subjecting our proposed
malicious JavaScript code detection system to a test
group of 100 malicious JavaScript code samples and
100 benign JavaScript code samples. The main goal
of this performance evaluation is to study and analyze the detection correctness of the proposed solution
and compare it to similar studies in the literature.
Three independent types of studies were conducted.
In the first one, the testing dataset is the same as the
training dataset, which gives us an indicator of the classifiers’ ability to learn from the training set. Second,
five-fold cross-validation is used for testing. This is a
standard way of testing and evaluating machine learning approaches (Kolter & Maloof, 2006), where the data
set is repeatedly divided into five subsets, of which
7
Accuracy =
FN =
FP =
#Correctly Classified Benign JS codes
×100%
Total #Benign Samples
(1)
#Benign JS codes classified as Malicious
× 100%
Total #Benign Samples
(2)
#Malicious JS codes classified as Benign
× 100%
Total #Benign Samples
(3)
5.2. Results
Figures 5a and 5b show that the four classifiers are
highly accurate when tested with same set as the training set, with 100% accuracy in most cases. The decision
Analysis of Malicious JavaScript
Downloaded by [rami alsalman] at 07:03 25 December 2012
(a) Malicious code, testing set same as training set.
(b) Benign code, testing set same as training set.
(c) Malicious code, testing using 5-fold cross validation.
(d) Benign code, testing using 5-fold cross validation.
FIGURE 5 Classification accuracy of the four classifiers. The input data is 100 benign and 100 malicious codes. Testing is done using
five-fold cross validation or using the same training data. (color figure available online.)
tree approach is slightly less accurate in comparison
to the other classifiers (97.1% accuracy for the malicious code). More importantly, testing using five-fold
cross-validation results in very high accuracy (+95%)
for all the classifiers except for the neural networks one,
which performed badly with an accuracy of only ∼ 40%
(see Figures 5c and 5d). This is because in our evaluation, we used five-fold cross-validation (i.e., 20 vectors
of 5 elements each). The Neural Network (NN) classifier was sensitive due to the small size of the data set.
This resulted in low performance.
As for the false negative ratio, the same trend
applies as before (see Figure 6). The ratio is low (less
that 3%) for all the classifiers except for the neural networks, which is consistent with the accuracy
results. We also compare our results to those of
the classification-based malicious JavaScript detection
approach in Likarish et al. (2009). using 10-fold cross
validation (see Figures 7 and 8). Figure 7 clearly demonstrates the superior accuracy of our proposed scheme.
Please note that Ripper is just another type of neural networks classifier that had been used in Likarish
et al. (2009). However, in comparing our work to the
literature (i.e., Likarish et al.), we used the same method
M. Fraiwan et al.
employed by Likarish et al., which is 10-fold crossvalidation (10 vectors of 10 elements each). Hence,
Ripper was less affected by the size of the vectors and
achieved the higher accuracy 97% (Figure 7). As for
Figure 8, the results show that their approach is plagued
with false negatives, which is nearly nonexistent in
ours.
5.3. Discussion
5.3.1. Data Set and Classification Algorithms
The data set used in this paper is based on Phishtank,
which collects Phishing Web sites. Similar evaluation
can be done using other data sets that focus on drive-by
or any other form of attacks, but this is out of the
scope of this paper. As of the classification algorithms,
WEKA provides a rich set of these, and we only
reported those that achieve the highest performance.
5.3.2. Effect of Blacklisted URLs and IPs
Figure 9 shows that the accuracy drops negligibly
when these features are excluded. In fact, the accuracy of the Neural Network classifier increases due to
8
(a)
(b)
3.5
3
Ratio (%)
Ratio (%)
2
1.5
1
Naïve Bayes
0
0
SVM
0 0
Decision Tree
(d)
70
0 0
Naïve Bayes
SVM
0 0
Decision Tree
60
FN
58
FN
50
40.8
40
30
Neural Network
70
FP
60
FP
Ratio (%)
Ratio (%)
1
0
0
Neural Network
50
Downloaded by [rami alsalman] at 07:03 25 December 2012
1.5
0.5
0
42.2
40
30
20
20
10
2
0
2
2
1
0.5
60
FN
2.5
2
1
(c)
FP
3
3
FN
2.5
0
3.5
FP
2.9
0
Naïve Bayes
2
0
SVM
10
4.9 3
0
Decision Tree
Neural Network
0
2
Naïve Bayes
0
2
SVM
3
5
Decision Tree
Neural Network
FIGURE 6 The false positive and false negative ratios of the four classifiers. The same testing procedure as in FIGURE 5. (color figure
available online.)
FIGURE 7 The accuracy of our proposed scheme in comparison to the work in Likarish et al. (2009). The accuracy here is
based number of correctly classified malicious code divided by
the total number of malicious samples. (color figure available
online.)
its sensitivity to the irrelevance of these two features.
The reason for these changes was pointed out in the
paper introduction; these blacklisted URLs and IPs
are outdated, as attackers may be quick in changing
them.
Having an up-to-date blacklist of malicious Web
sites definitely inundates the need for other features
or any other approach. However, a JavaScript code
from a not-yet-blacklisted URL or IP may redirect or
9
FIGURE 8 A comparison of the False Negative performance
of our systems versus the work in Likarish et al. (2009). False
negative definition according to Eq. (2). (color figure available
online.)
communicate with a blacklisted entity. Thus, the check
for communication and redirection may be needed as
done with some of the proposed features.
5.3.3. Handling Obfuscation
Obfuscation is common in drive-by download
attacks. We did not discuss thoroughly how this is handled, but typically deobfuscation needs to be done.
Analysis of Malicious JavaScript
Downloaded by [rami alsalman] at 07:03 25 December 2012
(a) Malicious code and the classifier without IP feature.
(b) Benign code and the classifier without the IP feature.
(c) Malicious code and the classifier without the URL feature.
(d) Benign code and the classifier without the URL feature.
FIGURE 9 Classification accuracy of the four classifiers without IP or URL features. The input data is 100 benign and 100 malicious
codes. Testing is done using five-fold cross-validation. (color figure available online.)
This problem has been discussed in (Cova et al., 2010),
and not all of our features will require deobfuscation.
5.3.4. Features Extraction
The feature extraction is done automatically by
parsing the http files. A code was written to perform
this task.
5.3.5. Runtime Overhead
The runtime and response time overhead of the tool
was noticed to be negligible, which is understandable
given the huge processing power of modern PCs.
6. CONCLUSION AND FUTURE WORK
The JavaScript programming/scripting language is
a ubiquitous tool in Web development. It is useful
in adding features, effects, and improvements to end
users as well as to the server-side of Web applications.
However, much like any technology, such capabilities
M. Fraiwan et al.
open the door for security exploits and adversaries.
Distinguishing malicious JavaScript code from benign
one is not an easy task for the typical user. In this paper,
we have analyzed the behavior of malicious JavaScript
code to point out its key features. Those features, when
considered together, have been shown to distinguish
malicious code from benign. To this end, we have
built a classification model based on those features.
Such a model can be readily integrated with a browser
(e.g., as a plug-in) to protect end users from malicious JavaScript code quite effectively. Performance
evaluation results show that our proposed approach
is quite effective in pointing out malicious JavaScript
code. However, as attackers get more sophisticated,
there is a continuous need for analysis and update of
malicious code features and distinguishing footprints.
Future work will focus on coming up with new distinguishing features. Also, there is a need to measure the
performance impact of the proposed scheme on the
perceived Web browsing experience.
10
Downloaded by [rami alsalman] at 07:03 25 December 2012
REFERENCES
Twitter scrambles to block worms. (2010, September 21). BBC News
Technology. Retrieved from http://www.bbc.co.uk/news/technology11382469
Ma, J., Saul, L., Savage, S., and Voelker, M. (2009). Beyond blacklists: Learning to detect malicious Web sites from suspicious URLs.
In Proceedings of the 15th ACM SIGKDD International Conference
on Knowledge Discovery and Data Mining, Paris, France.
PhishTank: A collaborative clearing house for data and Information about
Phishing on the Internet. (n.d.). Retrieved from www.phishtank.com/
Moore, T., Clayton, R., and Stern, H. (2009). Temporal correlations
between spam and Phishing Web sites. In Proceedings of the 2nd
USENIX Workshop on Large-Scale Exploits and Emergent Threats.
Holz, T., Gorecki, C., Rieck, K., and Freiling, F. (2008). Measuring and
detecting fast-flux service networks. In Proceedings of the 15th
Network & Distributed System Security Symposium (NDSS).
Hallaraker, O. and Vigna, G. (2005). Detecting malicious JavaScript code
in Mozilla. In Proceedings of the 10th IEEE International Conference
on Engineering of Complex Computer Systems, Washington, DC.
Likarish, P. and Jung, E. (2009). A targeted Web crawling for building malicious JavaScript collection. In Proceeding of the ACM First
International Workshop on Data-intensive Software Management and
Mining, Hong Kong, China.
Kolter, Z. and Maloof, M. (2006). Learning to detect and classify malicious
executables in the wild. Journal of Machine Learning Research, 7.
Feinstein, B. and Peck, C. (2007). Caffeine monkey: Automated collection, detection and analysis of malicious JavaScript. Black Hat.
Mozilla C implementation of JavaScript. (n.d.). Retrieved from http://
www.mozilla.org/js/spidermonkey/
google-caja: A source-to-source translator for securing JavaScript based
Web content. (n.d.). Retrieved from http://www.code.google.com/p/
google-caja/
Dewar, A., Holz, T., and Freiling, F. (2010). ADSandbox: Sandboxing
JavaScript to fight malicious Web sites. Symposium on Applied
Computing (SAC), Sierre, Switzerland.
Jim, T., Swamy, N., and Hicks, M. (2007). Defeating Script injection attacks with browser enforced embedded policies. In 16th
International World Wide Web Conference.
Likarish, P., Jung, E., and Jo, I. (2009). Obfuscated malicious JavaScript
detection using classification techniques. In 4th International
Conference on Malicious and Unwanted Software (MLWARE).
Free traffic metrics, search analytics, demographics, and more for
Websites. (n.d.). Retrieved from www.alexa.com/
Sphider: PHP search engine. (n.d.). Retrieved from http://www.sphider.eu/
Cova, M., Kruegel, C., and Vigna, G. (2010). Detection and analysis of drive-by-download attacks and malicious JavaScript code. In
Proceedings of WWW 2010 Conference, North Carolina.
Java Script Debugger Mozilla Firefox add-on. (n.d.). Retrieved from https:/
/addons.mozilla.org/firefox/addon/216
Firebug development tool Mozilla Firefox add-on. (n.d.). Retrieved from
https://addons.mozilla.org/firefox/addon/1843
Weka 3: Data mining software in Java. (n.d.). Retrieved from www.cs.
waikato.ac.nz/ml/weka/
The phpMyAdmin tool. (n.d.). Retrieved from http://www.phpmyadmin.
net/home_page/index.php
Yue, C., and Wang, H. (2009). Characterizing insecure JavaScript practices on the Web. In Proceedings of WWW 2009 Conference, Madrid,
Spain.
11
BIOGRAPHIES
Mohammad Fraiwan received his B.S. (with honors) degree in Computer Engineering from Jordan
University of Science and Technology (JUST), Jordan,
and Ph.D. degree from Iowa State University (ISU),
USA, in 2003 and 2009 respectively. He is currently
an assistant professor in the Computer Engineering
Department at JUST. His main research interests
include network monitoring, biomedical applications,
and data mining. He worked at Akamai Technologies,
Boston, in summer of 2008, and as a visiting scientist
at ISU during the summers of 2009 and 2010.
Rami Al-Salman is currently a Ph.D. student in
Germany. In Spring of 2011, he received his Masters
degree from the department of Computer Engineering
at Jordan University of Science and Technology
(JUST). He received his B.S. degree in Computer
Information Systems from JUST. He worked at the
Heinrich-Heine University in Duesseldorf, Germany
during the summer of 2010. His research interests include security, data mining, and Information
Assurance.
Natheer Khasawneh has been an assistant professor
in the Department of Software Engineering at Jordan
University of Science and Technology since 2005. He
received his B.S. in Electrical Engineering from Jordan
University of Science and Technology in 1999. He
received his Masters and Ph.D. degrees in Computer
Engineering from the University of Akron, Akron,
Ohio, USA in the years 2002 and 2005 respectively. His
research interests are data mining, biomedical signals
analysis, software engineering and web engineering.
Currently he is the vice director of the Computer and
Information Center at Jordan University of Science
and Technology.
Stefan Conrad is currently a full professor in
the Institute of computer Science at the University
of Düsseldorf, Germany. He received his Ph.D. in
Computer Science from the Braunschweig University
of Technology, Germany in 1994. His research interests include database design, knowledge discovery, data
mining, and multimedia information retrieval.
Analysis of Malicious JavaScript