Behavior-based Proactive Detection of Unknown Malicious Codes

Behavior-based Proactive Detection of
Unknown Malicious Codes
Jianguo Ding, Pascal Bouvry, Jian Jin, Yongtao Hu
Faculty of Science
Science, Technology and Communication (FSTC)
University of Luxembourg
[email protected]
The 4th International Conference on Internet Monitoring and Protection
(ICIMP 2009)
Venice, Italy, May 24-28, 2009
• Current methods in detecting malicious
codes
• Proposed methods
• Experiment results
• Discussion
• Future work
J. Ding
FSTC-University of Luxembourg
2
• Malicious code is defined as any program
(including macros and scripts) that is specifically
coded to cause an unexpected (and usually
unwanted) event on a user’s PC or a server.
• Malicious code includes
–
–
–
–
–
–
–
J. Ding
Viruses
Trojan horses
Worms
Back doors
Spyware
Adware
…
FSTC-University of Luxembourg
3
Antivirus technologies
•
•
•
•
•
•
•
•
Signature based scanning
Signature-based
Heuristic analysis
Cyclic redundancy check (CRC) scanner
Vaccination technology
B h i bl
Behavior
blockers
k
Immunizers
Snapshot technology
Sandbox technology
J. Ding
FSTC-University of Luxembourg
4
http://bt.ins.com Malicious Code 2008
J. Ding
FSTC-University of Luxembourg
5
• Motivation
– Increasing in the amount and the variety of malicious
codes
– Polymorphism and metamorphism in malicious codes
– Signature-based scanning lacks of mechanism to
identify unknown malicious codes
– In antivirus applications
applications, experts have to deal with any
reported unknown malicious code by manual analysis,
and then denote a signature for it and update the
virus database
• Try to establish an automatic mechanism to
assistant classifying and identifying unknown
malicious codes
J. Ding
FSTC-University of Luxembourg
6
Behavior analysis of malicious codes
– Static analysis: disassemble binary code and
analyze the functions of the program
program, without
executing it.
• Can cover the complete
p
p
program
g
• Faster than dynamic analysis
• Hard to identify the real behavior of malicious codes
– Dynamic analysis: analyze the code during runtime.
• Immune to obfuscation attempts
• Can deal with self-modifying programs
• But need to clean out the environment after every
dynamics testing
J. Ding
FSTC-University of Luxembourg
7
Example: risk evaluation of Copyfile
i 32
in
32-bit
bi Wi
Window
d
system
J. Ding
FSTC-University of Luxembourg
8
Classification of malicious behaviors
• File-related behavior
– 1. Creating PE (Portable Executable) files under
system-sensitive directory.
– 2. Copying PE files to system-sensitive
system sensitive directory.
– 3. Writing system-sensitive files such as PE files.
• Process-related behavior
–1
1. W
Writing
iti memory off other
th processes.
– 2. Creating remote thread in other processes.
– 3. Terminating
g other processes.
• Window-related behavior
– 1. Hooking keyboard.
– 2.
2 Hiding window
window.
J. Ding
FSTC-University of Luxembourg
9
Classification of malicious behaviors
(
(cont.)
)
• Network-related behavior
– 1. Binding and listening to port.
– 2. Initiating http connection.
• Register
Register-related
related behavior
– 1. Creating and Setting register key for automatic running.
– 2. Setting register to lower security check.
•
Windows service behavior
–
–
–
–
J. Ding
1. Terminating windows update service.
2. Terminating windows fire-wall.
3 O
3.
Opening
i ttelnet
l t service.
i
4. Opening FTP session.
We define 35-dimension feature vectors (behaviors) to measure
th b
the
behavior
h i off malicious
li i
codes.
d
FSTC-University of Luxembourg
10
Definition of 35-dimension feature vector (event
operation) for malicious codes in 32-bit system
J. Ding
FSTC-University of Luxembourg
11
System architecture
J. Ding
FSTC-University of Luxembourg
12
Analysis methods
• Statistic
Statistic-based
based method
– Set up the event benchmark for malicious code VM
and benign code VB . For unknown test code X,
retrieve and calculate it’s feature vector VX .
VM = {VM 1,VM 2,...,VM 35 }
VB = {VB1,VB 2,...,VB 35 }
VX = {VX 1,VX 2,...,VX 35 }
J. Ding
FSTC-University of Luxembourg
13
• Calculate the Euclidean distances:
d MX = VX − VM
d BX = VX − VB
X
X
d
<=
d
• If
M
B , X is malicious code;
If d MX > d BX , X is
i benign
b i code.
d
J. Ding
FSTC-University of Luxembourg
14
MoE (Mixture of Experts) Neural
Network model
C
yq ( Pq ) = ∑( gc,q ( Pq ) ⋅ ac,q ( Pq ))
c=1
J. Ding
FSTC-University of Luxembourg
15
Experiment (1)
•
Collect
– malicious
li i
sample
l code:
d 7240(f
7240(from A
Anti
ti security
it llab),
b)
– benign sample code: 2821 (be detected by Norton, McAfee, Kaspersky );
•
Test distribution:
J. Ding
sum
training
test
Malicious sample
8823
5000
3823
Benign sample
2821
2000
821
FSTC-University of Luxembourg
16
Experiment (2)
• We use a VMware Workstation 5.5.3 build on a
desktop with Intel Core 2 Duo CPU E4400 @
2GHz, 2GHz.
• For statistical model,, the training
g speed
p
is 0.3054
seconds, the test speed is 0.0435 seconds.
• For MoE model, the training speed is 68.2880
seconds, the test speed is 0.3882 seconds.
J. Ding
FSTC-University of Luxembourg
17
Traditional statistical definitions on the
error off the
th evaluation
l ti
• True Positive ((TP):
) Ratio of malicious executables are
correctly classified to be malicious.
• True Negative (TN): Ratio of benign executables are
correctly
tl classified
l
ifi d tto b
be b
benign.
i
• False Negative (FN): Ratio of malicious executables are
incorrectly classified to be benign.
• False Positive (FP): Ratio of benign executables are
incorrectly classified to be malicious.
Thus TP+FN=1, TN+FP=1.
J. Ding
FSTC-University of Luxembourg
18
Experiment results
Accuracy of the detection for malicious code and benign code:
Detection results (%) for training dataset:
Approaches
pp oac es
TP
FN
TN
FP
Average
e age ((TP+TN))
Statistical method
98.00
2.00
56.20
43.80
77.1
MoE method
79.00
21.00
100.00
0.00
89.5
Detection results (%) for test dataset:
Approaches
TP
FN
TN
FP
Average (TP+TN)
Statistical method
96.01
3.99
65.12
34.88
80.17
MoE method
75.20
24.80
99.00
1.00
87.10
J. Ding
FSTC-University of Luxembourg
19
Discussion
• From the view of application
application, higher FN
means more dangerous.
• Single method can not get to optimization
result.
• Theoretically,
Th
ti ll th
the d
detection
t ti rate
t can nott
get to 100%, because the definition of
malicious
li i
codes
d changes
h
with
ith diff
differentt
users.
J. Ding
FSTC-University of Luxembourg
20
Future works
• Identify the causes of the difference
between the proposed two methods
• Use larger sample data set (100,000) for
experiment
• Find ways to improve the accuracy for
proactive detection
Thanks to ANTIY laboratory for their provision of the
malicious sample codes.
J. Ding
FSTC-University of Luxembourg
21
Q&A
Thank you!
J. Ding
FSTC-University of Luxembourg
22