X1 Application: Computational Fluid Dynamics - MINDS

MINDS: Data Mining Based
Network Intrusion Detection System
Vipin Kumar
[email protected]
Army High Performance Computing Research Center
University of Minnesota
http://www.cs.umn.edu/research/minds/
Team Members: Eric Eilertson, Paul Dokas, Levent Ertoz, Ben Mayer,
Aleksandar Lazarevic, Michael Steinbach, George Simon,
Varun Chandola, Mark Shaneck, Jaideep Srivastava,
Zhi-Li Zhang, Yongdae Kim, Vipin Kumar
AHPCRC
1
Information Assurance

Sophistication of cyber attacks and
their severity is increasing
90000
80000
70000
60000

50000
ARL, the Army, DOD and Other U.S.
Government Agencies are major
targets for sophisticated state
sponsored cyber terrorists
40000
30000
20000
10000
0
1
2
3
4
5
6
7
8
9
10 2000
11 2001
12 2002
13
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
 Cyber strategies can be a major force
multiplier and equalizer
 Across DoD, computer assets have
been compromised, information has
been stolen, putting technological
advantage and battlefield superiority at
risk

Incidents Reported to Computer Emergency Response Team/Coordination
Center
Security mechanisms always have
inevitable vulnerabilities
 Firewalls are not sufficient to ensure
security in computer networks
 Insider attacks
Spread of SQL Slammer worm 10 minutes
after its deployment
AHPCRC
2
Information Assurance
 Intrusion Detection System
– Combination of software and hardware that attempts to
perform intrusion detection
– Raises the alarm when possible intrusion happens
• Traditional intrusion detection system IDS tools are
based on signatures of known attacks
 Limitations
– Signature database has to be manually revised
for each new type of discovered intrusion
– Substantial latency in deployment of newly created signatures
across the computer system
– They cannot detect emerging cyber threats
– Not suitable for detecting policy violations and insider abuse
– Do not provide understanding of network traffic
– Generate too many false alarms
Example of SNORT rule
(MS-SQL “Slammer”
worm)
any -> udp port 1434
(content:"|81 F1 03 01 04
9B 81 F1 01|";
content:"sock";
content:"send")
www.snort.org
AHPCRC
3
Data Mining for Intrusion Detection
 Increased interest in data mining based intrusion detection
– Attacks for which it is difficult to build signatures
– Unforeseen/Unknown/Emerging attacks
• Misuse detection
– Building predictive models from labeled labeled data sets (instances
are labeled as “normal” or “intrusive”) to identify known intrusions
– High accuracy in detecting many kinds of known attacks
– Cannot detect unknown and emerging attacks
• Anomaly detection
– Detect novel attacks as deviations from “normal” behavior
– Potential high false alarm rate - previously unseen (yet legitimate) system
behaviors may also be recognized as anomalies
AHPCRC
4
Data Mining for Intrusion Detection
Training Set
Tid
SrcIP
Start
time
Dest IP
Dest
Port
Number
Attack
of bytes
1 206.135.38.95 11:07:20 160.94.179.223
139
192
No
2 206.163.37.95 11:13:56 160.94.179.219
139
195
No
3 206.163.37.95 11:14:29 160.94.179.217
139
180
No
1 206.163.37.81 11:17:51 160.94.179.208
150
?
4 206.163.37.95 11:14:30 160.94.179.255
139
199
No
2 206.163.37.99 11:18:10 160.94.179.235
208
?
5 206.163.37.95 11:14:32 160.94.179.254
139
19
Yes
3 206.163.37.55 11:34:35 160.94.179.221
195
?
6 206.163.37.95 11:14:35 160.94.179.253
139
177
No
4 206.163.37.37 11:41:37 160.94.179.253
199
?
7 206.163.37.95 11:14:36 160.94.179.252
139
172
No
5 206.163.37.41 11:55:19 160.94.179.244
181
?
8 206.163.37.95 11:14:38 160.94.179.251
139
285
Yes
9 206.163.37.95 11:14:41 160.94.179.250
139
195
No
10 206.163.37.95 11:14:44 160.94.179.249
139
163
Yes
Tid
SrcIP
Start
time
Number
Attack
of bytes
Dest Port
Test Set
Misuse Detection –
Building Predictive
Models
Key Technical
Challenges

Large data size

High dimensionality

Temporal nature of the data

Skewed class distribution

Data preprocessing

On-line analysis
10
Summarization of
attacks using
association rules
Learn
Classifier
Model
Anomaly Detection
Rules Discovered:
{Src IP = 206.163.37.95,
Dest Port = 139,
Bytes  [150, 200]} --> {ATTACK}
AHPCRC
5
Data Mining for Intrusion Detection
Training Set
Tid
SrcIP
Start
time
Dest IP
Dest
Port
1 206.135.38.95 11:07:20 160.94.179.223
139
192
No
2 206.163.37.95 11:13:56 160.94.179.219
139
195
No
3 206.163.37.95 11:14:29 160.94.179.217
139
180
No
1 206.163.37.81 11:17:51 160.94.179.208
150
?
4 206.163.37.95 11:14:30 160.94.179.255
139
199
No
2 206.163.37.99 11:18:10 160.94.179.235
208
?
5 206.163.37.95 11:14:32 160.94.179.254
139
19
Yes
3 206.163.37.55 11:34:35 160.94.179.221
195
?
6 206.163.37.95 11:14:35 160.94.179.253
139
177
No
4 206.163.37.37 11:41:37 160.94.179.253
199
?
7 206.163.37.95 11:14:36 160.94.179.252
139
172
No
5 206.163.37.41 11:55:19 160.94.179.244
181
?
Tid
8 206.163.37.95 11:14:38 160.94.179.251
139
285
Yes
9 206.163.37.95 11:14:41 160.94.179.250
139
195
No
10 206.163.37.95 11:14:44 160.94.179.249
139
163
Yes
SrcIP
Start
time
Number
Attack
of bytes
Dest Port
Test Set
10
Summarization of
attacks using
association rules
Rules Discovered:
Learn
Classifier
Model
Anomaly
Anomaly Detection
Detection
{Src IP = 206.163.37.95,
Dest Port = 139,
Bytes  [150, 200]} --> {ATTACK}
AHPCRC
Misuse Detection –
Building Predictive
Models
Number
Attack
of bytes
6
Key Technical
Challenges

Large data size

High dimensionality

Temporal nature of the
data

Skewed class distribution

Data preprocessing

On-line analysis
MINDS – Minnesota INtrusion Detection System
MINDS system
Anomaly
scores
network
Data capturing
device
Net

…
…
tcpdump
Detected
novel attacks
Summary and
characterization
of attacks
Human
analyst
Labels
Feature
Extraction
Known attack
detection
Detected
known attacks
Data mining based intrusion detection system
Incorporated into Interrogator architecture at ARL Center for Intrusion
Monitoring and Protection (CIMP)




Anomaly
detection
flow tools
Filtering

Association
pattern analysis
Helps analyze data from multiple sensors at DoD sites around the country
MINDS anomalies are used as the primary key when viewing related alerts from
other tools (SNORT, Jids, etc.)
MINDS is the first effective anomaly intrusion detection system used by ARL
Routinely detects attacks and intrusive behavior not detected by widely
used intrusion detection systems

Insider Abuse / Policy Violations / Worms / Scans
AHPCRC
7
Feature Extraction Module
• Three groups of features
– Basic features of individual TCP connections
•
•
•
•
•
•
source & destination IP source & destination port Protocol
Duration
Bytes per packets
number of bytes
Features 1 & 2
Features 3 & 4
Feature 5
Feature 6
Feature 7
Feature 8
– Time based features
• For the same source (destination) IP address, number of unique destination
(source) IP addresses inside the network in last T seconds – Features 9 (13)
• Number of connections from source (destination) IP to the same destination
(source) port in last T seconds – Features 11 (15)
– Connection based features
• For the same source (destination) IP address, number of unique destination
(source) IP addresses inside the network in last N connections - Features 10 (14)
• Number of connections from source (destination) IP to the same destination
(source) port in last N connections - Features 12 (16)
AHPCRC
8
Detection of Anomalies on Real Network Data
Anomalies/attacks picked by MINDS include scanning activities, worms, and non-standard behavior
such as policy violations and insider attacks. Many of these attacks detected by MINDS, have
already been on the CERT/CC list of recent advisories and incident notes.
 Some illustrative examples of intrusive behavior detected using MINDS at U of M
• Scans
–Detected scanning for Microsoft DS service on port 445/TCP

• Undetected by SNORT since the scanning was non-sequential (very slow). Rule added to SNORT in September 2002
–Detected scanning for Oracle server
• Undetected by SNORT because the scanning was hidden within another Web scanning
–Detected a distributed windows networking scan from multiple source locations
• Policy Violations
–Identified machine running Microsoft PPTP VPN server on non-standard ports
• Undetected by SNORT since the collected GRE traffic was part of the normal traffic
–Identified compromised machines running FTP servers on non-standard ports, which is a policy
violation
• Example of anomalous behavior following a successful Trojan horse attack
–Detected computers on the network apparently communicating with outside computers over a
VPN or on IPv6
• Worms
–Detected several instances of slapper worm that were not identified by SNORT since they were
variations of existing worm code
–Detected unsolicited ICMP ECHOREPLY messages to a computer previously infected with
Stacheldract worm (a DDos agent)
AHPCRC
9
M
I
N
D
S
Typical Anomaly Detection Output
–January 26, 2003
score
37674.69
26676.62
24323.55
21169.49
19525.31
19235.39
17679.1
8183.58
7142.98
5139.01
4048.49
4008.35
3657.23
3450.9
3327.98
2796.13
2693.88
2683.05
2444.16
2385.42
2114.41
2057.15
1919.54
1634.38
1596.26
1513.96
1389.09
1315.88
1279.75
1237.97
1180.82
srcIP
63.150.X.253
63.150.X.253
63.150.X.253
63.150.X.253
63.150.X.253
63.150.X.253
63.150.X.253
63.150.X.253
63.150.X.253
63.150.X.253
142.150.Y.101
200.250.Z.20
202.175.Z.237
63.150.X.253
63.150.X.253
63.150.X.253
142.150.Y.101
63.150.X.253
142.150.Y.236
142.150.Y.101
63.150.X.253
142.150.Y.101
142.150.Y.101
142.150.Y.101
63.150.X.253
142.150.Y.107
63.150.X.253
63.150.X.253
142.150.Y.103
63.150.X.253
63.150.X.253
sPort
1161
1161
1161
1161
1161
1161
1161
1161
1161
1161
0
27016
27016
1161
1161
1161
0
1161
0
0
1161
0
0
0
1161
0
1161
1161
0
1161
1161
(48 hours after the “slammer” worm)
dstIP
128.101.X.29
160.94.X.134
128.101.X.185
160.94.X.71
160.94.X.19
160.94.X.80
160.94.X.220
128.101.X.108
128.101.X.223
128.101.X.142
128.101.X.127
128.101.X.116
128.101.X.116
128.101.X.62
160.94.X.223
128.101.X.241
128.101.X.168
160.94.X.43
128.101.X.240
128.101.X.45
160.94.X.183
128.101.X.161
128.101.X.99
128.101.X.219
128.101.X.160
128.101.X.2
128.101.X.30
128.101.X.40
128.101.X.202
160.94.X.32
128.101.X.61
dPort
1434
1434
1434
1434
1434
1434
1434
1434
1434
1434
2048
4629
4148
1434
1434
1434
2048
1434
2048
2048
1434
2048
2048
2048
1434
2048
1434
1434
2048
1434
1434
protocolflags packets bytes
17
16 [0,2)
[0,1829)
17
16 [0,2)
[0,1829)
17
16 [0,2)
[0,1829)
17
16 [0,2)
[0,1829)
17
16 [0,2)
[0,1829)
17
16 [0,2)
[0,1829)
17
16 [0,2)
[0,1829)
17
16 [0,2)
[0,1829)
17
16 [0,2)
[0,1829)
17
16 [0,2)
[0,1829)
1
16 [2,4)
[0,1829)
17
16 [2,4)
[0,1829)
17
16 [2,4)
[0,1829)
17
16 [0,2)
[0,1829)
17
16 [0,2)
[0,1829)
17
16 [0,2)
[0,1829)
1
16 [2,4)
[0,1829)
17
16 [0,2)
[0,1829)
1
16 [2,4)
[0,1829)
1
16 [0,2)
[0,1829)
17
16 [0,2)
[0,1829)
1
16 [0,2)
[0,1829)
1
16 [2,4)
[0,1829)
1
16 [2,4)
[0,1829)
17
16 [0,2)
[0,1829)
1
16 [0,2)
[0,1829)
17
16 [0,2)
[0,1829)
17
16 [0,2)
[0,1829)
1
16 [0,2)
[0,1829)
17
16 [0,2)
[0,1829)
17
16 [0,2)
[0,1829)
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
2
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
3
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
4
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
5
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
6
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
7
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
8
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
9
10 11 12 13 14 15 16
0.81 0 0.59 0 0 0 0 0
0.81 0 0.59 0 0 0 0 0
0.81 0 0.58 0 0 0 0 0
0.81 0 0.58 0 0 0 0 0
0.81 0 0.58 0 0 0 0 0
0.81 0 0.58 0 0 0 0 0
0.81 0 0.58 0 0 0 0 0
0.82 0 0.58 0 0 0 0 0
0.82 0 0.57 0 0 0 0 0
0.82 0 0.57 0 0 0 0 0
0.83 0 0.56 0 0 0 0 0
0
0
0
0 0 0 1 0
0
0
0
0 0 0 1 0
0.82 0 0.57 0 0 0 0 0
0.82 0 0.57 0 0 0 0 0
0.82 0 0.57 0 0 0 0 0
0.83 0 0.56 0 0 0 0 0
0.82 0 0.57 0 0 0 0 0
0.83 0 0.56 0 0 0 0 0
0.83 0 0.56 0 0 0 0 0
0.82 0 0.57 0 0 0 0 0
0.83 0 0.56 0 0 0 0 0
0.83 0 0.56 0 0 0 0 0
0.83 0 0.56 0 0 0 0 0
0.82 0 0.57 0 0 0 0 0
0.83 0 0.56 0 0 0 0 0
0.82 0 0.57 0 0 0 0 0
0.82 0 0.57 0 0 0 0 0
0.83 0 0.56 0 0 0 0 0
0.83 0 0.56 0 0 0 0 0
0.83 0 0.56 0 0 0 0 0
 Anomalous connections that correspond to the “slammer” worm
 Anomalous connections that correspond to the ping scan
 Connections corresponding to UM machines connecting to “half-life” game servers
Summarization Using Association Patterns
Ranked
connections
attack
Anomaly
Detection
System
Discriminating
Association
Pattern
Generator
normal
update
Knowledge
Base
1.
Build normal profile
2.
Study changes in
normal behavior
R1: TCP, DstPort=1863  Attack
…
3.
Create attack summary
4.
Detect misuse behavior
5.
Understand nature of
the attack
AHPCRC
11
…
…
…
R100: TCP, DstPort=80  Normal
Typical MINDS Output
score
31.2
3.04
15.4
14.4
c1
138
-
c2
12
-
src IP
218.19.X.168
64.156.X.74
218.19.X.168
134.84.X.129
sPort
5002
----5002
4770
dst IP
dPort
134.84.X.129 4182
xxx.xxx.xxx.xxx----134.84.X.129 4896
218.19.X.168 5002
protocolflags packets bytes
6
27 [5,6)
[0,2045)
xxx
4
[0,2) [0,2045)
6
27 [5,6)
[0,2045)
6
27 [5,6)
[0,2045)
1
2
3
0
0.01
0.12
0.48
0.01
7.81
3.09
2.41
6.64
5.6
2.7
4
64
12
1
8
0
134.84.X.129 3890
xxx.xxx.xxx.xxx4729
xxx.xxx.xxx.xxx----218.19.X.168 5002
218.19.X.168 5002
xxx.xxx.xxx.xxx-----
218.19.X.168 5002
xxx.xxx.xxx.xxx----200.75.X.2
----134.84.X.129 3676
134.84.X.129 4626
xxx.xxx.xxx.xxx113
6
6
xxx
6
6
6
27 [5,6)
[0,2045)
------ --------- -------------- --------- [0,2045)
27 [5,6)
[0,2045)
27 [5,6)
[0,2045)
2
[0,2) [0,2045)
4.39
4.34
4.07
3.49
3.48
3.34
2.46
8
51
0
0
218.19.X.168
218.19.X.168
160.94.X.114
218.19.X.168
218.19.X.168
218.19.X.168
200.75.X.2
134.84.X.129 4571
134.84.X.129 4572
64.8.X.60
119
134.84.X.129 4525
134.84.X.129 4524
134.84.X.129 4159
xxx.xxx.xxx.xxx21
6
6
6
6
6
6
6
27
27
24
27
27
27
2
2.37
2.45
42
58
5
0
xxx.xxx.xxx.xxx21
200.75.X.2
-----
200.75.X.2
----xxx.xxx.xxx.xxx21
6
6
5002
5002
51827
5002
5002
5002
-----
5
6
7
8
9
10
11
12
13
14
15
16
0.01 0.03
0
0
0
0
0
0
0
0
0
0
1
0
0.26 0.58
0
0
0
0
0.07 0.27 0
0
0
0
0
0
0.01
0.01 0.06
0
0
0
0
0
0
0
0
0
0
1
0
0.01
0.01
0.05 0.01
0
0
0
0
0
0
1
0
0
0
0
0
0.01
0.02
0.09 0.02
0
0
0
0
0
0
1
0
0
0
0
0
0.14
0.33
0.17
0.47
0
0
0
0
0
0
0.2
0
0
0
0
0
0.33
0.27
0.21 0.49
0
0
0
0
0
0
0
0
0.28 0.25 0.01 0
0.03
0.03
0.03 0.15
0
0
0
0
0
0
0
0
0
0
0.99 0
0.03
0.03
0.03 0.17
0
0
0
0
0
0
0
0
0
0
0.98 0
0.25
0.09
0.15
0.15
0
0
0
0
0
0
0.08
0
0.79 0.15 0.01 0
0.04
0.05
0.05 0.26
0
0
0
0
0
0
0
0
0
0
0.96 0
0.04
0.05
0.05 0.23
0
0
0
0
0
0
0
0
0
0
0.97 0
0.09
0.26
0.16
0.24
0
0
0
0.91 0
0
0
0
0
0
0
0.06
0.06
0.06 0.35
0
0
0
0
0
0
0
0
0
0
0.93 0
0.06
0.06
0.07 0.35
0
0
0
0
0
0
0
0
0
0
0.93 0
0.06
0.07
0.07 0.37
0
0
0
0
0
0
0
0
0
0
0.92 0
0.19
0.64
0.35 0.32
0
0
0
0
0.18
0.44 0
0
0
0
0
20 --------- [0,2045) 0.35
------ --------- [0,2045) 0.19
0.31
0.22 0.57
0
0
0
0
0
0
0
0
0.18
0.28 0.01 0
0.63
0.35 0.32
0
0
0
0
0.18
0.44 0
0
0
0
[5,6)
[0,2045)
[5,6)
[0,2045)
[483,-) [8424,-)
[5,6)
[0,2045)
[5,6)
[0,2045)
[5,6)
[0,2045)
--------- [0,2045)
4
0
0
0
0
 UM computer connecting to a remote FTP server, running on port 5002
 Summarized TCP reset packets received from 64.156.X.74, which is a victim of
DoS attack, and we were observing backscatter, i.e. replies to spoofed packets
 Summarization of FTP scan from a computer in Columbia, 200.75.X.2
 Summary of IDENT lookups, where a remote computer tries to get user name
 Summarization of a USENET server transferring a large amount of data
AHPCRC
12
Typical MINDS Output
prot
6
6
6
6
4
3
2
1
packets bytes
flags
0
0
0
0
---AP--- [24k,124k][20M ,182M ]
0
0
0
0
---A---- [24k,124k][3M ,5M ]
0
0
0
0
---AP--- [24k,124k][20M ,182M ]
0.08 0.1 0.1 0.3
---APRSF[338,379] [15k,17k]
--4949
###
###
###
3989
6
6
6
6
6
6
---AP-SF
---AP-----AP-SF
---AP--F
---AP--F
---AP-SF
0.36
--[4,4]
0
[24k,124k][3M ,5M ]
0
[24k,124k][20M ,182M ]
0
[24k,124k][20M ,182M ]
0
[24k,124k][20M ,182M ]
[217,217] [252k,265k] 0.16
0.4
0
0
0
0
0.2
0.7
0
0
0
0
0.3
--4010
3995
3992
4007
4004
4001
6
6
6
6
6
6
6
---AP-SF
---AP-SF
---AP-SF
---AP-SF
---AP-SF
---AP-SF
---AP-SF
[4,4]
[217,217]
[217,217]
[217,217]
[217,217]
[218,234]
[217,217]
0.37
--[252k,265k] 0.16
[252k,265k] 0.16
[252k,265k] 0.16
[252k,265k] 0.16
[265k,309k] 0.16
[252k,265k] 0.16
0.4
0.2
0.2
0.2
0.2
0.2
0.2
-----
6
6
---AP-SF [4,4]
---AP-SF [4,4]
0.38
0.39
0.4
0.4
score
611
348
24
11
c1
-
c2
-
src IP
128.118.x.96
160.94.x.50
128.101.x.33
24.223.x.59
sPort
873
4529
20
1135
dPort
dst IP
160.94.x.50 4529
128.118.x.96 873
200.95.x.225 5001
554
160.94.x.1
7.8
10
9.6
9.5
9.5
9.4
11
-
0
-
x.x.x.x
128.101.x.173
128.101.x.113
192.18.x.40
192.18.x.40
24.33.x.62
8200
22
20
###
###
2011
160.94.x.154
24.26.x.13
81.168.x.40
134.84.x.19
134.84.x.19
160.94.x.150
7.8
9.1
9.1
9.1
9
8.9
8.9
13
-
1
-
x.x.x.x
24.33.x.62
24.33.x.62
24.33.x.62
24.33.x.62
24.33.x.62
24.33.x.62
8200
2011
2011
2011
2011
2011
2011
134.84.x.21
160.94.x.150
160.94.x.150
160.94.x.150
160.94.x.150
160.94.x.150
160.94.x.150
5.7
7.3
10 # 63.251.x.177
27 7 66.151.x.190
8200 x.x.x.x
8200 x.x.x.x
--[559,559]
5
6
7
9
8
14
13
12
11
10
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0.1
0
0
0
0
0.2
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
1
1
1
1
1
0
0
0
0
0
0
0.2
0
0
0
0
0
0.1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0.7
0.3
0.3
0.3
0.3
0.3
0.3
0.3
0.1
0.1
0.1
0.1
0.1
0.1
0
0
0
0
0
0
0
0 0
0 1
0 1
0 1
0 1
0 1
0 1
0
1
1
1
1
1
1
0
0
0
0
0
0
0
0.1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0.3
0.7
0.4 0 0 0
0.2 0 0 0
0 0
0 0
0
0.2
0.1
0
0
0
0
0
0
0
 UM computers doing bulk transfers
 Attack on Real-Media server (Reported by CERT on September 9, 2003,
RealNetworks media server RTSP protocol parser buffer overflow)
 8200/tcp traffic related to gotomypc.com which allows users to remotely control a
desktop (involves a third party)
 Mysterious traffic currently being investigated
AHPCRC
13
Typical MINDS Output
score c1
57973 -
c2
-
6530 3227 1534 19.3
9 67
src IP
128.101.X.1
sPort
dst IP
dPort protocolflags
packets bytes
1
56025 192.67.X.205
22 tcp
---A P --- [32k,1M ] [8M ,1765M ]
141.213.X.100 4354 160.94.X.142 59999
192.67.X.206 43710 128.101.X.1
22
160.94.X.142 59999 141.213.X.100 4354
193.62.X.38 ----160.94.X.132 -----
14.9 23 81 134.84.X.117 ----26.6 81 258 208.2.X.101 ----88.2
5
1 208.2.X.101 ----143 160.94.X.132 35755
xxx.xxx.xxx.xxx----xxx.xxx.xxx.xxx 139
xxx.xxx.xxx.xxx 139
193.62.X.38 45288
57 216.196.X.78 -----
9
10
11
12
13
14
15
16
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
0.1
0
0
0
0.1
0
0
0
0
0
0
0
0.1
0
0
0
0
0
0
0
0.1
0
0
0
0
0
0
0
0
0
0
0
0
tcp
tcp
tcp
tcp
---A P --- --------- ---------
0.3
0.2
0
0
0.3 0.3
0.3 0.3
0.1
0
0
0
0.5
0.4
0.1
0
0
0
0
0
0
0
0
0
0
0
0
0
0.1
0
0
1
0.1
0.1
0
0
0
0
0
0
0.1
0.1
0
0
0
0
0
0
0.1
0.1
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0.3
0.3
0
0.3
0.2
0
0.3
0.5
0 0 0 0
0.5 0 0 0
0.3 0 0 0
1
0.2
0.1
0
0.1
0.1
0
0
0.1
0
0.1
0.1
0
0
0
0
0
0.1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0.2
0.3
0.3
0.4 0 0 0
0
0.2
0
0
0
0.2
0
0
0
------S - [4,4]
--------[200,200]
---A ---F [32k,1M ] [1M ,3M ]
------S - [4,4]
---A -R -- --------- ---------
73 xxx.xxx.xxx.xxx
8
0
0
0
0.5
tcp
12.1 23
7
0
0
0
0.3
---A P -S - [32k,1M ] [8M ,1765M ]
62727
50789
6881
2355
6
0
0
0
0.3
tcp
tcp
tcp
tcp
67.40.X.170
65.221.X.2
134.84.X.43
160.94.X.1
5
0 0 0 0
0
---A P --- [32k,1M ] [8M ,1765M ] 0
---A --S F [32k,1M ] [3M ,8M ]
0
---A --S F --------- --------0.3
58.9
54
34.4
28.4
134.84.X.2
554
128.101.X.39 54906
62.70.X.101
17534
220.120.X.249 15074
4
0
---A P -S F [32k,1M ] [8M ,1765M ]
---A ---- [32k,1M ] [1M ,3M ]
-
3
0
tcp
tcp
tcp
tcp
117 144.34.X.164
1676 128.101.X.190
22 tcp
13.4
4 31 128.101.X.204 ----xxx.xxx.xxx.xxx----tcp
12.3
11 101 xxx.xxx.xxx.xxx----134.84.X.117 ----tcp
-
2
0
---A ---F --------- -----------A P --- --------- -----------A P --- [32k,1M ] [8M ,1765M ]
---A P --- [32k,1M ] [8M ,1765M ]
---A P --- [32k,1M ] [8M ,1765M ]




0
0
0
0
0
0
0
0
0
0
0
0
UMN computers doing bulk transfers
160.94.122.142 is running a rogue FTP server on 60000/TCP
UMN Computers doing large transfers via BitTorrent to many outside hosts
This computer is scanning for computers on port 139/TCP. Majority of the packets are 192bytes
or 144bytes, except for the second summary (score 88.2)
 UMN computer running a RealMedia server, that was not known to the analyst
 Odd looking P2P traffic to/from a UMN computer (potentially KaZaA or Gnutella)
 The remote computer was scanning for 57/TCP, where RESET packets are sent back from
computers that do not have 57/TCP open.
AHPCRC
14
Scan Detection
• Despite the importance of scan detection its value is often overlooked
– Lack of good tools for scan detection
• Existing methods either miss stealth scans or give too many false alarms
• Fast scans are easy to catch using existing schemes but stealth
scans are very difficult to recognize
• MINDS employs our new methodology for detecting network scans
– Makes use of powerful new heuristics
• Only considers flows with a small number of packets
• Only considers scans in a subnet (not the whole internet)
– Makes effective use of usage information
• Touches to rare IP / port combinations are more suspicious than others
• A scanner will hit machines where the service is not available resulting in a low count
• Very low False Alarm rate
– Evaluation of 36 million flows over a 30-minute window at the University of
Minnesota showed 2583 alarms but only 22 false alarms
– Evaluation on an hour of data at the ARL showed 1150 scans report, but only 5
false alarms
• Routinely finds compromised machines at ARL-CIMP
AHPCRC
15
Detecting Suspicious Ports for Possible Worm Activity
• We find destinations located within the network for which
there is a high connection failure rate on specific ports for
inbound, non-scan connections
• Then we find ports on which there are many such destinations
• The existence of these ports indicates a potential worm or
slow scan
• This warrants targeted and more detailed data collection and
analysis that cannot be done easily on the entire data
– Packet content analysis
– Signature generation
AHPCRC
16
IP / port pairs for which a large percentage of connections failed
AHPCRC
17
IP / port pairs for which a large percentage of
connections failed (only for ports with many hits)
AHPCRC
18
0
1
4
5
16 HP 17 Apple 20 CSC
21
64
65
68
69
80
81
84
85
2
3 GE
6
7
18 MIT
8
9 IBM
10
11
14
15 HP
32 ATT
33
36
34
35 Merit
Halliburto
Netw orks
n
19 Ford
22
23
66
67
70
71
82
83
86
87
25
28
29
72
73
76
77
88
89
92
93
26
27
30
31
74
75
78
79
90
91
94
95
37
48
Prudential
49
52
DuPont
53
Chrysler
96
97
100
101
112
113
116
117
38 PSI
39
50
51
54 Merck
55
98
99
102
103
114
115
118
119
56
57
60
61
104
105
108
109
120
121
124
125
12 ATT 13 Xerox 24 Cable
40 Eli Lily
41
44 Am
Rad Digi
Com
45
Interop
Show
Net
42
43
46
47 Nortel
58
59
62
63
106
107
110
111
122
123
126
127
128
129
132
133
144
145
148
149
192
193
196
197
208
209
212
213
130
131
134
135
146
147
150
151
194
195
198
199
210
211
214
215
136
137
140
141
152
153
156
157
200
201
204
205
216
217
220
221
138
139
142
143
154
155
158
159
202
203
206
207
218
219
222
223
160
161
164
165
176
177
180
181
224
225
228
229
240
241
244
245
162
163
166
167
178
179
182
183
226
227
230
231
242
243
246
247
168
169
173
184
185
188
189
232
233
236
237
248
249
252
253
175
186
187
190
191
234
235
238
239
250
251
254
255
172
AOL
170
171
174
APNIC (Asia)
US Military
RIPE (Europe)
USPS
IANA Reserved
Private Use
LACNIC (Lat. Am.)
ARIN
Loopback
Japan Inet
UK Government
Public Data Network
SITA (French)
AHPCRC
19
Multicast
999 unique sources (Min:1, Max:28, Avg:1)
1126 unique destinations (Min:1, Max:55,
Avg:1)
1516 total flows involved
1472 scan flows on port 80 (found by scan
detector)
7982 unique sources (Min:1, Max:16, Avg:1)
6184 unique destinations (Min:1, Max:28, Avg:1)
9930 total flows involved
9406 scan flows on port 445 (found by scan
detector)
Clustering
•
Useful for detecting modes of behavior
– Shared Nearest Neighbor (SNN) clustering works quite well at
determining modes of behavior
• Not distracted by “noise” in the data
•
•
SNN is CPU intensive, O(N^2)
Requires storing an N x K matrix
– K (number of neighbors) is typically between 10 – 20
– K should be about the size of the smallest expect mode
•
•
•
Clustered 850,000 connections collected over one hour at one US
Army Fort
Took 10 hours using 3 Quad 2.8 Ghz Servers, and 4 2 Ghz
workstations (total of 16 CPUs)
Required around 100 Meg of memory per PE for the distance
calculations
– 500 Meg of memory for the final clustering step on a single PE
•
Found 3135 clusters
– Largest clusters around 500 records, smallest cluster 10 records
AHPCRC
24
Detecting Large Modes of Network Traffic Using Clustering
 Large clusters of VPN traffic (hundreds of connections)
 Used between forts for secure sharing of data and working remotely
Start Time
Duration
20040407.10:00:00.428036
0:00:00
20040407.10:00:00.685520
0:00:03
20040407.10:00:00.748920
0:00:00
20040407.10:01:44.138057
0:00:00
20040407.10:01:59.267932
0:00:00
20040407.10:02:44.937575
0:00:01
20040407.10:04:00.717395
0:00:00
20040407.10:04:30.976627
0:00:01
20040407.10:04:46.106233
0:00:00
20040407.10:05:46.715539
0:00:00
20040407.10:06:16.975202
0:00:01
20040407.10:06:32.105013
0:00:00
Src IP
A
A
A
A
A
A
A
A
A
A
A
A
Src Port
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
Dst IP
B
B
B
B
B
B
B
B
B
B
B
B
Dst Port
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
Proto
gre
gre
gre
gre
gre
gre
gre
gre
gre
gre
gre
gre
TTL Packets Bytes
237
1
556
237
1
556
237
1
556
237
1
556
237
1
96
237
1
556
237
1
556
237
1
556
237
1
556
237
1
556
237
1
556
237
1
556
Start Time
Duration
20040407.10:00:40.685522
0:00:03
20040407.10:00:58.748922
0:00:00
20040407.10:01:44.138059
0:00:00
20040407.10:02:14.678442
0:00:00
20040407.10:02:44.937577
0:00:01
20040407.10:03:15.308206
0:00:00
20040407.10:04:30.976629
0:00:01
20040407.10:06:16.975204
0:00:01
20040407.10:06:32.105015
0:00:00
20040407.10:06:47.234837
0:00:00
20040407.10:07:02.367471
0:00:00
20040407.10:07:17.494574
0:00:00
Src IP
B
B
B
B
B
B
B
B
B
B
B
B
Src Port
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
Dst IP
A
A
A
A
A
A
A
A
A
A
A
A
Dst Port
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
Proto
gre
gre
gre
gre
gre
gre
gre
gre
gre
gre
gre
gre
TTL packets Bytes
237
1
96
237
1
96
237
1
96
237
1
96
237
1
96
237
1
96
237
1
96
237
1
96
237
1
96
237
1
96
237
1
96
237
1
96
Detecting Unusual Modes of Network Traffic Using Clustering
 Clusters Involving GoToMyPC.com (Army Data)
 Policy violation, allows remote control of a desktop
Start Time
Duration
20040407.10:00:10.428036 0:00:00
20040407.10:00:40.685520 0:00:03
20040407.10:00:58.748920 0:00:00
20040407.10:01:44.138057 0:00:00
20040407.10:01:59.267932 0:00:00
20040407.10:02:44.937575 0:00:01
20040407.10:04:00.717395 0:00:00
20040407.10:04:30.976627 0:00:01
20040407.10:04:46.106233 0:00:00
20040407.10:05:46.715539 0:00:00
20040407.10:06:16.975202 0:00:01
20040407.10:06:32.105013 0:00:00
Src IP
A
A
A
A
A
A
A
A
A
A
A
A
Src Port
4125
4127
4138
4141
4143
4149
4163
4172
4173
4178
4180
4181
Dst IP
B
B
B
B
B
B
B
B
B
B
B
B
Dst Port
8200
8200
8200
8200
8200
8200
8200
8200
8200
8200
8200
8200
Proto
tcp
tcp
tcp
tcp
tcp
tcp
tcp
tcp
tcp
tcp
tcp
tcp
TTL
123
123
123
123
123
123
123
123
123
123
123
123
Flags
Packets Bytes
***AP*SF
5
248
***AP*SF
5
248
***AP*SF
5
248
***AP*SF
5
248
***AP*SF
5
248
***AP*SF
5
248
***AP*SF
5
248
***AP*SF
5
248
***AP*SF
5
248
***AP*SF
5
248
***AP*SF
5
248
***AP*SF
5
248
Start Time
Duration
20040407.10:00:40.685522 0:00:03
20040407.10:00:58.748922 0:00:00
20040407.10:01:44.138059 0:00:00
20040407.10:02:14.678442 0:00:00
20040407.10:02:44.937577 0:00:01
20040407.10:03:15.308206 0:00:00
20040407.10:04:30.976629 0:00:01
20040407.10:06:16.975204 0:00:01
20040407.10:06:32.105015 0:00:00
20040407.10:06:47.234837 0:00:00
20040407.10:07:02.367471 0:00:00
20040407.10:07:17.494574 0:00:00
Src IP
B
B
B
B
B
B
B
B
B
B
B
B
Src Port
8200
8200
8200
8200
8200
8200
8200
8200
8200
8200
8200
8200
Dst IP
A
A
A
A
A
A
A
A
A
A
A
A
Dst Port
4127
4138
4141
4145
4149
4153
4172
4180
4181
4182
4183
4184
Proto TTL
tcp 123
tcp 123
tcp 123
tcp 123
tcp 123
tcp 123
tcp 123
tcp 123
tcp 123
tcp 123
tcp 123
tcp 123
Flags
packets Bytes
***AP*SF
4
211
***AP*SF
4
211
***AP*SF
4
211
***AP*SF
4
211
***AP*SF
4
211
***AP*SF
4
211
***AP*SF
4
211
***AP*SF
4
211
***AP*SF
4
211
***AP*SF
4
211
***AP*SF
4
211
***AP*SF
4
211
Detecting Unusual Modes of Network Traffic Using Clustering
 Clusters involving mysterious ping and SNMP traffic
Start Time
Duration
20040407.10:01:00.181261 0:00:00
20040407.10:01:23.183183 0:00:00
20040407.10:02:54.182861 0:00:00
20040407.10:03:03.196850 0:00:00
20040407.10:04:45.179841 0:00:00
20040407.10:06:27.180037 0:00:00
20040407.10:09:48.420365 0:00:00
20040407.10:11:04.420353 0:00:00
20040407.10:11:30.420766 0:00:00
20040407.10:12:47.421054 0:00:00
20040407.10:13:12.423653 0:00:00
20040407.10:14:53.420635 0:00:00
Src IP
A
A
A
A
A
A
A
A
A
A
A
A
Src Port
1176
-1
1514
-1
-1
-1
-1
3013
-1
3329
-1
-1
Dst IP
B
B
B
B
B
B
B
B
B
B
B
B
Start Time
Duration
20040407.10:01:00.181488 0:00:00
20040407.10:01:23.183291 0:00:00
20040407.10:01:55.180590 0:00:00
20040407.10:02:54.184537 0:00:00
20040407.10:03:03.196958 0:00:00
20040407.10:04:45.179965 0:00:00
20040407.10:05:09.180542 0:00:00
20040407.10:06:27.180159 0:00:00
20040407.10:09:48.420410 0:00:00
20040407.10:11:30.420773 0:00:00
20040407.10:13:12.423663 0:00:00
20040407.10:14:53.421019 0:00:00
Src IP
B
B
B
B
B
B
B
B
B
B
B
B
Src Port
161
-1
161
161
-1
-1
161
-1
-1
-1
-1
-1
Dst IP
A
A
A
A
A
A
A
A
A
A
A
A
Dst Port
161
-1
161
-1
-1
-1
-1
161
-1
161
-1
-1
Dst Port
1176
-1
1326
1514
-1
-1
1927
-1
-1
-1
-1
-1
Proto
udp
icmp
udp
icmp
icmp
icmp
icmp
udp
icmp
udp
icmp
icmp
Proto
udp
icmp
udp
udp
icmp
icmp
udp
icmp
icmp
icmp
icmp
icmp
TTL ICMP Type ICMP Code # Packets # Bytes
123
1
95
123
8
0
1
84
123
1
95
123
8
0
1
84
123
8
0
1
84
123
8
0
1
84
123
8
0
1
84
123
1
95
123
8
0
1
84
123
1
95
123
8
0
1
84
123
8
0
1
84
TTL ICMP Type ICMP Code # Packets # Bytes
63
1
103
254
0
0
1
84
63
1
234
63
1
134
254
0
0
1
84
254
0
0
1
84
63
1
234
254
0
0
1
84
254
0
0
1
84
254
0
0
1
84
254
0
0
1
84
254
0
0
1
84
Detecting Unusual Modes of Network Traffic Using Clustering
 Clusters involving unusual repeated ftp sessions
 Further investigations revealed a misconfigured Army computer was trying to
contact Microsoft
Start Time
20040407.10:10:57.097108
20040407.10:11:27.113230
20040407.10:11:37.111176
20040407.10:11:57.118231
20040407.10:12:17.125220
20040407.10:12:37.132428
20040407.10:13:17.146391
20040407.10:13:37.153713
20040407.10:14:47.178228
20040407.10:15:47.199100
20040407.10:16:07.206450
Duration
0:00:00
0:00:00
0:00:00
0:00:00
0:00:00
0:00:00
0:00:00
0:00:00
0:00:00
0:00:00
0:00:00
Start Time
Duration
20040407.10:00:06.627895 0:00:01
20040407.10:00:16.633872 0:00:01
20040407.10:00:36.638794 0:00:01
20040407.10:01:16.652664 0:00:01
20040407.10:01:26.659694 0:00:01
20040407.10:01:56.666816 0:00:01
20040407.10:02:06.670680 0:00:01
20040407.10:02:56.687932 0:00:01
20040407.10:03:26.698413 0:00:01
20040407.10:04:06.712495 0:00:01
20040407.10:05:06.733731 0:00:01
20040407.10:06:16.758442 0:00:01
Src IP
A
A
A
A
A
A
A
A
A
A
A
Src IP
B
B
B
B
B
B
B
B
B
B
B
B
Src Port
3004
3007
3008
3011
3013
3015
3020
3022
3031
3040
3042
Src Port
21
21
21
21
21
21
21
21
21
21
21
21
Dst IP
B
B
B
B
B
B
B
B
B
B
B
Dst IP
A
A
A
A
A
A
A
A
A
A
A
A
Dst Port
21
21
21
21
21
21
21
21
21
21
21
Dst Port
2924
2925
2927
2932
2933
2937
2938
2944
2947
2952
2961
2969
Proto
tcp
tcp
tcp
tcp
tcp
tcp
tcp
tcp
tcp
tcp
tcp
tcp
Proto
tcp
tcp
tcp
tcp
tcp
tcp
tcp
tcp
tcp
tcp
tcp
TTL
123
123
123
123
123
123
123
123
123
123
123
123
TTL
123
123
123
123
123
123
123
123
123
123
123
Flags
packets Bytes
***AP*SF
7
318
***AP*SF
7
318
***AP*SF
7
318
***AP*SF
7
318
***AP*SF
7
318
***AP*SF
7
318
***AP*SF
7
318
***AP*SF
7
318
***AP*SF
7
318
***AP*SF
7
318
***AP*SF
7
318
Flags
***AP*SF
***AP*SF
***AP*SF
***AP*SF
***AP*SF
***AP*SF
***AP*SF
***AP*SF
***AP*SF
***AP*SF
***AP*SF
***AP*SF
packets
Bytes
7
7
7
7
7
7
7
7
7
7
7
7
449
449
449
449
449
449
449
449
449
449
449
449
MINDS: CRITICAL
TOTOCOMPLETE
FUNCTIONALITY
MINDS: CRITICAL
COMPLETE FUNCTIONALITY
Header Analysis
Scans with
Automatic
Virus Attacks
Packet-Based
Signature Detection
Behavior
Analysis
Simple
Scans
Viruses
and
Worms
(MINDS)
Scans with
Target
Responses
Anomaly
Detection
and New
Attacks
Compromises
Session-Based
Signature Detection
New and
Variant
Attacks
Army Research Laboratory (ARL),
supported by the AHPCRC and the
MINDS initiative, successfully
monitors and analyzes network data to
protect ARL and its Army and DoD
customer infospace
Current MINDS Research and Development Work
• Correlation of suspicious events across network sites
– Helps detect sophisticated attacks not identifiable by single site
analyses
– Scalable anomaly detection
– Distributed correlation algorithms
– Grids & middleware
• Analysis of long term data (months/years)
– Uncover suspicious stealth activities (e.g. insiders
leaking/modifying information)
M
I
N
D
S
M
I
N
D
S
M
I
N
D
S
M
I
N
D
S
AHPCRC
M
I
N
D
S
30