Non-negative Matrix Factorization F

IBM Research
Non-Negative Residual Matrix Factorization
w/ Application to Graph Anomaly Detection
Hanghang Tong and Ching-Yung Lin
SIAM-DM 2011, Mesa AZ, USA, April 28-30, 2011
© 2011 IBM Corporation
IBM Research
Large Graphs are Everywhere!
-
Q: How to find patterns?
Terrorist Network
Food
Web
[2007]
Internet Map [Koren 2009]
e.g., community, anomaly,[Krebs
etc.2002]
Social Network
[Newman 2005]
2
Protein Network
[Salthe 2004]
Web Graph
© 2011 IBM Corporation
IBM Research
Matrix Tool for Finding Graph Patterns
 A Typical Procedure:
Graph
3
Adj. Matrix A
Low-rank matrices
Residual matrix
A=FxG+R
© 2011 IBM Corporation
IBM Research
Matrix Tool for Finding Graph Patterns
 A Typical Procedure:
Graph
Adj. Matrix A
Low-rank matrices
Residual matrix
A=FxG+R
community
anomalies
An Illustrative Example
4
© 2011 IBM Corporation
IBM Research
Improve Interpretation by Non-negativity
 A Typical Procedure:
Interpretation by Non-negativity
community
Graph
Adjacency
Matrix A
 An Example
Non-negative Matrix Factorization
A=FxG+R
anomalies
F >= 0; G >= 0
(for community detection)
Non-negative Residual
Matrix Factorization
R(i,j) >= 0; for A(i,j) > 0
(for anomaly detection)
This Paper
5
© 2011 IBM Corporation
IBM Research
Anomaly Detection on Graphs
 Social Networks
– `Popularity contest’
 Computer Networks
– Spammer, Port Scanner, Vulnerable Machines, etc
 Financial Transaction Networks
– Fraud transaction (e.g., money-laundry ring), scammer
 Criminal Networks
– New criminal trend
 Tele-communication Networks
– Tele-marketer
Key Observation: Abnormal Behavior  Actual Activities
6
© 2011 IBM Corporation
IBM Research
Optimization Formulation
 General Case
Weighted Frobenius Form
Common in Any
Matrix Factorization
8
Weight
© 2011 IBM Corporation
IBM Research
Optimization Formulation
 General Case
Weighted Frobenius Form
Common in Any
Matrix Factorization
Unique in This Paper
9
Weight
Non-negative residual
© 2011 IBM Corporation
IBM Research
Optimization Formulation
0/1
weight
 0/1 Weight Matrix (Major Focus of the Paper)
Common in
Any Matrix Factorization
Unique in This Paper
10
Non-negative residual
© 2011 IBM Corporation
IBM Research
Optimization Formulation with 0/1 Weight Matrix
 NrMF with 0/1 Weight Matrix
 Q: How to find ‘optimal’ F and G?
– D1: Quality
 C1: non-convexity of opt. objective
– D2: Scalability  C2: large size of the graph
11
© 2011 IBM Corporation
IBM Research
Optimization Method: Batch Mode
 Basic Idea 1: Alternating
Not convex wrt F and G, jointly
But convex if fixing either F or G
 Basic Idea 2: Separation
argminG
argminG
i,
s.t..
For each j
s.t..
Standard Quadratic Programming Prob.
Overall Complexity: Polynomial  Can we do better?
12
© 2011 IBM Corporation
IBM Research
Optimization Method: Incremental Mode
 Basic Idea 1: Recursive
 Basic Idea 2: Alternating
 Basic Idea 3: Separation
Adjacency Matrix
A
Initialize: R=A
Rank-1 Approximation
Do r times
QP for a single variable
w/ boundary constrains
Update Residual
Matrix R
Can be solved in
constant time
Output Final
Residual Matrix
Overall Complexity: Linear wrt # of edges
13
© 2011 IBM Corporation
IBM Research
Experimental Evaluation
Effectiveness
Accuracy
Wall-clock Time
Anomaly Type
14
Efficiency
# of edges
© 2011 IBM Corporation
IBM Research
Batch Method vs. Incremental Method
Log Wall-clock time (sec.)
Batch Method
Incremental Method
16
Data Set
© 2011 IBM Corporation
IBM Research
Conclusion
 Problem Formulation: Non-negative Residual Matrix Factorization
– a new matrix factorization for interpretable graph anomaly detection
 Optimization Methods
– Batch: straight-forward, polynomial time complexity
– Incremental: linear time complexity
 Future Work
– Other interpretable properties (sparseness) for anomaly detection
– Matrix Factorization w/ Total Non-negativity
17
© 2011 IBM Corporation
IBM Research
Thank you!
[email protected]
(We are hiring at IBM Research!)
18
© 2011 IBM Corporation
IBM Research
Visual Comparison
19
© 2011 IBM Corporation
IBM Research
low q
up
q
low
up
© 2011 IBM Corporation