An Introduction to Graph Theory for Security

#RSAC
SESSION ID: AIR-W10F
An Introduction to Graph Theory for
Security People Who Can’t Math Good
Andrew Hay
CISO
DataGravity, Inc.
@andrewsmhay
#RSAC
Session Overview
A gentle introduction to graph theory
Graphs in every day life
Freely available tools
The application of graphs in a security context
Summary & Application
2
#RSAC
A Gentle Introduction to Graph
Theory
If You’re Anything Like Me…
You completely zone out when you see something like this
source: http://article.sapub.org/10.5923.j.am.20150505.01.html
#RSAC
#RSAC
What Is A Graph?
A
B
5
4
3
2
1
0
5
C
#RSAC
A Graph Is…
A graph is a collection of
vertices (i.e. nodes, dots)
—
where a vertex is an entity which represents some object (e.g. a person, a place, etc.)
edges (i.e. relationships, lines)
—
where an edge represents the relationship between two vertices
source: http://tinkerpop.apache.org/
A Graph Is (continued)…
source: http://tinkerpop.apache.org/
Diagram above shows a graph with two vertices
One with a unique identifier of 1
Another with a unique identifier of 3
There is an edge connecting the two with a unique identifier of 9
It is important to consider that the edge has a direction which goes
out from vertex 1 and in to vertex 3
#RSAC
A Graph Is (continued)…
To give some meaning to this basic structure, vertices and edges can
each be given labels to categorize them
You can now see that a vertex 1 is a person and vertex 3 is a software
vertex
source: http://tinkerpop.apache.org/
#RSAC
A Graph Is (continued)…
They are joined by a created edge which allows you to see that a
person created software
The label and the id are reserved attributes of vertices and edges, but
you can add your own arbitrary properties as well
source: http://tinkerpop.apache.org/
#RSAC
#RSAC
So What Is A Graph?
A
B
5
4
3
2
1
0
10
C
#RSAC
So What Is A Graph?
Chart
Graph
5
4
3
2
1
0
11
Plot
A Little More Advanced Graph Theory
#RSAC
You’ll often hear the words network and
graph used interchangeably…and there is
nothing wrong with that
If the edges in a network are directed (i.e.
pointing in only one direction) the network is
called a directed network or a directed graph,
sometimes digraph for short
When drawing a directed network, the edges
are typically drawn as arrows indicating the
direction
source: http://mathinsight.org/definition/network
A Little More Advanced Graph Theory
If all edges are bidirectional, or undirected, the network is an
undirected network (or undirected graph)
source: http://mathinsight.org/definition/network
#RSAC
A Little More Advanced Graph Theory
#RSAC
Variations
A small undirected network where the nodes and
edges have different types, as indicated by their
colors and line styles
source: http://mathinsight.org/image/small_undirected_node_edge_types_network
A small directed network where the edges and
nodes have different weights, as indicated by their
sizes
source: http://mathinsight.org/image/small_directed_weighted_nodes_edges_network
#RSAC
Graphs In Every Day Life
Graphs in Every Day Life: Internet
Everyone has seen a visual representation of
the Internet
Often, colors indicate operator of network,
country, etc.
Structure determined by sending a storm of
IP packets out randomly across the network
source: http://mathinsight.org/image/internet_map_jurvetson_2004
Each packet is programmed to self-destruct
after a delay, and when this happens, the
packet failure notice reports back the path
the packet took before it died
#RSAC
Graphs in Every Day Life: TSP
Travelling salesman problem (TSP)
"Given a list of cities and the distances between each pair of cities, what is the
shortest possible route that visits each city exactly once and returns to the
origin city?”
source: https://www.mathworks.com/help/optim/examples/travelling-salesman-problem.html
#RSAC
Graphs in Every Day Life: More Examples…
#RSAC
Mapping
Google maps, self-driving cars, etc.
“Hey, Siri, how do I get to 1 Main Street?”
Perception/Attitude Analysis
What hashtags are trending right now?
Which Presidential candidate is being talked
about most on which social media platform?
And, of course, security!
source: https://datasemantics.files.wordpress.com/2013/12/graph3.png
18
#RSAC
Freely Available Tools
Including clients, databases, and programming modules
Tools: Google Fusion Tables
support.google.com/fusiontables/answer/2566732
?hl=en – Network Graph
Basic network mapping tool
Some useful filter functionality
Lacks the deep customization options and analysis
functionality
Can produce insightful visualizations
developers.google.com/fusiontables
Create, update, and delete tables and table data
Issue SQL-like queries
20
#RSAC
#RSAC
Tools: Graphviz
www.graphviz.org
Open source graph visualization software
The Graphviz layout programs take
descriptions of graphs in a simple text
language, and make diagrams in useful
formats
—
Images, SVG, PDF, Postscript , interactive
graph browser
Many useful features for diagrams
—
source: http://www.graphviz.org/content/profile
options for colors, fonts, tabular node
layouts, line styles, hyperlinks, and custom
shapes
Tools: Visual Investigate Scenarios (VIS)
vis.occrp.org
Designed to assist investigative journalists,
activists and others in mapping complex
business or crime networks
Help investigators understand and explain
corruption, organized crime and other
wrongdoings and to translate complex
narratives into simple, universal visual
language
Customizable, dynamic html5 visualization
templates
Illustrate entities, networks and complex
configurations of data
22
#RSAC
#RSAC
Tools: Gephi
gephi.org
source: https://gephi.org/screenshots/
Desktop tool for performing powerful
network analysis and creating network
visualizations
Described as being like Photoshop™ but for
graph data
The user interacts with the representation,
manipulate the structures, shapes and
colors to reveal hidden patterns
Designed to help data analysts to make
hypothesis, intuitively discover patterns,
isolate structure singularities or faults
during data sourcing
23
#RSAC
Tools: OpenGraphiti
www.opengraphiti.com
OpenGraphiti is a free and open
source 3D data visualization engine
created by Thibault Reuille of
OpenDNS
Designed for data scientists to
visualize semantic networks and to
work with them
It offers an easy-to-use API with several associated libraries to create custommade datasets
24
#RSAC
Tools: Maltego
www.paterva.com/web7/buy/malte
go-clients/maltego-ce.php
Maltego CE is the community editio
Available for free for everyone after a
quick registration
Interactive data mining tool
Renders directed graphs for link analysis
Used in online investigations for finding
relationships between pieces of
information from various sources
located on the Internet
source: www.paterva.com
#RSAC
Tools: Maltego (continued…)
www.paterva.com/web7/buy/maltegoclients/casefile.php
CaseFile is Paterva's answer to the offline
intelligence problem
Allows for analysts to examine links between
offline data
Same graphing application as Maltego without
the ability to run transforms
CaseFile gives you the ability to quickly add, link
and analyze data
source: www.paterva.com
#RSAC
Graph Databases: neo4j
neo4j.com
Graph database management system developed by
Neo Technology, Inc
ACID-compliant transactional database with native
graph storage and processing
Implemented in Java
Accessible from software written in other languages
using the Cypher Query Language
Exposes a transactional HTTP endpoint
27
source: https://neo4j.com/
#RSAC
Graph Databases: OrientDB
orientdb.com
Open source NoSQL database management system
Written in Java
Multi-model database, supporting graph,
document, key/value, and object models
Relationships are managed as in graph databases
with direct connections between records
Supports schema-less, schema-full, and schemamixed modes
source: http://orientdb.com/orientdb/
28
#RSAC
Graph Databases: Titan
titan.thinkaurelius.com
Scalable graph database optimized for
Storing and querying graphs
— Containing hundreds of billions of vertices and edges
— Distributed across a multi-machine cluster
—
Support for various storage backends
Support for global graph data analytics, reporting, and ETL through
integration with big data platforms
source: http://titan.thinkaurelius.com/
Native integration with the TinkerPop graph stack
29
#RSAC
Graph Stack: Apache TinkerPop
tinkerpop.apache.org
Open source Graph Computing Framework
Goal is to make it easy for developers to create
graph applications by providing APIs and tools that
simplify their endeavors
Abstraction layer over different graph databases
and different graph processors
As an abstraction layer, TinkerPop provides a way to
avoid vendor lock-in to a specific database or
processor
source: https://tinkerpop.apache.org/
#RSAC
Development Modules
NetworkX
networkx.github.io
Package for the creation, manipulation,
and study of the structure, dynamics,
and functions of complex networks
Graph-tool
graph-tool.skewed.de
Manipulation and statistical analysis of
graphs
SNAP for Python
snap.stanford.edu/snappy/
General purpose, high performance
system for analysis and manipulation of
large networks
Written in C++ and optimized for
maximum performance and compact
graph representation
Scales to massive networks with
hundreds of millions of nodes, and
billions of edges
#RSAC
Development Modules
semanticnet
github.com/ThibaultReuille/semanticnet
Small python library to create semantic
graphs in JSON
Datasets can then be visualized with
OpenGraphiti
Plotly for Python
plot.ly/ipython-notebooks/networkgraphs
Store position as node attribute data
Add, change, delete nodes, node color,
connections, etc.
#RSAC
Development Modules
vis.js
visjs.org
Designed to be easy to use, to handle large
amounts of dynamic data, and to enable
manipulation of and interacti on with the
data
sigmajs
sigmajs.org
Allows developers to integrate network
exploration in rich Web applications
JSNetworkX
jsnetworkx.org
JavaScript port of the NetworkX graph
library
Cytoscape.js
js.cytoscape.org
Fully featured graph library written in
pure JS
Designed for users first, for both front
facing app and developer use cases
#RSAC
The Application Of Graphs In A
Security Context
#RSAC
Scenario: Incident Response
The Application Of Graphs In A Security Context
#RSAC
Scenario: Incident Response
“We had a data breach, what was taken, and who was involved?”
Stu
Mary
CC
Rahim
SSN
#RSAC
Scenario: Incident Response
“We had a data breach, what was taken, and who was involved?”
Stu
Mary
CC
Rahim
SSN
#RSAC
Scenario: Incident Response
“We had a data breach, what was taken, and who was involved?”
Stu
upload download
SSN
CC
HTTP Proxy
#RSAC
Scenario: Incident Response
“We had a data breach, what was taken, and who was involved?”
Stu
upload download
SSN
CC
HTTP Proxy
Scenario: Incident Response
What would this
look like in a tool?
Using Google’s
experimental Fusion
Tables we can easily
graph this
Easy to show links,
directionality, and
node colors
#RSAC
Scenario: Incident Response
Type by Name shows who has interacted with what data
#RSAC
Scenario: Incident Response
Action by Name shows who has performed what actions
#RSAC
#RSAC
Scenario: Actor Tracking
The Application Of Graphs In A Security Context
Scenario: Actor Tracking
“New Phishing Campaign Targets South-East Asia”*
http://www.minerva-labs.com/post/new-phishing-campaign-targets-southeast-asia
Malware variant that was distributed via phishing emails in south-east
Asia.
The binary mimicked Navicat and had multiple info-stealing
capabilities - and possibly a later stage POS oriented module.
* source: https://app.threatconnect.com/auth/incident/incident.xhtml?incident=3440670
#RSAC
#RSAC
Scenario: Actor Tracking
Let’s load the indicators of
compromise (IOC) from the blog
post into a tool
This time, we’ll use Maltego
Community Edition (CE)
source: www.paterva.com
#RSAC
Scenario: Actor Tracking
Add the various
elements that you want
to track
Hashes
Domains
IP addresses
Email addresses
etc.
Scenario: Actor Tracking
Use the transforms to enrich the data
VirusTotal Public
ThreatCrowd
PassiveTotal
Get Passive DNS with Time
— Get Whois Details
— Whois Search by Email Address
—
Avoid running “All Transforms”
#RSAC
Scenario: Actor Tracking
asdf
#RSAC
Scenario: Actor Tracking
Zooming in we can see interesting associations…like how the malware
hashes are being recognized
#RSAC
Scenario: Actor Tracking
Zooming in we can see interesting associations…like how the domains
are associated with the same registrant email address
#RSAC
Scenario: Actor Tracking
Zooming in we can see interesting associations…like how the domains
are associated with the same and IP address
#RSAC
Scenario: Actor Tracking
#RSAC
We can also enrich the data
with…all of the other domains
registered using that email
address
Scenario: Actor Tracking
As you can imagine, this can quickly get out of hand…
#RSAC
General Suggestions
Just because you CAN graph or run a transform on
something…
Consider using only the data you need for a
particular task or project
If you want to experiment with different
transforms, data points, nodes, edges, etc…
#RSAC
General Suggestions
Just because you CAN graph or run a transform on
something…
Consider using only the data you need for a
particular task or project
If you want to experiment with different
transforms, data points, nodes, edges, etc…
USE A NEW GRAPH AND DON’T TINKER
WITH THE MAIN ONE
#RSAC
#RSAC
Summary & Application
#RSAC
Summary
The general application of graph theory doesn’t require
an advanced degree in mathematics
Especially once you know the basics
The connection of related information (read: nodes &
edges) helps represent the data
Both visually and programmatically
There are a growing number of tools to help create graph
associations, store graph data, and programmatically
traverse and modify said data
Pick what works best for you and your environment
source: https://en.wikipedia.org/wiki/Travelling_salesman_problem
Apply What You Have Learned Today
Next week you should:
Take a look at the various free tools and see which one(s) resonate
In the first three months following this presentation you should:
Begin graphing connections for a simple project (e.g. threat actor tracking)
Use your graph project to teach your team or peers the value
Within six months you should:
Have a firm grasp of your own graph project
Look to introduce graph relationships, where applicable, to current security
projects
58
#RSAC