Graph Analysis with Node Differential Privacy Node Differential

Graph Analysis with
Node Differential Privacy
Sofya Raskhodnikova
Penn State University
Joint work with
Shiva Kasiviswanathan (GE Research),
Kobbi Nissim (Ben-Gurion U. and Harvard U.),
Ad S
Adam
Smith
i h (Penn
(P
S
State))
1
Publishing information about graphs
Many datasets can be represented as graphs
•
•
•
•
“Friendships” in online social network
“Friendships”
in online social network
Financial transactions
E il
Email communication
i ti
Romantic relationships
image source http://community.expressorsoftware.com/blogs/mtarallo/36-extracting-datafacebook-social-graph-expressor-tutorial.html
Privacy is a
big issue!
American J. Sociology,
Bearman, Moody, Stovel
2
Private analysis of graph data
Graph G
Users
Trusted
curator
(
queries
i
answers
)
Government,
researchers
researchers,
businesses
(or)
malicious
adversary
• Two conflicting goals:
T
fli i
l utility and privacy
ili
d i
– utility: accurate answers
– privacy: ?
image source http://www.queticointernetmarketing.com/new-amazing-facebook-photo-mapper/
3
Differential privacy for graph data
Graph G
Users
Trusted
curator
(
A
queries
i
answers
)
Government,
researchers
researchers,
businesses
(or)
malicious
adversary
• Intuition: neighbors are datasets that differ only in some neighbors are datasets that differ only in some
information we’d like to hide (e.g., one person’s data) image source http://www.queticointernetmarketing.com/new-amazing-facebook-photo-mapper/
4
Two variants of differential privacy for graphs
• Edge differential privacy
G:
Two graphs are neighbors if they differ in one edge.
• Node differential privacy
G:
Two graphs are neighbors if one can be obtained from the other by deleting a node and its adjacent edges
by deleting a node and its adjacent edges.
5
Node differentially private analysis of graphs
Graph G
Users
Trusted
curator
(
A
queries
i
answers
)
Government,
researchers
researchers,
businesses
(or)
malicious
adversary
• Two conflicting goals:
T
fli ti
l utility and privacy
tilit
d i
– Impossible to get both in the worst case
• Previously: no node differentially private no node differentially private
algorithms that are accurate on realistic graphs
image source http://www.queticointernetmarketing.com/new-amazing-facebook-photo-mapper/
6
Our contributions
• First node differentially private algorithms that are accurate for sparse graphs
accurate for sparse graphs
– node differentially private for all graphs
– accurate
acc rate for a subclass of graphs, which includes for a s bclass f
h
hi h i l d
• graphs with sublinear (not necessarily constant) degree bound
• graphs where the tail of the degree distribution is not too heavy
graphs where the tail of the degree distribution is not too heavy
• dense graphs
• Techniques for node differentially private algorithms
ec ques o ode d e e a y p a e a go
s
• Methodology for analyzing the accuracy of such algorithms on realistic networks
algorithms on realistic networks
Concurrent work on node privacy [Blocki Blum Datta
Concurrent work on node privacy Blum Datta Sheffet 13]
7
Our contributions: algorithms
…
…
Frequency
…
Degrees
8
Our contributions: accuracy analysis
(1+o(1))-approximation
9
Previous work on
differentially private computations on graphs
Edge differentially private algorithms
Edge differentially private algorithms
•
•
•
•
number of triangles, MST cost [Nissim Raskhodnikova Smith 07]
degree distribution [Hay Rastogi
degree distribution
[Hay Rastogi Miklau Suciu 09, Hay Li Miklau
09 Hay Li Miklau Jensen 09]
Jensen 09]
small subgraph counts [Karwa Raskhodnikova Smith Yaroslavtsev 11]
cuts [Blocki Blum Datta Sheffet 12]
Edge private against Bayesian adversary (weaker privacy)
• small subgraph
ll b
h counts t [Rastogi
[
Hay Miklau
kl Suciu 09]]
Node zero‐knowledge
Node zero
knowledge private (stronger
private (stronger privacy)
• average degree, distances to nearest connected, Eulerian, cycle‐free graphs for dense graphs [Gehrke Lui Pass 12]
10
Differential privacy basics
Graph G
Users
Trusted
curator
(
A
statistic
t ti ti f
)
approximation
to f(G)
Government,
researchers
researchers,
businesses
(or)
malicious
adversary
11
Global sensitivity framework [DMNS’06]
12
“Projections” on graphs of small degree
Goal: privacy for all graphs
13
Method 1: Lipschitz extensions
14
1
s
1
1'
2
2'
3
3'
4
4'
5
5'
t
15
1
s
1/ 1
1'
2
2'
3
3'
4
4'
5
5'
t
16
1
s
1
1'
2
2'
3
3'
4
4'
5
5'
6
6'
t
17
 via Smooth Sensitivity framework via Smooth Sensitivity framework [NRS
[NRS’07]
07]
18
Our results
via Lipschitz
extensions
} via generic reduction
19
Conclusions
• It is possible to design node differentially private algorithms with good utility on sparse graphs
with good utility on sparse graphs
– One can first test whether the graph is sparse privately
• Directions for future work
– Node‐private synthetic graphs
– What are the right notions of privacy for network data?
20
Lipschitz extensions via linear/convex programs
21
Our results
…
…
Frequency
…
…
Degrees
22
Our results
…
…
(1 o(1)) approximation
(1+o(1))-approximation
23
G
A
T
T(G)
query f
S
24
Generic Reduction via Truncation
Frequency
…
…
d
Degrees
25
Smooth Sensitivity of Truncation
#(nodes of
degree above d)
26
Releasing Degree Distribution via Generic Reduction
G
A
T
T(G)
qqueryy f
S
27