C ISPA - Internet Society

saarland
university
computer science
On Epigenomic Privacy:
Tracking Personal MicroRNA
Expression Profiles over Time
Michael Backes, Pascal Berrang, Anne Hecksteden,
Mathias Humbert, Andreas Keller and Tim Meyer
21st February 2016
C ISPA
Center for IT-Security, Privacy
and Accountability
saarland
university
computer science
On Epigenomic Privacy:
Tracking Personal MicroRNA
Expression Profiles over Time
Epigenetics
MicroRNA (miRNA)
“epi”: above, over (greek)
“genetics”: origin (greek)
discovered in the early 1990s
Definition: study of cellular and phenotypic trait
variations stemming from other causes than
changes in the genotype
Definition: small non-coding RNA molecules
that regulate gene expression in plants/animals
60% of genes coding human proteins are
regulated by miRNAs
C ISPA
Center for IT-Security, Privacy
and Accountability
MicroRNA
Expression Profiles
Real-valued number
quantifying whether and
how much miRNAs are
active in a given set of
cells/tissue.
external factors such as:
in-utero and childhood development,
environmental chemicals, aging, diet.
2
saarland
What is the purpose of MicroRNAs?
university
computer science
C ISPA
Center for IT-Security, Privacy
and Accountability
But all cells have the same genes!
Chromosomes: carry hereditary information
in long strings of DNA called genes
(a region of DNA)
What makes the cells different:
gene expression
(which genes are active in a cell)
Graphics: genographic.nationalgeographic.com
3
saarland
What is the purpose of MicroRNAs?
university
computer science
C ISPA
Center for IT-Security, Privacy
and Accountability
What makes the cells different:
gene expression
(which genes are active in a cell)
miRNAs regulate most of human genes!
↳ important for normal and disease cells
neurodegenerative diseases (e.g., Alzheimer’s)
heart diseases, diabetes, majority of cancers
4
saarland
More on DNA and MicroRNAs!
DNA
university
computer science
C ISPA
Center for IT-Security, Privacy
and Accountability
miRNAs
•
contains receipts what a cell
potentially can do,
•
expression regulates what a cell
really does,
•
is (mostly) fixed over time,
•
expression changes over time,
•
can hint on risks of getting a
disease,
•
can tell whether you carry a disease,
•
so far, have been largely overlooked
(in privacy)!
•
has been researched a lot.
Common belief: no privacy threats from miRNAs,
because of temporal variability
5
saarland
university
computer science
identification
C ISPA
Center for IT-Security, Privacy
and Accountability
Common belief: no privacy threats from miRNAs,
because of temporal variability
t1
t2
matching
hospital server
()
cyber attacks against healthcare companies
have increased by 72% within one year
black market
6
saarland
Athletes’ dataset
university
computer science
C ISPA
Center for IT-Security, Privacy
and Accountability
Participants: 29
Points in time: 2 (before and after exercising)
Time shift: 1 week
Disease: none
blood-based
plasma-based
1,189 miRNAs per sample
7
saarland
university
Lung cancer dataset
computer science
C ISPA
Center for IT-Security, Privacy
and Accountability
Participants: 26
Points in time: 8
Time shift: mostly 3 months
Disease: lung cancer
plasma-based
1,189 miRNAs per sample
before surgery
after surgery
-?
0
3
6
9
12
15
18
months
8
saarland
university
computer science
t1
C ISPA
Center for IT-Security, Privacy
and Accountability
t2
t1
rk
()
t1 n
{ri }i=1
1,189 miRNAs
per sample
t2 n
{ri }i=1
9
saarland
university
computer science
tj
rk
1,189 miRNAs
per sample
C ISPA
Center for IT-Security, Privacy
and Accountability
tj
r̄k
PCA +
whitening
vector with m
dimensions
whitening: unit variance
PCA:
smaller dimensionality m
+ uncorrelated components
10
saarland
university
Identification Attack
t1
t1
rk
computer science
C ISPA
Center for IT-Security, Privacy
and Accountability
t2
t2
r̄i
t1
r̄k 2
t2
r i⇤
⇤
i = arg min
i
t2
r̄i
t1
r̄k 2
t2 n
{ri }i=1
11
saarland
university
Identification Attack
t1
t1
rk
computer science
C ISPA
Center for IT-Security, Privacy
and Accountability
t2
t2
r̄i
t1
r̄k 2
t2
r i⇤
⇤
i = arg min
i
t2
r̄i
t1
r̄k 2
t2 n
{ri }i=1
12
saarland
university
Identification Attack
t1
t1
rk
computer science
C ISPA
Center for IT-Security, Privacy
and Accountability
t2
t2
r̄i
t1
r̄k 2
t2
r i⇤
⇤
i = arg min
i
t2
r̄i
t1
r̄k 2
t2 n
{ri }i=1
13
saarland
university
Identification Attack
t1
t1
rk
computer science
C ISPA
Center for IT-Security, Privacy
and Accountability
t2
t2
r̄i
t1
r̄k 2
t2
r i⇤
⇤
i = arg min
i
t2
r̄i
t1
r̄k 2
t2 n
{ri }i=1
14
saarland
university
Identification Attack
t1
t1
rk
computer science
C ISPA
Center for IT-Security, Privacy
and Accountability
t2
t2
r̄i
t1
r̄k 2
t2
r i⇤
⇤
i = arg min
i
t2
r̄i
t1
r̄k 2
t2 n
{ri }i=1
15
saarland
university
Identification Attack
t1
t1
rk
computer science
C ISPA
Center for IT-Security, Privacy
and Accountability
t2
t2
r̄i
t1
r̄k 2
t2
r i⇤
⇤
i = arg min
i
t2
r̄i
t1
r̄k 2
t2 n
{ri }i=1
16
saarland
Identification Attack
university
computer science
C ISPA
Center for IT-Security, Privacy
and Accountability
42%
76%
22%
28%
similar number of PCA dimensions
80% overlap in top10 miRNAs of
first PCA component
17
saarland
Identification Attack
university
computer science
C ISPA
Center for IT-Security, Privacy
and Accountability
top 2: >80%
top 2: >40%
18
saarland
Identification Attack
university
computer science
C ISPA
Center for IT-Security, Privacy
and Accountability
19
saarland
university
Matching Attack
t1
⇤
= arg min
n
X
i=1
t2
r̄ (i)
t1
r̄ti
1
n
2
{ri }i=1
Center for IT-Security, Privacy
and Accountability
t2
t2
r̄i
t1
rk
computer science
C ISPA
t1
r̄k 2
t2
ri
t2 n
{ri }i=1
20
saarland
university
Matching Attack
t1
computer science
C ISPA
Center for IT-Security, Privacy
and Accountability
t2
minimum weight assignment
on bipartite graph
⇤
= arg min
n
X
i=1
t2
r̄ (i)
t1
r̄ti
1
n
2
{ri }i=1
t2 n
{ri }i=1
21
saarland
university
Matching Attack
computer science
C ISPA
Center for IT-Security, Privacy
and Accountability
55%
90%
48%
29%
similar number of PCA dimensions
22
saarland
Matching Attack
university
computer science
C ISPA
Center for IT-Security, Privacy
and Accountability
<80%
<100 miRNAs
23
saarland
Matching Attack
university
computer science
C ISPA
Center for IT-Security, Privacy
and Accountability
success rate remains more or
less constant in the first year
24
saarland
university
computer science
Identification Attack
C ISPA
Center for IT-Security, Privacy
and Accountability
Matching Attack
90%
76%
48%
28%
25
saarland
Downside of Identification Attack
t1
university
computer science
C ISPA
Center for IT-Security, Privacy
and Accountability
t2
26
saarland
university
computer science
Identification Attack
Center for IT-Security, Privacy
and Accountability
Matching Attack
42%
22%
C ISPA
55%
29%
27
saarland
university
computer science
C ISPA
Center for IT-Security, Privacy
and Accountability
Common belief: no privacy threats from miRNAs,
because of temporal variability
t1
t2
d or
e
fi
f
i
t
s
%
u
0
j
9
n
u
s
s
a
e
s
l
i igh
p
f
e
m
i
h
l
a
be s as ed s
s
s
a
e
b
c
c
d
u
o
s
o
l
b
()
28
saarland
C ISPA
university
!
u
k
n
o
y
belief is unjustified
linkability as high as 90% for
blood-based samples
computer science
Center for IT-Security, Privacy
and Accountability
Qu
es
a
h
T
there in fact are privacy threats
inherent to epigenetic data
blood is easier to link
than plasma
matching is more successful
than identification
success rate remains more or
less constant in the first year
tio
ns
?
29