PIR (Private Information Retrieval)

Private Location-based Query Processing Using PIR
Layla Pournajaf
Guest Lecture
Data Privacy and Security
1
Computer Science Department
Nov. 2016
Emory University
Motivation
I want the list of nearest restaurants
Emory University
Computer Science Department
2
Motivation
I want the list of nearest restaurants
Give me your location first
Emory University
Computer Science Department
3
Private Information Retrieval
Suppose that a server maintains a database consisting of
N sequential blocks. PIR protocols enable a client to retrieve
the ith block from the server, without the server discovering
which block was requested (i.e., index i).
Emory University
Computer Science Department
4
PIR Implementations
• Computational PIR methods [3] rely on the computational intractability
of well-known problems. However, they entail expensive operations linear
in the database size, which lead to prohibitive processing costs (in the
order of thousands of seconds even for moderate database sizes).
• Secure hardware PIR [4] is currently the only practical PIR mechanism.
It relies on a tamper-resistant CPU that is positioned at the server and is
trusted by the clients. This CPU receives a client block request, which is
unreadable by the server. It obliviously extracts the requested block from
the server’s disk, and returns it to the client in an encrypted form
decipherable solely by the client. This paradigm leads to constant
communication cost, and amortized polylogarithmic computational cost.
The latter translates to processing times close to a second even for
Gigabyte databases.
Emory University
Computer Science Department
5
Private Information Retrieval
SCOP: A trusted secure coprocessor implementation of PIR protocol
SYSTEM MODEL
Emory University
PIRAware
Query
Processing
Computer Science Department
6
Location-based Queries Using PIR
• Server-side Indexing of Data
• Spatial Indexing
• Compact Storage
• Enable data retrieval in minimal number of PIR requests
• Client-side query processing
• Query processing with minimal information
• Retrieval of the same number of PIR requests for any query
Emory University
Computer Science Department
7
Nearest Neighbor Query Using PIR [2]:
Server-side Indexing of Data: Voronoi Tessellation
Emory University
Computer Science Department
8
Nearest Neighbor Query Using PIR [2]:
Server-side Indexing of Data: Voronoi Tessellation
Emory University
Computer Science Department
9
Nearest Neighbor Query Using PIR [2]
Emory University
Computer Science Department
10 10
Nearest Neighbor Queries Using PIR [2]
• Server-side Indexing of Data
• Spatial Indexing (Voronoi Tessellation)
• Efficient Storage (Storage of max 3 points per cell)
• Data retrieval in minimal number of PIR requests (1 request per
query)
• Client-side query processing
• Query processing with minimal information (find the query cell index,
request the index)
• Retrieval of the same number of PIR requests for any query (1
request per query)
Emory University
Computer Science Department
11
K-Nearest Neighbor Queries Using PIR [1]
Emory University
Computer Science Department
12
K-Nearest Neighbor Queries Using PIR [1]
• Server-side Indexing of Data
• Spatial Indexing (Grid Structure)
• Efficient Storage (Not very efficient (Too many dummy points))
• Data retrieval in minimal number of PIR requests (Empty cells may
be requested with no use. Too many points in a cell may need
several requests)
• Client-side query processing
• Query processing with minimal information
• Retrieval of the same number of PIR requests for any query
Emory University
Computer Science Department
13
K-Nearest Neighbor Queries Using PIR [1]
• Server-side Indexing of Data
• Spatial Indexing (Grid Structure)
• Efficient Storage (Not very efficient (Too many dummy points))
• Data retrieval in minimal number of PIR requests (Empty cells may
be requested with no use. Too many points in a cell may need
several requests)
• Client-side query processing
• Query processing with minimal information
• Retrieval of the same number of PIR requests for any query
Emory University
Computer Science Department
14
Query Plan
• Goal:
• Ensuring the same number of PIR requests per query
• Approach:
• Finding the maximum number of PIR requests
considering any query point (QP)
• Regardless of the location of query, all users should
ask for QP blocks
Emory University
Computer Science Department
15
kNN – Smarter Indexing [1]
Emory University
Computer Science Department
16 16
Query Plan
Emory University
Computer Science Department
17
References
[1] S. Papadopoulos, S. Bakiras, and D. Papadias, “Nearest neighbor
search
with strong location privacy,” Proceedings of the VLDB Endowment,
vol. 3, no. 1-2, pp. 619–629, 2010.
[2] A. Khoshgozaran, C. Shahabi, and H. Shirani-Mehr, “Location
privacy:
going beyond k-anonymity, cloaking and anonymizers,” Knowledge and
Information Systems, vol. 26, no. 3, pp. 435–465, 2011.
[3] G. Ghinita, P. Kalnis, A. Khoshgozaran, C. Shahabi, and K.-L. Tan,
“Private queries in location based services: anonymizers are not
necessary,”
in Proceedings of the 2008 ACM SIGMOD international conference on
Management of data. ACM, 2008, pp. 121–132.
[4] P. Williams and R. Sion, “Usable pir.” in NDSS, 2008.
Emory University
Computer Science Department
18
Thank You!
Emory University
Computer Science Department
19
My Research: Reverse k-Nearest Neighbor
Queries Using PIR

Finding RKNN query without LBS deducing the location
of client.
p2 is the nearest neighbor of q
p1 and p4 are the reverse nearest
neighbors of q
Reverse k-Nearest Neighbor Queries

Pruning



Shrink the search space by removing the points that can not be
the answer
The remaining points are considered Candidates
Verification

Compute kNN for all candidates. For each candidate, if its
kNN includes the query point, it is an answer of RkNN query
Pruning Strategies
Emory University
Computer Science Department
22 22
Pruning Strategies
Emory University
Computer Science Department
23 23
Reverse k-Nearest Neighbor
Queries Using PIR: Server-Side Indexing Method 1



DB_cnt: < c:pre, c:count >
DB_loc: <p.id, p.x, p.y, p.ptr>
DB_dtl: <p:id, p:payload >
Emory University
Computer Science Department
24 24
Solution 1 Overview

Client

Aware of grid structure, query plan and DB_cnt
Pruning Phase


Verification Phase

Server


Create Hilbert curve on fix grid
Create DB_cnt, DB_loc and DB_dtl
Query Plan


Emory University
Retrieve the location coordination and find the results
Indexing Data


Determine candidate and influenced cells
Uncertain half-plane pruning for uncertain queries
Modified Verification
Computer Science Department
25
Pruning method in Query Plan

Uncertain half-plane pruning for uncertain queries
HM:B
HP:A
N
O
R
M
P
C
H’N:E
B
A
H’M:B
Q
D
H’O:D
E
H’P:A
Emory University
Computer Science Department
26
Reverse k-Nearest Neighbor Queries Using PIR:
Server-Side Indexing Method 2






Hilbert-Rtree: hybrid structure based on B+-tree and R-tree
Internal nodes : < MBR, HLV, Ptr >
leaf nodes : < mbr, Pid >
Nodes and points reside in HRT based on their Hilbert number.
nodes are separated based on the largest Hilbert value
In case of overflow in a node, it is handled by applying s, s+1 policy.
Emory University
Computer Science Department
27 27
Solution 2 Overview

Client
Aware of Hilbert-Rtree structure, query plan and
level_0 of HRT (< MBR, LHV, c >)
Advantage:
Instead of probing small cell
 Pruning Phase
of Grid, assess the MBR in
 Verification Phase
which might contain more
cells
Server

Indexing Data



Create Hilbert-Rtree structure
Create DB_loc and DB_dtl
Query Plan


Emory University
Uncertain half-plane pruning for uncertain queries
Modified Verification
Computer Science Department
28
Thank You!
Emory University
Computer Science Department
29