Private and secure secret shared map reduce

Private and Secure Secret
Shared MapReduce
Shlomi Dolev1, Yin Li2, and Shantanu Sharma1
1 Ben-Gurion University of the Negev, Israel
2 Xinyang Normal University, China
30th Annual IFIP WG 11.3 Conference on Data and Applications Security
and Privacy (DBSec 2016), Trento, Italy

Outline
• Introduction
• System Settings
• Overview of the Approach
• Count Operation
• Search and Fetch Operation
• Other Operations
• Conclusion
2

• Why is it required to ensure privacy?
– Users send data on the clouds
– Curious mappers and reducers can
• Store useful data
• Know the given job
• Where is it required?
– Banking, financial, retail, and healthcare
Introduction
3

Introduction
• What others do?
4
Work on encrypted data
Authentication & Compress + Encrypt data
Encrypt data-at-rest
Secure storage of data in HDFS
Provide authentication before using Hadoop cluster
They are making ‘computational secure’ data.
But, for how long is it secured??
Make information secure data

Outline
• Introduction
• System Settings
• Count Operation
• Conclusion
5

STEP 4:
Interpolatio
n and obtain
the final
results
Database
STEP3: Master
Process
Secret-
shares of
the
database
Data owner User-side
M
M
M
R
R
STEP3: Master
Process
Secret-
shares of
the
database
M
M
M
R
R
STEP3: Master
Process
Secret-
shares of
the
database
M
M
M
R
R
Notations:
M: Mapper
R: Reducer
System
Settings

Adversarial Setting
• Honest-but-curious adversary
– Wants to gain knowledge
– But executes computations honestly
7

Parameters of Analysis
• Communication cost
• Computational cost
• Number of rounds
8

Outline
• Introduction
• System Settings
• Count Operation
• Conclusion
9

Overview of the Approach
• Accumulating Automata*
– Make shares of data (or input split)
– Send these shares to mappers
• Mappers do not know the computation and data
– Mappers have a defined accumulating automata
– Example: Search a pattern “LO” in the string
“LOXLO”
10
*S. Dolev, N. Giboa, X. Li, “Accumulating Automata and Cascaded Equations
Automata for Communicationless Information Theoretically Secure Multi-Party
Computation: Extended Abstract,” SCC@ASIACCS, pages 21—29, 2015.

Overview of the Approach
11
L = {v3,v4}
O = {v5,v10}
X ={v1,v1}
L = {v4,v5}
O = {v15,v20}
N1 N2 N3
Mapper 1
2 1 1
L = {v5,v6}
O = {v10,v19}
X = {v2,v2}
L = {v6,v7}
O = {v20,v29}
N1 N2 N3
Mapper 2
3 4 8
Example: Search a pattern “LO” in
the string “LOXLO”
L = {v7,v9}
O ={v15,v28}
X = {v3,v3}
L = {v7,v9}
O ={v15,v28}
N1 N2 N3
Mapper 3
4 9 27
L = {v9,v12}
O={v20,v37}
X = {v4,v4}
L = {v9,v12}
O={v20,v37}
N1 N2 N3
Mapper 4
5 16 64
Reducer
v140
v698
v1964
v4226
𝑁1 𝑀
𝑘+1
= 𝑣0
𝑁2 𝑀
𝑘+1
= 𝑁1 𝑀
𝑘
. 𝑣1
𝑁3 𝑚
𝑘+1
= 𝑁3 𝑀
𝑘
+ 𝑁2 𝑀
𝑘
. 𝑣2
LO, 2

Creating Secret-Shares
• Consider only English words
– Represent an alphabet as:
• ‘A’ is represented as (11, 02, 03, . . ., 026)
– Make secret-shares of every bit by selecting
different polynomials of an identical degree
– Since we use different polynomials for creating
secret-shares of each bit, multiple occurrences
of a word in a database have different secret-
shares
12

Outline
• Introduction
• System Settings
• Count Operation
• Conclusion
13

Count Operation
• String-matching based
– Matches a value of a relation with a pattern, where
the value and the pattern are of the form of
secret-shares
• Two phases
– Phase 1: Privacy-preserving counting in the clouds
– Phase 2: Result reconstruction at the user-side
14

Count Operation
• Working in the cloud: A mapper
– Creates an automaton of x+1 nodes where x is the
length of p
– Initializes values of these nodes
– The first node is assigned a value one (N1 = 1) and
all the other nodes are assigned values zero (Ni =
0)
15

Count Operation
• Working in the cloud: Count ‘John’
16
Name
Adam
John
John
N1 = 1
v1
N2 = 0 N3 = 0 N4 = 0 N5 = 0
v1 v2 v3 v4
v1 = J * A v2 = o * d v3 = a * h v4 = m * n
N1 = 1 N2 = 0 N3 = 0 N4 = 0 N5 = 1
v1 = J * J v2 = o * o v3 = h * h v4 = n * n
N1 = 1 N2 = 0 N3 = 0 N4 = 0 N5 = 2

Count Operation
• Working at the user side
• Result construction – a simple interpolation
operation
17

Outline
• Introduction
• System Settings
• Count Operation
• Conclusion
18

Search and Fetch Operation
– PHASE 1: Finding addresses of tuples containing p
– PHASE 2: Fetching all the tuples containing p
19

Unary Occurrence
– No need to know the address
– Multiply
• Results will be 0 or 1 of the form of secret-shares
• Multiply the result with the tuple
– Add the values of an attribute
20
Name Department
Adam CS
John EC
John CS
Adam
Name Department
1 1
0 0
0 0
1 1

Unary Occurrence
• Working at the user side
– A simple interpolation
21

Multiple Occurrences
• Tradeoff
– Number of rounds vs computational load at the
user side
– Naïve algorithm and a database partitioning
algorithm
22

• The first way: Naïve Algorithm
– Requires a lot of computation at the user side
while only 2 rounds are required
– Now the user can know the address
23
Name Department
Adam CS
John EC
John CS
John
Name
0
1
1Multiply

• The first way: Naïve Algorithm – But HOW
TO FETCH
– Say L occurrences are there
– Create a matrix M of L*n
24
Name Department
Adam CS
John EC
John CS
0 1 0
0 0 1
Name
0
1
1
Name Department
John EC
John CS*
M

• The second way
– Requires less computation at the user side while
more than 2 rounds are required
– Partitions database and knows address
– Then fetches tuples using the solution suggested
in the naïve algorithm
25

#Occurrences = 2
#Occurrences = 2
Q&A Round 1 Q&A Round 2
Database
#Occurrence = 0
#Occurrences = 2
#Occurrence = 0
#Occurrence = 1
#Occurrence = 1
Q&A Round 3

Outline
• Introduction
• System Settings
• Count Operation
• Conclusion
27

Other Operations
• Equijoin
– Use two layers of clouds, where the first layer
performs fetch operation and the second layer
performs equijoin operation
• Range query
– By using 2’s complement
– Count the occurrence of number that lies in the range and
then fetch those tuples
28

Outline
• Introduction
• System Settings
• Count Operation
• Conclusion
29

Conclusion
• Privacy-preserving operations based on
MapReduce
– A way to create secret-shares
– Count, search, and fetch operations
– Equijoin and range quires
30

Shlomi Dolev1, Yin Li2, and Shantanu Sharma1
1 Department of Computer Science, Ben-Gurion University of the
Negev, Israel
{dolev,sharmas}@cs.bgu.ac.il
2 Department of Computer Science, Xinyang Normal University, China
yunfeiyangli@gmail.com
Presentation is available at
http://www.cs.bgu.ac.il/~sharmas/publication.html

Private and secure secret shared map reduce

Recommended

Recommended

More Related Content

Similar to Private and secure secret shared map reduce

Similar to Private and secure secret shared map reduce (20)

More from Shantanu Sharma

More from Shantanu Sharma (8)

Recently uploaded

Recently uploaded (20)

Private and secure secret shared map reduce