Cloud Storage Oriented Cipher-text Search Protocol

Cloud Storage
Oriented
Cipher-text Search
Protocol
Catalogue
1. Introduction ........................................................................................................................ 4
1.1 Background ................................................................................................................ 4
1.2 Purpose ....................................................................................................................... 5
1.3 Application ................................................................................................................. 6
1.4 Terminology ............................................................................................................... 6
1.5 Symbol description ....................................................................................................... 8
1.6 Normative reference ................................................................................................. 10
2. Overview .......................................................................................................................... 11
2.1 Protocol overview....................................................................................................... 11
2.2 Design Philosophy ...................................................................................................... 12
2.3
Requirements for Design ............................................................................................ 16
3. Data Types ......................................................................... Error! Bookmark not defined.
3.1 Definition..................................................................... Error! Bookmark not defined.
3.1.1 File ........................................................................ Error! Bookmark not defined.
3.1.2 Array ..................................................................... Error! Bookmark not defined.
3.1.3 Index ..................................................................... Error! Bookmark not defined.
3.1.4 Token .................................................................... Error! Bookmark not defined.
3.1.5 Proof ..................................................................... Error! Bookmark not defined.
3.1.6 Hash table ............................................................. Error! Bookmark not defined.
3.1.7 Merkle Hash Tree ................................................. Error! Bookmark not defined.
3.2 Implementation ............................................................ Error! Bookmark not defined.
4. Message Types .................................................................. Error! Bookmark not defined.
5. File Storage........................................................................ Error! Bookmark not defined.
5.1 Overview ..................................................................... Error! Bookmark not defined.
5.2 Generate index ............................................................. Error! Bookmark not defined.
5.3 Generate keys .............................................................. Error! Bookmark not defined.
5.4 Encryption ................................................................... Error! Bookmark not defined.
5.5 Generate....................................................................... Error! Bookmark not defined.
5.6 Upload file ................................................................... Error! Bookmark not defined.
5.7 Store files..................................................................... Error! Bookmark not defined.
6. File Search ......................................................................... Error! Bookmark not defined.
6.1 Overview ..................................................................... Error! Bookmark not defined.
6.2 Generate search token.................................................. Error! Bookmark not defined.
6.3 Send search token ........................................................ Error! Bookmark not defined.
6.4 Search .......................................................................... Error! Bookmark not defined.
6.5 Return result ................................................................ Error! Bookmark not defined.
6.6 Decryption ................................................................... Error! Bookmark not defined.
7. Challenge and Proof .......................................................... Error! Bookmark not defined.
7.1 Overview ..................................................................... Error! Bookmark not defined.
7.2 Generate challenge ...................................................... Error! Bookmark not defined.
7.3 Send challenge ............................................................. Error! Bookmark not defined.
7.4 Generate proof ............................................................. Error! Bookmark not defined.
7.5 Send proof ................................................................... Error! Bookmark not defined.
7.6 Validate proof .............................................................. Error! Bookmark not defined.
8. File Update ........................................................................ Error! Bookmark not defined.
8.1 Overview ..................................................................... Error! Bookmark not defined.
8.2 Generate Keys ............................................................. Error! Bookmark not defined.
8.3 Generate update token ................................................. Error! Bookmark not defined.
8.4 Send token/file ............................................................. Error! Bookmark not defined.
8.5 Update files.................................................................. Error! Bookmark not defined.
8.6 Update index................................................................ Error! Bookmark not defined.
8.7 Update Search Authentication Token .......................... Error! Bookmark not defined.
8.8 Return new DSA ......................................................... Error! Bookmark not defined.
8.9 Update DSA ................................................................ Error! Bookmark not defined.
9
Error Handling ................................................................ Error! Bookmark not defined.
1.
Introduction
1.1
Background
In recent years, with the rapid development of cloud computing, cloud storage
which is one of the most important parts of cloud computing is becoming a
researching hot. Technically speaking, cloud storage refers a system which consists of
large numbers of different types of network storage devices working together. These
devices use the technology of cluster application, grid and distributed file system to
provide storage and business access.
At present, the rapid development of computer technology and Internet
application cause data grow exponentially. People have more and more demand for
storage. Under this trend, the proposal and development of cloud storage not only
brings cheap storage service for people but also challenges traditional data storage
service. Cloud storage, as a new storage method, has a big advantage over traditional
storage. First of all, in the cloud storage, storage exists as a service. When a user has
requirement for storage, he applies the appropriate size of space from cloud storage
service provider to avoid constructing and managing storage platform himself. It
ensures the full utilization of storage resources, and reduces the overhead of storage
cost of user. Secondly, cloud storage can provide data backup, disaster recovery, load
balance and other functions. So when some storage nodes are upgraded or damaged, it
can also provide storage service normally to user to avoid the interruption of service.
In addition, the authorized users can access the cloud storage service at any place
through network. This storage flexibility will play a great role in promoting the
development of mobile Internet. The scalability, low cost, no access restrictions and
easy management of cloud storage will bring a great challenge to the traditional
storage method.
At the same time, cloud storage also has problems. In cloud storage, all the data
are delivered to cloud storage provider, and users lose the absolute control of the data,
which will inevitably cause users to concern about the data security. Cloud storage has
provided security protection measures, such as the use of general SSL in data
transmission (Secure Sockets Layer) and TLS (Transport Layer Security) protocol,
data encryption and firewall settings, but the data security depends entirely on the
cloud storage system security, the quality of data administrator and other controlled
factors because of the centralized management of the CSP. In addition, data stored in
the cloud has become a primary target for malicious users and hackers. And if the data
isolation becomes invalid, the private data of users may be leaked. Although cloud
storage provider provides SLA (Service Level Agreement) protocol to user to
illustrate the services grades it provides, various uncontrolled factors still cause the
concerns of users. Data security is a key problem in cloud storage. The survey of the
Twinstrata in 2012 shows that only 20% of users are willing to store their private data
in the cloud, and about 50% of people are willing to use cloud storage for data backup,
archival storage, disaster recovery and so on. Thus, the problem of data security
obstacles the cloud storage extension.
The cloud storage vendors, such as Windows, VMware, Amazon, Google all
have launched their own cloud storage services and give a certain assurance of data
security, such as a variety of encryption, authentication means, to protect the privacy
of users. But there are still various security incidents. In 2005, the encrypted tapes of
the America bank were lost, resulting in the disclosure of a large number of customers’
information. In April of 2011, the information of nearly 77000000 online customers of
Sony was stolen, including credit card data. In June of the same year, Google was
invaded and some important personnel mailbox accounts were theft. These security
events have caused users to lose trust in these cloud service providers.
1.2
Purpose
This protocol constructs a dynamic cipher-text search model and a search
verification model based on cloud storage from the perspective of protection of data
privacy. This model enables users to store their private data at the untrusted party.
Even if the data are stolen, it will not disclose any information about the plaintext of
the data. The model also supports the search operation based on the keyword and
dynamic adding and deleting files.
1.3
Application
The system based on this protocol can be used in some confidential departments
and sensitive commercial sectors. These departments and sectors can store a large
number of secret information in the cipher-text form in the cloud server and retrieve at
any time as necessary.
1.4
Terminology
Keyword. A keyword is used to indicate the word of file content, which is the
generalization and centralization of information. In this protocol it refers to some
words selected as the identifications of the files.
Linked list. The linked list is a storage structure, the physical storage unit of
which is non-continuous and non-sequential. The logical sequence of data elements is
implemented by the linked order of pointer. The list consists of a series of nodes (each
element in the list is called node), which can be dynamically generated at runtime.
Each node consists of two parts: one is the data domain which stores the data
elements and the other is the pointer domain which stores the address of the next node.
Compared to the structure of the sequence list, the linked list is more convenient for
inserting and deleting operations.
Array. An array is a form of organizing some variables with the same type
orderly in order to handle easily.
Pseudo random function. A pseudorandom function is an algorithm of producing
random numbers for people.
Token: A token is a kind of special frames which can control the site occupying
media, to distinguish from the data frame and the other control frames. In this
protocol the token is a kind of data format, used to represent the transmission type of
message and the transmission data.
Inverted index. An inverted index is derived from practical applications which
need to find records according to the value of the attribute. Each part of this index
table includes an attribute value and the address of each record with the attribute
values. It determines the record position by attribute value and it doesn’t determine
the attribute value by record.
Encryption. Encryption is to make the original plaintext files or data become
unreadable code, often referred as "cipher-text", according to a certain algorithm. It
can only demonstrate the original content after the corresponding key is input. By this
way, the purpose to protect data from being stolen or read by illegally people is
achieved.
MD5. Message Digest Algorithm MD5 (message digest algorithm version fifth)
is a hash function which is widely used in the computer security field to provide the
integrity of the message. The algorithm transforms an arbitrary length byte string into
a large length-fixed integer to ensure data integrity.
Cloud storage. Cloud storage develops based on cloud computing. It focuses on
providing users with online storage service based on the Internet. Cloud storage
organizes a large number of different types of storage devices to cooperate together
by software to provide external data storage services.
Digital signature. Digital signature is some data which is added on the data unit,
or the cipher transformation made to the data unit. This data or transformation allows
the recipient of data unit to confirm the data sources and data integrity and to protect
data from being forged.
Key. The key is a kind of parameter. It is the data as the input of algorithm
converting the plaintext into cipher-text or cipher-text into plaintext.
1.5
Symbol description
Symbol description
symbol
F  { f1 ,
description
, fn}
The file set including n files
#F
The number of files
fi : {w1 , , wm }
The files including m keywords
W
The set of keywords
#W
The number of keywords
fw
The file set including the keyword w
# fw
The number of files including the
keyword w
The linked list made up of files including
Lw
keyword w
The linked list made up of all keywords
Lf
in file f

Inverted index
free
An special keyword ,satisfied free  W
As
Search array, used to store keyword
linked list
Dictionary, used to record the head node
Ts
of the linked list
Deleting array, used to store file linked
Ad
list
Dictionary, used to record the head node
Td
of the linked list
γ
Encrypted index, defined as
  ( As , Ts , Ad , Td )
Function description:
The algorithm used in the search part:
Description of Algorithm Function
Gen(1k ) : Running on the client, used to generate the key for symmetric encryption
algorithm and pseudo-random function.
Enc( F ,  , K ) : Use the key to encrypt the user’s file and keyword information into
cipher-text and the encrypted index.
SrchToken( K , w) . Use the user’s keyword to generate the corresponding search
token.
Search(c,  , s ) : Use the search token to perform search operation on the encrypted
index.
AddToken( K , f ,  f ) : Generate the add token according to the files to be added and
the corresponding keywords.
Add ( , a , c f ) : Use the received add token to add files and update the stored
encrypted index.
DelToken( K , f ,  f ) ：Generate the deleting token according to the files to be deleted.
Del ( , d , c) : Use the received delete token and the files to be deleted to update the
stored encrypted index.
The algorithm used in the authentication part:
Description of Algorithm Function
Gen(1k ) . To generate the key used in the algorithm and the client is responsible for
keeping the key.
Auth( K ,  , F ) : To generate the search authenticator when storing files.
Chall ( K , w) : Run by the Client, and it is used to generate the challenge of searching
some keywords.
Prove( , ch) : Run by the Server, and it is used to generate the authentication path
according to a certain search.
Vrfy(st ,  , f w ) : The verification algorithm. Run by the Client, and it is used to verify
the proof sent from the Server
AddToken( K , f ,  f ) : Generate add token according to the files to be added and the
corresponding keywords.
Add ( , a , c f ) : Use the received add token to add files and update the stored
encrypted index.
DelToken( K , f ,  f ) : Generate the deleting token according to the files to be deleted
Del ( , d , c) : Use the received delete token and the files to be deleted to update the
stored encrypted index.
Update(st ,  , a ) : It is used to update the DSA state.
1.6
Normative reference
Kamara S, Lauter K. Cryptographic cloud storage[J]. Financial Cryptography and
Data Security, 2010: 136-149.
2.
Overview
2.1
Protocol overview
The system uses C/S architecture and it is composed of two entities, client and
server. The main function of client is key generation, data encryption/decryption,
authenticator generation, token generation and so on. The main function of server is
searching, proof generation, update operation and so on. The overall frame is shown
in figure 2.1.
File
① En/Decry
pt
Keyword
② Generate
token
User
③ Generate
update
File
client
⑦
④
Store
⑤
Search
⑥
Update
server
Cloud
storage
figure 2.1 the frame structure of the cloud storage system
Figure 2.1: ①file encryption/decryption at the client; ②generate the search
token using keywords at the client; ③generate add/deleting tokens using the files to
be added/deleted at the client; ④store the files which user uploads at the server; ⑤
search on the encrypted index according to the received search token at the server; ⑥
update files according to the received add/delete token at the server; ⑦interaction of
data between the client and server.
As can be seen from figure 2.1, the main function of the client is obtaining the
original data from the users (including the files to be uploaded, the searching
keywords, and the files to be updated), processing data and uploading data to the
cloud. The main function of the server is receiving the data sent by client and doing
the corresponding operations, mainly including storing, searching, and updating.
2.2
Design Philosophy
The protocol analysis the security of the existing cloud storage system, and puts
forward a secure framework of cloud storage system.
STORE
SSE.Enc() DSA.Auth()
Files
SEARCH
keyword
SSE.Search() DSA.Chall()
User
UPDATE
SSE.Update() DSA.Update()
file
Cloud Storage
Fig. 2.2 complete Security Model of Cloud Storage
The construction of the model of secure cloud storage system is based on the
Searchable Symmetric Encryption (SSE) algorithm, combining with the secure cloud
storage system architecture. Through the SSE algorithm, the user can encrypt the data
and index, and send the cipher-text and the secure index to the cloud service provider
for storing. When executing the search operation, the cloud service provider searches
on the secure index using the search token generated by the user, and returns the
cipher-text set to the user. Then the user can decrypt the received result and get the
plaintext file corresponding to the search keyword. In addition, users can add files and
delete files at any time, and it still can be able to guarantee the correctness of the
index. In order to verify the search results returned by the server, a dynamic search
authentication (DSA) algorithm is designed. The algorithm is based on the improved
Merkle authentication dictionary and can validate the correctness of the search result.
The algorithm also support update operation based on the token and the algorithm can
achieve higher efficiency at communication and computations.
Searchable encryption
The model involves only two entities. One is the owner of the confidentiality
data, who hopes to store the data in the cloud and prevent from illegal access to the
data. This kind of entity is called the user (Client). The other kind of entity is the
cloud storage service provider, who provides storage interface outward, stores the data
and performs specific search operation on the data. It is called the server (Server).
According to the mentions above, in order to guarantee the security of the data in
the maximum extent, all of the operations processing user data are basically placed at
the client, including user’s files encryption, file index encryption and process of
keywords. And the server only needs to store the files and do the limited retrieval
function.
SSE.Enc(·)
upload
Index
Encrypted
Index
Symmetric
Encryption
Files
upload
Ciphertext
Cloud Storage
SSE.Serach(w)
SSE.Update(F)
User
Fig. 2.3 Searchable Encryption Model of Cloud Environment
As can be seen from the chart, the user uses the computer to select the file sets
needed to be stored, preprocesses the files and then uploads them to the cloud. The
preprocessing of the files is divided into two parts which execute simultaneously. One
part is using symmetric encryption algorithm to encrypt the files set to get the
cipher-text set, and then uploading them to the cloud storage server. The other part is
constructing the index using the keywords of the file, encrypting the index using the
special encryption method and storing the result which is called the encrypted index
in the cloud. The storage of the file and index is managed by the cloud storage service
providers. The users only need to upload the files, without caring about the details of
file storage. When a user searches some keyword, the client generates the search
token corresponding the keyword using the method provided by the algorithm and
sends the token to the cloud storage server. Then the server performs the search
operation and returns the result.
The key of constructing the searchable encryption algorithm lies in the
encryption of file index. In order to obtain a better search experience, this protocol
uses the form of keyword specified by the user in advance. After obtaining the
keyword information, preprocess these keywords. The keyword linked list is
constructed by the files containing the same keyword and the file identifier is written
in the linked list corresponding to the nodes. All of the keyword linked list form the
inverted index. In order to ensure that the server cannot obtain effective information
from the index, the pseudo random function (PRFs) is used to encrypt the inverted
index. The encrypted index is stored in the random position of the search array, and
each head node of the list is stored in the dictionary Ts (also called search table). The
processed arrays and dictionaries are stored in the server. Because the inner elements
are all encrypted data, the server cannot get the plaintext information directly from the
search arrays and the search table. When the user search a keyword, process the
keyword to get the search token which contains the information designating the
position of the keyword in the encrypted index. After the server receives the search
token, it reads the encrypted index of the user, performs the search operation, gets the
file identification, and sends the responding cipher-text to the client.
The user of cloud storage users may add or delete the files at any time, so the
protocol must be able to support dynamic addition and deletion operation. The
previous discussion shows that the key of the search lies in the construction of the
encrypted index. In order to ensure that it still can be efficient and correct to perform
the search operation after the user adds and deletes the files, the encrypted index must
be updated in the process of adding and deleting files. When the user adds files, the
keyword that the file contains maybe existed or new. No matter what kind of situation,
it only needs perform the corresponding updating operation on the keyword linked list,
and the operation is not difficult. When the user deletes a file, the file contains
different keywords which may be at any node of the keyword linked list. So every
node of the linked list containing the keyword must be traversed. After deleting the
node, the continuity of the linked list also needs to be ensured. So the deleting
operation is complex and low efficiency.
In order to update the encrypted index more efficiently when a file is deleted, the
file linked list is constructed by the keywords of a file. All the file index forms the file
index. Encrypt the index and store it in the random position of the arrays which is
called the deleting array Ad (Deletion Array). Store the head node of the linked list in
the dictionary Td (Deletion Table). So when a file is deleted, find the position of the
keywords corresponding to the file in the As on the deleting array, update the
correspondingly in the As, and delete the corresponding file linked list from Ad. In
order to ensure that the server cannot get the file information of the user from array,
the random string is used to fill the unused unit in the array. At the same time, in order
to be able to find a free node in the As when adding a file, the idle node of the array
needs to be recorded. This protocol uses a special keyword to construct the idle nodes
linked list and stores the head node of the linked list in the search table, as storing the
inverted index.
Search for certification
The cloud storage model has been introduced before, and this model can realize
the function of the cipher-text search based on the keyword. Due to the lack of a
verification mechanism for the search operation, so this model is not perfect. So the
model will be improved in function next to add the function of the verification for the
search.
The protocol uses MHT as the basic authentication structure. Every file linked
list associated a keyword is as the data source of the leaf node in the MHT. Calculate
the value of the node using one-way hash function, and construct a full binary tree (in
order to facilitate the operation) based on the value. The root node of the
authentication tree is as the verification value, stored by the users of cloud storage
memory for the subsequent verification operation. The authentication tree itself is as
the authenticator stored by the server. When a user searches a file corresponds to a
keyword, the challenge according to this keyword and the search token are all
generated at the same time. The server performs the search operation according to the
search token, and generates the verification path according to the challenge at the
same time. After the user obtains the search results and the proof, he decrypts the
result and gets the value of the leaf nodes in the MHT by calculating. Then calculate
the final verification value according to the proof. Compare this value to the value
stored at the client. If they are the same, the verification is passed. Otherwise the
verification is failed and the operation is terminated.
2.3
Requirements for Design
In order to fully use storage service provided by the cloud storage, let the server
performs the search operation and ensure that the server cannot get any useful
information during the interactive process, this protocol designs a cipher-text search
method.
First of all, the user selects the files to be stored and adds some keywords
descripting the file for each file. Then construct the keyword index using these
keyword information. In order to ensure that the index will not reveal the file
information, the special process of encrypting these indexes is required special. Use
the symmetric encryption algorithms such as AES algorithm to encrypt the files of the
user, send the cipher-text and the encrypted index together to the cloud storage server
for storage.
When the user retrieve a keyword, he inputs the keyword, processes it to get the
keyword token and sends it to the server. After receiving the keyword token, the
server retrieves on the encrypted index of the user, finds the cipher-text corresponding
to the token, and return the result to user. Note that in this process the server doesn't
know what the search keyword the user specifies. The only effective information that
can be obtained is the specific files corresponds to the specific token.
Using this idea, the cipher-text search method supporting keyword search is
constructed to satisfy the demand of storing confidential data in the cloud storage for
user and give the server the ability of transparent search.
3.
Data Types
3.1
Definition
3.1.1
File
This section introduces the file types supported in the protocol. The file
operations in the protocol are: file upload and file update.
The file types supported in this protocol are: text files (including the files with
the suffix: .txt, .doc, .docx, .pptx, .xls, etc.), sound files (including the files with the
suffix: .mp3, .wav and so on), video files (including the files with the
suffix: .avi, .mp4 etc.)
3.1.2
Array
As : Search array. The linked list indexed by the keywords of the files is called the
keywords linked list. The file identifier is written into the corresponding node list. All
the keywords linked list form the inverted index. In order to ensure the server cannot
acquire any effective information from the index, use the pseudo random function to
encrypt the inverted index. Store the encrypted index in the search array A s randomly
and the head nodes of each linked list are stored in the dictionary Ts (Search Table).
Ad : Deletion array. In order to update the encrypted index efficiently, the linked
list constructed by the keywords of each file is called the file linked list. All the file
linked lists form the file index. Encrypt the index and then store it in the random
position of the array which is called the deletion array Ad. The head node of the linked
list is stored in the dictionary Td .
3.1.3
Index
In this protocol contains two kinds of indexes: the inverted index and the
encrypted index.
The inverted index is constructed using keyword information of the file.
The encrypted index is the encrypted inverted index using special method.
3.1.4
Token
This protocol uses the token to do search operations and update operations. 
represents the tokens which includes search token, add token, deletion token. The
search token is defined as  s . The add token is defined as  a , and the deletion token
is defined as  d .
Search token. The format of search token is:  s : ( FK ( w), GK ( w), PK ( w)) , in
1
2
3
which w represents the keyword and k represents the key. When the user retrieves a
file containing a certain keyword, he first processes the keyword and get the
corresponding search token. After the server gets search token, it reads the user’s
encrypted index and calls the corresponding algorithm to search to get the cipher-text
set corresponding to the search token. Finally the cipher-text set is sent to the user.
Add token. The format of add token is:  a : ( FK1 ( f ), GK2 ( f ), 1 ,
, # f ) , in
which f represents the files to be added. When the user adds a file, the client first
generates add token using the file and keyword information. The client encrypts the
file and sends add token and the encrypted file to the server. The server receives the
encrypted file, reads the user’s encrypted index and update the encrypted index using
add token.
Deletion
token.
The
format
of
deletion
token
is:
 d : ( FK ( f ), GK ( f ), PK ( f ), id ( f )) , in which f represents the files to be deleted. The
1
2
3
process of deleting files is similar with adding files. The user may not have a copy of
the file locally when he wants to delete a file, so the deletion algorithm needs to
download the file to be deleted from the server, and then generates deletion token
using the file.
3.1.5
Proof
Proof. The format of deletion proof is:  : {i ,1  i  h} , in which h represents
the height of the authenticator  . When a user searches a file with a certain keyword,
he generates search token and the challenge corresponding to this search. The server
executes search operation according to the search token and at the same time
generates the certification path according to the challenge, also known as proof.
3.1.6
Hash table
Hash table is a data structure with direct access based on the key value. That is to
say, it maps the key value to a position in the table to access records in order to speed
up the search. The mapping function is called the hash function, and the array which
stores records is called the hash table.
3.1.7
Merkle Hash Tree
The Merkle hash tree is the authentication structure based on the tree structure.
The authentication structure can be used to verify the data integrity. It is usually
defined as the complete binary tree when in use.
The Merkle hash tree is a full binary tree and it just uses a one-way hash function
in the computation. Sometimes complete binary tree can also be used to represent the
Merkle hash tree, because the Merkle tree used in the protocol has 2l leaf nodes and it
also belongs to complete binary tree.
The initialization of the Merkle hash tree requires mapping the documents to be
authenticated to leaf nodes and grow reversely through the hash function to construct
a complete hash tree. Then the verifier only needs to record the value of the root node
of hash tree and send the hash tree as the authenticator to the untrusted server. In the
stage of verification, the verifier generates the verification challenge of some leaf
node. The server receives the challenge and generates verification paths
corresponding to the position of challenge the leaf nodes corresponding, and transmits
it to the verifier, the verifier can verify operation according to the root node to verify
the path and the stored value. This verification method is far less than the complete
data retrieved to calculate way of validation in computational cost and communication
cost.
The figure below is an example showing how to generate a hash tree:
H(1,8,Y)
H(1,4,Y)
H(3,4,Y
)
H(1,2,Y)
H(1,1,Y)
Y1
Y2
H(5,8,Y)
H(2,2,Y) H(3,3,Y) H(4,4,Y)
Y3
Y4
H(7,8,Y)
H(5,6,Y)
H(5,5,Y)
Y5
H(6,6,Y) H(7,7,Y) H(8,8,Y)
Y6
Y7
Y8
Fig.2.3 the structure of Merkle hash tree
Assuming that the data set is Y  {Y1 ,..., Y8 } and each data in the set Yi can be the
data source of leaf node. Calculate the value of the leaf node of hash tree through the
one-way hash function F, and the calculation method can be expressed as
H (i, i, Y)  F (Yi )
. After calculating each leaf node value, every two brother node
values are mapped to a value which is as the node values of its father using one-way
hash function, and finally construct the whole Hash authentication tree in this way. In
the process of calculation using one-way hash function F, with two leaf node values as
input, the final outputs a fixed length value, the calculation can be expressed as
H (i, j, Y)  F ( H (i, (i  j  1) / 2, Y), H ((i  j  1) / 2, j, Y)) . The root node value of
hash tree is expressed using the symbol  and stored by the verifier as the
verification value. However, the hash tree itself is stored by the third party server.
3.2
Implementation
The index in the protocol can be achieved using two-dimensional array. Merkle hash
tree can be achieved using full binary tree. The rest data structure can be achieved by
variables. The reference implementation of data structure in the protocol is given in
the below.
Data
Definition
Description
CArray<CStringArray*,CStringArray*>
A two-dimensional array, used to store
structure
Index
the file index and the inverted index
Hash table
hash_map<CString,char[16]>
Hash table, storing the corresponding
relationship between the elements and its
MD5 value to avoid the repeated
computation of the hash value
MHT node
{char[16]}
The structure of MHT node
Search
char fileID[8]
File ID
array(1)
short loc_pre
The position of the previous node in the
list.
Search
short loc_next
The position of the next node in the list.
bool flag
Indicates whether an array node has been
array(2)
used
short loc_d_next
The position of the previous node in the
list.
short loc_d_dual_pre
The coordinate of the previous node of
the dual node in the Ad
Deletion
short loc_d_dual_next
array(1)
The coordinate of the next node of the
dual node in the Ad
short loc_s
The coordinate of the dual node in As
short loc_s_pre
The coordinate of the previous node of
the dual node
short loc_s_next
The coordinate of the next node of the
dual node
char fk1w[16]
The record of the value FK1 (w)
Deletion
bool flag
Identifier
array(1)
char complexStr[32]
The record of the XOR value
char randomStr[16]
The record of the random string.
char fk1w[16]
The record of the entrance address of the
The
entrance
int loc
MHT leaf node
char updateInfo[16]
The definition of the structure of proof.
address of
MHT leaf
proof
char prove[ProveSize]
Search
char fk1w[16]
The definition of the structure of search
token
char gk2w[16]
token.
char pk3w[16]
Add
char fk1w[16]
The definition of the structure of add
token(1)
char gk2w[16]
token.
char complexStr[16]
char randomStr[16]
Add
int count
The structure of add token, in which
token(2)
char fk1free[16]
count represents the number of key
char gk2free[16]
words in the file
char fk1f[16]
char gk2f[16]
char pk3f[16]
Deletion
char fk1f[16]
The definition of the structure of
token(1)
char gk2f[16]
deletion token when every file is
char pk3f[16]
calculated.
char fk1free[16]
char gk2free[16]
Deletion
token(2)
int count
The structure of deletion token, in which
count represents the number of the files
4.
Message Types
In the process of the interaction between the client and the server, the format of
transmission message is defined as follows:
The message contains the following fields:
1. Message type
The Message type field is mainly used to indicate the type of the transmission
message. The field uses 8 bits, and the first 4 bits is used to distinguish between an
operation message and a notification message.
If it is an operation message, it indicates that the information carried in the
message is a specific operation and the first 4 bits is set to 0000. If it is a notification
message, it indicates that the message is used to notify whether the operation has been
performed correctly and the first 4 bits is set to 0001.
(a) Add files (the first to add). When the client firstly adds a file, the client
sends the encrypted files and indexes to the server. The field of the message is set to
0x01.
(b) Search operation. When the user performs a search operation, the client
generates and sends search token to the server. The message of the field is set to 0x02.
When the server returns the results the user wants to search, this field is set to the
0x03 message.
(c) Authentication operation. When the authentication of the search data is
requested, the client will send the challenge to the server and the field of the message
is set to 0x04. When the server has generated proof, it sends the prove value to the
client and the field of the message is set to 0x05.
(d) Update operation. When the user needs to add files to the server (not the
first to add), the client sends the new files and add token to the server. The server
updates files and index using them. The field of the message is set to 0x06. When the
user needs to delete the files in the server, the client sends the deletion token to the
server for deleting the files. The field of the message is set to 0x07. When the server
executes a DSA status update, the server sends the new DSA state to the client for
authentication. The field of the message is set to 0x08.
(e) Operation tips. It is used for the server to notify whether the operation is
successful. If the operation is successful, the field of the message is set to 0x00. If the
operation fails, the server returns an error message and the field of the message is set
to 0x01.
2. Length
This field is used to represent the size (Byte) of the data part in the transmission.
3. Direction
This field is used to represent the direction of message transmission. When the
client sends the message to the server, the field of the message is set to 0x00. When
the server sends the message to the client, the field of the message is set to 0x01.
4. Type
This field is used to represent the type of the transmission data.
If the Data field in the transmission is encrypted file, the field is set to 0x00.
If the Data field in the transmission is index the field is set to 0x01.
If the Data field in the transmission is add token, the field is set to 0x02. If the
Data field in the transmission is deletion token, the field is set to 0x03. If the Data
field in the transmission is search token, the field is set to 0x04.
If the Data field in the transmission is challenge, the field is set to 0x05.
If the Data field in the transmission is proof, the field is set to 0x06.
If the Data field in the transmission is search authenticator, the field is set to
0x07.
If the Data field in the transmission is DSA state, the field is set to 0x08.
If the Data field in the transmission is error information, the field is set to 0x09.
5. Data
This field is used to store the data to be transmitted.
5.
File Storage
5.1
Overview
When the user uploads a file, he first chooses the files to upload from the local
disk and attaches some keyword (specified by the user) description for each file. After
the files has been chosen, the client preprocess the data, including generating the
encrypted index, the search authenticator and the encrypted files and upload them to
the server for storage.
Cilent
the ciphertext : c
the encrypted index : 
the search authenticator : 
Server
 c,  ,  
receive and
store the file
Fig.5.1.1 The flow chart of file storage section
The flow chart of file storage section is shown in Figure 5.1.1. First the client
generates the keyword index according to the keyword. Then the client encrypts the
index and the files and gets the cipher-text c and the encrypted index λ. The client
generates the Merkle hash tree, also known as search authenticator, and sends the hash
tree, the cipher-text c and the encrypted index λ to the server. The server receives
these files, stores them at the local, and returns a message to the client to inform it
whether the operation has been performed successfully.
5.2
Generate inverted index
The operation of generating inverted index is performed on the client in the local.
According to keyword information of the file, construct the file linked list based
on the keyword. All the file linked lists form the inverted index. Encrypt the inverted
index to get the encrypted index. The flow chart of generating the inverted index is
showed in Figure 5.2.1.
Start
Traverse the file list
Get the file
Traverse the keyword list
no
The keyword list is null
no
Include current
keyword
yes
yes
Add keyword and file path
no
All the file path
Traversing the keyword
list is finished
yes
no
Traversing the file list is
finished
yse
end
Fig.5.2.1 The flow chart of generating the inverted index
5.3
Generate keys
The operation of generating keys is performed on the client in the local.
Key generation process
Gen(1k ) : 1k, which is the system security parameter, is the input of the function.
Select three k-bits strings K1, K2, K3 randomly as the key of the pseudorandom
function. Compute k4  SKE.Gen(1k ) as the key of symmetric encryption algorithm.
The algorithm outputs the key K  ( K1 , K2 , K3 , K4 ) .
Gen(1k ) is running on the client for generating the key of symmetric
encryption algorithm and pseudo random function. The generated key is only used
locally on the client. So the key management on the client is very simple. The client
only needs to store the key, and it is not related to the key distribution operation.
Notably, the file encryption, the index encryption and updating operations all need the
key to participate in. So if the user’s key is missing, it will be unable to retrieve their
data from the cloud storage server.
5.4
Encryption
The operation of encryption is performed on the client in the local.
The process of encryption includes file encryption and index encryption.
After generating the encrypted index and the authenticator, encrypt the user’s
plaintext files. The process of file encryption is relatively simple. It only needs to loop
for the collection of files, and use symmetric encryption algorithm to encrypt the files.
Attach the file name and the keyword information to the end of plaintext file for the
use of generating deletion tokens in the subsequent procession and encrypt the files.
To generate the encryption index, first traverse the inverted index to generate the
MD5 of file and keyword, the search array which has been filled and the search table.
Traverse the file linked list, and use the files and the MD5 value of keywords to fill in
the deletion array and search table. After the completion of the traversal, construct the
free linked list and store it in the array. Finally write the generated search array,
deletion array and two search tables into the file and save them in the disk.
The flow chart of generating encrypted index is shown in figure 5.4.1.
start
Generate the inverted
index
Traverse the file list
Initialize the array and search
table
Read the MD5 of files
Traverse the keyword
list
Traverse the inverted
list
Read the MD5 of
keywords
Calculate the MD5 of
files
Fill the delete array
Traverse the file list
no
Calculate the MD5 of
files
Fill the search table
no
The node is the head
node of the list
The node is the
head node of the
list
yes
Record the coordinate in the
delete table
Traversing the keyword
is finished
no
yes
yes
Record the coordinate in the
search table
yes
no
Traversing the keyword
list is finished
yes
no
Traversing the file list
is finished
Traversing the inverted
list is finished
yes
no
yes
Construct the free list
Modify the search/delete
array
Record the head node of the list in the
search table
end
Fig.5.4.1 The flow chart of generating encrypted index
The realization process of encryption algorithm is described in detail through the
formal definition.
Encryption algorithm process
Enc( F ,  , K ) : Input key K, file set F  { f1 ,..., f n } , inverted index  , and process as
follows:
1. Initialize array As , Ad and dictionary Ts , Td .
2. For each keyword wi W ,process as follows:
(a) Create the linked list Lwi , and the list contains # f w nodes ( N1 ,..., N # f w ) . These
nodes will be stored in the array As randomly. Define
Ni : ( idi , locs ( Ni 1 ), locs ( Ni 1 )   H1 ( PK3 ( w), ri ), ri ) .
Among them, idi represent the ith document identification and ri is a random
string to be filled. locs ( N0 ) and locs ( N # f w 1 ) are all defined as 0.
(b) Store the address of the head node of each linked-list Lwi in the search table
Ts . The structure of Ts is defined as Ts [ FK1 (w)]: locs ( N1 ), locd ( N1* )  GK2 (w) .
locs ( N1* ) represents the coordinate of the dual node of N1 in the array Ad .
3.for each fi  F in the every file, process as follows:
(a) Construct the linked-list L fi which contains # f i nodes ( D1 ,..., D# fi ) and store
the nodes in the array Ad randomly. Notice that every node Di is associated with a
keyword w , therefore it is also associated with a node N in the linked list Lw . N 1
and N 1 are defined as the previous node and the next node of N in the keyword
linked-list. The structure of the node Di is defined as:
Di : ( locd ( Di 1 ), locd ( N*1 ), locd ( N*1 ), locs ( N ), locs ( N1 ), locs ( N1 ), FK (w) )  H 2 ( PK2 ( f ), r)
ri represents the random string to be filled. locd ( D# f 1 ) is defined as 0.
i
(b) Store the address of the head node in the each linked-list L fi in the search
table Td . The structure of Td is defined as Td [ FK1 ( f )]  locd ( D1 )  GK2 ( f ) .
4. Select  unused nodes ( F1 ,..., F ) and ( F1' ,..., F' ) randomly each from the free
array L free . For each node 1  i   , the structure of As is defined as:
As [locs ( Fi )]: (locs ( Fi 1 ), locd ( Fi )) .set locs ( F1 ) as 0 and store the head node of the
linked-list L free in the search table Ts .the structure is defined as Ts ( free) : locs ( F1 ) .
5. Fill the other unused nodes in the array As and Ad with random strings.
6. Encrypt each file fi  F and get the cipher-text ci  SKE.Enc( K , fi ) .
7. The algorithm outputs the encrypted file set c  (c1 ,..., c# F ) and the encrypted
index  : ( As , Ts , Ad , Td ) .
Notice, MD5 value of all the files and the keywords has been calculated in the
process of generating the encrypted index. To improve the treatment efficiency, write
the data and MD5 information into the hash table (to be a dictionary) for later use.
In order to facilitate the understanding, a simple example of constructing an
encryption index is given here. Suppose that there are three files to be upload:
f1 : (w1 , w2 ) , f 2 : (w2 , w3 ) , f3 : ( w2 , w3 ) , where w represents the keyword of the
file. First, the inverted index is constructed using the files and the keyword
information. Then use the inverted index and the files to structure the encrypted index.
The results are shown in Figure 5.4.2
Fig.5.4.2 Structure the encrypted index.
The detailed construction process is as follows:
① Construct the file linked-list distinguished by the keywords according to the
keyword information. All the linked-list together form the inverted index.
② Define two fixed-length arrays As and Ad, and initialize them.
③ for every node in the inverted index, the contents of the node is computed
according to the formula and written into the array As . Write the coordinate of the
head node of the linked-list list into Ts.
④ For each keyword of each file, calculate the contents of the node and write
them into the array Ad. Write the coordinate of the head node of each file linked-list
into Td.
⑤ Construct the free list and write the coordinate of the head node of the
linked-list into Ts
5.5
Generate search authenticator
The operation of generating the search authenticator is performed on the client in
the local.
After the encryption index is generated, it will be easy to get the inverted index
and the dictionary which records all the MD5 values. It can be quick to construct the
search authenticator using these information. Note that there is an identification flag
in the array As and array Ad each, indicating whether the node has been used. In order
to hide this information to the server, the identification will be ignored and the other
contents will be written into the files when writing to the disk.
The key of generating search authenticator is to calculate the value of the leaf
node. When generate search authenticator, first initialize the MHT array according to
the number of keyword in the inverted index and then traverse the inverted index. For
each linked in the inverted index, read the corresponding MD5 value and compute and
write the result into the leaf node position corresponding to the MHT array. After the
inverted index has been traversed, calculate up according to the leaf nodes to get the
whole MHT authentication tree. Then write the tree into file, set the value of the root
node of the MHT as the authentication value and write it into the key file of user. The
process flow of generating the search authenticator is shown in figure5.5.1.
Start
Initialize the MHT array
Traverse the inverted
list
Read the MD5 of
keyword
Fill the leaf value
into the MHT array
Traverse the keyword
list
Record the coordinate
of the leaf
Read MD5 file
Traversing the
inverted list is
finished
Calculate MHT leaf value
no
yes
no
Generate MHT
Traversing the keyword
list is finished
end
yes
Fig.5.5.1 The process flow of generating the search authenticator
The following describes the process of generating the search authenticator in
detail through formal definition.
the process of generating the search authenticator
Auth( K ,  , F ) : Input the user key K , the file set F , the inverted index  , and
process as follows:
w : FK ( w),  f  f IH (GK ( w, f )) 
1
w
2
1. For each keyword w W , compute w : FK ( w),  f  f IH (GK ( w, f ))  .
1
w
2
2. Set {w : w W } as leaf node to construct MHT.  stands for MHT and st
stands for the value of root node of MHT.
3. The algorithm outputs authenticator  and DSA state st .
Execute the algorithm DSA.Auth to get the search authenticator  and the
authentication value st . The client is responsible for the storing st . The search
authenticator  and the results generated by SSE.Enc are stored in the Server.
5.6
Upload file
The operation of updating files is completed by both the client and the server,
which is an interactive process.
After the client completes the process of encrypting the files and generating the
search authenticator and the encrypted index, it uploads the encrypted files, the
encrypted index and the search the authenticator to the server for storage.
When upload different files, the data filling into each field of the message is
different

When the client sends the encrypted files to the server, each field of the
message is filled as follows:
Message type field: 0x01, it indicates the operation is adding files the first time.
Direction field: 0x00, it indicates the message is sent from the client to the
server.
Type field: 0x00, it indicates the data portion carries the encrypted files.
Length field and Data field will be filled based on the actual situation.

When the client sends the encrypted index to the server, each field of the
message is filled as follows:
Message type field: 0x01, it indicates the operation is adding files the first time.
Direction field: 0x00, it indicates the message is sent from the client to the
server.
Type field: 0x00, it indicates the data portion carries the index.
Length field and Data field will be filled based on the actual situation.

When the client sends the search authenticator to the server, each field of the
message is filled as follows:
Message type field: 0x01, it indicates the operation is adding files the first time.
Direction field: 0x00, it indicates the message is sent from the client to the
server.
Type field: 0x00, it indicates the data portion carries the search authenticator.
Length field and Data field will be filled based on the actual situation.
5.7
Store files
The operation of storing files is performed on the client in the local.
When the server receives a connection request from the client, it first checks
whether the user exists. If it exists, the server directly stores the received files in the
user's corresponding folder. If the user does not exist, the server creates a new folder,
and stores the files in it.
6.
File Search
6.1
Overview
When the user retrieves a file includes some keywords, first use algorithm
SSE.SrchToken() to handle the keywords to get the corresponding search token.
When the server gets the search token, it read the encrypted index of user and use
algorithm SSE.Search() to search. Then the server finds the search token
corresponding to cipher-text set, and sends the result to the user. The user receives
and decrypts the cipher-text. After this process, the user can get the file set
corresponding to the keywords without divulging any effective information.
Client
Server
SSE.SrchToken( K SSE , w)
 w : search token
K SSE : the key of SSE
w : the keyword of search
w
SSE.Search( w ,  , c)  I w
 w : the search token
 : the encrypted index
c : the set of ciphertext
cW
cW : {ci  c : i  I w }:
the set of ciphertext
after searching
SSE.Dec( K SKE , cW )
K SKE : the key of symmetric
encryption
Fig. 6.1.1 Sequence Diagram of Search Operation
6.2
Generate search token
The operation of generating search token is performed on the client in the local.
SrchToken( K , w) : Input the keyword w and the key K, Output the search token
 s : ( FK ( w), GK ( w), PK ( w)) 。
1
6.3
2
3
Send search token
The operation of sending the search token is completed by both the client and the
server, which is an interactive process. The search token generated by the client is
sent to the server.

When the client sends the search token to the server, each field of the
message is filled as follows
Message type field: 0x02, it indicates the operation of sending the search token
belongs to search part.
Direction field: 0x00, it indicates the message is sent from the client to the
server.
Type field: 0x04, it indicates the data portion carries the search token.
Length field and Data field will be filled based on the actual situation.
6.4
Search
At any time, the user can input the keyword and send the request to the server to
query all the files which contain the keywords.
Search(c,  , s ) : input the encrypted index  , the search token  s and the
cipher-text set c , and process as follows:
1. Compute (1 , 1' ) : Ts [ FK ] , find the coordinate of the head node N1 of the
keyword linked list Lw . 1 stands for the coordinate of N1 in As and  1' stands for
the coordinate of N 1* in Ad .
2.
Express
the
content
of
N1
in
As as (v1 , r1 ) ,
compute
(id1 , 0, locs ( N 2 ))  v1  H1 ( PK1 ( w), v1 ) and get the file description id1 corresponding
to the node N1 and the coordinate locs ( N 2 ) of the next node in As .
3. If locs ( N 2 ) is not equal to zero, execute the method of step 2. If locs ( N 2 ) is
equal to zero, the algorithm stops.
I w  {id1 ,..., idm } stands for the file identifier set which has been searched.
Find cipher-text corresponding to each identifier and output {ci }iI w .
6.5
Return result
The operation of returning the result is completed by both the client and the
server, which is an interactive process. The search result generated by the server is
sent to the client.

When the server sends the search result to the client, each field of the
message is filled as follows:
Message type field: 0x03, it indicates the operation of returning the search result
belongs to the search part.
Direction field: 0x01, it indicates the message is sent from the server to the
client.
Type field: 0x00, it indicates the data portion carries the cipher-text.
Length field and Data field will be filled based on the actual situation.
6.6
Decryption
The operation of decryption is performed on the client in the local.
Because the files which the user receives from the server have been encrypted,
the client needs to use the same symmetric encryption algorithm to decrypt files.
Dec( K , c) : Input the key K and the cipher-text set c  {c1 ,
fi  SKE.Dec( K , ci ) ，and then compute to get the plaintext.
, cm } by
7.
Challenge and Proof
7.1
Overview
Verify operation must be accompanied by search operation synchronously. The
diagram below is a timing diagram of the authentication process. The process includes
two parts: challenge and prove. First, the client generates the corresponding challenge
according to a certain search which is based on some keywords and sends it to the
server. After the server receives the challenge, it reads the user's search authenticator
and generates the proof according to that search. After the client gets the proof, it
decrypts the file set including the keywords, returned by the search process and
validates the result. Finally the client judges whether the operation of server is
legitimate according to the output of the algorithm.
Client
SSE.SrchToken( K SSE , w)
K SSE : the key of SSE
Server
 s : the search token
w : the keyword of search
DSA.Chall ( K DSA , w)
ch : the challenge according
to some search
K DSA : the key of DSA
 s , ch 
cW : {ci  c : i  I w }
SSE.Search( w ,  , c)  I w
 w : the search token
 : the encrypted index
c : the set of ciphertext
 : the proof according
to some search
DSA.Prove(ch,  )
 : the search authenticator
 cw ,  
SSE.Dec( K SKE , cW )
K SKE : the key of symmetric
encryption
DSAVrfy
.
( f w ,  , st )
f w : the searched files
st : DSA state
Fig. 7.1.1 Sequence Diagram of Authentication Algorithm
7.2
Generate challenge
The operation of generating challenge is performed on the client in the local.
Chall ( K , w) ：Input the key K and the keyword for searching w， and then
compute and output the challenge ch : FK1 ( w) .
7.3
Send challenge
The operation of sending challenge is completed by both the client and the server,
which is an interactive process. The challenge generated by the server is sent to the
client.

When the server sends the challenge to the client, each field of the message
is filled as follows:
Message type field: 0x04, it indicates the operation of sending challenge belongs
to the authentication part.
Direction field: 0x00, it indicates the message is sent from the client to the
server.
Type field: 0x05, it indicates the data portion carries the challenge.
Length field and Data field will be filled based on the actual situation.
7.4
Generate proof
The operation of generating challenge is performed on the client in the local.
Prove( , ch) : Input the search authenticator  , the challenge ch , and process
as follows:
1. Traverse  , and find the first leaf node M whose element is ch .
2. Traverse from node M to the root node, and record the sibling node value i
of all the node in the traversal path.
3. Output the proof  : {i ,1  i  h} , and h is the height of the authenticator  .
7.5
Send proof
The operation of sending proof is completed by both the client and the server,
which is an interactive process. The proof generated by the server is sent to the client.

When the server sends the challenge to the client, each field of the message
is filled as follows:
Message type field: 001000, it indicates the operation of sending proof belongs
to the authentication part.
Direction field: 0x01, it indicates the message is sent from the server to the
client.
Type field: 0x05, it indicates the data portion carries the proof.
Length field and Data field will be filled based on the actual situation.
7.6
Validate proof
The operation of validating proof is performed on the client in the local.
Vrfy(st ,  , f w ) : Input the file f w , the proof  and the state st got by search,
and process as follows:
1. Compute w :  f  f IH (GK ( w, f )) to get the value of leaf node  0 .
w
2
2. Compute  0 : IH (  0 , i ) when 1  i  h , to get the validation value .
3. If  st , the validation is passed and outputs 1. Otherwise outputs 0.
8.
File Update
8.1
Overview
The timing diagram of the update operation is shown below.
Client
SSE. AddToken( K SSE , f ,  )
 : the inverted index
DSA. AddToken( K DSA , f ,  )
Server
c f : the ciphertext corresponding
to the files
 1 : the add token of SSE
 2 : the add token of DSA
cf
  : ( 1 , 2 )
SSE. Add ( , , c)
 : the encrypted index
 : theupdate infomation

DSA. Add ( , )
 : the search authenticator
DSAUpdate
.
( st ,  , )
st : DSA state
Fig. 8.1.1 Sequence Diagram of Add Files
When a user adds a file, the client first uses the file and the keywords to generate
add tokens (including the SSE token and the token DSA), and encrypts the files. Then
the client uploads cipher-text and add token. The server updates the encrypted index
and the search authenticator using the token and simultaneously stores the cipher-text.
Then it return the update information back to the client. The client updates the local
DSA state using the update information.
The process of deleting files is similar with adding files. When deleting files, the
copy of file in the local should be accounted for. So the deleting algorithm first needs
to download the files which the user want to delete from the server. Then the client
generates the deletion token using the files. We assume that the client has the copies
of files to be removed.
8.2
Generate Keys
Key generation process
Gen(1k ) : 1k is the security parameter of the system. Select two k-bit length strings
randomly according to the safety parameter.
k
The key generation algorithm and Gen(1 ) running on the client is used for
generating the key of the DSA algorithm. The generated key consists of two parts,
which are keys of two pseudo random functions respectively. Same with the key
generation of SSE algorithm, the generation and use of DSA key and use only in the
Client and the client is responsible for keeping the key.
8.3
Generate update token
The operation of generating update token is performed on the client in the local.
The update operation includes add operation and deletion operation, so the
update token also includes the add token and the deletion token.
Generating add token:
AddToken( K , f ,  f ) : Input the key K, the files f which are to be added, the
inverted index  f , and process as follows:
1.
For
each
keyword
of
the
files
wi ( 1  i  # f ),
compute
i : ( FK ( wi ), GK ( wi ),  id ( f ),0,0   H1 ( PK ( wi ), ri ), ri ,(0,0,0,0,0,0, FK ( wi ))  H 2 ( PK ( f ), ri), ri)
1
2
3
1
3
, ri and ri are fixed-length random strings.
2. Compute  a : ( FK1 ( f ), GK2 ( f ), 1 ,
, # f ) , according to the result of step 1.
3. Encrypt the file c f  SKE.EncK4 ( f ) .
4. The algorithm outputs the add token  a and the cipher-text c f .
Generating the deletion token:
DelToken( K , f ,  f ) : input the key K , the file f , and compute
 d : ( FK ( f ), GK ( f ), PK ( f ), id ( f )) , the algorithm output the deletion token  d .
1
8.4
2
3
Send token/file
After generating the update token, the client sends the token to the server.
The operation of sending token is completed by both the client and the server,
which is an interactive process. The token generated by the client is sent to the server.
When the client adds file, the client not only needs to upload the add token, but also
needs to upload encrypted files.
Add files:
The process of adding files is divided into two sub-processes: sending the token
and sending the files.

When the server sends the token to the client, each field of the message is
filled as follows:
Message type field: 0x06, it indicates the operation of adding files belongs to the
updating part.
Direction field: 0x00, it indicates the message is sent from the client to the
server.
Type field: 0x02, it indicates the data portion carries the addition token.
Length field and Data field will be filled based on the actual situation.

When the server sends the encrypted file to the client, each field of the
message is filled as follows:
Message type field: 0x06, it indicates the operation of adding files belongs to the
updating part.
Direction field: 0x00, it indicates the message is sent from the client to the
server.
Type field: 0x00, it indicates the data portion carries the encrypted file.
Length field and Data field will be filled based on the actual situation.
Delete files：

When the server sends the token to the client, each field of the message is
filled as follows:
Message type field: 0x07, it indicates the operation of deleting files belongs to
the updating part.
Message type field: 0x07, it indicates the operation of deleting files belongs to
the update part.
Direction field: 0x00, it indicates the message is sent from the client to the
server.
Type field: 0x00, it indicates the data portion carries the challenge.
Length field and Data field will be filled based on the actual situation.
8.5
Update files
The operation of updating files is performed on the client in the local.
When the server receives the files from the client, the server stores them in the
local.
8.6
Update index
The operation of updating index is performed on the client in the local.
Update index(add files operation)
Adding file process
Add ( , a , c f ) :Input the cipher-text c f ,the encrypted index  ,the add token  a , and
process as follows:
1. Store the cipher-text: c  c f  c .
2.Set  a as ( 1 , 2 , 1 ,
, # f ) ,for each i (1  i  # f ) , process as follows:
(a) Find the coordinate  of the head node M of the free list L free in As through
Ts [ free] .
(b) Compute ( 1 ,  * ) : As [ ] to find the coordinate of the next free node,  1 and the
coordinate of the dual node of M in Ad ,  * .
(c) Set Ts [ free]  1 and update the linked-list L free .
(d) Compute (1 , 1* ) : Ts [i [1]]  i [2] , find the coordinate 1 of the head node N of
the linked-list Lwi in As , and the coordinate 1* of dual node of M.
(e) Let As [1 ] represent for ( N , r ) , set As [1 ]: ( N   0,  ,0 , r ) , and set node M as
the head node of the linked-list Lwi .
(f) Set As [ ]: (i [3]  0,0, 1 , i [4]) , update the content of node M.
(g) Set Ts [i [1]] : ( ,  * )  i [2] , update the search table Ts .
(h) Let Ad [1* ] represent for ( D, r ) , set Ad [1* ] : ( D  0,  * , 0, 0,  , 0, 0 , r ) , and
update the content of node N * .
(i) Set Ad [ * ]: (i [5]  *1 , 0, 1* ,  , 0, 1 , i [1] , i [6]) ,update the content of node
M*
(j) Set Td [ 1 ] :  *   2 , update the search list Td .
3. The algorithm outputs the new encrypted file set and the updated encrypted index
   ( As , Ts, Ad , Td ) .
Update index (delete file operation)
Deleting file process
Del ( , d , c) :Input the encrypted index  ,the encrypted file set c , the deletion token
 d , and process as follows:
1. Let  d be (1 , 2 , 3 , id ) , compute 1'  Td [ 1 ]   2 , and find the coordinate of the head
node D1 of the linked-list L f in Ad .
2. For each node Di (1  i  # f ) in the linked-list L f , process as follows:
(a) Compute (1 ,...,  6 ,  ) : D  H 2 ( 3 , r ) , in which ( D, r ) : Ad [ i' ] .
(b) Use random string to fill Ad [i] .
(c)Compute   Ts [ free] , find the address of the head node of the linked-list L free .
(d) Set Ts [ free]:  4 , let the node D1 be the head node of the linked-list L free .
(e) Set As [ 4 ]: ( , i) , update the content of N which is the dual node of D1 , let
it in the linked-list L free .
(f) Let N 1 be the previous node of N in the keyword linked-list. Set
As [ 5 ]: (1 , 2   4   6 , 3 , r1 ) ,
in
which (1 , 2 , 3 , r1 ) : As [ 5 ] .
Ad [ 2 ]: ( 1 ,  2 , 3   i   3 ,  4 , 5 ,  6   4   6 ,  * , r*1 )
( 1 ,
,
in
Set
which
,  6 ,  * , r*1 ) : Ad [ 2 ] .
(g) Let N 1 be the next node of N in the keyword linked-list. Set
As [ 6 ]: (1 , 2   4  5 , 3 , r1 ) ,
in
which (1 , 2 , 3 , r1 ) : As [ 6 ] .
Ad [ 3 ]: ( 1 ,  2   i   2 , 3 ,  4 , 5   4   5 ,  6 ,  * , r*1 )
( 1 ,
,
in
Set
which
,  6 ,  * , r*1 ) : Ad [ 3 ] .
(h) Set  i1  1 , and execute (a)
3. Delete the files with the identification id f in the encrypted file set, c  c  c f .
4. The algorithm outputs the new encrypted file set c and the updated encrypted index
   ( As , Ts, Ad , Td ) .
The process flow of deleting files is approximately same as adding files. The
client generates the delete token using the files and the keywords and sends it to the
server. The server reads the encrypted index, update the encrypted index using the
deletion token, and delete the corresponding files from the encrypted file set.
8.7
Update Search Authenticator
The operation of updating the search authenticator is performed on the client in
the local.
In order to make user still validate the search result after adding and deleting
files, the search authenticator must be updated after adding and deleting files. The
DSA state (hash tree) stored at the user also needs the corresponding update
operation.
When the user needs to add a file, he first generates add token using the files and
the keywords, and then encrypts the files. Finally he sends add token and encrypted
files to the server. After the server receives the data, it reads the user's search
authenticator, update the search authenticator operation according to add tokens, and
stores the encrypted files of the user. The process flow of deleting the files is roughly
the same. Just change the file storage operation to delete files operation. The specific
operations are as follows:
Generate update information:
Add ( , a ) : Input the search authenticator  and add token  a , and process as
follows:
1. Set the add token as (1 ,
, n , h1 , , hn ) .
2. When 1  i  n ,
(a)Find the first leaf node M : i the value of which is  i in the authenticator 
(b)Set i as (i ,1 , i ,2 ) . Compute i  (i ,1 , i ,2  IH (hi )) .
(c)Record the critical path i from M to the root.
3. Reconstruct MHT with the node (1 ,
, n ) and the nodes that do not change,
to get the new authenticator   .
4. Output the new authenticator   and the update information   ( 1 ,
8.8
, n ) .
Return new DSA
The operation of returning DSA state is completed by both the client and the
server, which is an interactive process. The new DSA state generated by the server is
sent to the client.

When the server sends the DSA state to the client, each field of the message
is filled as follows:
Message type field: 0x07, it indicates the operation of returning DSA belongs to
the update part.
Direction field: 0x01, it indicates the message is sent from the server to the
client.
Type field: 0x00, it indicates the data portion carries the DSA state.
Length field and Data field will be filled based on the actual situation.
8.9
Update DSA
The operation of updating DSA state is performed on the client in the local.
The DSA state is the root node values. It is stored at the client in the local, used
in the authentication operation.
Update(st ,  , a ) : input DSA state st , update information  , add token  a , and
process as follows:
1. Let the token be (1 ,
, n , h1 , , hn ) , the update information be ( 1 , , n ) ,
and the leaf node corresponding to i be i  (i ,1 , i ,2 ) .
2. Validate i using state st , when 1  i  h . If the validation is passed, continue.
Otherwise the algorithm outputs  .
3. Compute i  (i ,1 , i ,2  hi ) , when 1  i  h .
4. Use (1,
, n ) to update st .
5. Output new state value st  .
9
Error Handling
Error Definition
Result Reason
Log-on error
Password is incorrect/Cannot
connect to the Internet
Cannot connect to the
Internet / file is occupied
Uploading error
Operation error
Action
Message pops up, and input
again/check the Internet
Message pops up, and check
the network / close occupied
file
The content of the message is Message pops up, and check
filled in error (such as the
the content filled in
message transmission
direction may fill in error)

Download Report

Cloud Storage Oriented Cipher-text Search Protocol

Paperzz.com

Your Paperzz