Robust Watermarking For Relational Database

International Journal of Communication and Computer Technologies
Volume 01 – No.41, Issue: 05 May 2013
ISSN NUMBER : 2278-9723
Robust Watermarking For Relational Database
Loganayaki.A
Department of CSE, VMKVEC.
ABSTRACT - Watermark describes information
that can be used to prove the ownership of data
such as the owner, origin or recipient of the
content. Watermarking is the piece of data
securely embedded and this is said to be
imperceptible. We implemented a proof of
concept of our watermarking technique and
showed by experimental that our technique is
resilient to tuple deletion, alteration, and
insertion attacks. It includes the feature of
detectability, i.e,the original data can be retrieved
only with the help of the key value. The
watermark data is robustly embedded into the
original content .It means that the original values
will be retained even if any alterations occurred.
We can say it as a blind system, because the
knowledge of the original content is no need for
the process of watermarking.
Furthermore, the watermark detection is
blinded, that is, it neither requires the knowledge
of the original data nor the watermark.
1.INTRODUCTION:
Proving ownership rights on outsourced
relational databases is a crucial issue in today’s
internet-based application environments and in
many content distribution applications. In this
paper, new mechanism is proposed for proof of
ownership based on the secure embedding of a
robust imperceptible watermark in relational data.
The steps of the proposed mechanism of
watermarking relational database mainly involve
decoding and encoding on numerical attribute of
relational database.
The first phase is to partition the original
data and assign partition number to each and every
tuple of the relation using Cryptographic Hashing
Function (MD5). In the second phase , while
changing the data , select the desired watermark
and bit bi is selected from the partitioned data and
then that bit bi is changed using watermark W.
When the original value of data gets changed due
to the watermark bit, it always checks the data
usability constraints. In the third phase, after
inserting the watermark in the partition, merge all
partitions and get the complete watermarked data.
While decoding, use majority voting algorithm to
get the correct watermark.
Watermarking is the piece of data securely
embedded and this is said to be imperceptible.
Imperceptible embedding means that the presence
of the watermark is unnoticeable in the data.
2. EXISTING SYSTEM
In the already existing technique, the
original data is encrypted with the public and
private keys. RSA is an algorithm used for
encryption.
Encryption is the process of
converting the original content into cipher text to
make it unreadable to anyone except those
possessing special knowledge, usually referred to
as a key.
Public key and private key are used for
encryption . Public key is a key which is used for
encryption and is known for both the authors and
the users where the private or secret key is an
encryption/decryption key known only to the party
or parties that exchange secret messages. The use
of combined public and private keys is known as
asymmetric encryption.
It is not much secure, because the original
data can be easily decrypted with the simple
manipulations. So that we can retrieve the original
content easily.This is the main drawback in the
existing system.
Volume 01 – No.41, Issue: 05
Page 143
International Journal of Communication and Computer Technologies www.ijccts.org
International Journal of Communication and Computer Technologies
Volume 01 – No.41, Issue: 05 May 2013
ISSN NUMBER : 2278-9723
3. PROPOSED SYSTEM
In the proposed system we are using bit
encoding algorithm for encoding the data. It allow
us to convert the original string into ASCII then to
binary digits. Then by using data padding we are
inserting the 0’s and 1’s to the original encoded
binary value and then it is send to the receiver.
In receiver side first it will check the
user authentication. If the receiver is administrator
means first he will check whether the data is
modified or not by comparing the data sent by the
sender admin and with the original data stored in
the database. If the values in the two tables are
matched then the receiver get the original data by
decoding process. If the values in the two tables
are not matched the n an alert box intimate the
receiver that the data is modified.
Benefits:
*Proof of ownership.
*Even if any intruders hack the data they will get
only the wrong value not the original value.
*If any modifications done while transfer of data it
can be easily retained with the help of the key.
The Architecture of Watermarking Model is as
described in the block diagram in Fig 1.
As shown in the block diagram Proposed
Approach gives two main techniques
1) Single Bit Encoding
2) Single Bit Decoding
Original data set D is required to be partitioned
into m number of partitions based on cryptographic
hash function (MD5). Single bit encoding function
takes three input parameters secret key Ks,
Number of partitions m and watermark W i.e.
known to only copyright owner.
Fig. 1. Watermarking Model with Encoder and
Decoder
It transforms original data into watermarked data
DW. While watermarking using single bit
encoding, some of the data get modified but
usability constraints are maintained using the
usability matrix. The single bit encoding includes
data partitioning and watermark embedding
algorithms. The single bit decoding includes
majority voting and watermark detection
algorithms.
Here if any hackers hack the data they will get only
the duplicate value which is stored in the other
database. Thus the original content will not be
share to any intruders other than the original
receiver.
4. APPLICATIONS:
Watermarking techniques is used mainly in
three applications such as Stock market, weather
report and medical research. These three
applications contain centralize database with very
high security. First the given database will be
portioned based on the records. The portioned
database will be implemented to watermark
embedded.In embedding the given database will be
Volume 01 – No.41, Issue: 05
Page 144
International Journal of Communication and Computer Technologies www.ijccts.org
International Journal of Communication and Computer Technologies
Volume 01 – No.41, Issue: 05 May 2013
ISSN NUMBER : 2278-9723
hiding with the help of watermark key. ie., the
given original data is convert in ASCII then to
binary and stored in separate database. Then by
using data padding we are inserting the 0’s and 1’s
to the original encoded binary value and then it is
send to the receiver.
In receiver side first it will check the user
authentication. If the receiver is administrator
means first he will check whether the data is
modified or not by comparing the data sent by the
sender admin and with the original data stored in
the database.
If the values in the two tables are matched
then the receiver get the original data by decoding
process. If the values in the two tables are not
matched the n an alert box intimate the receiver
that the data is modified.
If the accessing person is administrator
means it will give full permission to read and write
the database. If the accessing person receiver side
admin means they will view and store the database
only else the person is end user means they have
only one chance to read the database.
5. SYSTEM DESIGN
5.1 Input Design
Input design is the process of converting
user-originated inputs to a computer-based format.
Input design is one of the most expensive phases of
the operation of computerized system and is often
the major problem of a system.
The sender admin first login into the
system and encode the data by using the public key
and then modifying the data using watermark bit.
Then send to the receiver.
Sender
admin
Login
Encode
data
Add
watermark
Send
Figure 2: Sender Admin Flowchart
5.2 Output Design
Output design generally refers to the
results and information that are generated by the
system for many end-users; output is the main
reason for developing the system and the basis on
which they evaluate the usefulness of the
application. In any system, the output design
determines the input to be given to the application.
The receiver admin can login into the
system and compare the data that received with the
data already stored in the database. Using private
key the receiver decode the data and update and
publish to the user.
The output is secure and hackers cannot
modify the data. Hacker can view only the
watermarked data.
The receiver can alone decode the data and get the
original message.
Volume 01 – No.41, Issue: 05
Page 145
International Journal of Communication and Computer Technologies www.ijccts.org
International Journal of Communication and Computer Technologies
Volume 01 – No.41, Issue: 05 May 2013
ISSN NUMBER : 2278-9723
Receiver
admin
Login
Receive data
Compare
data
Decode
Update
Figure 3: Receiver Admin Flowchart
User
Login
View
data
Location
search
FIGURE 4: USER FLOWCHART
The user can view the data by clicking the location
of the required place.
6. ALGORITHMS
Data Partitioning:
The technique used to partition the data is
based on message authentication code (MAC)
using cryptographic hash function Message Digest
5 (MD5). The data partitioning algorithm that
partitions the data set based on secret key Ks. The
data set D is a database relation with scheme
D(P,A0,A1…….Av-1) where P is the primary key
attribute, A0,A1…….Av-1 are v attributes which are
candidates for watermarking, and |D| is the number
of tuples in D. The data set D is to be partitioned
into m non overlapping partitions, namely, {S0 ; . . .
; Sm-1}, such that each partition Si contains on the
average |D| / m tuples approx. from the data set D.
Partitions do not overlap, that is, for any two
partitions Si and Sj such that i ≠ j ,we have Si ‫ ח‬Sj =
{ }. For each tuple r € D, the data partitioning
algorithm
computes
a
MAC
(Message
Authentication Code), which is considered to be
secure cryptographic hash function as given below
:
H(ks | | H(r.P || Ks)),
where r.P is the primary key of the tuple r,
H( ) is a secure hash function, and || is the
concatenation operator. Using the computed MAC
tuples are assigned to partitions. For a tuple r, its
partition assignment is given by Partition( r ) =
H(Ks || H(r.p || Ks)) mod m.
Secure hash function used is Message
Digest 5(MD5). A cryptographic hash function is a
transformation that takes an input (or 'message')
and returns a fixed-size string, which is called the
hash value (sometimes termed a message digest, a
digital fingerprint, a digest or a checksum). In
various standards and applications, the two most
commonly used hash functions are MD5 and SHA1. In cryptography, MD5 is a widely used, partially
insecure cryptographic hash function with a 128bit hash value. An MD5 hash is typically expressed
as a 32 digit hexadecimal number. So to get MD5
Volume 01 – No.41, Issue: 05
Page 146
International Journal of Communication and Computer Technologies www.ijccts.org
International Journal of Communication and Computer Technologies
Volume 01 – No.41, Issue: 05 May 2013
ISSN NUMBER : 2278-9723
hash value , concatenate primary key of row r.P
with Secrete Key Ks and then assign the partition
to each row using Partition( r ) = H(Ks || H(r.p ||
Ks)) mod m. An attacker cannot predict the tuple
to partition assignment without the knowledge of
the secret key Ks and the number of partitions m,
which are kept secret. However, keeping it secret
makes it harder for the attacker to regenerate the
partitions.
Data Partitioning Algorithm:
Input : Data Set D, Secret Key Ks, Number of
partitions m
1. {S0,S1……Sm-1}  { }
2. for each tuple r € D
3. Partition( r ) = H(Ks || H(r.p || Ks)) mod m.
4. insert r into Spartition(r)
5. return S0 , S1…… Sm-1
Output : Data Partitions S0,S1……Sm-1
7. DESCRIPTION
Data Insertion
Data insertion module contains applications
for collecting data from admin side. In weather
information module admin will insert details about
the
sunrise,visibility,pressure,humidity,location
etc. Each and every changes of the weather details
should updated by admin day by day.
Encoding data
We are using bit encoding algorithm for
encoding the data. It allow us to convert the
original string into ASCII then to binary digits.
These binary digits will be stored in the memory
stream and the conversion of 0’s into 1’s and 1’s
into 0’s will be taken place.
Data Padding
We are padding some bits to the original
content to send the data to the receiver side. Here if
any hackers hack the data they will get only the
duplicate value which is stored in the other
database. Thus the original content will not be
share to any intruders other than the original
receiver.
7.1 Watermark Detection
Access Permission
It checks whether the role of the user is
receiver admin based on the username and the
password and allows the receiver to detect the
watermark data.
Data Retrieval
Data can be retrieved by the receiver admin
as well as end-users . If the users has admin role
[receiver side] they can access the data and then
save it with the help of key. If the users has end
user role [visitors] they can only view the data .
The receiver will check whether the data is
modified or not by comparing the data sent by the
sender admin and with the original data stored in
the database. If the values in the two tables are
matched then the receiver get the original data by
decoding process. If the values in the two tables
are not matched the n an alert box intimate the
receiver that the data is modified.
7.2 Searching Technique
View Data
Receiver admin and the end-user can
search the historical data stored in the database. By
specifying the location of the state in the map we
can view the required data.
8. CONCLUSION
Watermarking algorithms are often used in
larger system designed to achieve certain goals
e.g., prevention of illegal copying. Watermarking
database can be used to prevent database piracy,
where somebody takes somebody else's database,
slaps their name on it and then goes into
competition with the original database producer.
Volume 01 – No.41, Issue: 05
Page 147
International Journal of Communication and Computer Technologies www.ijccts.org
International Journal of Communication and Computer Technologies
Volume 01 – No.41, Issue: 05 May 2013
ISSN NUMBER : 2278-9723
Protection from the piracy of digital assets is
usually based upon the embedding of digital
watermarks into the data. Watermarking
approaches do not prevent copy rather it deter
illegal copying by providing a means of
establishing the original owners a redistributed
copy. In this paper, the new watermarking
technique is proposed for relational data that
embeds watermark bits in the data by maintaining
its meaning and value as it uses usability matrix
while changing its original value. The data
partitioning is done by using cryptographic hash
function (MD5). The proposed technique handles
the alteration, deletion and insertion attack more
effectively. The watermark resilience was
improved by the repeated embedding of the
watermark and using majority voting technique in
the watermark decoding phase. Moreover, the
watermark algorithm can be applied for more than
one attribute of the same table.
REFRENCES
[1] Radu Sion, , “Database Watermarking” , Network
Security and Applied
Cryptography LabComputer
Science, Stony Brook University.
[2]. R. Agrawal and J. Kiernan, “Watermarking Relational
Databases”, Proc. 28th Int’l Conf. Very Large Data Bases,
2002.
[3] Yong Zhang, Xiamu Niu, Dongning Zhao, Juncao Li,
and Siming Liu , “Relational Database Watermarking
Technique Based on Content Characteristics “ , Proceedings
of the First International Conference on Innovative
Computing, Information and Control (ICICIC'06)
[4] Mitchell D. Swanson , Mei Koba Yashi and Ahmed H.
Tewfik, “Multimedia Data-Embedding and Watermarking
Technologies” ,Proceedings of the IEEE VOL. 86, NO. 6,
JUNE 1998
[5] Julien Lafaye , “An Analysis of Database Watermarking
Security”, 0- 7695-2876 7/07 $25.00 © 2007 IEEE , Third
International Symposium on Information Assurance and
Security.
[6] Mohamed Shehab, Elisa Bertino and Arif Ghafoor,
“Watermarking Relational Databases Using OptimizationBased Techniques”, IEEE Transaction on Knowledge and
Data engineering, VOL. 20, NO. 1, JANUARY 2008
Volume 01 – No.41, Issue: 05
Page 148
International Journal of Communication and Computer Technologies www.ijccts.org