International Journal of Advances in Engineering, 2015, 1(3), 192 - 195 ISSN: 2394-9260 (printed version); ISSN: 2394-9279 (online version); url:http://www.ijae.in SHORT COMMUNICATION Synonym Query Using Multi-keyword Search Using Cloud Computing B.Preethi, M.Shahin and K.RamaDevi S.K.P Engineering College, India [email protected] Received 24 February 2015 / Accepted 21 March 2015 Abstract-As future cloud computing will be more flexible and effective in term of supervision, data owners are motivated to outsource their complex data systems from local sites to commercial public cloud. . But for security of data, sensitive data has to be encrypted before outsourcing, which overcomes method of traditional data utilization based on plaintext keyword search. Considering the large number of data users and documents in cloud, it is necessary for the search service to allow multi-keyword query and provide result similarity ranking to meet the effective data retrieval need. Retrieving of all the files having queried keyword will not be affordable in pay as per user cloud paradigm. As so much advantage of cloud computing, more and more data owners centralize their sensitive data into the cloud. It is a natural language process. It can be summarized in two aspects :multi-keyword ranked search to achieve more accurate search results and synonym-based search to support synonym queries. . Finally, the experimental result demonstrates that our method is better than the original MRSE scheme. Keywords-Multikeyword search, ranking, Synonym-based search. 1. INTRODUCTION Due to the rapid expansion of data, the data owners tend to store their data into the cloud to release the burden of data storage and maintenance. However, as the cloud customers and the cloud server are not in the same trusted domain, our outsourced data may be under the exposure to the risk. Thus, before sent to the cloud, the sensitive data needs to be encrypted to protect for data privacy and combat unsolicited accesses. Fuzzy keyword searches [2-4] have been developed. Propose a privacy-aware bed-tree method to support fuzzy multi-keyword search. This approach uses edit distance to build fuzzy keyword sets. Bloom filters are constructed for every keyword. Then, it constructs the index tree for all files where each leaf node a hash value of a keyword. Li et al. [3] exploit edit distance to quantify keywords similarity and construct storage-efficient fuzzy keyword sets. Specially, the wildcard-based fuzzy set construction approach is designed to save storage overhead. Wang et al. [4] employ wildcard-based fuzzy set to build a private triedtraverse searching index. These fuzzy search methods support tolerance of minor types and format. Unfortunately, data encryption, which restricts user’s ability to perform keyword search and further demands the protection of keyword privacy, makes the traditional plaintext search methods fail for encrypted cloud data. Ranked search greatly improves system usability by normal matching files in a ranked order regarding to certain relevance criteria (e.g., keyword frequency). Background And Related Work : Many organizations and companies store their more valuable information in cloud to protect their data from virus and hacking.The benefit of new computing is Deep search it can be easy to cloud users. Ranked search improves system usability by normal matching files in a ranked order regarding to certain relevance criteria (e.g., keyword frequency),As directly outsourcing relevance scores will drips a lot of sensitive information against the keyword privacy, We proposed asymmetric encryption with ranking result of queried data which will give only expected data and also search a fuzzy key word(exact)data. Existing system: Existing system approaches synonym based. The existing search approaches like ranked search, multikeyword search that enables the cloud customers to find the most relevant data quickly. It also reduces the network traffic by sending the most relevant data to user request. But In real search scenario it might be possible that user searches with the synonyms of the predefined keywords not the exact or fuzzy matching keywords, due to lack of the user’s exact knowledge about the data. These approaches supports only exact or fuzzy keyword search. That is there is no tolerance of synonym substitution and/or syntactic variation which are the typical user searching behaviors happens very frequently. Therefore synonym based multi-keyword ranked search over encrypted cloud data remains a challenging problem. 193 Int. J. Adv. Eng., 2015, 1(3), 192-195 Drawbacks of existing system 1.Single-keyword search without ranking 2.Boolean- keyword search without ranking 3.Single-keyword search with ranking 4.Do not get relevant data. II. PROPOSED SYSTEM To overcome this problem of effective search system this paper proposes an efficient and flexible searchable scheme that supports both multi-keyword ranked search and semantic based search. The Vector Space Model is used to address multi-keyword search and result ranking. By using VSM document index is build i.e. each document is expressed as vector where each dimension value is the Term Frequency (TF) weight of each corresponding keyword. Another vector is generated in query phase. It has same dimension as that of document index and its each dimension value is the Inverse Document Frequency (IDF) weight. Then cosine measure is used to calculate the similarity between the document and the search query. Showing the problem of Secured Multi-keyword search over encrypted cloud data Propose two schemes following the principle of coordinate matching and inner product similarity. Design Goal: 1.User Interface 2.Search Space After user login process, cloud user can enter the search space page. This is the environment for user to search the content from the cloud server.T his Search Space is the interface for user and cloud server. Input from User ( Get the input text from the user for the search process) Data Preprocessing Stop Word Removal: Stop words are words which are filtered out prior to,or after processing of natural language data(text).It is controlled by human input and not automated. These are some of the most common, short function word, such as the, is, at, which and on. Poster stemming Figure.1 Proposed system Stemmers employ a lookup table which contain relations between root forms and inflected forms. To stem a word, the table is queried to find a matching inflection.If a matching inflection is found,the association root form is returned.Eg:A stemming algorithm reduces the words “fishing”,” fished”,” fish”,and “fisher”,to the root word, “fish”. Ontology Clustering: Words ending in nym’s are often used to describe different classes of word,and the relations between words. Hypernym: A word that has a more general meaning than another. 194 Int. J. Adv. Eng., 2015, 1(3), 192-195 Synonym: One of two(or more)words that have the same (or very similar). The Artificial-Intelligence literature contains many definitions of ontology(Word net).It includes machine-interpretable definitions of basic concepts in the domain and relations among them. The featured results produced by the sentencebased, document-based, corpus-based, and the combined approach concept analysis have higher quality than those produced by a single-term analysis similarity. III . METHODOLOGY Multi-Keyword Ranked Search: The existing systems like exact or fuzzy keyword search, supports only single keyword search. These schemes doesn’t retrieve the relevant data to users query therefore multi- keyword ranked search over encrypted cloud data remains a very challenging problem. To meet this challenge of effective search system, an effective and flexible searchable scheme is proposed that supports multi-keyword ranked search. To address multikeyword search and result ranking, Vector Space Model (VSM) is used to build document index, that is to say, each document is expressed as a vector where each dimension value is the Term Frequency (TF) weight of its corresponding keyword. A new vector is also generated in the query phase. The vector has the same dimension with document index and its each dimension value is the Inverse Document Frequency (IDF) weight. Then cosine measure can be used to compute similarity of one document to the search query [1]. To improve search efficiency, a tree-based index structure used which is a balance binary tree is. The searchable index tree is constructed with the document index vectors. So the related documents can be found by traversing the tree. Semantic Based Search: While user searching the data on cloud server it might be possible that the user is unaware of the exact words to search, i.e. there is no tolerance of synonym substitution or syntactic variation which are the typical user searching behaviors and happen very frequently. To solve this problem semantic based search method is used. To improve the search for information it is necessary that search engines can understand what the user wants so they are able to answer objectively. To achieve that, one of the necessary things is that the resources have information that can be helpful to searches. The Semantic Web proposed to clarify the meaning of resources by annotating them with metadata data over data. By associating metadata to resources, semantic searches can be significantly improved when compared to traditional searches. It allows users the use of natural language to express what he wants to find. Here the enhanced E-TFIDF algorithm is proposed for improving documental searches optimized for specific scenarios where user want to find a document but don´t remember the exact words used, if plural or singular words were used or if a synonym was used. The defined algorithm takes into consideration: 1) the number of direct words of the search expression that are in the document; 2) the number of word variation (plural/singular or different verbs conjugation) of the search expression that are in the document; 3) the number of synonyms of the words in the search expression that are in the document; weights to each one of this components as the fuzziness part of the algorithm [7]. RSA Algorithm This algorithm is used to encrypt n decrypt file contents. It is an asymmetric algorithm. The RSA algorithm involves three steps: key generation, encryption and decryption. Key generation RSA involves a public key and a private key. The public key can be known to everyone and is used for encrypting messages. Messages encrypted with the public key can only be decrypted using the private key. The keys for the RSA algorithm are generated the following way: 1.Choose two distinct prime numbers a and b. 2.Compute n = ab. n is used as the modulus for both the public and private keys 3.Compute φ(n) = (aԜ–Ԝ1)(bԜ–Ԝ1), where φ is Euler's totient function. 4.Choose an integer e such that 1 < e < φ(n) and greatest common divisor of (e, φ(n)) = 1; i.e., e and φ(n) are co prime. e is released as the public key exponent. having a short bit-length . B. K-Nearest Neighbour K-nearest neighbor search identifies the top k nearest neighbors to the query. This technique is commonly used in predictive analytics to estimate or classify a point based on the consensus of its neighbors. K-nearest neighbor graphs are graphs in which every point is connected to its k nearest neighbors. The basic idea of our new algorithm: The value of dmax is decreased keeping step with the ongoing exact evaluation of the object similarity distance for the candidates. At the end of the step by step refinement, dmax reaches the optimal query range Ed and prevents the method from producing more candidates than necessary thus fulfilling the roptimality criterion. Nearest Neighbor Search (q, k) // optimal algorithm 1.Initialize ranking = index.increm-ranking (F(q), df) 2.Initialize result = new sorted-list (key, object) 3.Initialize dmax = w 4.While o = ranking.getnext and d,(o, q) I d,,, do 195 Int. J. Adv. Eng., 2015, 1(3), 192-195 5.If do@, s> s dmax then result.insert (d,(o, q) , o) 6.If result.length 2 k then dmax = result[k].key 2, February 2014) 7.Remove all entries from result where key > dmax 8.End while Report all entries from result where key I dmax CONCLUSION The proposed Semantic Search with WordNet methodology makes the Search process more efficient. The proposed scheme could return not only the exactly matched files, but also the files including the terms semantically related to the query keyword. The concept of co-occurrence probability of terms is used to get the semantic relationship of keywords in the dataset. It offers appropriate semantic distance between terms to accomplish the query keyword extension. To guarantee the security and efficiency, the data is encrypted before outsourced to cloud, and provides security to datasets, indexes and keywords also. Then the data owner groups the indexes and forms the ontology based on the documents which is having syntactically and semantically similar words. The overall performance evaluation of this scheme includes the cost of metadata construction, the time necessary to build index and ontology construction as well as the efficiency of search and WordNet methodology which makes the search scheme still more efficient to the user and by employing this technique keyword that we used for searching will also protected and better search mechanism can be achieved. REFERENCES [1] [2] [3] [4] [5] Zhangjie Fu, Xingming Sun, Nigel Linge and Lu Zhou, “Achieving Effective Cloud Search Services: Multi- keyword Ranked Search over Encrypted Cloud Data Supporting Synonym Query”, IEEE Transactions on Consumer Electronics, Vol. 60, No. 1, February 2014. J. Li, Q. Wang, C. Wang, N. Cao, K. Ren, and W. Lou, “Fuzzy keyword search over encrypted data in cloud computing,” Proceedings of IEEE INFOCOM’10 Mini- Conference, San Diego, CA, USA, pp. 1-5, Mar. 2010. C. Wang, N. Cao, J. Li, K. Ren, and W. Lou, “Secure ranked keyword search over encrypted cloud data,” Proceedings of IEEE 30th International Conference on Distributed Computing Systems (ICDCS), pp. 253-262, 2010. N. Cao, C. Wang, M. Li, K. Ren, and W. Lou, “Privacy- preserving multi-keyword ranked search over encrypted cloud data,” Proceedings of IEEE INFOCOM 2011, pp. 829-837, 2011. Q. Chai, and G. Gong,“Verifiable symmetric searchable encryption for semi-honest-but-curious cloud servers,”
© Copyright 2026 Paperzz