Minning Queries From Search Results : A Survey

Imperial Journal of Interdisciplinary Research (IJIR)
Vol-2, Issue-12, 2016
ISSN: 2454-1362, http://www.onlinejournal.in
Minning Queries From Search Results : A Survey
Anju G R1 & Karthik M2
1
2
Student , Mohandas College of Engineering and Technology
Asst Professor , Mohandas College of Engineering and Technology
Abstract: Address the problem of finding facets from
query which explain the content covered by query.
Most queries in web search are mainly ambiguous in
nature . Understanding the search intent of users is
essential for satisfying a user’s search needs. Most of
the queries are ambiguous. There are many methods
for mining queries from their search results .This
paper presents a survey about mining queries from
their search result.
Keywords: query facet, facet search, summarization
Introduction
Automatically mining facets from the search
result is a difficult task. Many approaches for this.
Most queries are ambiguous or multifaceted For
example, ‘harry shum’ is an ambiguous query, which
may refer to an American actor, a vice president of
Microsoft, or another person named Harry Shum.
‘Xbox’ is a multifaceted query. When people search
for ‘xbox’, they may be looking for information on
different facets of an Xbox, such as ‘online game’,
‘homepage’, and ‘marketplace’. To address the issue
of discovering inquiry aspects which are numerous
gatherings of words and expression .A question may
have various features that compress the data about
the question from alternate points of view indicates
test features for a few inquiries. Aspects for the
question "watches" spread the information about
watches in five extraordinary perspectives, including
brands, sexual orientation classes, supporting
components, styles, and hues. The question "visit
Beijing" has an inquiry aspect about prominent
resorts in Beijing.
We watch that vital bits of data around a question
are normally introduced in rundown styles and
rehashed commonly among top recovered reports.
Therefore we expert posture totaling continuous
records inside the top indexed lists to mine question
features and execute a framework. All the more
particularly, extricates records from free content,
HTML labels, and rehash areas contained in the top
indexed lists, bunches them into groups taking into
account the things they contain, then positions the
groups and things in light of how the rundowns and
things show up in the top results. We expert stance
two models, the Unique Website Model and the
Context Similarity Model, to rank inquiry features.
In the Unique Website Model, we expect that
Imperial Journal of Interdisciplinary Research (IJIR)
rundowns from the same site may contain copied
data, while distinctive sites are free and each can
contribute an isolated vote in favor of weighting
aspects. Be that as it may, we find that occasionally
two records can be copied, regardless of the fact that
they are from various sites. For instance, mirror sites
are utilizing diverse area names yet they are
distributed copied content and contain the same
records. Some substance initially made by a site may
be re-distributed by different sites, henceforth the
same records contained in the substance may show
up multiple times in various sites. Moreover,
distinctive sites may distribute content utilizing the
same programming and the product may create
copied records in various sites.
2. Mining Approaches
There are basically three types of media retrieval
is available
1) Query based reformulation
2) Query based summarization
3) Entity Search
2.1 Query Based Reformulation
Most standard information retrieval models use a
single source of information for query formulation
task such as query weighting etc. The process of
query formulation modifies the keyword submitted
by the user to the search engine. The formulated
query is used as the input to search engine ranking
algorithm. Therefore the primary goal of the query
reformulation is to improve the overall quality of
the ranking process. For example, for the query
“what is the fastest animals in the world” in Table 1,
we generate a dimension “cheetah, pronghorn
antelope, lion, Thomson’s gazelle, wildebeest, ...”
which includes animal names that are direct answers
rather than query reformulations to the query.
Page 1840
Imperial Journal of Interdisciplinary Research (IJIR)
Vol-2, Issue-12, 2016
ISSN: 2454-1362, http://www.onlinejournal.in


Abstractive
Extractive
Steps in query based summarization
 Identification of the correct sections
 Summary generation
Advantages:


2.2.1
It is based on two steps


The first is the processing stage also
called query refinement
The second the processing stage alters
the query on structural level
2.1.1 Advantages


Disadvantages:


Fig 1: idea behind query reformulation
Easy to implement
Fast retrieval
User must read all the documents
Time consuming
2.3 Entity Search
The Entity search engines aim at providing the
user with entities and relationships between these
entities, instead of providing the user with links to
web pages .Information is everywhere. The
traditional search approach is to scan every
document in any order. Its is time consuming for
users
It is very fast retrieval method.
More accurate result is obtained
2.2 Query-Based Summarization
Producing a summary a document satisfying a
request for information expressed by a query .The
summary is the sequences of sentences which can be
extracted from documents. The number of sources
for the summary that are single document summary
and multiple document summaries.
Fig 3: traditional vs. entity search
An entity refers to any object or thing that can be
uniquely identified in the world. Each entity is match
with other entity
Benefits of entity search
Fig 2: idea behind query based summarization
Summary construction methods are of two types
Imperial Journal of Interdisciplinary Research (IJIR)



It is categorized into taxonomy
The first task is to make a decision
More structured than compared to others
Page 1841
Imperial Journal of Interdisciplinary Research (IJIR)
Vol-2, Issue-12, 2016
ISSN: 2454-1362, http://www.onlinejournal.in



More understandable by human
Increase precision
Less time consuming
[4] W. Kong and J. Allan, “Extending faceted search to the
generalweb,” in Proc.ACMInt. Conf. Inf. Knowl. Manage., 2014,
pp. 839–848.
[5] T. Cheng, X. Yan, and K. C.-C. Chang, “Supporting entity
search:A large-scale prototype search engine,” in Proc. ACM
SIGMODInt. Conf. Manage. Data, 2007, pp. 1144–1146.
[6] K. Balog, E. Meij, and M. de Rijke, “Entity search: Building
bridgesbetween two worlds,” in Proc. 3rd Int. Semantic Search
Workshop,2010, pp. 9:1–9:5.
[7] M. Bron, K. Balog, and M. de Rijke, “Ranking related entities:
Components and analyses,” in Proc. ACM Int. Conf. Inf. Knowl.
Manage., 2010, pp. 1079–1088.
[8] C. Li, N. Yan, S. B. Roy, L. Lisham, and G. Das,
“Facetedpedia:Dynamic generation of query-dependent faceted
interfaces for wikipedia,” in Proc. 19th Int. Conf. World Wide
Web, 2010,pp. 651–660.
[9] W. Dakka and P. G. Ipeirotis, “Automatic extraction of useful
facethierarchies from text databases,” in Proc. IEEE 24th Int.
Conf. DataEng., 2008, pp. 466–475.
Fig 4: idea about entity search
4. Conclusion
As the primary methodology of discovering
question features, can be enhanced in numerous
angles. For instance, some semi administered
bootstrapping list extraction calculations can be
utilized to iteratively extricate more records from the
top results. Particular site wrappers can likewise be
utilized to concentrate top notch records from
legitimate sites. Including these rundowns may
enhance both precision and review of inquiry
features. Grammatical feature data can be utilized to
further check the homogeneity of records and
enhance the nature of inquiry aspects. We will
investigate these points to refine aspects later on. We
will likewise research some other related themes to
discovering inquiry aspects. Great portrayals of
question aspects might be useful for clients to better
comprehend the features. Automatically create
significant depictions is an intriguing examination
subject.
[10] A. Herdagdelen, M. Ciaramita, D. Mahler, M. Holmqvist, K.
Hall,S. Riezler, and E. Alfonseca, “Generalized syntactic and
semanticmodels of query reformulation,” in Proc. 33rd Int. ACM
SIGIRConf. Res. Develop. Inf. retrieval, 2010, pp. 283–290.
[11] M. Mitra, A. Singhal, and C. Buckley, “Improving automatic
query expansion,” in Proc. 21st Annu. Int. ACM SIGIR Conf. Res.
Develop. Inf. Retrieval, 1998, pp. 206–214.
[12] P. Anick, “Using terminological feedback for web search
refinement:A log-based study,” in Proc. 26th Annu. Int. ACM
SIGIRConf. Res. Develop. Inf. Retrieval, 2003, pp. 88–95.
[13] S. Riezler, Y. Liu, and A. Vasserman, “Translating queries
intosnippets for improved query expansion,” in Proc. 22nd Int.
Conf.Comput. Ling., 2008, pp. 737–744.
[14] X. Xue and W. B. Croft, “Modeling reformulation using
query distributions,”ACM Trans. Inf. Syst., vol. 31, no. 2, pp. 6:1–
6:34, May2013.
[15] L. Bing, W. Lam, T.-L. Wong, and S. Jameel, “Web query
reformulationvia joint modeling of latent topic dependency and
term context,”ACM Trans. Inf. Syst., vol. 33, no. 2, pp. 6:1–6:38,
eb. 2015.
[16] J. Huang and E. N. Efthimiadis, “Analyzing and evaluating
queryreformulation strategies in web search logs,” in Proc. 18th
ACMConf. Inf. Knowl. Manage., 2009, pp. 77–86.
5. References
[17] R. Baeza-Yates, C. Hurtado, and M. Mendoza, “Query
recommendation using query logs in search engines,” in Proc. Int.
Conf. CurrentTrends Database Technol., 2004, pp. 588–596.
[1] O. Ben-Yitzhak, N. Golbandi, N. Har’El, R. Lempel, A.
Neumann,S. Ofek-Koifman, D. Sheinwald, E. Shekita, B.Sznajder,
and SYogev, “Beyond basic faceted search,” in Proc. Int. Conf.
WebSearch Data Mining, 2008, pp. 33–44.
[18] Z. Zhang and O. Nasraoui, “Mining search engine query logs
forquery recommendation,” in Proc. 15th Int. Conf. World Wide
Web,2006, pp. 1039–1040.,” in Proc.Int. Conf.
[2] M. Diao, S. Mukherjea, N. Rajput, and K. Srivastava,,“Faceted
search and browsing of audio content on spoken web,” in Proc.
19th ACM Int. Conf. Inf. Knowl. Manage., 2010, pp. 1029–1038.
[3] D. Dash, J. Rao, N. Megiddo, A. Ailamaki, and G. Lohman,
“Dynamic faceted search for discovery-driven analysis,” in ACM
Int. Conf. Inf. Knowl. Manage., pp. 3–12, 2008.
Imperial Journal of Interdisciplinary Research (IJIR)
Page 1842