ppt

Context-Aware Query Classification
Huanhuan Cao, Derek Hao Hu, Dou Shen, Daxin Jiang,
Jian-Tao Sun, Enhong Chen, Qiang Yang
Microsoft Research Asia
SIGIR 2009
2010.04.27
Summarized and presented by Sang-il Song, IDS Lab., Seoul National University
Query Classification
 Query Classification (QC)

Understanding user’s search intent

Classifying user queries into predefined target categories.

Difference from traditional text classification


–
Queries are usually very short
–
Many queries are ambiguous, so that it belongs to multiple categories
Approaches
–
Augmenting the queries with extra data (search results)
–
Leveraging unlabeled data to help improve the accuracy of supervised
learning
–
Expanding training data by automatically labeling some queries in
some click-through data via a self-training
These approaches doesn’t consider user behavior history
Copyright  2010 by CEBT
2
Context Query Classification
 Motivation Example

Query “Jaguar” w.o. context
–

Ambiguous that user is interested in “car” or “animal”
Query “jaguar” before “BMW”
–
Clear that User is interested in “car”
 Context Information

Adjacent queries

Clicked URLs
 This paper is modeling context information with CRF
Copyright  2010 by CEBT
3
User Session
 User search session

Series of observation

Each
consists of a query
user for
and a set of URL
Copyright  2010 by CEBT
, clicked by
4
Taxonomy
 Taxonomy

Tree of categories

Each node corresponds to a predefined category
Copyright  2010 by CEBT
5
Conditional Random Field
 Undirected graphical model
 input sequence
 pij depends on feature function
 Motivation for using CRF

Suitable for capturing
context information
p12
p22

Doesn’t need any
prior knowledge

Flexible to richer feature
p21
s2
p11
s1
p13
p14
p23
p31
p32
p33
s3
p34
p24
p41
p42
s4
p43
p44
Copyright  2010 by CEBT
6
Context-Aware QC with CRF
world cup
fifa
worldcup.fifa.com
fifa10.ea.com
fifa news
Category Label
soccer
game
fifaworldcup.ea.com
0.24
0.8
0.3
0.56
0.7
0.8
0.01
0.05
0.2
0.2
0.95
0.19
0.7
0.168
0.3
0.072
0.4
0.224
0.6
0.336
0.7
0.007
0.3
0.003
0.4
0.076
0.6
0.114
Copyright  2010 by CEBT
7
Conditional Probability
 Conditional Probability

Category label sequence

Observation sequence

Conditional Probability
–

Z(o) : normalization factor
Potential function
–
fk : feature function
–
lk : weight of fk
Copyright  2010 by CEBT
8
Training and Classification
 Training

Given Training Data

Objective
–
find a set of parameters
–
Maximize the conditional log-likelihood:
 Inferring the category label ct for the test query
Copyright  2010 by CEBT
as
9
Features
 Feature
local
feature
contextual
feature
Feature
What does it use?
Query terms
Query terms
Pseudo feedback
External Web directory
Implicit feedback
External Web directory +
click information
Direct Association between
adjacent labels
Previous labels
Taxonomy-based association
between adjacent labels
Taxonomy structure
Copyright  2010 by CEBT
10
Local Feature
 Query Terms

Elementary feature

too sparse – training data couldn’t
cover terms sufficiently
 Pseudo feedback

Using top M results returned
by an external Web directory

Mapping its category label to a category in the target taxonomy

General label confidence
–
Meaning the number of returned related search results of
category labels are
after mapping
Copyright  2010 by CEBT
whose
11
Local Features (contd.)
 Implicit feedback

Similar to Pseudo feedback, but using click information

click-based label confidence score

Calculating
1.
Using Web Directory, get corresponding categories
2.
Obtain a document collection for each possible query
3.
Build a Vector Space Model for each category
4.
Use cosine Similarity term vector of
Copyright  2010 by CEBT
and snippets of the
12
Contextual Features
 Direct Association between adjacent labels

Using occurrence of a pair of labels

The Higher the weight
the larger the probability
,
transits into
 Taxonomy-based association between adjacent labels

Limited by size of training data, some transition may not occur.

Using Structure of Taxonomy

The association between two
sibling categories stronger than
that of two non-sibling categories
Copyright  2010 by CEBT
13
Experimental Setup
 Taxonomy of ACM KDD Cup’05

Target Taxonomy

7 level-one category

67 level-two category
 Data set

Extracting 10,000 sessions from one day’s search log

Each session contains at least two queries

Three human labelers label the queries of each session
Copyright  2010 by CEBT
14
Baseline
 Bridging classifier (BC)

Training a classifier on an intermediate taxonomy

Bridging the queries and the target taxonomy in the online step
of QC

Outperforming the winning approach in KDD Cup’ 05
 Collaborating classifier (CC)

Naïve context-aware approach

Define score function of query q and category c by BC

Using current query and past query, association of previous
category and estimated category
Copyright  2010 by CEBT
15
Evaluation
 For a test query
, true category label
 Given the classification results

is a set of the top K predicted category labels
 Recall
 Precision
 F1 Score
Copyright  2010 by CEBT
16
Results
The average overall recall
CRF-B: CRF with Basic Features – Query terms, General label confidence
and Direct association between adjacent labels
CRF-B-C: CRF-B + Click-based label confidence
CRF-B-C-T: CRF-B-C + Taxonomy-based association
Copyright  2010 by CEBT
17
Results (contd.)
The average overall precision
The average overall F1 score
Copyright  2010 by CEBT
18
Case Study

Without considering context, Many possible search intents
–
General information of Santa Fe => Information\Local & Regional
–
Travel information of Santa Fe => Living\Travel & Vacation
Copyright  2010 by CEBT
19
Conclusions
 Novel Approach for leveraging context information to classify
queries by modeling search through CRFs
 This approach consistently outperforms a non-context-aware
baseline and a naïve context-aware baselines

The effectiveness of context information
Copyright  2010 by CEBT
20
Discussions
 Experiments on real data set clearly show that this approach
outperforms non-context-aware baseline
 The first-query problem

Not being able to find a search context if query is located at the
beginning of the session
 Experiments are too simple

size of session

height of taxonomy
Copyright  2010 by CEBT
21
Q&A
Thank you