The Use of Faceted Analytico-Synthetic Theory as

Creating a New Faceted Browsing
Function for
Millennium WebPAC Pro
Li, Yiu On, Senior Assistant Librarian
Leung, Roger, Information Technology Officer
Hong Kong Baptist University Library
9th HKIUG Meeting
1
University of Hong Kong Library
8th Dec., 2009
Outline
1.
2.
3.
4.
5.
What is Faceted Browsing?
Implementations of Faceted Browsing in
Traditional WebPAC: Two Approaches
Architecture of the New Faceted Browsing
Function in WebPAC Pro
BU Faceted Browsing Function and Encore:
A Comparison
Conclusion
2
1. What is Faceted Browsing?
1.1 Definition of Faceted Browsing
Faceted Browsing

also known as faceted searching, or faceted
navigation

is a special navigation interface designed for
record searching and browsing

to display aspects of result sets in multiple
classification and categorization schemes,
(e.g. related authors, titles, subject headings,
material types, locations, languages,
publication years, etc.)
4
1.2 Advantages of Faceted Browsing
Unlike a single, pre-determined, hierarchical
scheme, faceted browsing gives users the
abilities:

To find items from multiple dimensions and
attributes

To explore new directions in dynamic
taxonomies (i.e. divisions into ordered groups
or categories)

To refine/narrow down the searches
5
1.2 Advantages of Faceted Browsing
(Con’t)


To easily switch between searching and
browsing, users can use their own
terminology for searching, while browsing the
organizations and categories suggested by
faceted classifications
To display the number & contents of each
suggested category
6
1.2 Advantages of Faceted Browsing
(Con’t)

“For experienced Web users, faceted
navigation isn’t something that needs to be
explained”
-- Marshall Breeding. "Next-Generation Library
Catalogs". Library Technology Reports, vol. 43, no.
4, July-August 2007, p.12.
7
1.3 Use of Faceted Browsing in
Commercial Sites


Indeed, faceted browsing has become part of a
well-established user interface convention
A 2003 survey reported that:
69% of 75 leading commercial sites made use of
faceted browsing. In fact, all sites of computers,
gifts, kitchen ware, music/video categories used
faceted browsing
-- Use of Faceted Classification,
http://www.webdesignpractices.com/navigation/facets.html

e.g. Amazon, the largest online book stores
8
In Amazon, faceted browsing includes:
1.
2.
3.
4.
5.
6.
7.
8.
9.
New Releases
Department
Formats
Binding
Shipping Options
Award Winners
Promotion
Avg. Customer Review
Condition
9
1.4 Implementation of Faceted
Browsing in WebPAC


If librarians can implement this common
faceted browsing function in WebPAC
environment, then
we can change WebPAC from a traditional
searching tool to a powerful information
discovery tool
10
2. Implementations of Faceted
Browsing in Traditional WebPAC:
Two Approaches
2.1 Need for Adding New Web 2.0
Functions to WebPAC


More and more librarians are discontent with
the insufficient functionalities of the traditional
WebPAC interfaces
To win the support from the new generation of
web users, we need to add new Web 2.0
technologies such as faceted browsing,
interactive cloud tags, federated search, and
social networking tools, etc.
12
2.2 Next Generation WebPAC

Different names of WebPAC equipped with
new Web 2.0 functions include:



Next Generation Library Catalog,
SmartCat,
Library Catalog 2.0….
13
2.3 Two Different Approaches
In Hong Kong, two different approaches are
adopted to build the Next Generation Library
Catalog. They are:
1.
New Functions in New WebPAC (NFNW)
2.
New Functions in Current WebPAC (NFCW)
(Note: in this presentation, we use faceted browsing
function as an representative example of the Web 2.0
functions)
14
2.4 NFNW Development Logic
The development logic of the New Function New
WebPAC approach may be summarized as :
1.
We MUST add faceted browsing function to
WebPAC
2.
Existing Millennium WebPAC Pro environment
is too old and CANNOT accommodate this
transformation
3.
Thus, we need to develop a new WebPAC to
implement new Web 2.0 technology
(NOTE: this argument is invalid, we will talk more in
NFCW later)
15
2.5 Two Models of NFNW
Two different development models of New Function
in New WebPAC :
1.
Encore



III product
Relatively high annual subscription fee
CUHK, HKU, PolyU have purchased
Scriblio
2.


an open-source software
enhanced & used at HKUST
16
2.6 Disadvantages of NFNW

Many existing powerful functions of WebPAC
Pro are missing in Encore
1.
2.
3.
4.
5.
6.
7.
Exact Author, Title, Subject Searching
Scope searching
Limit results to items with "Available" status
Search History
Author/title/subject authority list (e.g. Author
search = Strauss, Johann, 1825-1899)
Modify/Limit this Search command
Advanced Keyword Search Form
17
Existing WebPAC Pro powerful
functions are missing in Encore
18
2.7 NFNW = Dual WebPAC System
As a result, Encore cannot replace “traditional”
WebPAC Pro
2.
If patrons want to use those “old” advanced
search functions, they have to use the
“Traditional” WebPAC Pro
3.
Thus, Encore (New Function New WebPAC)
approach, in reality, is a dual WebPAC system
Encore (New WebPAC) + WebPAC Pro (Traditional
WebPAC)
1.
19
click on to access WebPAC Pro for
more “old/classic” advance
searching capabilities
Encore in University of
Queensland Library
http://encore.library.uq.edu.au/iii/encore/search/C%7CSStr
20
auss%7COrightresult%7CU1?lang=eng&suite=def
2.8 Disadvantages of a Dual WebPAC
System
1.
2.
3.
Patrons have to learn how to use two different
WebPAC systems. This may cause
inconvenience and confusions
Library staff spend more time and effort to
maintain two searching interfaces, therefore,
maintenance cost is high
Systems people waste time to re-invent a
“new” interface rather than concentrate on the
design work of faceted browsing function
21
2.9 Building Next Generation Library
Catalog – the Second Approach
1.
The development logic of New Functions New
WebPAC approach is based on an invalid
argument:
“the existing Millennium WebPAC Pro environment is
too old and CANNOT accommodate any Web 2.0
functions”
2.
But, our study shows that WebPAC Pro is a
comparatively open and flexible environment,
and we can add in-house developed scripts to
the interface
22
2.9 Building Next Generation Library
Catalog – the Second Approach (Con’t)
3.
4.
Thus, we decided to add faceted browsing
function to the existing WebPAC Pro interface
This is a more logical, simple and direct
approach, and I call it:
New Functions in Current WebPAC (NFCW)
23
2.10 Merits of the NFCW Approach
Faceted browsing is inserted to WebPAC Pro
and becomes an integral part of it
1.


2.
All the existing WebPAC Pro powerful functions
are kept
The new add-on faceted browsing functions
are fully compatible with the existing WebPAC
Pro functions
The new add-on faceted browsing functions
strengthen the existing WebPAC Pro
searching capabilities
24
2.10 Merits of the NFCW Approach
(Con’t)
3.
4.
5.
Single interface avoids unnecessary
inconvenience, inconsistency and confusion
caused by a dual WebPAC systems
Save library staff’s time and efforts in
maintaining two different WebPAC systems
No need to re-invent a new WebPAC interface,
therefore, software development cost and
cycle is largely reduced
25
2.11 New Faceted Browsing in HKBU
WebPAC Pro
1.
2.
Based on the NFCW development logic,
HKBU has recently installed a new faceted
browsing function on the staging port of
WebPAC Pro
Currently, only some 222,000 records are
uploaded to this database for testing
26
2.11 New Faceted Browsing in HKBU
WebPAC Pro (Con’t)
3.
HKBU WebPAC Pro staging port:
http://hkbulib.hkbu.edu.hk:2082/search~S11/?searchtype=a&searcharg
=plato&searchscope=11&SORT=D&extended=0&searchlimits=&
searchorigarg=asmith+adam
27
New In-house Developed Faceted
Browsing Function in HKBU
WebPAC Pro
28
3. Architecture of the New
Faceted Browsing Function in
WebPAC Pro
3.1 Systems Requirements
Hardware
1.
X86 based PC/Server
Our Server configuration:





Dual Xeon Q-Core CPU
32GB Memory
1 TB HDD space
NOTE: 220,000 bib records are uploaded for testing, and
use 7.8 GB for MySQL and Sphinx
30
3.1 Systems Requirements (Con’t)
Software
2.




Perl 5 with marc2xml and marc-charset (for
MARC to XML conversion)
MySQL 5 (for data storage)
Sphinx (for building index and searching data)
IIS with ASP 3.0 (for user interface & data
conversion)
31
3.1 Systems Requirements (Con’t)



Systems requirement is minimal
Don’t need a dedicated server
Don’t require special high end programming
language (Perl, MYSQL, and Sphinx are
freeware)
32
3.2 Program Workflow

1.
2.
Two major parts:
Construct a bibliographic record database for
facets analysis
Create a special iFrame in WebPAC Pro for
displaying facets
33
3.3 Construction of a New
Bibliographic Database
1.
2.
3.
Metadata are required to calculate facets
Thus, we build a separate database to store
the raw data for creating facets instead of
using the records in Innopac system
All MARC records are exported from Innopac
system, and uploaded to our in-house
developed bibliographic database
34
3.3 Construction of a New
Bibliographic Database (Con’t)
4.
An indexing program is designed to extract
facets according to 11 categories below:
Variable Fields
1. Author
Fixed Fields
5. Scope
2. Title
6. Language
3. Subject
7. Material Type
4. Publisher
8. Location
9. Publication Year
10. Call No. (browsing only)
35
3.4 Facets Variable Fields

Below is the facet variable fields for
Author Search = Smith, Adam
36
3.5 Facets Fixed Fields

Below is the facet fixed fields for
Author Search = Smith, Adam
37
3.6 Insert Facets on WebPAC Pro
1.
WebPAC Pro is an open environment, we can
insert scripts and create an iFrame to display
facets on brief citation browse page and bib
record page
38
3.7 iFrame Tag on Briefcit.html

An example of iFrame tag:
<iFrame
src="http://lib.hkbu.edu.hk/facet/browse/index.
asp?searchterm " width="100%"
height="100%" frameborder="0"
scrolling="no">
</iFrame>
39
3.8 Facet Display Program
1.
Input search variable by extracting the search
term in the WebPAC search URL,
http://hkbulib.hkbu.edu.hk:2082/search~S11/?searchtype=X&searcharg
=china&searchscope=11&SORT=DZ&extended=0&SUBMIT=Search&s
earchlimits=&searchorigarg=Xchina
2.
3.
4.
Pass the search term to bibliographic
database and made a SQL search query
Extract search results from the in-house
bibliographic database, and then, calculate
and group the facet values
Display faceted categories and values on
iFrame
40
2. Extract search term and pass
to iFrame
1. Create
facet
iFrame in
3. SQL to bibliographic db
briefcit.html
4. Return facets values
41
4. BU Faceted Browsing Function
and Encore:
A Comparison
4.1 Facets Variable Fields
1.
2.
3.
Encore cannot provide facets values for
variable fields like Author, Title, and Subject
Thus, Encore cannot provide a meaningful
refinement alternative for variable field
categories
An example: Author search = Smith, Adam
43
Encore Variable Field Facets
No facet values
Indeed, only keyword
search links are provided






Keyword-Author
Keyword-Title
Keyword-Subject
Fail to provide meaningful
alternatives for users to
refine/limit their search
44
Author Facets in HKBU


Names of Chinese
translators are provided
Ebook collections
45
Title Facets in HKBU
List of Adam Smith’s most
important work:




Wealth of Nations
Theory of Moral Sentiments
Chinese translation titles for
Wealth of Nation 原富, 國富
論 are provided
46
Subject Facets in HKBU
Contributions of Adam Smith
in subject areas:



Economics
Ethics
47
4.2 New Facets Variable Field -Publisher
1.
2.
BU provides “Publisher” as a new facet
variable field
Users may choose to refine the search by
publisher like Oxford University Press
48
4.3 Publication Year Facets
1.
2.
In BU, the publication year is sorted in a 10year range instead of a long list of single year
as in Encore
A 10-year list is easier for browsing, searching,
and collection analysis
49
4.4 Encore Only Provides Keyword
Searching
1.
2.
3.
In Encore, keyword searching is the only
searching capability
Without exact Author, Title, Subject search, it
will make the searching process more
complex, and difficult
In BU, users can still use exact Author, Title,
and Subject search, and refine the search by
facets
50
4.4 Encore Only Provides Keyword
Searching (Con’t)
4.
It is difficult to do a keyword search on
authors with common last names and first
names
e.g., Adam Smith
51


Find all records containing Adam and
Smith
NOTE: the first two are not written by
Adam Smith, the British economist,
that we are looking for
52



An exact Author search is much easier and
straight forward
Adam Smith was born in 18th century, entry
#2 is the one we are looking for
Facets is also helpful to refine the search
53
4.4 Encore Only Provides Keyword
Searching (Con’t)
5.
It is also difficult to search subject headings
containing common terms by Keyword
e.g. Philosophy – History -- China
54


In Encore, find all records containing
philosophy and history and China
55
NOTE: Many are irrelevant records


An exact Subject search is much
easier and straight forward
Facets is very useful to refine the
search
56
4.5 Call Number Analysis



Unavailable in Encore:
e.g. Keyword = Plato
To facilitate users to browse the class
number list, the program will display both the
class number and scope of content
57
4.5 Call Number Analysis (Con’t)
58
4.6 Fully Compatible with All WebPAC
Searching Functions

Unavailable in Encore:
1.
2.
3.
4.
5.
6.
7.
Exact Author, Title, Subject Searching
Scope searching
Limit results to items with "Available" status
Search History
Author/title/subject authority list (e.g. Author search =
Strauss, Johann, 1825-1899)
Modify/Limit this Search command
Advanced Keyword Search Form
59
5. Conclusion
5.1 Benefit

Created new faceted browsing function in
existing WebPAC Pro environment is
beneficial
1.
2.
WebPAC Pro can be re-engineered and
upgrade to become a Next Generation Library
Catalog
This upgrade can keep all the advanced
searching functionalities of WebPAC Pro
61
5.1 Benefit of NFCW (Con’t)
3.
4.
5.
Upgrade cost is low because there is no need
to re-invent a new WebPAC
Annual maintenance cost is low because there
is no need to maintain two different WebPAC
interface
Development circle is faster because we can
concentrate our work on designing new
functions
62
5.2 Future Development

In the second phase, we will add the following
new Web 2.0 functions
Cloud Tagging
2.
Newly Added Book List
3.
RSS*
4.
User Book Rating
5.
User Book Review/Comment
6.
Adding Google/Amazon table of content*
7.
Most Common Search Terms*
*Not available in Encore
1.
63
RSS
Cloud
Tagging
User Book Rating
User Book Comment/Review
Google/Amazon table of content
Recently
Added
List
64
Demo
Thank you
Demo examples
BU WebPAC Staging Port
http://hkbulib.hkbu.edu.hk:2082/search/X?SEARCH=plato&SOR
T=D&l=&m=&p=&b=&Da=&Db=&searchscope=11





KW = Plato (scope = Multimedia)
AU = 張五常 (publisher =香港經濟日報)
AU = Strauss, Johann, 1825-1899 (subject =
Waltzes (Orchestra))
SU = Kant (author = 牟宗三)
KW = China (pub year = pre 1900)
66