User-centric SN for calculating cooperation index

Digital Enterprise Research Institute
Extracting and Utilizing Social Networks from Log Files of Shared
Workspaces
Peyman Nasirifard, Vassilios Peristeras, Conor Hayes and Stefan Decker
10th IFIP Working Conference on VIRTUAL ENTERPRISES
Thessaloniki, Greece, 7-9 October 2009
 Copyright 2009 Digital Enterprise Research Institute. All rights reserved.
www.deri.ie
Outline
Digital Enterprise Research Institute
Introduction and Problem Definition
 Object-centric social network for extracting
expertise
 User-centric social network for calculating the
coperation index
 Prototypes





Expert Finder

Holmes
Evaluation
Conclusion
Q and A
www.deri.ie
Introduction and Problem Definition
Digital Enterprise Research Institute

Online Shared workspaces provide various services
for online collaboration


BSCW, SharePoint
Difficult to find people with appropriate expertise
in intra- and inter-organizations settings


www.deri.ie
People do not update their profiles regularly
Difficult to spot „who works with whom“ or „who
the senior within a community is“

People do not maintain their social networks frequently
Problem Definition
Digital Enterprise Research Institute


www.deri.ie
To find people with specific expertise
To understand who works with whom and to what
extend
Our approach
Digital Enterprise Research Institute
We use:

Log files from CWEs

Social Network Analysis

Semantic technologies (RDF) to represent the
extracted Social Network
www.deri.ie
Social Network Analyis
Digital Enterprise Research Institute
www.deri.ie

Social Network Analysis has a lot of potential

Overt and Latent social networks exist among professionals

Online social networks can be divided into two main types


Object-centric (e.g., based on videos, music)

User-centric
We use both types in our work

We use object-centric SN for extracing expertise

We use user-centric SN for calculating cooperation index
– Cooperation index: an index that determines how close two people
work together
Log files
Digital Enterprise Research Institute

Log files of shared workspaces contain rich
information and can be further analyzed

A log record contains at minimum Subject
(e.g., user), Object (e.g., document) and
Action/Verb (e.g., read, revise)


Person with ID 123 revised
the document with ID 456
We use these three elements to
generate RDF triples for processing
www.deri.ie
Digital Enterprise Research Institute
www.deri.ie
Object-centric Social Networks
for
extracing expertise
Finding Experts
Digital Enterprise Research Institute

First step: Key-phrase Extraction


Documents are analysed based on NLP techniques to
identify phrases that occur frequently
Second step: Log File Analysis


www.deri.ie
To identify the documents a user interacts with and how
Third step: Assigning Expertise


A user is expert in topic X, if s/he created or revised a
document that contains topic X.
A user is familiar with topic Y, if s/he just read a document
that contains topic Y.
Overall Approach
Digital Enterprise Research Institute
www.deri.ie
Digital Enterprise Research Institute
User-centric SN for calculating
cooperation index
www.deri.ie
From Object-centric to User-centric
Digital Enterprise Research Institute
www.deri.ie
Action
Relationship
Assigning weights to social networks
Digital Enterprise Research Institute


First step: Build user-centric social network

Previous slide

Depth is also considered (e.g., Depth one means just one
document connects two persons)
Second step: Assign weights to relationships


www.deri.ie
User-defined weights with default values (e.g. Read-Read is
low-weighted relationship, create-create high-weighted)
Third step: Calculate cooperation index

Sum up the weights
Overall Approach
Digital Enterprise Research Institute
www.deri.ie
Prototypes
Digital Enterprise Research Institute

Expert Finder


www.deri.ie
http://purl.oclc.org/projects/expertui
Holmes (Cooperation Index calculator)

http://purl.oclc.org/projects/holmes

The prototypes are SOA-based

The prototypes use the BSCW shared workspace

The prototypes use log files of BSCW and in particular the
Ecospace project in the period of three years
– Around 183 users extracted from log file and some thousands of
events

Expert Finder uses around 50 deliverables of Ecospace project
Snapshot: Expert Finder
Digital Enterprise Research Institute
www.deri.ie
Snapshot: Holmes
Digital Enterprise Research Institute
www.deri.ie
Evaluation with 12 participants
Digital Enterprise Research Institute

www.deri.ie
We asked people to take a look at their
cooperation indices


All participants confirmed that the presented results were
relevant to them
Currently, we considered four main document events
(i.e., Create, Revise, Delete, and Read) and only
relationships at a depth of one. These events can be
simply extended to cover more document events as well
as deeper depths.
– Combining events and assigning weights to them can bring
overhead for users.

In a more complex model for calculating Cooperation
Indices, different weights can be posed to documents
based on their importance for the collaboration process.
Evaluation with 12 participants
Digital Enterprise Research Institute
Issue
www.deri.ie
Solution
Meaningless expertise
The confidence values (provided by NLP package)
were used as a threshold to identify the phrases that
have a higher probability of being a meaningful keyphrase. The key phrases were filtered accordingly.
Organization expertise profile
An expertise profile may be built for an organization by
unifying the expertise of all members of that
organization.
Similar phrases
Some phrases were conceptually the same, but
reported several times. One partial solution to this
problem could be using WordNet to infer the semantics
of the terms and merge relevant terms.
Irrelevant expertise
Version history of the shared workspace may be utilized
to infer the exact contribution of a user (e.g. by using
diff)
Tools and technology overview
Digital Enterprise Research Institute





www.deri.ie
Social Network Analysis
Log files from CWEs
NLP techniques for Phrase Extraction
RDF for representing object-centric and user-centric
Social Networks
Web Services for exposing functionalities
Conclusion and Future Work
Digital Enterprise Research Institute




www.deri.ie
We presented our approach for extracting
expertise from online shared workspaces
We also presented our approach for calculating an
index that determines how close two people
worked together in the past
Addressing the points (and shortages) mentioned
in the evaluation is one of our future directions
Using temporal aspects of log file is another
future directions

Calculating cooperation index in a period of time
Digital Enterprise Research Institute
www.deri.ie
Thank You!
Q and A