Use-Cases Proof of Concept Collaboration Infrastructure

indi-2010-010-013
Use-Cases Proof of Concept Collaboration
Infrastructure
Project
Project Year
Project Manager
Author(s)
Completion Date
Version
:
:
:
:
:
:
SURFworks / Proof of Concept COIN
2010
Frank Pinxt
Hulsebosch, Wladimir Mufty, Frank Benneker
30-09-2010
1.0
Summary
This document describes a set of use-cases for the realisation of the Proof of Concept
Collaboration Infrastructure (COIN).
The use-cases are divided in three areas:
•
Use-case eResearch.
•
Use-case UvACommunities;
•
Use-case Mashups / Federated Widgets;
This document forms the basis for the realization of the Proof of Concept COIN
This publication is licensed under Creative Commons “Attribution 3.0 Unported”.
More information on this license can be found at http://creativecommons.org/licenses/by/3.0/
Colophon
Programme line
Part
Activity
Deliverable
Access rights
External party
:
:
:
:
:
:
SURFworks 2010
Proof of Concept
5.1
Use-cases Proof of Concept Collaboration Infrastructure
public
Novay
This project was made possible by the support of SURF, the collaborative organisation for higher
education institutes and research institutes aimed at breakthrough innovations in ICT. More
information on SURF is available on the website www.surf.nl.
____________________________________________________________________________________________
2
5 things you should know about Use-Cases Proof of Concept COIN
Scenario
In 2010 a number of proofs of concept (PoC’s) will be realized. The content of
these PoC’s will be based on the results of the technology and market research.
The use-cases in this report serve as the basis for the realization of the PoC’s
What is it?
This report describes several use case scenarios as input for the PoC’s. Use-cases
are a effective and common way to capture the functional requirements of a
software system.
The goal is to define criteria that need to be applied to several PoC’s that will be
built and evaluated in SURFworks’ COIN.
These criteria are used to determine whether available services or tools are
suitable (enough) to be used as functional building blocks in COIN.
For who is it?
This report will be the input for the developers of the PoC COIN.
How does it
work?
What can you
do with it?
The purpose of this report is foremost to serve as the input for the realization of
the PoC COIN.
More
information
Not applicable
____________________________________________________________________________________________
3
Contents
1.
eResearch Use-cases ....................................................................................................................... 5
1.1
Introduction ............................................................................................................................... 5
1.2
Context, conditions and criteria ..................................................................................................... 5
1.3
Use case 1: Exchange of large data sets......................................................................................... 6
1.3.1
Required functionalities ................................................................................................................ 6
1.3.2
Key criteria for PoC success .......................................................................................................... 6
1.4
Use case 2: Sharing data ............................................................................................................. 7
1.4.1
Required functionalities ................................................................................................................ 8
1.4.2
Key criteria for PoC success .......................................................................................................... 8
Use case 3: Related work search .................................................................................................. 10
1.5
1.5.1
Required functionalities ................................................................................................................... 11
1.5.2
Key criteria for PoC success ............................................................................................................. 11
Use case 4: Reference management ............................................................................................. 12
1.6
1.6.1
Required functionalities ................................................................................................................... 12
1.6.2
Key criteria for PoC success ............................................................................................................. 13
Appendix ................................................................................................................................... 14
1.7
Products publication and reference management...................................................................................... 14
Products data transport ........................................................................................................................ 15
Products for cloud storage and sharing ................................................................................................... 15
2.
UvA Communities ........................................................................................................................... 17
2.1
Introduction .............................................................................................................................. 17
2.2
Context, conditions and criteria .................................................................................................... 17
2.3
Use case ................................................................................................................................... 17
2.3.1
Required functionalities ............................................................................................................... 18
2.3.2
3.
Key criteria for PoC success ......................................................................................................... 18
Federated Widgets ......................................................................................................................... 20
3.1
Introduction .............................................................................................................................. 20
3.2
Context, conditions and criteria .................................................................................................... 20
3.3
Use case: Use of Federated Widgets ............................................................................................. 21
3.3.1
Required functionalities ............................................................................................................... 22
3.3.2
Key criteria for Proof of Concept success ....................................................................................... 22
4.
Mashups ....................................................................................................................................... 23
4.1
Introduction .............................................................................................................................. 23
4.2
Context, conditions and criteria .................................................................................................... 23
4.3 Use case: Mashup Technology for Research and Education ........................................................................ 24
4.3.1
Required functionalities ............................................................................................................... 24
4.3.2
Key criteria for Proof of Concept success ....................................................................................... 24
____________________________________________________________________________________________
4
1.
eResearch Use-cases
1.1 Introduction
Researchers more and more collect and utilize large data sets for conducting their research. They
frequently do that in inter-disciplinary and collaborative research settings. Such settings demand for
efficient data management solutions that facilitate scalable storage, simple exchange, and convenient
sharing of data.
Once the data has been analyzed, the outcome should be put in its related scientific context. A process
that involves amongst others finding related works and be able to smoothly embed their references
into the scientific paper during the writing process.
This report outlines several use case scenarios that are associated with the scientific data sharing and
publication processes. The goal is to define criteria that need to be applied to several proofs-of-concept
(PoCs) that will be built and evaluated in SURFworks’ COIN. These criteria are used to determine
whether available services or tools are suitable (enough) to be used as functional building blocks in
COIN.
1.2 Context, conditions and criteria
Starting point for the use cases is a market scan for e-Science (collaboration) tools in COIN. The scan
provided a basis for a recommendation for one or more PoCs in the areas of data sharing and
publication.
The use cases are positioned in a research collaboration context involving academic researchers from
multiple universities. The collaboration has a limited lifetime that is in the order of several years. In
the use cases we assume a setup in which (possibly large) data collections of different nature are
shared or transferred between scientists. The data can either be stored in a single central location, or
can be distributed over different locations. Centralized storage is assumed in the cloud; distributed
storage typically involves peer-to-peer storage solutions. Existing COIN functionality such as
SURFteams for group management and SURFfederatie for federated access will be taken into account
while analyzing the use cases.
Moreover, the use cases will be described from a service point of view, i.e. they will focus on a single
aspect of the data management process (exchange, sharing, literature search, reference
management). This allows for better assessment of potential services’ suitability for a COIN PoC. The
assessment will be based on use case specific as well as a number of high-level criteria. The high-level
criteria are:
1. Popularity of the service amongst researchers.
2. The service or product’s business model: is it a commercial or non-commercial (i.e. open
source) available service.
3. Added value of COIN for the service. Which features or functionality of COIN make the service
easier to work with or add new and useful functionality?
4. Is the service offered as a standalone software application or as a web application?
5. The amount of effort needed to make the service part of COIN, i.e. COIN-isation effort.
____________________________________________________________________________________________
5
6. Is the owner of the service willing to cooperate with the COIN-isation effort?
7. The strategic relevance of the service for SURFnet and/or COIN platform. Important aspect to
take into account here are service popularity for attracting more users to COIN, the capability
of introducing new target user groups that are associated with a certain service, the added
value of certain service combinations, and the possibility of creating strategic partnerships with
service vendors.
1.3 Use case 1: Exchange of large data sets
This use case illustrates simple exchange of large files. Janet studies human movement , and she has
just finished a patient observation experiment. She has recorded the experiment, which took several
hours, using a modern high-resolution camera. The resulting movie file is rather large (~ 35GB) and is
stored at a local PC in the Hospital’s laboratory. For further analysis of the experiment she needs to
transfer the file1 to her own desktop PC and to her colleague John's PC. Because the file is too large to
be sent as an email attachment, Janet uses a very simple file exchange application2_ that is offered by
the COIN platform. After logging in3, this application allows Janet to upload the file and to specify the
users she wants to transfer the file to. A few minutes later, John receives an email from Alice that
contains a link for downloading the file. Before Alice arrives back at the university premises, John has
already started analyzing the movie.
1.3.1 Required functionalities
In order to realize exchange of large data sets the following functionality is required:
1. The researcher should be able to transfer large files to multiple destinations. Often a file is too
large to send via email. Services exist that facilitate smooth exchange of such large files (see
below).
2. Exchanging files should be very simple. FTP-based transfer often tends to be too difficult to setup
for many researchers, particularly when doing research from a foreign location (e.g. firewall
settings and other security measures typically prevent such file transfers). The graphical interface
should be intuitive and friendly for the researcher.
3. Access control functionality must be present to guarantee the confidentiality and integrity of the
exchanged files. Research data often should be treated confidentially, i.e. other researchers should
not be able to receive exchanged data. Ideally, the researcher should be able to use his own
account credentials to login and enter a personal date exchange environment.
The following services offer the required functionality and should be considered for a PoC:
•
FileSender
•
Lobber
•
Big File Sender 2.0
•
DropSend
1.3.2 Key criteria for PoC success
Key technical criteria for the requirements discussed in the previous section can now be provided.
Although these criteria are not comprehensive in every aspect, they provide a good indication of the
services under investigation and therefore can be used in the decision to start the development of a
PoC.
Technical criteria that may have an impact on the feasibility of a PoC with one of these tools are:
1. User friendliness.
____________________________________________________________________________________________
6
2. Federated access should be supported.
A brief assessment of several solutions against the use case specific criteria is given in the table below.
Exchange service
User friendliness
Federated
access
FileSender
Lobber
Good
Good
Provided
Provided
Big File Sender 2.0
DropSend
Moderate
Good
Not provided
Not provided
Assessment of the candidate services against the above-mentioned high-level criteria results in the
following table:
Service
Popularity
Commercial
/academic
Added
value of
COIN
Software or
web
application
Willingness
to
cooperate
COINisation
effort
Strategic
relevance
for
SURFnet
FileSender
Popular
Open
Source.
Web
Not relevant.
In Norway
Open
Source, for
academic
usage
mainly.
Web
Not relevant.
Small,
supports
federated
access.
Small,
supports
federated
access.
High
Lobber
High, via
SURFfederati
e en
SURFteams
High, via
SURFfederati
e en
SURFteams
Big File
Sender 2.0
Limited
Commercial,
nonacademic.
Desktop
application
Unknown
Large
High
DropSend
Limited
Commercial,
nonacademic.
High, via
SURFfederati
e en
SURFteams
High, via
SURFfederati
e en
SURFteams
Web
Unknown
Large
High
High,
cooperatio
n with
NORDUnet
and reuse
of tools
from
another
NREN.
1.4 Use case 2: Sharing data
This use case illustrates typical data sharing behavior in a cross-domain scientific project. It involves a
researcher Alice of a University A who has generated a considerable amount of scientific data, i.e. she
has created a PDF file of an old and valuable historic document. The data and its meta-data are stored
somewhere1 for further analysis and are available via a common internet browser. Prior to being able
to analyze the document, the data needs to be processed2, i.e. an OCR conversion of the PDF file is
needed. This conversion is carried out via a robust OCR tool that has been developed by Bob of
University B. Instead of downloading the file to her own computer and sending it to the OCR tool, Alice
____________________________________________________________________________________________
7
delegates the OCR tool access3 to the PDF file and the conversion starts. Somewhat later, the OCR
conversion file has been stored as well. Now Alice can start analyzing the old document’s language
style. Due to copyright infringements, access to the file is limited and under the control of Alice. Since
she participates in a collaboration project, Alice creates a group4 of researchers that are allowed
access5 to the file. This group includes researchers from University A as well as from other universities.
Tom, from University C, is one of the group members. He particularly is interested in the historic
meaning of the scanned document. After providing his username and password credentials to his local
authentication6 server, Tom is granted access to the file and amazed about the historic value it
contains.
1.4.1 Required functionalities
Based upon the above-described functionalities, the following functional requirements can be identified
(these represent the underlined pieces of text in the case description):
1. Storage capacity. As already mentioned, storage can be done locally on a central database,
somewhere on a central location in the cloud, or distributed via peer-to-peer storage mechanisms.
Services that support cloud storage and sharing are amongst others DropBox, Box.net, JungleDisk,
and Mozy. More advanced cloud storage products are GPFS and GlusterFS. These two products will
be PoC-ed in GigaPort3 EDS and are out of scope of this deliverable. Peer-to-peer storage products
are Wuala, CrashPlan, Cucku, and PowerFolder.
2. Data processing. Tools for processing the data are often required. These tools may run at a remote
or local site. Since these tools are often use case specific they will not be taken into account during
the further analysis of the use case.
3. Delegation of access. Alice delegates the conversion application access to the PDF-file. Somehow,
she must have given the application an access token. Such functionality could for instance be
supported via standard delegation mechanisms like OAuth.
4. Group functionality. Alice is able to create a group of resources that is allowed access to the file.
SURFteams could facilitate this very well.
5. Group-based access. The SURFteams group-attribute could be used for access control decisions.
The file storage service, however, must be able to interpret this group attribute.
6. Federated identity. Tom accesses the file after his own university authentication server has
authenticated him. His identity has been federated with the remote storage capacity. The
SURFfederatie is able to offer such functionality.
1.4.2 Key criteria for PoC success
Key technical criteria for the requirements discussed in the previous section can now be provided.
Although these criteria are not comprehensive in every aspect, they provide a good indication of the
products under investigation and therefore can be used in the decision to start the development of a
PoC.
Looking at these requirements it is obvious that they have an impact on the tools for storing and
sharing of data. Technical criteria that may have an impact on the feasibility of a PoC with one of these
tools are:
1. Storage and sharing tools must be able to deal with federated access credentials and delegation of
access via e.g. tokens. Furthermore, they must be capable of handling group functionality, in
particular group-based access via group-related credential (attributes).
2. The data owner must be able to delegate third parties access to (pieces of) data stored elsewhere.
Such delegation may have a temporal character.
____________________________________________________________________________________________
8
3. Group functionality should be under the control of the data owner. Group members must be part of
the identity federation.
Looking at the criteria it is obvious that they mostly have an impact on the storage tools.
A brief assessment of several solutions against the criteria is given in the table below.
Storage & sharing
service
Federated
access
Delegation
support
Sharing via group
Dropbox (cloud)
Box.net (cloud)
Not provided
Not provided
Not provided
Not provided
Shared access is provided
Not provided
JungleDisk (cloud)
Mozy (cloud)
Not provided
Not provided
Not provided
Not provided
Provided
Not provided
LionShare (cloud)
Wuala (p2p)
Not provided
Not provided
Not provided
Not provided
Provided
Available via MyGroups
CrashPlan (p2p)
Cucku (p2p)
Not provided
Not provided
Not provided
Not provided
Not provided
Not provided
PowerFolder (p2p)
Not provided
Not provided
Provided
Clearly, most of the desired functionality is not supported by the candidate services. This will have an
impact on the COIN-isation effort needed. These and other criteria are assessed in the table below.
Service
Popularity
Commercial
/academic
Added
value of
COIN
Software
or web
application
Willingness
to
cooperate
COINisation
effort
Strategic
relevance
for
SURFnet
Moderate
DropBox
Popular
Open Source,
also academic
usage.
Web
application
Not
relevant, is
open source.
Large,
federated
and groupbased
access
functionality
needed.
Box.net
According to
the website
60.000
businesses
use it.
Commercial,
nonacademicusag
e mainly.
High,
federated
access and
group
manageme
nt features
are of
added
value.
High,
federated
access and
group
manageme
nt features
are of
added
value.
Web
application
Possibly, the
SURFnet
community
might be
attractive
for Box.net
to support.
Large,
federated
and groupbased
access
functionality
needed.
Moderate
JungleDisk
Unknown
Commercial,
non-academic
usage mainly.
High,
federated
access and
group
Desktop
application
Possibly, the
SURFnet
community
might be
Large,
federated
and groupbased
Moderate
____________________________________________________________________________________________
9
manageme
nt features
are of
added
value.
attractive
for Box.net
to support.
access
functionality
needed.
Mozy
More than 1
million
customers
and 50,000
business
customers
use it
Commercial,
nonacademicusag
e mainly.
High,
federated
access and
group
manageme
nt features
are of
added
value.
Web
application
Possibly, the
SURFnet
community
might be
attractive
for Box.net
to support.
Large,
federated
and groupbased
access
functionality
needed.
Moderate
LionShare
Mainly in the
US, status
unclear.
Popular,
very mature
and friendly
Open source,
for academic
usage mainly.
Commercial
High
Desktop
application
Small, is
open source.
Large
Moderate
High
Software
Large
Moderate
CrashPlan
Limited
High
Both
Large, the
product
itself is not
mature yet.
Small
Cucku
Limited
For free or
via paid
subscription
for online
version
For free
Small,
added value
of a closed
Wuala
community
in COIN is
not
attractivefor
Wuala.
Small
High
Software
Small
Large, the
product
itself is not
mature yet.
Small
Power
Folder
Limited
For free with
limited
functionality
or for sale
High
Software
Small
Large, the
product
itself is not
mature yet.
Small
Wuala
1.5 Use case 3: Related work search
After having intensively analyzed the data and compared with several other data-sets and related
literature, Alice decides to write a paper about the outcome of her team’s work. They have found some
interesting linguistic deviations in the old document that seem to point to a strong influence of a
foreign language. Similar foreign influences have been reported in literature, she found out while
browsing through several online collections1 of published scientific content. Several of these collections
offer Alice the opportunity to create a personal profile2. The availability of such a profile allows the
____________________________________________________________________________________________
10
collections to optimally retrieve relevant information for Alice and notify her about new work. Alice
decides to create a personal profile at several collection sites. Interestingly, there seems to be another
researcher who is very active as well in the area of old language analysis. Alice decides to query for
more work of this researcher. She uses the digital unique identifier3 of the researcher for this purpose.
Just a few days before submitting the paper to a conference, Alice is notified4 via an email that
recently a possibly interesting paper has been published. Indeed, the paper is very relevant as it
confirms Alice’s conclusions and she decides to add it to her own paper. Now the paper is as solid as a
rock and she is confident that it will be accepted for publication.
1.5.1 Required functionalities
Based upon the above-described functionalities, the following functional requirements can be identified
(these represent the underlined pieces of text in the case description):
1. Online collections of published content. These collections offer the researcher to find and retrieve
relevant published content on the basis of key words and/or domain experts. Examples of such
collections are PubMed, NARCIS, and ScienceDirect.
2. Personal profile options for customized search and retrieval.
3. The availability and use of unique author identifiers to simplify (cross-repository) search actions.
4. Notifications. Researchers should be notified of new publications in their area of interest.
1.5.2 Key criteria for PoC success
Based upon the identified requirements, a number of criteria can be identified that are relevant for
further PoC efforts:
1. Searching for content may be done in the online repositories themselves but may also be done via
a meta-search option. Tools like Scopus, Scirus, Scitopia, OpenDOAR, and GoogleScolar offer such
functionality.
2. The researcher should be able to specify a personal profile for customized content offerings.
Ideally, this should be a single profile that can be used for multiple repositories.
3. A unique identifier of a researcher (e.g. Digital Author Identifier) could be used to efficiently search
for publications of that researcher across multiple repositories.
4. Notifications of new related publications from multiple repositories should be presented to the user.
A brief technical assessment of several online repositories against the requirements and criteria is
given in the table below.
____________________________________________________________________________________________
11
Online
repositories
ScienceDirect
PubMed
Narcis
Available via
meta search
engines
Yes, via e.g.
Scopus and
Scirus
Personal
profile
Identifiers
Notifications
Provided
Not supported
Provided
Yes
Yes, though with
low priority
ranking
Provided
Provided
Not supported
DAI
Provided
Provided
The assessment of the online repositories against the high-level PoC criteria is summarized in the table
below.
Service
Popularity
Commercial/
academic
Added
value of
COIN
Software or
web
application
Willingness
to cooperate
COINisation
effort
Strategic
relevance
for
SURFnet
ScienceD
irect
Popular
Commercial,
academic.
Limited
Web
Maybe, is
Elsevier and
already
member of
the
SURFfederatie
Small, in
case of a
widget.
Widgets are
available.
Moderate
PubMed
Popular
Commercial,
academic.
Limited
Web
Unknown
Small, in
case of a
widget.
Widgets are
available.
Moderate
Narcis
In the
Netherlands
Not
commercial,
academic.
Limited
Web
Likely
Small in case
of a widget.
High, good
for Dutch
research
community
1.6 Use case 4: Reference management
While browsing for related work literature in several online repositories, Alice decides to make use of a
reference management tool for collecting all references1 she finds relevant. This collection of related
work references she shares2 with the co-authors of the paper they are writing. The co-authors are also
able to add new references to the collection. While writing the paper, Alice finds it easy to copy and
paste the references3 from the collection in the Word document according to the format3 that the
journal template requires.
1.6.1 Required functionalities
The following functional requirements can be distilled from the use case description:
____________________________________________________________________________________________
12
1. Reference management. References and citations of related work must be handled efficiently to
streamline the paper-writing process. Popular tools that facilitate such handling are Zotero,
Mendeley, Refworks, and Endnote.
2. Reference sharing. Other research team members must be able to add new references or make
use of existing ones via a shared collection. Again, Zotero offers such support sharing support for
groups or collaborations.
3. Reference reuse and format support. Reference management tools must support multiple reference
formats.
1.6.2 Key criteria for PoC success
The following technical criteria are relevant for a success:
1. Reference management tools must be able to interoperate with multiple publication sources.
2. Access to the shared reference collection must be possible across multiple universities.
3. Simple copy-paste functionality that takes into account the right reference format.
Reference
managers
Interoperability with
online databases
Group support for
sharing references
Multiple reference
formats and copypaste
Zotero
Yes, with e.g. PubMed,
CiteSeer, and IEEE Xplore
Available in the form of a
shared collection of
references
Supported, Word/Writer
extensions.
Mendeley
Yes, with e.g. PubMed,
CiteSeer, and IEEE Xplore
Available
Supported
Refworks
With PubMed; not with
IEEE Xplore.
Yes, with e.g. PubMed;
not with IEEE Xplore and
CiteSeer.
Available
Supported
Not supported
Supported
Endnote
For a complete comparison of reference management software we refer to
http://en.wikipedia.org/wiki/Comparison_of_reference_management_software.
Product
Popularity
Commercial
/academic
Added
value of
COIN
Software
or web
application
Willingness
to
cooperate
COINisation
effort
Zotero
Moderate
Free
download
(GPL license).
Firefox
browswer
plugin and
Word/Writer
extensions
Likely
Small,
unless
support for
multiple
browsers is
required
Mendeley
Increasingl
y popular
Free,
proprietary
High,
federated
and group
access
functionalit
y are of
added
value.
High,
federated
Desktop and
web
Unknown
Large in
terms of
Strategic
relevance
for
SURFnet
Moderate
Moderate
____________________________________________________________________________________________
13
license.
Not open
source.
Academically
oriented
Refworks
Popular
Commercial
Endnote
Popular
Commercial
and group
access
functionalit
y are of
added
value.
High,
federated
and group
access
functionalit
y are of
added
value.
components
federated
access and
group
managemen
t.
Centrally
hosted
website
Unknown
Large in
terms of
federated
access and
group
managemen
t.
Moderate
High,
federated
and group
access
functionalit
y are of
added
value.
Web account
Unknown
Large in
terms of
federated
access and
group
managemen
t.
Moderate
1.7 Appendix
Products publication and reference management
Zotero is an easy-to-use research tool that helps the researcher gather, organize, and analyze sources
and then share the results of his research. Zotero is a free, open source add-on for the Firefox
browser. Zotero enables users to manage bibliographic data and to store web-page snapshots and
other electronic objects. Through a separate add-on, it also allows citation in text (in Microsoft Word
and OpenOffice.org Writer) and can automatically create bibliographies in various formats. On many
major research websites such as digital libraries, PubMed, Google Scholar, Google Books,
Amazon.com, and even Wikipedia, Flickr and youtube, Zotero detects when a book, article, or other
resource is being viewed and with a mouse click finds and saves the full reference information to a
local file. Collaboration functionalities include group library-creation to collaborate with other Zotero
users, group publication of libraries, and the discovery of researchers that are working on a similar
topic. Writing in Word is facilitated with a Zotero Word plugin that enables easy creation of end- or
footnotes.
NARCIS is the National Academic Research and Collaborations Information System. NARCIS has been
developed by the KNAW to increase visibility and retrievability of Dutch scientific research. NARCIS
gives access to scientific information consisting of (open access) publications from the repositories of
all the Dutch universities, KNAW, NWO, and a number of research institutes, the datasets of the
institute DANS, as well as descriptions of research projects, institutes and researchers. This means
that NARCIS cannot be used as an entry point to access complete overviews of publications of
researchers (yet). Research news from, among others, Intermediair Nieuws, Science Guide and several
universities is presented on the homepage of NARCIS with access to the full articles. The news content
is refreshed every hour. Using an RSS feed the user can be notified of new open access publications,
____________________________________________________________________________________________
14
datasets and the latest research in his area of interest. The NARCIS widget enables the user to
dynamically post his 20 most recent NARCIS publications to his blog or website.
ScienceDirect is one of the largest online collections of published scientific research in the world. It is
operated by the publisher Elsevier. It allows users to find and retrieve relevant publications and to be
notified of new relevant entrances. The SciVerse extension of ScienceDirect offers sophisticated search
and retrieval tools and integrated external sources that enable users to maximize the effectiveness of
their knowledge discovery process.
PubMed is a free digital archive of biomedical and life sciences journal literature at the U.S. It provides
search and retrieval of publications, abstracts and citations. RSS-feed functionality allows the user to
be notified of recently added literature. A Clipboard feature allows collection of selected citations from
one or more searches for saving, printing, e-mailing, ordering, or storing in a MyCollection.
Products data transport
FileSender is a web based application that allows authenticated users to securely and easily send
arbitrarily large files to other users. Authentication of users is provided through SAML2, LDAP and
RADIUS. Users without an account can be sent an upload voucher by an authenticated user. The
purpose of FileSender is to send a large file to someone, have that file available for download for a
certain number of downloads and/or a certain amount of time, and after that automatically delete the
file. The software is not intended as a permanent file publishing platform. FileSender is released under
the BSD license. It is open source software and available for free.
Lobber is a BitTorrent-based service designed to be deployed with federated authentication in order to
deliver a simple data distribution service. Lobber was designed and built with the needs of most
research and education federations in mind.
Big File Sender 2.0 facilitates transfer of large files to any pc.
DropSend is a software tool that allows users to send large files as though by email, through a small
desktop client.
Products for cloud storage and sharing
Dropbox is a Web-based file hosting service operated by Dropbox, Inc. which uses cloud computing to
enable users to store and share files and folders with others across the Internet using file
synchronization. The Dropbox client enables users to drop any file into a designated folder that is then
synced with Dropbox's web service and to any other of the user's computers and devices with the
Dropbox client. Users may also upload files manually through a web browser. Through these usages, it
can be an alternative to couriering physical removable media, and other traditional forms of file
transfer, such as FTP and email attachments. While Dropbox functions as a storage service, its focus is
on synchronization and sharing. Though the desktop client has no restriction on individual file size, files
uploaded via the web site are limited to a maximum of 300 MB per file.
Box.net provides a cloud content management solution for individuals and companies ranging from
small businesses to large corporations. Box.net's online file storage makes it easy to securely share
content as a link or a shared folder with anyone — inside or outside the institution. It enables the
creation of an online workspace where the user can share project files, add comments, assign tasks,
start discussions or create new content.
JungleDisk is an online backup tool that stores its data in Amazon S3 or Rackspace Cloud Files. For
personal use, JungleDisk offers easy-to-use, automatic online backups, storage, and real-time sync
between one or more computers. In business settings, desktop and server backups, file sharing and
real-time sync designed for multiple users and teams.
____________________________________________________________________________________________
15
Mozy is an online backup service that allows both Windows and Mac users to back up an unlimited
amount of data and files to offsite servers. The cloud service allows users to back up data
continuously, manually or schedule updates. MozyFree, a free version of Mozy, allows users to back up
2 GB of data from up to two computers for an unlimited period of time without cost.
The LionShare P2P project is an innovative effort to facilitate legitimate file-sharing among individuals
and educational institutions around the world. By using Peer to Peer (P2P) technology and
incorporating features such as authentication, directory servers, and owner controlled sharing of files,
LionShare promises secure file-sharing capabilities for the easy exchange of image collections, video
archives, large data collections, and other types of academic information. In addition to authenticated
file-sharing capabilities, LionShare will also provide users with resources for organizing, storing, and
retrieving digital files.
____________________________________________________________________________________________
16
2.
UvA Communities
2.1 Introduction
Modern universities are developing towards loose conglomerates of (inter)discipline expertise that has
a high degree of connectedness with society in the broader sense. 21st century universities may also
be regarded as 'knowledge servers' in which a number of communities create, share, publish and apply
knowledge. Learning and research, in other words, is becoming a community-wide activity. In order to
support the formation and development of academic communities, the University of Amsterdam has
tailored the community (open) source collaboration and learning platform Sakai and created an
integrated community system that meets basic requirements for community support, UvA
Communities
UvA Communities aims to place this important observation in the context of the 21st century
University, by creating a ubiquitous community platform that is able to stage a number of different
communities and users, including individuals from outside university. Within one system, users can
become a member of one or more communities, and communities may be formed around a number of
topics, ranging from a general field of studies or research to an ad hoc problem area or course sites for
an innovative educational setting
2.2 Context, conditions and criteria
The context and principal aim of the use case is to provide a collaboration service that users from
multiple organizations are able to use. It's very purpose is to allow people from outside the university
organization to connect within the realm of the university and to give insiders means to easily connect
outside university, all having access to the same set of tools and services, which are extensible with
tailor-made functionality. People from many different cultural and scientific backgrounds collaborate on
a variety of topics and take on multiple roles in multiple communities.
Its relationship with COIN is obvious. The COIN infrastructure provides the perfect layer to facilitate
user access and multiple ad hoc group relationships for users from multiple organizations to work,
learn and share in a true collaboration spirit using the UvA Communities (Sakai) platform. The COIN
infrastructure provides also a platform to interconnect UvA Communities functionalities as widget or
gadget services in other platforms based on the OpenSocial adoption & implementation in COIN.
The criteria for PoC are:
1. There is intrinsic need for inter-organizational (university) collaboration using a collaboration
suite from one of partners
2. There is added value of the COIN infrastructure to provide federated access and group
management tools and gateway to exchange OpenSocial based information structures to more
than one service
2.3 Use case
The primary use case of the PoC is the use of the COIN infrastructure to logon and to form & organize
groups ( & groups relations) with the objective to use UvA Communities as the platform for
collaboration. The user perspective is the user has two options:
1. Use the portal provided by SURFnet to logon to the COIN infrastructure and
a. Start a new collaboration; choose UvA Communities as the primary collaboration
service. Establish group and invite users to group. And finalize the chain of actions:
link the group to a specific working area (site) or
____________________________________________________________________________________________
17
b.
Surf in the (COIN) portal to the section to collaborate with the other users that are
invited to use UvA Communities. A gadget will show the presence status of the other
users
2. Use UvA Communities to logon using the COIN infrastructure for federated access and
a. Start a new collaboration, using UvA Communities as the primary collaboration service
. In a specific working area (site) a group is setup using the group(er) interface of
COIN and other users are invited to join the group. And finalize the chain of actions:
the group is linked to this specific working area (site) or
b. Surf to the specific site in UvA Communities using its native interface and start
collaborating. The native presence gadget shows who is online
2.3.1 Required functionalities
In order to realize the integration of COIN and UvA Communities the following functionalities are
required:
1. UvA Communities has the functionality to link users and groups from an external providor to
specific native working sites.
2. The COIN infrastructure is able to provide user and group information to external services
based clearly defined information standards
3. UvA Communities has the possibility to provide (limited) functionality to external system, i.c.
presence-status
2.3.2 Key criteria for PoC success
Based on the Use Case and the requirements the following relevant criteria for successful PoC can be
identified:
1. A user is able to logon to UvA Communities or the COIN portal using their own organization ID
& credentials
2. The chain of actions needed to link a newly established COIN group (with invited users) to a
working site in UvA Communities is a completely automatic action.
3. The COIN infrastructure provides the correct user and group information. UvA Communities
has the tools to manipulate the data provided in such a way that the internal mechanism to
create sites, establish groups and link users to intended working site are performed
adequately.
4. The specific User Interface(s) needed for this PoC are friendly and intuitively.
5. Within the SURFnet COIN portal is possible to select UvA Communities as a collaboration
service
____________________________________________________________________________________________
18
____________________________________________________________________________________________
19
3.
Federated Widgets
3.1 Introduction
The usage of widget technology within web applications has been growing increasingly for the past
years. A widget is a small add-on application that can typically be embedded on a user profile page
within a social web network portal. Widgets can provide various kinds of information and functionality
within the context of the user.
The adoption of these small and simple add-on applications by large networking portals like Facebook,
Hyves, Orkut en LinkedIn did not only result in a widespread popular feature for the end-user. The
popularity and variety of possible usage was also the marking point for widget developers to
standardize the development and used technology. The OpenSocial standard is a set of API’s within
social network portals. These API’s functions can access and execute core functionalities within the
participating social networks. Widgets using the OpenSocial standard are interoperable with any social
network portal supporting the standard.
This deliverable contains a use case scenario that covers the usage of widget technology within a
federated context. The related criteria are defined and shall be applied on a Proof of Concept that will
be part of SURFwork’s project COIN. The use case will be described from the point-of-view of an enduser.
3.2 Context, conditions and criteria
The “Proof of Concept Collaboration Infrastructure” (2009) showed that widgets could be implemented
in a federated OpenSocial Environment. SURFnet successfully used widgets that were based on the
OpenSocial standard within this recently developed Infrastructure.
The starting point for this use case is the in 2010 conducted “Technology Scouting on federated
Widgets”. The Technology Scouting covered the possible technological scenarios for the
implementation of OpenSocial within a federative environment. Besides the scenario analyses, the
Technology Scouting gave an overview on specific issues in the field of security and privacy. These
specific issues arise when federative environments and OpenSocial widgets are used in conjunction of
each other.
An OpenSocial widget does not have to overcome any major technical challenges to obtain data from
the portal where it is embedded. A widget automatically retrieves the data through the OpenSocial
“container” in which it is placed. Every portal has its own OpenSocial container and therefore will
return their own data.
Retrieving data in a widget from outside the container is not impossible but does however have some
technical difficulties. Examples of these difficulties are for instance the (external) location of the
requested data or the possibility of non-identical data sources for individual users.
The Technology Scouting discusses four possible solution scenarios to overcome these challenges. All
four scenarios have their advantages and disadvantages in terms of applicability, flexibility and
compatibility.
The Technology Scouting on federated Widgets provides clear solution paths to the challenges on
retrieving external data within a federated widget. The biggest challenges are related to data routing
and trust issues.
____________________________________________________________________________________________
20
3.3 Use case: Use of Federated Widgets
This use case describes the usage of a simple federated widget within an OpenSocial supported web
portal. The University of Amsterdam offers its students a web portal. Within this web portal each
student can choose which widgets he or she wants to see. These widgets can have a variety of
information such as (UvA) news, a personal calendar and a widget containing the teams to whom the
student belongs.
(Part 1)
Otis is a 2nd year student of information sciences at the University of Amsterdam. He often uses the
UvA web portal for managing his homework. Otis logs on with his personal university credentials and
browses to the UvA web portal. In this portal Otis uses a widget that displays all of his UvA groups,
these groups are the study-teams Otis is participating in.
Besides the UvA groups widget Otis also uses the SURFteams Widget. The SURFteam widget is a
federated widget that displays the SURFteams in which he is participating. Because of these 2 widgets
Otis can keep a clear view on the things he should do for the different teams he is participating in
(figure 1).
(Part 2)
Due to the large number of teams Otis has decided to use a new widget that is providing the
opportunity to select multiple team and group providers. The widget is configured by Otis to show the
UvA groups and the SURFteams he is a member of. After applying all settings he can view all of his
teams and groups within 1 widget (figure 2).
Figure 1
UvA Portal
OpenSocial Container
Figure 2
UvA
Groups
UvA Portal
OpenSocial Container
Widget
Widget UvA Groups
Widget SURFteams teams
Widget X
UvA
Groups
All groups and teams
SURF
teams
Widget X
Service
X
SURF
teams
Service
X
____________________________________________________________________________________________
21
3.3.1 Required functionalities
Based upon the above use case the following required functionalities can be identified:
•
The user should be able to have access to the University web portal.
•
The configuration and adding of the group and team widgets should be very simple.
•
The user should see what the source of the specific group is (UvA or SURFteams).
•
It is possible for the user to only show the selected team-provider instead of always showing
all possible providers.
•
The web portal should support the OpenSocial standard.
•
The group widgets (from the UvA and SURFteams) should use the OpenSocial standard.
•
The user is able to join the teams/groups and access them. Existing services like SURFteams
en SURFfederatie can facilitate this.
3.3.2 Key criteria for Proof of Concept success
Criteria that may have an impact on the success of the Proof of Concept:
•
•
The user friendliness of the widget
o
Simple and attractive GUI
o
Loading/rendering feedback to user
o
In style of the web portal and with common widget methods
o
Simple configuration options (non programming skills)
Complete results, if results are not present or incomplete the widget should display a clear but
friendly error message.
•
Success should not depend on all data providers being available. If a specific data provider
isn’t available the widget should keep working for the other providers and retry on a later
moment.
____________________________________________________________________________________________
22
4.
Mashups
4.1 Introduction
A growing number of web users are combining different online information resources together.
Combining two (or more) information resources on the internet can be viewed as a "mashup" of
information. The characteristic of a mashup is that it is a fast and accessible way to combine existing
information. The “mixing” of information within mashups can have multiple levels, from a mere
presentation of two combined web-resources to scientific analysis, calculations and modification of
information using multiple datasets.
To create a mashup user can use a mashup tool. These tools have a graphical interface and can often
be used without programming skills or knowledge about creating web applications. The current
mashup tools have a variety of functionality and user options.
In the Technology Scouting "Mashup Technology for Research and Education" (SURFnet 2010)
research and analysis was conducted on these tools. The Technology Scouting covered the applicability
and usefulness of mashup tools for SURFnet users, with the focus on the scientific researchers.
The Technology Scouting has shown that the available mashup tools can be divided into three main
categories:
•
General purpose Mashup environments.
•
Enterprise Mashup platforms.
•
Scientific Workflow environments.
To analyze which type of mashup tools best fitted the needs of the scientific researcher’s requirements
were set and analyzed.
These requirements include the desired mashup platform characteristics and properties which are of
great importance to researchers. It can be concluded that the studied mashup tools differ in their
available functionality. In addition, none of the tested tools actually fulfilled all of the requirements.
The type of mashup tool that currently suits the needs of researchers the best can be found within the
Enterprise Mashup Platforms & Scientific workflow environments.
This deliverable contains a use case scenario that covers the usage of a mashup platform by scientific
researchers. The related criteria are defined and shall be applied on a Proof of Concept that will be part
of SURFwork’s project COIN. The use case will be described from the point-of-view of an end-user.
4.2 Context, conditions and criteria
The set of identified requirements were presented and discussed with some mashup (technology)
experts within the scientific community. Within different research fields several factors appear to
influence the importance of the defined requirements.
The interviews showed that the success of a mashup platform does not depend on being able to handle
very large amounts of data and dealing with a variety of user types and the diversity of the visiting
public.
More important requirements were that of (open) protocol support, the possible use of domain specific
tools and the availability of mashup platforms. These requirements are most often already met within
today’s mashup tools.
____________________________________________________________________________________________
23
The Technology Scouting had a few requirements which were endorsed by the researchers but which
are still not available within the current platforms. There are still requirements to be met on flexible
authentication, tracing the scientific process and the extensibility options of a mashup platform.
4.3 Use case: Mashup Technology for Research and Education
In line with the results of Technology Scouting "Mashup Technology for Research and Education" plus
the field study within the group of mashup experts a use case has been formed. The use case covers a
combination of requirements that are not met yet or are still not available the in mashup platforms of
interest. The use case illustrates the need for a web-based mashup platform with a strong focus on the
scientific process and simple authentication method. The availability of open standards and the
possible reuse of mashups by the user community is also an important factor.
Isis is a researcher at the University of Leiden. She works within the eScience research department.
Just like her colleagues Isis uses a web based workflow tool with extensive mashup functionality.
Because Isis is already logged on to her personal computer account she does not have to register for a
separate account to make use of the workflow-mashup tool. The tool is associated with the
SURFfederation system, just like the other services she is using.
The workflow-mashup tool is a web based version of the (client) application Taverna. Isis and her
department have used Taverna for the past years, each on their own local computers. The tool gives
her the same possibilities to mashup information resources and save them as part of a (scientific)
workflow. She can create new flows, modify existing flows and save or export them.
For her current project Isis logs on to the workflow-mashup tool via the SURFfederation system and
creates a new flow, within the flow she combines 1 local dataset with an external data repository. She
executes the flow and saves the mashup results to her local computer. She saves the flow and exports
it for sharing purposes on the myExperiment.org website. On this website, scientific workflows are
shared with other community members (figure 1).
4.3.1 Required functionalities
Based upon the above use case the following required functionalities can be identified:
•
The user should be able to have access to the webbased Workflow-mashup tool.
•
The available functionality of the tool should correspond with the offline client version of the
tool.
•
The user is able to join the teams/groups and access them. Existing services like SURFteams
en SURFfederatie can facilitate this.
•
The user should have an own account within the workflow-mashup tool where her own files can
be saved and stored for later use.
•
The data processing and tooling should run on a remote machine and not as a local service on
the machine of the user.
4.3.2 Key criteria for Proof of Concept success
•
•
•
The GUI layout and functionality should correspond with the offline version of the tool. This will
make it easier for current users to find their way and continue with their current knowledge.
The workflow-mashup tool should be bug free so it can execute workflows with its mashups
without any problems.
The speed and stability of the platform should be sufficient enough to support regular usage.
____________________________________________________________________________________________
24
•
The possibility to save and reuse the workflows.
Figure 1.
Taverna Server
Workflow-Mashup
Tool
Web based Workflow-Mashup Tool
Local information resources
External
information
resources
____________________________________________________________________________________________
25