Metadata sells books - SAiW Copyright Polska

Metadata sells books:
how can Poland benefit from ARROW?
Brian Green,
Krakow, 10° December 2013
www.arrow-net.eu
ARROW Plus is a Best Practice
Network selected under the
ICT Policy Support Programme
(ICT PSP)
Agenda

What sort of metadata?

Why is good metadata so important?

Book trade metadata standards

“Books in Print” and its benefits

Developments in Poland

Links for follow-up
What sort of metadata?

Metadata: literally 'data about data,' but, for the purposes of this presentation,
structured data about books that can be used to facilitate description,
discovery, selection and trade in books and their underlying intellectual
property.

Libraries have long understood the need for accurate metadata but their needs
are different from those of publishers and booksellers.

Libraries are not much interested in commercial data such as price and
availability and most library catalogues and National Bibliographies do not
include images of book covers, detailed descriptions, author biographies,
reviews and other rich information that users expect to see on Internet
bookselling sites.

Wholesalers databases have richer information but do not normally contain
100% of all books published and may only be available to their customers.
Why is Metadata important?

For conventional “bricks and mortar” bookshops, good metadata helps them
find books for customers and select books for stock

For printed books and e-books sold online, rich metadata is the way that
potential buyers will find and choose the books they will buy. The equivalent of
browsing in a bookshop.

Internet booksellers both in Poland and overseas need rich information in order
to present and promote your books as well as possible
What is rich metadata?

Basic bibliographic metadata includes:


Trade metadata also requires:


price, availability status
Rich metadata adds:


title, author, publisher/imprint, ISBN, format, publication date, subject
cover image, descriptions (long and short), author biography, reviews, prizes and
awards, links to websites, e-book DRM info etc.
There is evidence from Nielsen in the UK that more complete metadata leads
to increased sales
Helps to sell more books

UK Books in Print provider Nielsen Book, noted that those titles with
complete basic data records sold 70% more copies on average than
those with incomplete metadata.

Inclusion of enhanced data (description, reviews, author biography)
increased sales a further 28%

NB: UK BiP provides free basic records but charges for inclusion of enhanced
data elements
Communicating metadata

An ever-increasing number of trading partners will ask for this metadata and
each one may require slightly different data fields

It would be cost-effective to send full metadata to a single trusted hub, if one
exists, for transmission to all channels.

Metadata is normally supplied to retailers and e-book aggregators via webbased forms, in Excel spreadsheets or in ONIX format.

ONIX for Books is the international standard for communicating product
information about printed and electronic books and has been designed to
include all the data likely to be required, including marketing and supply
information: book jackets, long and short descriptions, reviews, detailed size
and weight info, links to a/v (e.g. author interviews), websites etc.
ONIX for Books

A global standard for communication of book product information, widely used
throughout the book and e-book supply chain in North America, Europe,
Australasia and the Asia Pacific region.

XML-based, providing a consistent way for publishers, retailers and their supply
chain partners to communicate rich information about their products, but quite
complex

Maintained and supported by EDItEUR, the international book trade standards
body (www.editeur.org)

Freely available online

Latest version, ONIX for Books version 3.0, has been designed to support ebook information including format and DRM/usage constraints
Benefits of ONIX

Provides book information in a widely used standard format, favoured by major
bookselling chains and Internet booksellers (e.g. Amazon)

Promotes faster and more cost-effective exchange of information from
publishers to wholesalers, booksellers, libraries both nationally and
internationally

Publishers no longer need to provide data in so many unique formats, reducing
errors and avoiding rekeying

By providing a template for the content and structure of a product record, ONIX
has helped to stimulate the introduction of better internal information systems,
capable of bringing together all the ‘metadata’ needed for the description and
promotion of new and backlist titles.

The same core data can also be used to produce advance information sheets,
catalogues and other promotional material, to feed publisher websites, and to
meet the needs of the wider supply chain.
Could ONIX become standard in Poland?

Basis of ARROW Books in Print and links to ARROW system

Basis of metadata for new Polish ISBN system

Major Polish wholesalers (e.g. Azymut, Olesiejuk) already using ONIX to
communicate with overseas partners

ONIX based service providers (e.g. eLibri) could provide a way for nontechnical publishers to generate ONIX

ONIX for RROs already used by IFRRO members for exchange of distribution
payment information and rights and repertoire information.
Book trade standards

Computer systems hate ambiguity and need to speak the same “language” if
communication / interoperability is to be optimised

Before 1970, every major bookseller and distributor used their own numbering
systems to identify books on their computer systems. Through ISO,
representatives of several countries agreed on the ISBN as an international
standard for identifying books

Now ISBN is universal in 160 countries. Imagine the book trade today without
ISBNs

Before 2000, every bookselling chain (Internet and physical) demanded book
information in their own format. Then the major UK and US publishers and
booksellers agreed on ONIX for books, a standard format for communicating
that information, saving rekeying, errors and time.
Some new standards to look out for

ISTC (International Standard Text Code), identifies the underlying work and
can be used to link different manifestations of that work e.g. paperback,
hardback, filmscript, e-book

Linking manifestations of the same work is an important part of ARROW

Each ISTC is a unique number, assigned through a central registration system
to a textual work with a unique set of metadata about that work

When a new record is presented for registration, if another record has already
been registered with the same metadata, the system will output the ISTC of the
matching metadata record already held on the system.

Translations are regarded as new works, but linked to the original work

SAiW “Polska Ksiazka” is one of the early ISTC Agencies
Hamlet
Shakespeare
Some new standards to look out for

ISNI (International Standard Name Identifier), identifies public names of
contributors to creative works and those active in their distribution

Includes authors, translators, illustrators, publishers, imprints as well as
creators in other media industries (music, film, etc.)

Attempts to deal with ambiguities in names, e.g. multiple authors with the same
name, different spelling/transliteration of the same name

Also links pseudonyms (where the name is public)

ISNI holds public records of over 6.8 million identities, including:


6.4 million individuals

400,000 organisations
The ISNI database is a cross-domain resource, contributed to by 25 institutions
and databases, and 38 major national and research libraries including the
Virtual International Authority File (VIAF)
Olga Tokarczuk
Thema

A new global subject classification system for books

Hierarchical system based on BIC subject categories

Participants from 15 countries – Austria, Canada, Denmark, France, Germany,
Italy, Netherlands, Norway, Pan Arab Group, Russia, South Africa, Spain,
Sweden, Switzerland, the United Kingdom, and the United States - managed by
EDItEUR and free to use

Version 1.0 published at Frankfurt Book Fair, October 2013

“Sunrise date” for implementation 31 December 2013

Already multi-lingual - English, French, German, Italian, Russian, Arabic
versions with more languages coming

Unlike library classifications such as UDC, it is tailored for commercial use
within the book trade and considers booksellers needs
The ARROW project and metadata

Metadata providers to the project included:

the cultural sector represented by national libraries (national bibliographies)

the collective management organisations which maintain a network for the collective
management of textual reproduction rights on behalf of authors and publishers
(“reproduction rights organisations” or RROs)

the organisations which create and maintain “books in print” databases across
Europe

One of the crucial questions that the ARROW system asks is whether a title is
still commercially available in print or available as an ebook. This requires a
database of commercially available books, i.e. Books in Print

ARROW Plus aims to help establish sustainable Books in Print services in
partner countries that do not have yet them: Greece, Hungary, Latvia,
Lithuania, Poland and Portugal
ARROW shared BiP platform

Aimed to create software capable of meeting all the requirements of different
organisations in agreed countries with minimal need for local adaptation

All participating countries had the opportunity of expressing requirements

Development cost funded by ARROW Plus

On-going maintenance and enhancement costs could also be shared amongst all
countries using the system and therefore should be less per user.
What is Books in Print?

What it is NOT

NOT only about printed books

NOT a National Bibliography


which includes books in and out of print (no discrimination)

has no updated price and availability data
NOT a Wholesaler/Distributor/Internet bookseller database

which are limited to books carried by wholesaler/distributor/bookseller

often has only limited information and contact details for publishers and distributors

availability refers to wholesaler or bookseller rather than the publisher
Books in Print

A trusted book trade hub for product information

Provides aggregated information to the entire book supply chain

Listing of all books available

… or soon to be published

Including e-books

Contact details for publishers / distributors

Comprehensive in coverage

Includes descriptive/marketing information

Provides updated information on current price, availability and source (i.e.
distributor)

Normally, available to retailers and the book trade, but PK plans to use it
only as an internal resource
What are the benefits of BiP to the book
trade?

Provides all the information that potential purchasers need to discover, make a
purchase decision and obtain the book

Accurate and up-to-date data about every book available for sale

Encourages backlist sales and facilitates customer orders

Promotes export sales (books and licences)

Enables electronic ordering routing and other valued-added services (sales
data collection, anti-piracy systems etc)

Facilitates online bookstores (including ebook aggregators)

Enables licensing of in-commerce books for digitisation

Provides a “hub” for disseminating book information to the trade

Saves time and money and reduces errors

Helps sell more books
A hub for book information

Currently publishers must provide product data in many different formats to
wholesalers, booksellers, ebook platforms, National Library ISBN Agency etc.

This is time consuming, wasteful, costly and can lead to errors and
inconsistencies

A Books in Print service can provide a “hub” to which publishers can send their
product information in agreed formats

The Books in Print hub can check the data and convert it into the formats
required by wholesalers, distributors, booksellers, libraries etc.

This checked and formatted data can also be made available back to the
publisher to provide a well-formed product database for producing catalogues
and other promotional material
Which European countries have BiP
services?

Belgium

Denmark

Finland

France (includes French-speaking Belgium & Switzerland)

Germany (includes Austria and German-speaking Switzerland)

Greece

Italy (includes Italian-speaking Switzerland)

Netherlands

Norway

Romania

Spain

Sweden

UK (includes Ireland)
Ownership of Books in Print

Belgium Meta4Books, (non-profit industry association) - ISBN

Denmark

Finland Kirjavälitys Oy, (wholesaler)

France Cercle de la Librairie, (professional association) - ISBN

Germany

Greece OSDEL (RRO)

Italy

Netherlands Centraal Boekhuis (wholesaler) - ISBN

Norway Publishers, booksellers and wholesalers consortium

Romania

Spain

Sweden The three major publishing houses in Sweden + the
leading wholesaler.

UK
The Danish Booksellers Association
MVB (marketing arm of PA/BA) - ISBN
Informazioni Editoriali (bibliographic company)
National Book Centre (public/private funds)
Federation of Spanish Publishers Guilds - ISBN
Nielsen Book Services (commercial company) - ISBN
Ownership of Books in Print

Belgium Meta4Books, (non-profit industry association) - ISBN

Denmark

Finland Kirjavälitys Oy, (wholesaler)

France Cercle de la Librairie, (professional association) - ISBN

Germany

Greece OSDEL (RRO)

Italy

Netherlands Centraal Boekhuis (wholesaler) - ISBN

Norway Publishers, booksellers and wholesalers consortium

Romania

Spain

Sweden The three major publishing houses in Sweden + the
leading wholesaler.

UK
The Danish Booksellers Association
MVB (marketing arm of PA/BA) - ISBN
Informazioni Editoriali (bibliographic company)
National Book Centre (public/private funds)
Federation of Spanish Publishers Guilds - ISBN
Nielsen Book Services (commercial company) - ISBN
Collaboration is a good idea

In many countries the BiP and ISBN Agency are co-located. Where they are
not, there is close collaboration between the two functions

National Libraries managing legal deposit schemes are well-equipped to collect
and aggregate good quality comprehensive data but not to maintain dynamic
data such as price and availability or other market data

Collaboration with ISBN agencies/National Libraries and trade bodies is the
best solution

All parties benefit if the quality of metadata is improved

Poland could be a good example of this
Polish Books in Print

Good book trade databases from wholesalers and booksellers already exist in
Poland

PK is anxious not to interfere with commercial interests of those companies so
the PK “Books in Print” will be an internal database linking to the ARROW
system

Services already exist in Poland to provide a hub for the distribution and
conversions of data (e.g. eLibri)

Initial data has been ingested from the National Library

Major wholesaler(s) will provide their records and update availability information

In the future, publishers will be expected to enter the initial information in
advance of publication and keep it updated

This may be using online forms or by batch delivery of data for multiple titles in
agreed formats (e.g. Excel templates, ONIX)

There will be exchange of information from ISBN registration
Potential benefits from ARROW

Membership of the ARROW system helps avoid unauthorised use of your
intellectual property and may provide licensing income from digitisation
projects

Use of standards promoted by ARROW can save time, cost and errors

Better metadata in international standard formats can lead to increased sales
at home and overseas

Potential for full scale, trade-accessible Polish Books in Print in the future if
the trade decides it would be useful.
Some useful links

ONIX for Books:

ISBN FAQs:
http://isbn-international.org/faqs

ISTC:
http://www.istc-international.org/

ISNI:
http://www.isni.org/

THEMA:
http://www.editeur.org/83/Overview/
http://www.editeur.org/151/Thema/
http://www.arrow-net.eu
FURTHER INFORMATION
Brian Green
[email protected]
ARROW Plus is a Best Practice Network
selected under the ICT Policy Support
Programme (ICT PSP)