presented

RePEc, a digital commons for
economics
Thomas Krichel
2013-02-05
???
• RePEc is a congenial and highly original
initiative.
• It is very poorly understood.
• It has been running close to 20 years.
structure of talk
•
•
•
•
•
some history
extent of RePEc
reasons for success
challenges
soapbox
History
• It started with me as a research assistant
an in the Economics Department of
Loughborough University of Technology in
1990.
• a predecessor of the Internet allowed me
to download free software without effort
• but academic papers had to be gathered
in a painful way
CoREJ
• published by HMSO
– Photocopied lists of contents tables recently
published economics journal received at the
Department of Trade and Industry
– Typed list of the recently received working
papers received by the University of Warwick
library
• The latter was the more interesting.
working papers
• early accounts of research findings
• published by economics departments
– in universities
– in research centers
– in some government offices
– in multinational administrations
• disseminated through exchange
agreements
• important because of 4 year publishing
delay
1991-1992
• I planned to circulate the Warwick working
paper list over listserv lists
• I argued it would be good for them
– increase incentives to contribute
– increase revenue for ILL
• After many trials, Warwick refused.
• During the end of that time, I was offered a
lectureship, and decided to get working on
my own collection.
1993: BibEc and WoPEc
• Fethy Mili of Université de Montréal had a
good collection of papers and gave me his
data.
• I put his bibliographic data on a gopher
and called the service "BibEc"
• I also gathered the first ever online
electronic working papers on a gopher and
called the service "WoPEc".
NetEc consortium
•
•
•
•
•
•
BibEc
WoPEc
CodEc
WebEc
JokEc
HoPEc
printed papers
electronic papers
software
web resource listings
jokes
a lot of Ec!
why?
• In the 90s it was clear to me that open
access to scientific publication would bring
tremendous benefits.
• The aim was to place open access
documents in the same system with tollgated access documents to allow the
former to compete more effectively with
the latter.
WoPEc to RePEc
• WoPEc was a catalog record collection
• WoPEc remained largest web access
point
• but getting contributions was tough
• In 1996 I wrote basic architecture for
RePEc.
– ReDIF
– Guildford Protocol
1996: RePEc principle
• Many archives
– archives offer metadata about digital objects (mainly
working papers)
• One database
– The data from all archives forms one single logical
database despite the fact that it is held on different
servers.
• Many services
– users can access the data through many interfaces.
– providers of archives offer their data to all interfaces
at the same time. This provides for an optimal
distribution.
RePEc is based on 1400+
archives
•
•
•
•
•
MPRA
DEGREE
S-WoPEc
NBER
CEPR
•
•
•
•
•
•
US Fed in Print
IMF
OECD
MIT
arXiv
CO PAH
to form a 1.2M item dataset
500,000
700,000
3,000
30,000
35,000
working papers
journal articles
software components
book and chapter listings
author contact and publication
listings
12,000 institutional contact listings
RePEc is used in many services
• IDEAS
• EconPapers
• NEP: New Economics
Papers
• Inomics
• RePEc Author Service
•
•
•
•
•
IDEAS
RuPEc
EDIRC
LogEc
CitEc
… describes documents
Template-Type: ReDIF-Paper 1.0
Title: Dynamic Aspect of Growth and Fiscal Policy
Author-Name: Thomas Krichel
Author-Person: RePEc:per:1965-0605:thomas_krichel
Author-Email: [email protected]
Author-Name: Paul Levine
Author-Email: [email protected]
Author-WorkPlace-Name: University of Surrey
Classification-JEL: C61; E21; E23; E62; O41
File-URL: ftp://www.econ.surrey.ac.uk/
pub/RePEc/sur/surrec/surrec9601.pdf
File-Format: application/pdf
Creation-Date: 199603
Revision-Date: 199711
Handle: RePEc:sur:surrec:9601
… describes persons (HoPEc)
template-type: ReDIF-Person 1.0
name-full: MANKIW, N. GREGORY
name-last: MANKIW
name-first: N. GREGORY
handle: RePEc:per:1984-06-16:N__GREGORY_MANKIW
email: [email protected]
homepage:http://post.economics.harvard.edu/faculty/
mankiw/mankiw.html
workplace-institution: RePEc:edi:deharus
workplace-institution: RePEc:edi:nberrus
Author-Article: RePEc:aea:aecrev:v:76:y:1986:i:4:p:676-91
Author-Article: RePEc:aea:aecrev:v:77:y:1987:i:3:p:358-74
Author-Article: RePEc:aea:aecrev:v:78:y:1988:i:2:p:173-77
….
… describes institutions
Template-Type: ReDIF-Institution 1.0
Primary-Name: University of Surrey
Primary-Location: Guildford
Secondary-Name: Department of Economics
Secondary-Phone: (01483) 259380
Secondary-Email: [email protected]
Secondary-Fax: (01483) 259548
Secondary-Postal: Guildford, Surrey GU2 5XH
Secondary-Homepage:
http://www.econ.surrey.ac.uk/
Handle: RePEc:edi:desuruk
summary about RePEc
• RePEc is not an open access archive.
• It is a free abstracting and indexing
dataset collected in a collaborative
fashion.
• It treats full-text locations as attributes of
the document descriptions.
RePEc and institutional repositories
• If a RePEc archive is augmented with fulltext (which it can be) it is as true example
of an institutional repository albeit
discipline limited.
• RePEc is the living proof that an
institutional repository (IR) system can
thrive.
data and service providers
• In classical IR thinking there is a
distinction between data providers and
service providers.
• In RePEc many service providers act as
data providers. We have more of a peer
ecology than a standard institutional
repository system.
key to success
• Have a small group of volunteers. All are
technically competent. No “stakeholder
consultation” talk.
• Disseminate as widely as possible.
• Demonstrate to authors and institutions
that it works for them.
– institutional registration
– author registration
institutional registration
• It started by one sad geezer making a list
of departments that have a web site.
• I persuaded him that his data would be
more widely used if integrated into the
RePEc database.
• Now he is a happy geezer and one of our
three crucial volunteers.
author registration
• It started when funding allowed us to hire
a crazy programmer to write an author
registration system.
• system went online as "HoPEc" in late
2000.
• has been renamed "RePEc author
service" (RAS)
• In 2003 a grant from OSI allows for a
rewrite and expansion.
RePEc Author Service
• RePEc document data has author names
as strings.
• The authors register with RAS to list
contact details and identify the papers they
wrote.
• This is classic access control, but done by
the authors.
• In a ranking of 100 most important
economists, over 80% are registered with
RAS.
authors' incentives
• Authors perceive the registration as a way
to achieve common advertising for their
papers.
• Author records are used to aggregate
usage logs across RePEc user services
for all papers of an author.
• Stimulates a “I am bigger than you are”
mentality. Size matters!
LogEc
• Despite the existence of many user
services, a central service collects usage
data from the most important ones.
• This data is then distributed to user
services. The can globally assess usage
of an item.
• Created and maintained by Sune
Karlsson.
NEP: New Economics Papers
• This service collects data on new working
papers in RePEc.
• It makes it available to editors to filter into
close to 100 subject specific report.
• Editors are aided by machine learning.
• Created in 1998 by yours truly and
maintained by yours truly.
• Server sponsored by Victoria University.
Citation in Economics CitEc
• CitEc is an autonomous citation system. We
download available full texts, convert to
text, parse references to parse citations.
– 458247 documents processed
– 10900917 references found
– 4411202 citations found
• Created by Jose Manuel Barrueco Cruz in
1998 and maintained by him.
• Data is widely used across RePEc services.
CitEc and RAS
• RAS authors can claim citiations from
CitEc data.
• The can verify that the association
between reference and cited document is
correct.
EconAcademics.org
• A new service by Christian Zimmermann,
sponsored by the St. Louis Fed.
• The service monitors blogs citing work in
RePEc using links to RePEc documents.
• The service encourages discussion of
research in RePEc and inbound links.
• Brings blog posts closer to formally
published items.
• This is very slick!
MPRA
• The Munich personal RePEc archive is a
repository for authors who are not affiliated
with institutions that have a RePEc
archive.
• Launched by Ekkehart Schlicht in 2006
and sponsored by Munich University,
based on EPrints software.
• Currently over 23000 items.
CollEc
• A full collaboration graph of the RePEc
Author dataset.
• It maintains about 400000000 shortest
paths in a rolling continuous updated
system.
• Started by yours truly in 2006, fully
functional since 2012.
• Server sponsored by Symplectic.
RePEc genealogy service
• Another new service by Christian
Zimmermann and his team at the St. Louis
Fed.
• This builds a genealogy of RePEc authors
through the use of a crowd-sourcing tools.
• Only RAS registrants may contribute.
problems
• Over time, Google has become more slick
in pointing directory to full text rather than
to RePEc pages.
• While usage is still growing and RePEc is
still growing this is not reflected in the
usage numbers.
• RePEc has expanded its user services but
that may not be sufficient to guarantee
data growth.
ArchEc
• ArchEc, created by yours truly in 2012, is
an attempt to build a dark archive.
• Later, we can find agreements with archive
maintainers to make it a light archive, and
encourage links to ArchEc rather than to
the original item.
• As an unfunded initiative it may take many
months, if not years, to complete.
some thoughts
• When a new technology disrupts an
established social system based on an old
technology, change is slow.
• The main reason is that people are
prisoners of thinking in concepts the
relevance of which has passed.
overall move
• We are witnessing the transition from an
economy of information to an economy of
attention.
• Economy of information: information is
scarce, attention plentiful.
• Economy of attention: information is
plentiful, attention is scarce.
past thinking: peer review
• Peer review means reviewing material
prior to publication.
– This makes sense when publishing / updating
is expensive.
– It makes no sense when publishing / updating
is cheap.
• Usage-based evaluation is the way to go.
past thinking: “myth of industry”
• Data providers think of the data they have
collected as something they have to
control.
• Ask Greg for public domain metadata, no
response.
• No way to get a complete copy of
EconBiz.
• “Oh we have an API”…
RePEc vs. myth of industry
• RePEc has an ftp server with almost all
data it has.
• We aggressively try to give our data away
because we believe that this is what the
community wants.
• We work for the economy of attention.
• We need to get away from proprietary
silos.
example
• RePEc gives its data to the American
Economic Association.
• They run a multi-million dollar business out
of selling a product called “EconLit”.
• We give them our data for free. We get
nothing back.
• Well almost nothing…
past thinking: subscription model
• The distribution of scholarly material is
mainly supported via massive transfers of
funds from universities to toll-gating
publishers.
• Open access ventures get a tiny share of
these funds.
• As long as subscription persists there will
be not much progress with open access.
problem 1 of subscription model
• The subscription model is a product of the
economy of information.
• But research is fundamentally conducted
to create attention to the university’s work.
• When an university buy access to tollgated material, it subsidizes attention to
research conducted by others.
• The subscription model is individually
irrational.
problem 2 of subscription model
• The subscription model is not only
individually irrational, it is also collectively
irrational.
• When all institutions switch funds from
closed to open access we need no
subscriptions any more.
Am I crazy?
• Money does not make the world go round. Ideas
do.
• When RMS proposed a free replacement for
UNIX in the early 80s, most people dismissed
the idea.
• Today it is reality!
• Similarly, when I started to work on RePEc a
totally free and improved A&I dataset in 1993,
nobody gave it a high probability to succeed.
• It is a reality!
http://openlib.org/home/krichel
Thank you for your attention!