University Information System RUSSIA: Bilingual

Moscow State University
Research Computing Center
NCO Center for Information Research
University Information System RUSSIA:
Bilingual (Russian - English) Search Tools to
Integrate Data and Knowledge Products
Prepared for IASSIST
Montreal, Canada
May 2007
by
Tatyana N. Yudina, Leading researcher, Ph.D. (history)
Moscow State University Research Computing Center
[email protected];
Anna Bogomolova, Assistant professor, Ph.D. (economics)
Moscow State University Economic Faculty
[email protected]
University Information System RUSSIA Holdings
20 Gb/2 000,000
Source
Retrospective
Documents
Government
Documents
State Duma
Daily Records
State Agencies Publications;
/Kodeks Law Firm
1990-…
180,000
State Duma Analytical Department
1994-…
170,000
State Statistics
State Statistics Service;
RF Ministries, CIS Interstate
Statistics Committee
1996-…
More than
160,000 tables
Mass Media
Expert Weekly; AiF; NG; Izvestia;
Vedomosti…
199(7)-…
Academic Publications
MSU Bulletins, Economic Forecasting,
Sociology Research, Law…
2000…
20,000
State Agencies
Analytical Reports, Think
Tanks Reports
Ministry for Ec. Dev. and Commerce;
Central Bank; Accounting Chamber,
Russian-European Center for Economic
Policy; Fiscal Policy Center
1996-…
80,000
Surveys
National Survey on Households
Well-being
2003
800,000
227 questions,
45000
respondents from
46 regions
UIS RUSSIA Collections in English
 European Court of Human Rights archive,
 Council of Europe documents,
 Publications of Kennan Institute, USA
 OECD Health Data,
 RePEc (Research Papers in Economics,
www.repec.org) abstracts and full
texts.
NLP Technology in UIS RUSSIA
holdings
convertors
Automatic Linguistic
Text
Processing/Linguistic
Processors
*.HTM
Administrator
*.HDR
*.LEM
*.OUT
ORACLE
*.POD
WEB
www.cir.ru
(Apache; OAS)
Automatic Linguistic Text
Processing (ALTP)
ALTP is customized to content-based process and integrate
into Oracle-based IS all main types of business prose text
corpora (documents and statistical data)– government
publications, parliament chambers daily records, think tanks
reports, scientific journals, mass media, public opinion polls.
Content-based processing performs:
Conceptual Indexing,
Coherent Summarization,
Text Categorization.
THESAURUS
for Information Retrieval
in Sociopolitical Domain
 Main element of ALTP is Thesaurus hierarchical network of 45 000 descriptors, 107
000 synonyms, 175 000 relations;
 Thesaurus provides for query refinement reformulation - expansion;
 Terminology of Thesaurus covers 95-98% of
nowadays business prose - government
publications, academic papers, mass media
texts, statistical indicators;
 Thesaurus is translated into English.
Query Refinement
Interactive Query Refinement
Bilingual Search in UIS RUSSIA
Value-added Services for Assisting
Research

















The UIS RUSSIA provides for value-added services to assist research including:
Cross-collection content-based search across 60 holdings exploiting SSH and
thesaurus,
Thesaurus-based navigation to form a query,
Query refinement exploiting informers - document’s content, geography and dates;
Annotation/short summary complementing each full text document;
RF state statistics converted into relational data base format (power tables);
MS Excel 97 format available for all statistical tables, including the tables presented
in analytical reports and scientific journals;
Links to the Methodological Notes and Glossary for statistics;
Graphics- and maps-based data representation ;
Archives in English available in interface and with an informer (table of content) in
Russian;
JEL/Journal of Economic Literature-based search for think tanks reports and
academic publications in Russian and in English,
Subject-oriented modules - databases and knowledge products
Relational database to integrate social, economic and budget data at federal,
regional, local levels,
Data at household level on 46 regions of RF,
Interface and search tools in Russian for collections in English,
Full package of documents to compliment cases on Russia at European Court of
Human Rights,
User-tailored information update by e-mail.







Subject-Oriented Modules: Databases and
Knowledge Products
RF: State Statistics;
RF: Budget System;
RF: Agriculture and Food Production;
RF: Population and Living Standards;
National Survey on Households Well-being
(NOBUS);
Human Rights: International Organizations
Documents.
RF: Regions, Municipalities, Households;
Information System
Maintaining European Court of Human Rights’ Documents
with Interface and Value-added User Services in Russian
Available at Public Domain Site www.echr-base.ru
Multifunctional information system regular updates
a)full archive
of
European Court of Human Rights decisions and judgments, b)main Council of
Europe, United Nations Organization, Commonwealth of Independent States
documents and c)publications on the human rights protection produced by
partners in Russia and other countries’ specialists, ECHR Secretariat. The
system provides for friendly interface and value-added user services,
including :
 flexible search instruments in Russian;
 profile in Russian to compliment each of European Court’s documents;
 access to the European Court’s/Council of Europe documents available in
Russian;
 special module to monitor cases against Russia with links to national law at
issue and publications on the topic;
 hyperlinks to case law (Court’s precedents).
The ECHR archive is legally obtained from the European Court Secretariat,
archive is updated 3 times a year, December 2006 archive covers 45000
documents in English and in French.
Training module for comparative law investigations is under construction.
UIS RUSSIA and Partners
 UIS RUSSIA is a core of distributed network of
knowledge products. In 2007-2008 all main
think tanks’ publications will be integrated.
Most partners archives are copied to the UIS
RUSSIA server. Some partners provide direct
access to their servers for UIS RUSSIA
registered users. This approach is preferable for
future UIS RUSSIA development;
 Among partners – educational portals that
maintain publications for teaching in economics,
social sciences, management. Cross servers
search procedure is under construction.
JEL/Journal of Economic Literature Classification
System-based knowledge products integration
JEL/Journal of Economic Literature Classification
System-based knowledge products integration
JEL/Journal of Economic Literature Classification
System-based knowledge products integration
Perspectives
Contents and Research-assisting Services. Interface in English
 Region of RF database with developed GIS component and on-line
analysis tools to provide for system and comparative
investigations at federal, regional, local and households levels;
 Integration of holdings and data bases from partners in the
regions – universities and think tanks publications - to maintain a
distributed network of high quality resources on RF;
 New component to maintain reports and databases of main
international organizations (WB, IMF, UNESCO). Tools for
international comparative investigations;
 Finland-Russia component for comparative economic and social
investigations at federal, regional, local levels. Indicators
harmonization;
Linguistic investigations
 «Public administration» ontology (Russian-English) to integrate
data and knowledge products.
UIS RUSSIA operates as an electronic library for economic and social
research since 2000. Access is free but registration-based.
Users from all regions of RF:
400+ collective users - universities, higher education institutions,
colleges, Russian Academy of Sciences institutes, think tanks,
government agencies;
5000+ individual users.
The project started in1988 at USA&Canada Institute. Since 1997 the
team is accommodated at Moscow State University Research
Computing Center, unites specialists from MSU faculties, other
universities, Russian Academy of Sciences Institutes.
UIS RUSSIA project is financed by grants from Russian and foreign
funds –
Russian Fund for Basic Research www.rfbr.ru;
Russian Fund for Humanities www.rfh.ru;
MacArthur Foundation, USA www.macfound.ru;
Ford Foundation, USA, www.fordfound.ru;
Eurasia Foundation, USA www.eurasia.msk.ru
and contracts with RF government agencies, think tanks, business.
http://uisrussia.msu.ru
http://www.budgetrf.ru
http://www.echr-base.ru
Thank you!