osmosis - National Library of New Zealand

OSMOSIS
A guide for New Zealand libraries
October 2010
www.natlib.govt.nz
22/10/2010
1
Table of Contents
1.
INTRODUCTION.....................................................................................................................3
2.
WHAT IS OSMOSIS? ..............................................................................................................4
2.1.
2.2.
DESCRIPTION OF OSMOSIS.................................................................................................4
BENEFITS OF OSMOSIS.......................................................................................................4
3.
OSMOSIS: A THREE STAGE PROCESS ..............................................................................5
4.
STEPS IN THE THREE STAGE OSMOSIS PROCESS ........................................................5
5.
STAGE 1: PRE-OSMOSIS STAGE .........................................................................................6
5.1.
5.2.
5.3.
5.4.
5.5.
5.6.
6.
STAGE 2: OSMOSIS PROCESSING STAGE ........................................................................7
6.1.
6.2.
6.3.
6.4.
6.5.
6.6.
6.7.
7.
INTRODUCTION ....................................................................................................................6
COMPLETION OF FORMS (YOUR LIBRARY PROFILE)...............................................................6
EXTRACT CATALOGUE RECORDS ..........................................................................................6
FTP TO THE NATIONAL LIBRARY SERVER ...........................................................................6
TMQ ACTIONS .....................................................................................................................6
FIRST FILE LOAD AND SUBSEQUENT LOADS ..........................................................................6
INTRODUCTION ....................................................................................................................7
TMQ PROCESS HOLDINGS ...................................................................................................7
TMQ PROCESS LIBRARY PULLS ...........................................................................................7
TMQ PROCESS GLOBAL FIXES.............................................................................................7
TMQ PROCESS ERRORS AND DUPLICATES ...........................................................................7
OSMOSIS SOFTWARE PROCESSING .....................................................................................7
FIRST FILE LOAD AND SUBSEQUENT LOADS ..........................................................................7
STAGE 3: POST OSMOSIS STAGE .......................................................................................8
7.1. INTRODUCTION ....................................................................................................................8
7.2. REPORTS..............................................................................................................................8
7.2.1.
Verification email ........................................................................................................8
7.2.2.
BATCHLOAD Processing Summary Report [BatchloadReport.xxx.txt].........................8
7.2.3.
OSMOSIS Pre-Processing Detailed Report [OSREPORT.xxx.txt].................................8
7.3. LOGS ...................................................................................................................................9
7.3.1.
Verification Logs [MarcErr<date>xxx.txt] ..................................................................9
7.3.2.
Holdings Cleanup [HCRUNLOG.xxx.txt].....................................................................9
7.3.3.
Global Fix Log [OGFIXLOG.xxx.txt]...........................................................................9
7.3.4.
Coding Errors Log [OXCODLOG.xxx.txt] ...................................................................9
7.3.5.
Bad Matching Errors Log [OXMATLOG.xxx.txt] .........................................................9
7.3.6.
Duplicate Record Logs [ DUPELOG-.xxx.txt]..............................................................9
7.4. FILES ...................................................................................................................................9
7.4.1.
Deleted Records <eDeleted.xxx.mrc> ..........................................................................9
7.4.2.
Non-Bibliographic Records <eNonBib.xxx.mrc> .......................................................10
7.4.3.
Holdings Clean Up File HCEXCMRC.xxx.mrc...........................................................10
7.5. NATIONAL UNION CATALOGUE / WORLDCAT UPDATED ....................................................10
7.5.1.
First file load and subsequent loads ...........................................................................10
22/10/2010
2
1. Introduction
This guide is for libraries participating in the OSMOSIS process. It describes OSMOSIS and
its benefits and outlines processing steps involved from the initial extract of the library
catalogue through to the update of the National Union Catalogue.
Its purpose is to:
 help libraries understand what OSMOSIS is and the stages in the OSMOSIS process
 provide step by step instructions and / or information for each of the 3 stages of
OSMOSIS
 provide lists and descriptions of the logs, reports and files generated from the
OSMOSIS process
 provide links to related documentation
 describe, in the last paragraph in each section, differences between processing for
the first file of the catalogue you provide and for subsequent files you send.
22/10/2010
3
2. What is OSMOSIS?
2.1. Description of OSMOSIS
OSMOSIS is a database processing software, developed by The Marc of Quality (TMQ),
which enables holdings from your library's catalogue to be moved across to the National
Union Catalogue with very little effort on your part.
OSMOSIS uses ‘sequential snapshots’ of your library’s catalogue to identify additions and
deletions of material held. Each snapshot represents a copy of your library’s catalogue at a
point in time. “Snapshots” are taken on a regular basis and the holdings that have been
added and deleted by you between snapshots are recorded, processed and then loaded to
the New Zealand National Union Catalogue
2.2. Benefits of OSMOSIS
For NUC users:
 Interloan is more efficient and effective for requesters and suppliers when your
library’s holdings are accurately represented on the New Zealand National Union
Catalogue. Suppliers have fewer requests for material no longer held, requesters ask
the correct suppliers for material they require.
For your library users:
 When your users start their search in the National Union Catalogue and WorldCat,
Google or Yahoo they will find accurate holdings information reflected there.
Costs to your library:
 The costs of producing a regular file of your library’s holdings may be less than the
cost of manually adding, maintaining or deleting holdings your libraries holdings on
the NUC.
 If your library already sends monthly loads of new holdings or deletions to the NUC
for batch processing, using OSMOSIS would have a more accurate outcome and
therefore be more efficient
 OSMOSIS automatically identifies when the last copy of a particular item was deleted
from your library’s OPAC and so therefore deletions are transferred to the NUC.
The National Library
 Processing of batch files should speed up - the size of each file should be
significantly smaller representing additions and deletions in the intervals between
snapshots, rather than complete updates.
 Deletes will be easier to process
22/10/2010
4
3. OSMOSIS: a three stage process
The OSMOSIS process follows a series of sequential steps that can be separated into 3 main stages.
1. Stage 1, the Pre-OSMOSIS stage is the time when:
a. you update your library profile and forms;
b. you extract a copy of your library catalogue –bibliographic records and holdings - and send
it to the National Library FTP server.
2. Stage 2, is when TMQ runs the copy of your library catalogue through the OSMOSIS processing.
3. Stage 3, or the Post-OSMOSIS stage is when:
a. your library is provided with reports, logs and files that you can use to analyse and fix
bibliographic records in your catalogue. Any records you fix will improve the quality of your
library catalogue for local use and they will also be picked up in later OSMOSIS file loads.
b. the National Library is provided with a copy of your library holdings to update the National
Union Catalogue and WorldCat.
4. Steps in the three stage OSMOSIS process
The steps within each of the three stages of OSMOSIS Processing can be represented
diagrammatically as follows:
Figure 1: Steps in the Osmosis process
22/10/2010
5
5. Stage 1: Pre-OSMOSIS Stage
.
Figure 2: Pre-Osmosis processing
5.1. Introduction
The Pre-OSMOSIS stage is the time when you extract a copy of your library catalogue and send it to
the National Library FTP server. The following steps are included in Stage 1.
5.2. Completion of forms (your library profile)
You will be emailed with advice and location of the four forms where you will enter information about
your library. This library profile provides information to help TMQ process your catalogue.
5.3. Extract catalogue records
Use the documentation provided to you by your library system vendor to extract a copy of your library
catalogue. Bibliographic and holdings records are required.
5.4. FTP to The National Library Server
Use the following naming conventions: < YYMMDD.NUC.mrc> and FTP your extracted catalogue to
the National Library.
5.5. TMQ actions
TMQ will collect your file from the National Library server and run it through a verification process to
ensure that your records can be read. TMQ will then send you an email about receipt, readability and
number of records.
5.6. First file load and subsequent loads
 Whenever it is time for you to provide a subsequent file for OSMOSIS processing the National
Library will email you reminding you that you need to update your library profile (or notify TMQ
and the National Library if there is no change). Changes may include:
o
o
o
o
Your institution profile, e.g. contacts
Changes in NUC symbols or internal agency codes
Changes to your processing profile, e.g., a change to your library system vendor
Changes to your library pulls (those items that will not require OSMSIS processing)
Recommendation: Documenting your Process
We recommend you document the process so that it is easier the next
time you undertake the Pre-OSMOSIS stage.
22/10/2010
6
6. Stage 2: OSMOSIS Processing Stage
Figure 3: Osmosis processing
6.1. Introduction
Stage 2, the OSMOSIS processing stage is the time when TMQ processes your library catalogue for
holdings, library pulls, global fixes, duplicates and other errors. The following steps are included in this
stage:
6.2. TMQ Process Holdings
If your library catalogue has separate holdings records (i.e. MARC21 for Holdings Records) these will
be merged into the bibliographic record so that they can be processed. This stage will also translate
your libraries internal holdings codes into the NUC symbol. Information you provided in your library
profile (see 5.2 Completion of forms) is used for this step.
6.3. TMQ Process Library Pulls
This step removes those categories of records you have identified in the forms that you do not want
loaded to the NUC. You will not receive any reports logs or files for these records.
6.4. TMQ Process Global Fixes
Global fixes are made to the bibliographic records where there are inconsistencies in formatting or
MARC coding e.g. correction and normalisation of control numbers, repair of problems in the leader,
title filing indicators.
6.5. TMQ Process Errors and Duplicates
Bibliographic records are processed to identify records that are duplicates or have other major errors.
Those records are reported to your library and will not be loaded to the NUC until you fix them. You
are able to do this at any stage and they will be picked up in a later OSMOSIS process.
6.6. OSMOSIS Software Processing
The OSMOSIS software process is run on subsequent loads of your catalogue and identifies records
that have been added or deleted to a library catalogue since the previous file of catalogue records was
sent.
6.7. First file load and subsequent loads
The first file load you send:
 is tested and if there are issues you could be asked to extract your library catalogue again.
(Section 5.3).
 is retained by TMQ so that subsequent adds and deletions in the next file you provide can be
identified. [The next and subsequent loads are also retained]
22/10/2010
7
7. Stage 3: Post OSMOSIS Stage
Figure 4: Post Osmosis processing
7.1. Introduction
The Post OSMOSIS stage 3 is the final stage of OSMOSIS and this is when you receive logs, reports
and files to enable you to fix error records to make improvements to your local catalogue. TMQ will put
OSMOSIS reports, logs and files on a server where they will be available for you to collect.
Stage 3 is also the stage when the National Union Catalogue is updated with your holdings. You will
receive a confirmation email when this is complete.
This section contains information on the reports, logs and files generated from OSMOSIS processing
and briefly describes the update of the National Union Catalogue and WorldCat:
7.2. Reports
Description: Reports contain in the main descriptive and statistical data.
7.2.1. Verification email
The Verification email is the first report you receive and it contains statistics about the file you FTPed
to the National Library e.g. date of receipt and number of records in the file.
7.2.2. BATCHLOAD Processing Summary Report [BatchloadReport.xxx.txt]
A statistical summary report called the "Batchload Processing Summary Report" pulls everything to
together at the end of the batchload pre-processing and reports on:
 the number of bibliographic (and, if necessary, holdings) records sent and received
 the number of bibliographic records fixed
 the number of bibliographic records pulled as errors
 the number of bibliographic records identified as Adds and Deletes
 a list of the reports, logs, and files generated by the processing
 links to where the library can find the reports, logs, and files generated
The "Batchload Processing Summary Report" is sent to the batchloading library just before the
processed records are sent to the National Union Catalogue for loading.
7.2.3. OSMOSIS Pre-Processing Detailed Report [OSREPORT.xxx.txt]
The OSMOSIS Pre-Processing Detailed Report lists the specific types and numbers of:
 fixes done and printed in a log
 records pulled but not printed in a log (records that you requested not be sent to the NZNUC)
 possible duplicate records pulled and printed in log
 true duplicate records printed in a log, but not pulled
 records with invalid coding pulled and printed in a log
 records with bad matching errors pulled and printed in a log
22/10/2010
8
7.2.3.1.
First file load and subsequent loads [OSREPORT.xxx.txt]
For the first file you provide the OSREPORT contains the number of records which will go to
the National Union Catalogue and for subsequent files the number of records which are Adds
and Deletes to be processed onto the National Union Catalogue.
7.3. Logs
Description: Logs generally provide an explanation of a processing activity that may have caused an
error with a list of the record System Identification number (Sys Ids) that caused the error.
7.3.1. Verification Logs [MarcErr<date>xxx.txt]
If any records in your file cannot be verified, they are dumped to text files named <MARCERR' #
xxx.txt> with the record sequence number. Verification errors can be caused by corrupt data in the
records, bugs in export software, etc. Ideally these records should be replaced in your catalogue.
7.3.2. Holdings Cleanup [HCRUNLOG.xxx.txt]
The Holdings CleanUp statistical report is output only for libraries sharing a catalogue; it contains:
 the number of records without any holdings information;
 a list of invalid holdings codes (codes not provided on a translation table);
 a count of the outgoing codes (internal codes translated to NUC symbols).
7.3.3. Global Fix Log [OGFIXLOG.xxx.txt]
The Global Fix Logs contains the Sys IDs of the records that were fixed prior to sending to the
National Union Catalogue. An explanation of the errors is provided.
7.3.4. Coding Errors Log [OXCODLOG.xxx.txt]
The Coding Errors Log provides the Sys IDs of records that contain invalid MARC coding and so will
not be sent to the National Union Catalogue until they are fixed. An explanation of the errors is
provided.
7.3.5. Bad Matching Errors Log [OXMATLOG.xxx.txt]
The Bad Matching Errors Log provides the Sys IDs of records that could prevent matching or cause
bad matches to occur and so will not be sent to the National Union Catalogue until they are fixed. An
explanation of the errors is provided.
7.3.6. Duplicate Record Logs [ DUPELOG-.xxx.txt]
The Duplicate Record Logs are provided for each category of duplicate match keys found. They
contain:
 True duplicates: Sys_IDs, match keys and titles of the records identified by TMQ algorithms as
truly duplicate records that have been merged and sent on to the NZNUC
 Possible duplicates:Sys_IDs, match keys and titles of the records identified by TMQ algorithms as
possible duplicate records that will not be sent on to the NZNUC until you resolve them
7.4. Files
Description: Files contain actual MARC records.
Files available to you in the MARC21 format are:
7.4.1. Deleted Records <eDeleted.xxx.mrc>
If any records in your file are marked Deleted (Leader RecStat='d'), they are pulled and dumped to a
file called 'eDeleted.xxx.mrc'. This file can be checked against your catalogue. Sometimes the file may
simply contain coding errors in the leader.
22/10/2010
9
7.4.2. Non-Bibliographic Records <eNonBib.xxx.mrc>
If any records in your file contain a Record Type code that is not defined by the MARC21 Bibliographic
Format, they are pulled and dumped to a file called 'eNonBib.xxx.mrc'. This file can be checked
against your catalogue. Sometimes the file may simply contain coding errors in the leader.
7.4.3. Holdings Clean Up File HCEXCMRC.xxx.mrc
This file contains a list of records without holdings and those with invalid holdings codes identified in
the Holdings Cleanup step (6.2). This file is large and only available on request.
7.5. National Union Catalogue / WorldCat updated
The National Union Catalogue is updated with your holdings additions and deletions after the all TMQ
processing is complete. You will be sent a confirmation email once this is complete. The updates will
flow through to WorldCat in the files sent to OCLC each working day.
7.5.1. First file load and subsequent loads
The first file you provide will be loaded to the NUC as a complete refresh of your library holdings.
Subsequent files will be loaded as holdings Adds and Deletes.
#284524 National Digital Library-Customer Engagement
Toll free 0508 Te Puna (0508 83 7862) Ph +64 4 474 3000 Fax+64 4 474 3042
E-mail [email protected]
http://www.natlib.govt.nz/librarians
National Library of New Zealand Te Puna Mātauranga o Aotearoa
22/10/2010
10