PATCO Digitization Procedures Georgia State University Library Digitization of the PATCO Records funded in part by the NHPRC Planning Project timeline Project Timeline 0-3 months 3-6 months Project Manager submits the position announcements to Library Administration and initiates search for the Scanning Tech Assistant, Archival Assistant and Student Assistant(s) positions Project Manager set up regular project meetings, meets with project team (all Project Staff listed) to finalize project dependencies and establish a Gantt chart of tasks and milestones. 2 review meetings (6 week interval) by project team with review of milestones LTA and Project Archivist set up process for statistical records, metadata template and workflow. Project Archivist begins preparing Series folders for workflow Project Manager sets up spreadsheet for metadata Library Technical Assistant (LTA) begins test scanning and testing project workflow Project Archivist trains newly hired Archival Assistant on folder preparation and digitization workflow and image / metadata management Digital Projects Librarian trains newly hired LTA and student assistants on scanning procedures and image / metadata management LTA meets with and establishes a scanning schedule for student assistants Student assistants begin scanning Near end of the period: 3-month project review meeting by project team 2 review meetings (6 week interval) by project team with review of milestones Ongoing scanning of documents by student assistants, LTA Ongoing folder preparation and management by archival assistant LTA and archival assistant begin checking images, correct for skew and cropping and migrating images and metadata into the spreadsheet Continuing project review meetings by project team LTA begins final checking and uploading images and metadata from spreadsheets into CONTENTdm Coordinator for Arrangement and Description begins approval and publishing of records in CONTENTdm Near end of the period: 6-month project review meeting by project team Project Timeline 6-9 months 9-12 months 12-15 months 15-18 months 18-20 months 2 review meetings (6 week intervals) by project team with review of milestones Ongoing scanning of documents by student assistants, LTA Ongoing final checking and uploading of images/metadata from spreadsheets into CONTENTdm Near end of the period: 9-month project review meeting by project team 2 review meetings (6 week intervals) by project team with review of milestones Ongoing scanning of documents by student assistants, LTA Ongoing final checking and uploading of images/metadata from spreadsheets into CONTENTdm Near end of the period: 12-month project review meetings by project team Completed scanning of documents by student assistants, LTA LTA meets with student assistants about finalizing project work 2 review meetings (6 week intervals) by project team with review of milestones Ongoing final checking and uploading of images/metadata from spreadsheets into CONTENTdm Ongoing approval and publishing of records in CONTENTdm Near end of the period: 15-month project review meetings by project team Web Development Librarian customizes home page/landing page in CONTENTdm for project collection Testing home/landing page for project Near end of the period: 18-month project review meetings by project team Final checking and uploading and approval of records Final project review meetings by project team Launch of home/landing page Ongoing scanning of documents by student assistants, LTA Document Preparation Document preparation Before handling any documents, make sure that hands are clean and dry, free of dirt, food, lotion, and moisture. If you need to handle photographs, use gloves. Go through each page of each file separately, keeping all pages in order. Inspect for mold, damage, and fragility. Using a microspatula, remove all metal (paperclips, staples) extremely carefully. There should be no damage done to any document during this process. If something seems stuck or a document seems too fragile, consult the archivist for advice. For all pages organized into groups (by paperclip, staple, etc), either place group inside of a folded sheet of bond paper or group using a small slip of bond paper and a stainless steel paperclip. Scanning Guidelines for Scanning Before handling any documents, make sure that hands are clean and dry, free of dirt, food, lotion, and moisture. If you need to handle photographs, use gloves. Remove one folder at a time from the box you’re working on, and place an out card or placeholder where you have removed the folder. Each folder in each box will be given a name using the OPUS software with the following naming convention: o The project name is PATCO [Series number]. o The object name is PATCO_[Series number]_[Box number]_[Folder number] – so folder one in box one of the first series for PATCO = PATCO_01_01_01. Keep all pages in order; turn pages as if you are reading a book. At all times, documents should be on the scanner or in the folder. Do not place them in other areas around the scanning area; if they become misplaced, it will be very difficult to determine to which folder they should be returned. Scan the front of the file folder (with as much of the collection, box, and folder information visible as possible) as the first page of each new object. Be sure to scan the complete file, including any pages with even a small amount of writing; skip all blank pages, however. Do not scan extra copies of items in the folder. Only exact copies must be skipped; if a document contains small changes, handwritten notes, a slightly different layout, or other small changes, it must be scanned. Scan each page separately in the center of the bed. For books, booklets, or pamphlets, open the document and place the spine on the seam of the bed. For pages or booklets which do not lay flat, adjust one bed or both beds and close top cover of scanner. Be sure to adjust scanner settings as you go to optimize results. Books, booklets, pamphlets, or other publications with corporate or government publishers and/or that are under copyright should not be scanned in their entireties; only PATCOoriginated documents, reports, and publications should be scanned. o The covers of books, booklets, and pamphlets will be scanned so that their placement in the folder can be noted. o News clippings and magazine articles will not be scanned in their entireties, only PATCOoriginated documents, reports, and publications. The titles and publication information of news clippings and magazine articles will be scanned and the body of the article will be covered or redacted. It will be accompanied by this note: This item was not scanned in its entirety because Georgia State University does not hold copyright. You may be able to obtain a copy of this document at your local library. PATCO Records Digitization Project, Georgia State University Library. Personal, sensitive information is skipped, including social security numbers, credit card numbers, medical evaluations, etc. When encountering these, mask using the statement sheet that GSU is not scanning these or mark them for redaction. PATCO Scanning Instructions Using OPUS FreeFlow Software with a Bookeye 3 Scanner 1. After logging into the computer station, turn on the Bookeye scanner by pressing and holding the Start button. Release when start up begins. Scanner must be on before the software can open. 2. Open the OPUS FreeFlow software by clicking the icon. 3. Maximize window and select “New” to create a new item. 4. In this window you assign a Project and Object Name to the new item. The Object Name will become the name of the future derivative file, hence it is important to follow the correct file naming convention. The ID number is assigned by OPUS and is how the original TIFF scans are named. 5. Click ok. 6. Raise book cradle to top position for scanning paper or adjust accordingly for a bound volume using the corresponding arrow buttons on the scanner. 7. To scan, use the “Scan Now” button in the software (when it is green), the Start button on the scanner, or the foot pedal. Allow each scan time to process so that the Scan Status says Ready. The scanner automatically crops items, but this is often imperfect as seen in the example below. 8. If necessary, use the Setup menu to adjust the color or black/white preferences and specify whether you are using the glass, among other things. This can also be done using the scanner directly. 9. If you missed something or something scanned incorrectly, you can DELETE, REPLACE, or INSERT (before or after) the currently selected image (noted by red border around the thumbnail at the bottom when selected). 10. When scanning is complete, select the Image Treatment tab. The software automatically performs “two up” image splitting, book-fold correction, crop, de-skew, adjustment of the border, and eliminates fan and gutter. Click on the small page noted above to allow you to further manipulate the clips for each scan. Here you can crop, skew and rotate your image as needed. Book curvature produced from scanning a tightly bound book may be realigned using Opus’ book curve correction tool. 11. Click “Perform IT on All Images” when you are finished. A green progress bar will let you know when this stage is complete. If you place no clips on an image(s), you still have to “Perform IT” to send those unedited images to the Export stage. Do this if the scan requires no cropping. 12. When complete, review your images to insure proper image treatment before exporting. You can go back and forth between the Image Treatment and Export tabs as much as needed, performing IT on any image that requires adjustment until you are satisfied with the end result. Use the “Perform IT on Current Image” if you need to change only one. Click “Export Images” button when ready. 13. Here you specify your file type and location. “Choose Base Output Directory” is where your derivative file will be sent. You can change this setting each time as needed down to a specific folder level. The file will end up in a new folder named with its OPUS ID#. Choose options here to create derivative PDF-A files (for upload to CONTENTdm) using a JPEG compression at 300 DPI resolution with 24 bits color depth (RGB) and JPEG image format at 80 quality. OCR and intelligent interpretation is performed during derivative creation, so that the PDF is full-text searchable. 14. When finished, you must click back to the SCAN tab in order to close the item. The newly created item will remain in the MANAGE cue on the local hard drive for 7 days before it is deleted. File Management and Metadata File management It is best to copy all necessary files to the dark archive drive immediately after creation. Derivative files can be found in the chosen output location. Keep all tiff files in one folder and all PDF files in a separate folder. o The original tiff scans can be found organized by the OPUS ID number in the Active Object Hive folder of the Working Data folder in the Opus program folder. Move full scans to a new file folder for each folder of material. Include a copy of the metadata file created when scanning with the files. o For PDF derivatives, move the file from your chosen output location into a new file folder for each box of material. File checking protocol For PDF files (to be uploaded to CONTENTdm): o Check file to ensure agreement with folder name on the first scanned image of file (file folder cover). o Skim each page for personal information (evaluations, private information, medical information), social security numbers, credit card numbers, copyrighted works and duplicates. o Check number of scanned images to be sure they agree with number of pages listed on spreadsheet. o Mark any items on the spreadsheet for filename corrections or for redaction (for redaction, provide page/scan number). For Tiff files (for backup/archival storage): o Check file to ensure agreement with folder name on the first scanned image of file (file folder cover). o Check to be sure the file name correctly configured (make sure the file name corresponds to the individual tiff images). o Indicate if the tiff has more than one copy of the scan. o Mark any items on the spreadsheets for discrepancies. Metadata Fill out a spreadsheet containing the following metadata fields for each object: o Identifier – [object name] o Title – [folder title] o Date of Original – [folder date] o Decade – [from folder date] o Description – [modified description from series scope and content note] o Creator – Professional Air Traffic Controllers Organization o Contributors – [blank] o Digital Publisher – Georgia State University o Curatorial Area – Southern Labor Archives o Collection – Professional Air Traffic Controllers Organization o Series – [series number and description] o Rights Information – [as listed on finding aid] o Citation – [as listed on finding aid] o Ordering – [blank] o Language - English o WAV – [blank except for audio files] o MP3 – [blank except for audio files] o Location Depicted – United States o Subject – [subject terms] o Subject (Depicted) – [blank] o Subject (Name) – Professional Air Traffic Controllers Organization (Washington, D.C.) o Source Format – Files (document groups) o Source Type – text o Source Dimensions – [page count] o Note – [blank] o Relation – [shortened, modified description from series scope and content note] o Publication History – [blank] o Format – text/pdf o Full Text – [from OCR performed on image during scanning process] o Other Format – tiff o Object File Name – [object name].pdf Upload Upload to CONTENTdm 1. Copy and paste a small set of PDF files (8-10) into an empty folder on your desktop. 2. Prepare the metadata: a. Copy and paste the metadata corresponding to the copied PDF files (with titular row) into a new spreadsheet. b. Save/export the spreadsheet as a tab delimited text file. c. Open the text file and remove all quotation marks; the fastest way to do this is to perform a “Replace All” search to replace quotation marks with nothing. d. Remove any additional spaces that may be at the end of the document. e. Save the document. 3. Open the CONTENTdm Project Client, and open the Professional Air Traffic Control Organization (PATCO) project. 4. Select the “Add Multiple Items” option on the right menu. 5. Choose the “Import using a tab-delimited text file” option and select the correct file. 6. Choose the “Import files from a directory” option and select the appropriate folder from your desktop. 7. When asked, “Do you want CONTENTdm to generate display images from items you import?”, select yes. 8. Ensure that all metadata fields correlate correctly on the Map Metadata Fields screen. 9. Click the “Add Items” button. 10. The progress of the uploads will display as a moving bar. 11. After the uploaded is displayed in the client, right click on the automatically generated thumbnail, click the “Replace Thumbnail” option, and choose the PATCO thumbnail jpeg file. 12. Check the information for each uploaded file separately using the metadata spreadsheets. 13. Select all files and click the Upload for Approval button. 14. The coordinator for the project will log into the CONTENTdm Administration site. Under the “Items” tab is the option to both “Approve” and “Index” which must occur in that order. Recently uploaded content can be checked from the Approval stage and edited if necessary before approved. Content will be present and searchable in the collection after a successful indexing. Long-term Digital Preservation Digital Preservation Plan Written by Melanie L. Maxwell, PATCO Library Tech, Digital Projects, January 30, 2013 Updated by Jeremy T. Bright, PATCO Library Tech, Digital Projects, October 15, 2013 In keeping with best practices of digital file permanence, Digital Projects has been aware since we were awarded the PATCO grant of the need for redundancy and bit check of our dark archive, adequate storage to maintain and increase our digital collections, consistency in our digital file naming conventions and formats, and eventual format migration to avoid file obsolescence. Development of a Comprehensive Preservation Plan The GSU Library is currently working to develop a library-wide, long-term digital preservation plan. The need to manage the significant amount of digital content generated by the PATCO project served as a catalyst for creating a plan to manage our ever-increasing digital content. Redundancy In December 2012, our network systems administrator (Trevor Sookdeo) organized official contracts for off-site cloud storage by Peachnet and a more reliable local backup in the basement of the Library South building for the network drive containing our archived digital files (T:). At the time of this report, library staff are in the process of investigating geographically distributed storage options. Bit Check The local backup runs an MD5 cryptographic hashing function to create a 128-bit hash value when performing file comparisons for sync verifications on all files. Storage Currently the network drive (T:) has a holding capacity of 19 TB. At the time of this report, the T: drive has 8.2 TB of used space, leaving 10.8 TB free for additional projects. Naming Conventions Digital Projects employs its own naming conventions to its mass digital projects. The PATCO files are named according to archival series/box/folder (PATCO_01_01_01), while periodicals (The Great Speckled Bird and The Signal) are named according to date published (GSB_1968_3.15 and GSUS1979-04-13, respectively). File Formats Digital Projects keeps the original, uncompressed tiff scan and a PDF derivative of its digital files. These files are organized into mirrored folders by format type. Individual tiff pages are renamed to reflect the pdf file name as our scanning software isn’t able to intelligently name the individual tiff pages. Format Migration The Digital Projects Coordinator will keep current on which file formats are reaching obsolescence and migrate as appropriate. A spreadsheet of each digital collection and its archived formats is actively managed. Levels of support for certain formats we commit to preserving, or decide to let go, are documented.
© Copyright 2026 Paperzz