E-Text - Accessing Higher Ground

Processing PDF:
How to Go from PDF to
E-text to Audio
Gaeir Dietrich
Director
High Tech Center Training Unit
of the California Community Colleges
Foothill Community College District
PDF from Publishers
Portable document format (PDF)
 Reads the same on any computer
 Looks like the book
 Smaller than TIFFs
 Contains all the text

– Always check to make sure the book is the
right one!

Easy for publishers
Requesting through ATN

Access Text Network
– Now free for requesting files from ATN-
member publishers
– Paid membership to exchange files
– www.accesstext.org

Not all publishers
– But ATN does have the largest ones
Other Resources at ATN

Accessible Textbook Finder
– http://www.accesstext.org/atf.php

Link to Publisher Lookup
– http://www.publisherlookup.org/
– Will have to contact non-ATN member
publishers directly
Using Publisher PDFs
Sometimes students can use files
directly
 Most often files will need further
processing for student use
 At the very least, large files need to be
broken into chapters

PDF Strengths

Good format for large print
– Cropping
– Fit to page on large pages
– Print sections on large pages (tiling)

Adobe Reader has some nice features
– Change colors
– Reflow
– Limited voicing

Easy for most publishers to create
PDF Weaknesses

Not always fully accessible
– Screen readers do not always like them—
even when they are text-based
– Reading order can be problematic
May be graphics (pictures of text)
 May have too much security

As an Aside…

When faculty create PDFs…
– The PDF always started as something
else…usually a Word file
– Try to get the starting document
– Security concerns?


Word files can be password protected
Button > Prepare > Encrypt
Types of PDF Documents

Text-based
– Text can be selected

Graphical
– Picture of text (i.e., a graphic)
– Text cannot be selected
Use text-select tool to tell the difference
 Files may be “locked”

Processing PDFs
Adobe Acrobat Professional
 Good OCR program

– Abbyy FineReader
– Nuance OmniPage

IF you are a Kurzweil campus, you will
also need Kurzweil
Adobe Tools

Adobe Reader
– Free
– Useful for students who need minimal
accessibility features
– http://www.adobe.com/products/reader/

Adobe Acrobat Professional
– Essential for alt media specialists
– Extract text, create accessible PDFs, enabled
Adobe Reader features
– www.uscollegebuy.com Discounted Price
Acrobat Reader

Reads aloud
– But does not highlight or track

Enlarges text
– Nice reflow feature
Changes text/background colors
 Text highlighting, sticky notes, and
comments
 Access text-based PDFs

Process with Acrobat Pro
Cropping
 Enlargement for printing
 Tiling
 Combining
 Some text extraction
 Works with text-based PDF

Processing Graphical PDFs

Must run optical character recognition (OCR)
– Computers cannot read pictures
– OCR programs recognize the “characters” in the
picture

How you process the file depends on the end
format the student wants!
Various Options

OmniPage or FineReader
– FineReader generally easier to learn
– Save to Word or HTML or Text based on student
preference

Use virtual printer with Kurzweil
– Create KESI files

R&W
– Save as Word
Which One When?

Want a Word file?
– Best choice is OmniPage or FineReader

Want a Kurzweil document?
– Use Kurzweil to process the PDF

For students to do themselves?
– Whichever program they prefer
Why?

OCR programs are designed to make
extraction and editing easy

Document readers (R&W, Kurzweil,
etc.) are designed to make reading
easy…NOT editing.
NEVER!!!

Do NOT run OCR with FineReader or
OmniPage…save to PDF…and then
take into Kurzweil, R&W, etc.

Kurzweil, R&W, WYNN will run their
own OCR on the PDF!
– Wastes time, adds error to do OCR twice
OCR Programs

Treat PDFs the same as a TIFF
– If you OCR scanned documents, use the
same process
Load image file
 Select zones
 Create templates as needed

PDF Bottom Line

Source files vs. end-user files
– Source files = for you to create alt media
from
– End-user files = alt media formats

PDF
– Consider PDFs as source files (files to
process) that sometimes double as enduser files (for certain students with limited
access issues)
Resource Info
Gaeir Dietrich
 [email protected]
 408-996-6047

www.htctu.net
 Alt media listserv
 Manuals online
