Preview of Changes to STEP for Version 2

Preview of Changes to STEP for Version 2.0
June 26, 1998
Introduction
This document contains a description of the changes to STEP Parsons is implementing to
support non-English languages in all STEP keywords and also those changes necessary to
fully support Bibles in STEP. These changes are extensions of plans first described in to
the STEP community during its April 1997 meeting in Orlando and also in a December
1996 white paper on Bibles in STEP.
All of the described changes have been implemented in the Developers Toolkit.
Interested parties should request an update from Parsons.
Parsons is further publishing extensions to STEP that it has used to implement Greek
Lemmas and Morphology in a STEP version of the UBS Greek NT. These extensions can
be supported or ignored by STEP Book Readers.
Non-English Languages in STEP Keywords
Overview
The following keywords are affected by these changes:






Level
Glossary
LinkToLevel
LinkToBook
LinkToGlossary
SyncTo
When using the BSISG STEP Conversion Program for tagging and creating STEP books,
non-English text is placed into these keywords in the same way it is placed in the book
text. The Language tag turns the current language to the one named in the keyword. To
change it back to the preceding language or to change it to another language, simply
insert another Language keyword. The STEP Conversion Program supports the use of
the “SP” fonts for Greek and Hebrew text in keywords just as in the text.
Example:
{\Level2:{\Language:Greek}{\Language:English} 1}
(You must have the SPIonic font to see the sample Greek text above.)
If there are no Language keywords in the text of a keyword, the text is assumed to be in
English regardless of the language outside the keyword. Further, the effect of Language
keywords within a keyword is limited to the scope of the containing keyword.
Example:
{\Level2:{\Language:Greek}} This text is English since that was
the language in use prior to the Level keyword.
Storage of Non-English Text
The BSISG STEP Conversion Program alters the representation of non-English language
strings before storing it, as shown in the following example:
{\Level2:{\Language:Greek}{\Language:English} 1}
becomes
{\Level2:<Greek:><English: 1>}
In this case, the string “<Greek:><English: 1>” becomes the name of
this level and is stored in the STEP data files just as if it had been typed in English. Of
course the font information is lost upon storage, as only the character codes are stored.
This explains why it is necessary to retain the “Greek:” language tag.
Interpreting Non-English Text in Book Readers
For those Book Readers that use the BSISG STEP Programmers Toolkit, a function is
provided to convert these tagged strings into cLanguageStrings.
For those companies or individuals implementing STEP readers without the Programmers
Toolkit, information on mapping the non-English character codes is provided by BSISG.
Specifically, mappings from the “SP” fonts to Unicode can be obtained, as can be sorting
information.
Special Information on Selected Keywords
LinkToLevel, LinkToBook and LinkToGlossary
A link to any of these items can contain non-English text. The level name in the LinkTo
keyword must match the corresponding level or glossary name exactly, including all
embedded language keywords.
SyncTo
Non-English text may be used when the book is “Word sync’d”. Books using other
synchronization methods (such as Verse or Strongs) must use English.
Parsons’ Lemma and Morphology Keywords
Syntax
The Lemma and Morphology keywords are used to tag a Greek word with its
corresponding lemma and morphology. Currently only the Friberg morphological tagging
system is supported but others could be added.
Because of the complex nature of this addition to the specification, Parsons is treating it
as an extension to the STEP specification that can be implemented by Book Readers if
they so choose.
Here is an example of the syntax for the Lemma and Morphology keywords:
There must be no space or other keyword between the Greek
word and the Morphology or Lemma keywords so that the conversion program
associates the morphology and lemma with the Greek word they follow.
About the Morphological Tags
The morphology contents are based on the Analytical Greek New Testament by Timothy
and Barbara Friberg. Later, if we add other morphology, we anticipate adding a keyword
that would specify what morphology system applies to the book.
In the Friberg notation, the tag letters have meaning based on their position in the tag.
The first letter specifies the basic part of speech and the remaining letters are subcategories of that part of speech.
The above example has a simple tag morphology. Some words have complex tags which
are multiple simple tags combined with tag connectors. An example of this is:
AB-----/A--AM-S
This means that the word could be parsed as AB or A--AM-S. Since you will want the
user to be able to find this word when either of these tags is specified, each should be
given as a morphology by themselves as well as the combination. This can be done by
putting all different forms that should find this word in the morphology keyword
separated by semicolons. Parsons has chosen to specify this morphology like this:
{\Morphology:AB-----;A--AM-S;AB-----/;/A--AM-S;/AB-----;A--AM-S/;/;
A--AM-S/AB-----;AB-----/A-AM-S}
With this, the user can enter any of these combinations and find the word. In addition,
wildcards can be used to mark sub-categories that are “don't care.” The user can specify
the morphology they are interested in and if it is one way to parse this word, it will be
found. Also, the connector itself is put in as a tag so the user can find all word that have
complex tags using the OR connector, all tags that use the OR connector in association
with an adverb, etc.
Most words will not have a letter for all sub-categories. In those cases, a dash is used to
hold the position for the unused sub-categories. It is customary in the Friberg notation to
drop all ending dashes because they are placeholders to give meaning to subsequent
letters and if there are not any subsequent letters they are not needed. However for
searching purposes, the tags that go into the morphology keyword should contain trailing
dashes as well to distinguish them clearly from other tags and tag combinations.
The reason that each valid way of specifying how to parse the word is used is so that the
user can find the word specifying one of the valid ways and not get something they don't
expect. There are other more complex connectors where not all simple tags should match
the word by themselves. For instance, the “functions as” connector identifies the form in
which a word functions when it grammatically parses to a different form. Here is an
example:
APRGM-PÂPRAM-P
You should be able to find a word parsed this way by specifying APRGM-P but not by
specifying APRAM-P because the later is not a valid grammatical parsing of the word.
But you should be able to find this word with any of the following:
APRGM-P
ÂPRAM-P
APRGM-P^
^
APRGM-PÂPRAM-P
This is Parsons’ philosophy on this connector and they have opinions about the others as
well. The tags specified in the morphology keyword should reflect the desire of the editor
of the book.
Storage of Morphology and Lemma Keywords
The morphology tags are stored in the Greek Friberg Morphology language concordance
and the lemma tags in the Greek language concordance.
Bibles in STEP
General Comments
Bibles are, for the most part, just like any other STEP book that synchronizes by Bible
verse. They differ slightly in that they are the targets of the automatic verse links that the
STEP Conversion Program generates. Also, users may want to limit searches only to a
specific range of verses. As a result, STEP Book Readers need to know whether or not a
book is a Bible and how to tell where verses begin and end.
Traditionally, Bibles were printed with one verse per paragraph. Modern translations
have imposed paragraphs on the text which may or may not align nicely with verse and
chapter boundaries. Changes to the STEP specification to support Bibles should not
restrict a publisher from formatting the text any way it desires.
STEP Control Words for Bibles
The Translation Control Word
The Translation control word is used to tell the STEP Conversion Program both that this
book is a Bible and also which Bible it is. The syntax is:
{\Translation:<abbreviation from STEP spec>}
Example:
{\Translation:KJV}
Only those translations listed in the specification are supported. Before publishing a new
translation of the Bible in STEP it will be necessary to add the new translation to the
specification and to the Conversion Program.
In addition to adding the translation’s abbreviation to the list of same in the specification,
it is necessary to carefully verify the versification of the new Bible and compare it to
other supported translations. The list of STEP book numbers that make up this new Bible
needs to be created and added to the specification, and any new Bible books (or
renumbered existing books) need to be added to the list of Bible books supported by
STEP.
Effects of the Translation Control Word on SyncTo Commands
The translation named in the Translation control word will be used to interpret the verse
references in SyncTo keywords.
Book Readers can assume that when the Translation keyword is present for a book, then
a section that contains a verse SyncTo command to a specific verse (i.e. not an entire
book nor an entire chapter) consists entirely of Bible text. This is important when
determining which sections to include in a search that is limited by a verse range.
Verses that are split across multiple sections (such as might occur when a paragraph
break occurs in the midst of a verse requiring a new heading) should contain a SyncTo
identifying the verse as the “b” portion of the verse (i.e. {\SyncTo:John 3:16b}). This will
not result in an entry in the VSYNC file but will result in the section being appropriately
identified as containing Bible text.
A new data table consisting of a list of sections in the book that contain Bible text will be
added to one of the data files. Functions to access this list will be added to the STEP
Programmers Toolkit. For those developers not making use of the Toolkit, complete
details will be published as part of the STEP specification.
SyncType for Bibles
All Bibles must be verse sync’d. That is, they must contain the keyword:
{\SyncType:Verse}
Recommendations for Tagging Bibles
Placement of Level and EndViewableText Control Words
For proper operation, a section (identified by a Level command) should consist either
entirely of Bible text or of non-Bible text. (See the discussion of SyncTo above.)
It is recommended that EndViewableText be inserted at the end of chapters. Some
Bibles break logical sections of the text at boundaries slightly before or slightly after the
beginning of a new chapter. In these cases it is recommended that the “end of chapter”
rule for EndViewableText be ignored, and the EndViewableText control word be
placed where it makes the most sense for readability.
Placement of SyncTo Control Words
As mentioned above, a section containing a SyncTo command will be assumed to consist
entirely of Bible text. SyncTo commands in this case should explicitly identify the verse
by including book, chapter and verse. SyncTo commands containing verse ranges, entire
chapters or entire books will be interpreted as not identifying Bible text.
The way the publisher can make use of this convention is to identify sections containing
book titles and chapter headings with appropriate SyncTo commands without mistakenly
including these sections in searches. For example:
{\Level1:Genesis}{\SyncTo:Genesis}The Book of Genesis
{\Level2:Chapter One}{\SyncTo:Gen 1}Chapter One
{\Level3:Verse 1}{\SyncTo:Gen 1:1}In the beginning God created the heavens
and the earth…
When these conventions are followed, the Book Reader can identify the last section
above as consisting of Bible text and use this fact to perform a “Bible text only” search
on what it might otherwise treat as just another STEP book.
In a translation like the Living Bible or the Message, where verse numbers are somewhat
meaningless since whole paragraphs are paraphrased from several verses, each of the
component verses can be listed in the SyncTo commands for that section to maximize the
usefulness of Bible-text-only style searches. In this case the results won’t be as
meaningful since each section contains more than one verse. For example:
{Level3:Verse 3-4}{\SyncTo:Num 23:3}{SyncTo:Num 23:4}Then Balaam said
to the king, “Stand here by your burnt offerings and I will see if the Lord will
meet me; and I will tell you what he says to me.” So he went up to a barren
height, and God met him there. Balaam told the Lord, “I have prepared seven
altars and have sacrificed a young bull and a ram on each.”
If the user searches Numbers 23:3 for “Balaam” (a meaningless search but useful for
illustration) he’d find one occurrence in the KJV but two occurrences in the Living Bible
(quoted above). This is because the Living Bible does not distinguish where the division
is between verses 3 and 4 in Numbers 23.
The drawbacks of this approach are obvious in that it may result in spurious search hits.
But when you consider that the translators of the Living Bible didn’t seem to be
concerned with the issue of precise verse divisions and the fact that this procedure
eliminates the necessity to create artificial verse divisions where none exist, the positives
seem to outweigh the negatives.
“Missing” Verses
For Bibles that follow the KJV numbering but excluded the text of certain verses (like the
NIV and NASB), it is recommended that the publisher include a SyncTo for the
“missing” verse in the section prior to the missing verse. For example, if chapter 3 of a
book has no text for verse 12, include a SyncTo command for verse 12 in the section
containing verse 11.
An alternative is to create an empty section containing only the SyncTo for the missing
verse (and perhaps a glossary link to the text for the missing verse).
Translator’s Footnotes
Translator’s footnotes that are included with a translation should be implemented as
glossary sections. Cross-references within those footnotes will generate normal Bible
reference links just like they would if they were in a commentary or some other book. It
will be up to the Book Reader to implement any special behaviors it feels are necessary
when linking to Bible verses from within a STEP book it knows is a Bible.
Non-biblical Text (Section Headings, etc.)
Publishers should turn of Bible reference parsing around chapter and book headings for
which automatic links might be generated by the Conversion Program.
It is up to the publisher as to whether or not to turn the concordance off for section
headings. If the rules regarding the use of SyncTo commands in sections containing nonBible text are followed, Book Readers can choose to implement special features that limit
searches to only the Bible text, but the publisher may or may not want to rely on this.
On the other hand, it may be desirable to be able to search the section headings.
Valid and Invalid Assumptions for Book Readers
It is valid to assume that every verse in the Bible is covered by an appropriate SyncTo
command. It is up to the Bible publisher to verify that all verses have SyncTo commands
associated with them.
It is not valid to assume that the sections containing John 1:1-14 are non-interrupted. That
is, there may be non-biblical text in the midst of a range of sections. For the purpose of
limiting searches, it is not adequate to start with the section containing John 1:1 and end
with the section containing John 1:14.
It is not valid to assume that a Bible will have all its Level1 entries be book names,
Level2 entries be chapter numbers and Level3 entries be verse numbers. Levels can be
assigned to section headings, paragraphs, sentences or any other structure the publisher
sees fit.
### END ###

Download Report

Preview of Changes to STEP for Version 2

Paperzz.com

Your Paperzz