An
Click Here for Table Of Contents
Algorithm
for
Translating
Chemical
Names
to
Molecular
Formulas
EUGENE
INSTITUTE
33
SOUTH
FOR
17th
STREET
GARFIELD
SCIENTIFIC
l
PHILADELPHIA
INFORMATION
3,
PENNSYLVANIA
.. .
-lll-
PREFACE
This varityped
many requests
graphical
version
of my doctoral
I received
changes
for copies
and those noted
have been in the arrangement
conventions.
However,
The original
typing
names
my belief
that an arduous
of Linguistics
suggestions
accepted.
Many of these
morph, etc.
Linguistics
simply,
this
means
Noam Chomsky’s
linguistic
in pursuing
Syntactic
interested
this
in learning
even though the work was not written
a textbook
that
will
enable
searches
and for indexing
structure
and theory.
tion of the lexicon
primarily
to acyclic
To complete
must
benzoic
account
acid.
without
the linguistic
while
analysis
for the difference
This
chemico-linguistics,
example
will
into
a discovery
in many useful
not be trained
described
the futility
ways.
further,
I recommend
pages 17 and
However,
I do not
too difficult,
to supplement
this work by
nomenclature
lor literature
understanding
of organic
analysis
the feasibility
when it occurs
was confined
cyclics,
the analyses
with pentanoic
approach
chemical
on the comple-
of handling
work. For example,
of any syllabic
Quite
For the reader
will find the reading
work, linguistic
considerable
of oic acid
procedure.
work is now in progress
establishing
now requires
in linguistics.
to use chemical
the detailed
of
before it was
1957) especially
It is my intention
In the present
definitely
also illustrates
or calcu-
of the morpheme, do-
of this statement
of this dissertation,
in meaning
This
by the Department
revisions
interpretations
background
and llbrarlans
morphemes.
chemistry
through several
be interpreted
the procedures
getting
As a follow-up
of chemical
went
as a textbook.
sclentlsts
in
in the preparation
as accepted
(Mouton & Co, ‘S-G ravenhage,
treatise
a chemical
who helped
that one can prescribe
can
on Trans-
and the final manuscript,
The dissertation,
the theoretical
Structures
Most of the readersof
feel that anyone
data
by Mrs. Joan
in the section
such as naming
from different
is not yet so precise
that
who is interested
resulted
The vari-
He also found many errors
the many other persons
of Pennsylvania,
changes
Shapiro.
by a machine.
and participation.
of the University
task
at the end.
was performed
of omission
of footnotes.
intellectual
of minor typo-
the only other changes
Mrs. Sylvia
in both the original
performed
want to thank collectively,
this work through
found errors
and formulas
the
etc. which had to conform to university
by my secretary,
by the addition
to satisfy
\Yith the exception
etc. have been placed
was typed primarily
Mr. Fiddler
primarily
on Transformations,
bibliography,
the indexes,
its formula is most consistently
I also
manuscript.
was done by Mrs. Joan M, Graham. Proofreading
of chemical
only strengthens
has been prepared
below in the section
which have been corrected
the copying
lating
of the original
in this edition
E, Shook and Mr. Walter Fiddler.
formations
dissertation
of the indexes,
manuscript
in this edition
TO THE FIRST IS1 EDITION
acid and
to the study of
I wish
to stress
above-mentioned
cially
chemical
true
portion
lexicon
for those
nomenclature
In closing
of this
that
it is
in order
with
training
to carry
I should
like
not
necessary
for the reader
to use the algorithm
in organic
through
to encourage
(procedures)
chemistry,
the simple
to wait
that
is,
for the appearance
described
have
here,
already
This
is espe-
memorized
enough
calculations.
my readers
to communicate
with me concerningany
work,
Eugene
Garfield
INSTITUTE
Phi lade lphia
July
of the
17, 1961
FOR SCIENTIFIC
3, Pa.
INFORMATION
PREFACE
This
dissertation
chemical
and
nomenclature
related
aspects
linguistic
studies
chemical
nomenclature
discourse,
for use
by a computer.
routine,
and
formula
gorithm
is explained
method
and
the
nomenclature
The algorithm
diagrams
calculation
and finally,
direction
relevance
of structural
for
computer
the
are included.
might
is appended
primarily
for future
for use
in its
discussed,
respect
by the reader
context
as other
to modern
procedures
are illustrated
for the human
research.
proper
as well
linguistic
translator
and then
dictionary
procedure
for testing
to the general
A summary
to the
by examples
analysis,
sampling
with
for translating
of nomenclature
syntactic
The
are drawn
be taken
first
algorithm
study
of the linguist
explained,
conclusions
that
is first
The relationship
is then
routine
the
problem.
The methods
is shown.
Flow
practice
The
a new
to place
of nomenclature
information
introduced.
demonstrates
In order
formulas.
development
chemical
is then
and
explains,
molecular
historical
of the
of chemical
from
into
the
perspective,
study
discusses,
look-up
the al-
validity
of modern
who is not familiar
of the
chemical
with
chemi-
cal nomenclature.
ABSTRACT
An algorithm
The
validity
several
of the
hundred
analyses
randomly
cedure
algorithm
enables
diagrams.
formulas
problem
of chemists,
of the chemist,
many systems
(C. A.) nomenclature
of so-Cal led
matic
ofchemical
It is shown
trivial
expressions
were
but eliminates
of eight
is described.
Molecular
verifying
expanded
simple
quickly
and requires
is discussed
to the problem
is only
formulas
of
the linguistic
operations.
without
a program
to include
The
in terms
The pro-
drawing
of less
structural
than
low frequency
1000 in-
morphemes,
ambiguous
more
all homonymous
of the information
of nomenclature
one language
The difficulties
from C. A/s
nomenclature.
formulas
be handled.
of the linguist
there
successfully,
formulas
is rapid
nomenclature
that
to molecular
and by computer.
consists
molecular
routine
could
of nomenclature,
results
translation
dictionary
names
manually
were calculated
to compute
experimental
names
program.
human
translation
The approach
both
chemicals
for manual
for all chemical
from chemical
tested
of the computer
non-chemists
If the
The
was
selected
Th e machine
structions.
directly
algorithm
and the logic
The
exist
for translating
of chemical
in syntactically
of morphemes
systematic
1.U. P. A.C.
expressions.
is contrasted
nomenclature
analyzing
use
requirements
such
though
Chemical
as imino,
nomenclature
with that
there
Abstracts
not the use
includes
idio-
of the
most
frequently
allomorphs
such
compiled.
These
diaz,
the
pheme
gous
alkyl
thi
as
of which
morphemes
onewould
transformations,
Rather
the
in primary
lish
Chemical
name
In order
historical
systems
to machine
atic
the
Th e completion
calculation
suitable
line
methods
field
However,
morpheme
Univ.
the
as
Mor-
homolo-
in chemical
To complete
classes,
employed
of Pennsylvania)
the
and the list
of
procedure
a
translation
by a summary
the need
by Harris,
IIiz,
for normal
Eng-
of 1. U. P. A.C.
other
the procedures
information
of chemico-linguistics
nomenclature.
One can control
conclusions
but also
could
easily
is
rules
for
the experimental
can be drawn
which
of chemical
required
to the organic
for the linguist
conditions
may have
chemical
more easily
more general
to
(retrieval
facet
are more
nomenclature
must
for mechanized
would
of structural
to foreign
Conference
which
are to be used
in machine
be applied
aspects
nomenclature
the generation
the
the intellectual
in chemical
texts
retrieval,
and searching
to solve
of synonymy
of interest
Similarly,
studies
of chemical
grammar
formulas
indexing
with the manipulative
The problem
information
from the 1892 Geneva
notation,
for linguistic
analyses
of chemical
is traced
in contrast
of the detailed
and
problem
nomenclature,
syntactic
notations,
for teaching
field of study,
Rylamine.
to the procedure
is supported
to the general
handling.
of molecular
modifications
The
and
properties
a word-for-word
Projects,
data
is discussed
if computable
names,
is comparable
nomenclature
sed, In particular,
readily
mit
study
between
problem
indexing.
aminoRune=
of morphemes,
Analysis
of chemical
of the “retrieval”
be resolved
of di and az.
meanings
(an, en, yn, iun, etc.)
is not simply
linguistic
this
The relationship
amenable
such
names.
to relate
is discus
required
structural
development
the present.
were
expressions
of transformational
where
the inventory
and Discourse
chemical
generating
as 10, e, y! and
co-occurrences
idiomatic
from the referential
(R-N)
recognition
analysis
The
discourse.
in identifying
the demonstration
amines
to expand
et al (Transformations
valuable
be computed
include
have
syntactic
such
prop, but, etc.
eth,
analyses
morphemes
A list of their 200 actual
isolated,
particularly
forty
by the bonding morphemes
meth,
nomenclatureas e.g.
were
cannot
are illustrated
The syntactic
grammar
are
Approximately
segments.
sulf
and
studies
meaning
classes
occurring
not only
diagrams,
searching
per-
system-
systems,
With
nomenclature.
chemist
as it can improve
nomenclature
than
application.
in normal
is a fertile
discourse,
)
-vii-
To Use Table of Contents Click on any Item or Page Number
TABLE
PrefacetoFirstISIEdition
l
Prefaceandilbstract
,,,,.
ORGANIC
The Contradictory
Geneva
.
Rapid
Reading
of Chemical
Implications
Formula
Structural
Diagrams
Molecular
Formulas
1
...................................
3
Morphology
...................................
3
Organic
4
.................................
as a Language
4
................................
Chemistry
4
..................................
Literature.
4
5
....................................
of Chemists
6
in Analytical
Coordination
7
Center
Code.
7
..............................
Problem
..............................................
Nomenclature
Requires
More Than
8
8
..............................
Cooperation.
8
..................................................
MachineIndexing
Versus
and British
Analytical
Aspects
9
9
..............................................
AmericanNomenclature
Interest
9
...........................
of Indexing
........................................
Nomenclature
in Mechanical
Analysis
INTELLECTUAL
10
................................
INDEXING
TASKS
REQUIRING
STUDY
...........................................
MechanicalReadingDevice
6
7
................................
Chemistry
The Indexer’s
Accelerated
2
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Chemical-Biological
Soviet
1
.....................................
..................................................
GenericSearches
ix
1
...........................
..................................................
Indexes
v
BACKGROUND
..................................................
Requirements
Manipulative
~..~~~.~.....~.~~~...~~..~
**
Structure
Chain
of Chemical
Systems
Information
. . . . .
.........................................
Chemistry
Volume
Notation
~rrrrrrrrr~o
...............................................
for Teaching
Increased
l
. . . . . .
- HISTORICAL
Nomenclature.
Indexing
in Syntax-Not
Organic
. . . . . .
NOM FNCLATURE
Shortest
Change
.@
CHIWICAL
I.U.P.A.C.Nomenclature..
Versus
. . . . .
. . . . . . . . . . . .
Versus
Nomenclature
Longest
. .
l
. ,,,....
Goals
Oral Communication
111
4o4oooo4o4o4o4oo44444oeoo4e4o444oo44a4o444
,,,,,.,..
ListofTables
OF CONTENTS
10
Selective
Word Recognition-Copywriter.
...................................
10
Chemical
Names
...................................
11
to Structural
.........................................
DrawingDiagramsby.Machine
Recognizing
Chemical
Calculating
Molecular
Diagrams
Names
Formulas
by Machine
by Machine
..................................
................................
11
12
13
TARLE
The Quagmire
Trivial
of Chemical
Nomenclature
Designing
for Machine
between
Recognition
Nomenclature
Forms
Putative
Free
The
of Syntactic
Transformations
Organic
Chemistry
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
26
of the Study
of Chemical
Nomenclature
FOR TRANSLATING
Only One Language
Example.
SecandExample
Third
Example
of Chemical
29
FORMULAS
29
. . . . . . . .
32
. . . . . . . . . . . . . . . . . . . . . . . . . , . . . . . . . .
33
NAMES INTO
. . . . . . . . . . . . . . . . . .
Formula.
Nomenclature
MOLECULAR
l
.............................
34
34
34
..................................................
and Principal
of the Longest
Fifth
Example
- Computer
not Obvious
Processing
Routine.
..............................
....................................
Procedure
Discovery
Match
......................................
Procedure
.....................................
35
36
36
36
.........................................
37
...........................................
37
CurrentChancterProcessing
Match
33
..................................................
- Human
Fully
..................
33
Example
Dictionary
28
..................................................
Fifth
Ignorability
to Linguistics
.................
...................................................
FourthExample
‘4mbiguity
Formula
Nomenclature
of Nomenclature.
CW%ICAL
for the Molecular
for \lolecular
21
24
Chemical
The Value
Expression
20
. . . . . . . . . . . . . . .
in Organic
for the Study
First
19
..........................
Nomenclature.
Linguistics
Equation
NOMENCLATURE
..............................
of Structural
Soffer’s
TO CHEMICAL
20
The Value
Generalized
18
.................................
Distribution
Analysis
in Organic
AN ALGORITHM
17
................................................
in Systematic
Problem
APPROACH
Environments.
and Complementary
Co-occurrences
17
............................
and Searching
LINGUISTICS
and Their
Morphemes
Variation
16
..........................................
Devices
STRUCTURAL
Linguistic
15
..........................................
DesigningtheExperiment..
Pattern
14
.................................
Uses.
13
.
13
....................................
as a Language
Nomenclature
Relationship
. . . . . . . . . . . . . . . . . m. . . . . . . . . . , . . . . , .
Nomenclature
.................................................
Names
Treating
(continued)
..................................................
Nan-es,,
Systematic
OF CONTENTS
.41pha Storage
........................................
.37
Pent-Ott
Ambiguity-Resolving
Routine
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
38
Computer
Calculation
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
38
Hydrogen
Calculation..
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
39
Routine
-ix-
TABLE
Sampling
Method
OF CONTENTS
(continued)
50
...................................................
~bugging............................~~.....~~....~~..........~
The
Bonding
.
..............................................
Morphemes
Conclusions
.....................................................
APPEWIX.
I. U. P. A. C. Organic
ing a Detailed
Example
What’sInaName?.
51
52
57
i,
Chemical
of its use both
A Summary
Nomenclature.
in Recognition
of Principles
and Generation
of Systematic
IncludNames
56
58
................................................
Bibliography
....................................................
63
AuthorIndex
.....................................................
65
Subject
Index
....................................................
66
LIST
of Primary
Table
I
List
Table
II
Classified
Table
III
Alphabetical
Table
IV
Transformations
Table
V
Sumrrary
Table
VI
Inventory
Table
VII
General
Table
VIII
Dictionary
Table
IX
Pent-OctAmbiguityResolvingRoutine...~...~.....~~
Table
X
Molecular
Table
XI
Random
Table
XII
Summary
List
Morphemes
OF TABLES
Organic
of Operations
Program
Used
for Chemical
Look-Up
Formula
Nomenclature
for Human
of Morphemes
Routine
Calculation
of Chemicals
...............
27
.......................
in the Experiment
Name
23
..........................
Translation
Recognition
30
......................
30
.....................
42
...................................
Routine
Tested
of I. U. P. A. C. Nomenclature
45
.............
............................
on Computer
21
22
.............................
of Co-occurrences
in Organic
Chemistry.
...............................
of Co-occurrences
List
Sample
for Acyclic
Program
...........................
47
48
...............
54
62
-l-
ORGANIC CHEMICAL NOMENCLATURE-HISTORICAL AND BACKGROUNDINFORMATION
The Contradictory
“It
pound.
is
This
state
of thought
[J
possible
Am.
and
are
of the
of affairs
remarks
Nomenclature
result
find
next
in the
constitutes
the great
advantage
to bring
of the
the same
for greater
freedom
loss
“Rut
obstacle
to modify
written
communication
of these
experts.
get people
inability
to speak
to make
of organic
dents
are
expression
this
is useful.”
wherever
these
nomenclature
Ber.
organic
hand
(opus
Reform
is said
about
the baffling
and the needs
in order
blend
of indexing
to simplify
actually
years.
Geneva
Nomenclature
“officially”
began
are still
taught
syno-
out subtleties
that
on the part of the
of contradictions
we
substance
p. 3905).
nomenclature
of trying
that
on the other.
committees
on chem-
This
of preparing
the “Geneva”
system
the purposes
of
is like trying
to
dictionaries.
briefly,
Arch.
Congress
oral and
the work
if one examines,
known
their
to criticize
to serve
in 1892 [Pictet,
th e well
that
mi,ght be making
or intention
the task
obvious
seventy-five
cited,
is to indicate
dichotomy
is quite
does this
for the same
in chemical
purposes
remarks
ways,
Indexing
It is not my purpose
introductory
with
of names
to experts
for indexing
to bring
the round
a multiplicity
26, 15951631(1892)]at
chemistry
on the
For the speaker,
of expression
To complete
apparent
difficult.
two functions
nomenclature,
much that
orientation.
sue h freedom
Versus
English
and like
and the ability
Oral Communication
for the past
1892)][T iemann,
on one’s
of indexes.”
faced
of chemical
in two or more different
in the preparation
of these
the King’s
It depends
on the other
more
indeed
compound
of comprehension.
on the one hand
chemical
of elementary
com-
clear
of the Commission
p. 3905)
of expression
nomenclature
even
nomenclature
Modern
485520(
prevalent
The purpose
communication
to the same
of permitting
of affairs
Report
cited,
For the listener,
to make.
sentence:
(opus
chemical
of thought.
in complete
attempts
oral
names
the other.
Thirty years ago it was not yet quite
nomenclature
several
in structure
state
1930 “Definitive
Chemistry”
expression
a serious
ical
out analogies
of the general
contradicts
to name
be difficult
speakermav
on the one hand
sentences
allow
mightotherwise
to give
indicative
of Organic
clear
do indeed
chemistry
it easier
the one sentence
really , permit
Nomenclature
of organic
are quite
the opening
If it is possible
nyms
has
of Chemical
55, 3905(1933)I.
Sot.
nomenclature,
domain
of rendering
Gem
These
They
in the
Goals
sci.
the history
phy.
of Geneva.
though
The
some
nat.
27,
I’ll stuteachers
-2-
may
now
call
it the
the 1930 Report
not
been
found
I.U.P.A.C.
mentioned
realized”
under
above.
i.e.,
only
system.
one entry
Thirty-eight
years
I enabl e d each
Rule
in indexes
is known
1957
report
Report
contributed
The
way
affect
not affect
portion
the
complex
functions.
of acyclic
of so-called
as acids,
etc.
failure
mean ttlat there
past
thirty
mittees
has
Chemical
a great
2151,
that
cited
of nomenclature
and Engineering
ics
one
is
1930 Report,
The
tions.
the
present
If I had been
list
describe
to conclude
the former
organic
For that
thirty
“be
years
later
and
to note
that
the
is devoted
the hydrocarbons,
matter,
to cyclic
does
as is noted
below,
not in
it does
contain
only
The same
one kind of
is true of the
Field
domain
devoted
Words”
written
which
in the 1957 Report.
of organic
to chemical
noted
to compile
about
of organic
that
system
situation
a directory
by Patterson
nomenclature
nomenclature
in 1951 there
Washington,
were
of them.
Amer.
(Chen.
organ
not
during
the
so many com-
Chem.. Sot.,
for the weekly
does
Eng.
News
1957).
This
of the Society,
be recognized
study
to make an exhaustive
basically
chemistry
corresponding
as
it is used
by a grammar
analysis
in the Geneva
intact.
Only minor
may be described
syntactic
today.
based
of chemical
from the viewpoint
are changes
of the 1930 and 1957 Reports
nomenclature
would
Change
nomenclature
there
is retained
and their
year
while
in organic
ignorant
of morphemes
it would
.Yews.
at the development
forced
almost
of the report
is discussed,
M. Patterson
No Basic
Looking
Most
the entire
necessary
columns
had
pa 3906)
s, i.e. substances
of attention
“Words
SO that
It is important
in Nomenclature
as Austin
it was
in his
function
to treat
deal
Congress
nomenclature.
Activity
Report
On the contrary,
on nomenclature
a collection
1957
not been
years.
_
May 28, page
is
of the
which
are not covered
Constant
The
dissertation.
of my research.
simple
alcohols,
officially
came
S&5545-84(1960)].
chemistry
of organic
of the Geneva
(opus cited,
Nomenclature
Sot.
this
aspects
description
The nomenclature
such
to affect
linguistic
the basic
function
nothing
intent
Nomenclature
Chemical
[J. A m. Gem.
“the
of the Geneva system came with
to be named
and dictionaries.”
on Organic
as the 1957 Report
compounds.
any
major
later
chemical
I.U.P.A.C.
The next
major revision
The next
on organic
rules,
At least
System
details
modified.
would
It would
to 1892.
ques-
and had compiled
this
90% of the new chemicals
prior
in the
the following
nomenclature
System.
linguist-
contained
were
by posing
how accurately
on the Geneva
nomenclature
of structural
This
analysis
made
each
be an interesting
would
determine
-3-
the
basic
list
comparison
of morphemes
was not germane
While
it is
true
of the 1892 Report
tion
of earlier
mained
already
ture
conducted
in use
after
malized
ingan
and codified
it. In other
ing the same
the value
as there
However,
in its
tion
Geneva
of organic
but
were
cified
not new.
the selection
CIr,)-CH2
one
which
this chemical
est chain.
this
Neither
XII3
the
diagram
Chain
Thus,
the
pat-
nomencla-
as an analysis
well.
con-
structure
system
system
was not quite
as e.g.
be acquired
for-
by usin read-
is adopted.
to the syntactical
practices
became
the “Iparent”
atoms.
one named
used
tri, 4
by many
yl, met&
The new rules
for the chemical
that
descrip-
3-ethylpentane.
an, e in ethylpentane.
(CI13-CI12)3-CII.
func-
Structure
of carbon
where
a useful
accepted
would
The morphemes
principle
chain
which
triethylmethane
point
served
chemistry
of syntactical
of morphemes
possible
It has
contributions
eth, yl, pent,
can be written
The
an, e
spec-
CII3-CII2-CII(CII2structure
The older
chemicals
shall
method
in terms
Th ere are historical
be the
of naming
of the shortr e a s on s for
change.
Rapid
Early
as
morphologi-
of organic
results
organic
that
the Geneva
to solidification
the syntactical
syntactic
than
significant
combination
longest
had an implied
The same
some
the morphemes
by establishing
of
the morphological
nomenclature
elementary
Shortest
the morphology.
of the latter
contributes
in which
demonstrates
were
revisions
any major
any internationally
of organic
different
or at least
of triethylmethane
available
in studying
Versus
did make
as was
not onlyre-
presumably
analysis
the same
examination
chemistry
accepted
Conference.
the teaching
1920 edition
nomenclature,
not as universally
example
was not then
acquired
Conference
exactly
of the Geneva
would be not significantly
textbook
were
a morphological
almost
while
Longest
The
produced
simply
a
and similarexamina-
of organic
did not contribute
chemists
words,
conference,
1892,127.131)
(which
such
in 1892.
the terminology
1890 textbook
Sot.
However,
dissertation.
at the Geneva
Chem.
conference
conference.
in this
the morphology
The Geneva
have
nomenclature,
in 1891,
that
the Geneva
Conference
use.
began
Proc.
reveals
nomenclature.
could
involved
in the 1930 and 1957 Reports,
is not to underestimate
teachers
research
Armstrong,
but even
the Geneva
at the Geneva
nomenclature
practice
in 1891 would
tion in teaching
that
(e.g.
same
Report),
in organic
tern
This
“official”
nomenclature
the 1892 Geneva
cal changes
that
the
to the chemist
to the particular
and others
essentially
ducted
available
m’ethane
easier
organic
gas.
tounderstand,
chemistry
naturally
As the knowledge
but still
Change
in Syntax
was concerned
of chemical
the Geneva
- Not Morphology
chemists
structure
could
with chemicals
of simpler
increased,
chemicals
not foresee
the. rapid
like
structure
pentan.e
development
such
were
of or-
ganic
chemistry
svntax
of non
that
would
take
Contrary
credit
ganic
to general
of many
chemistry
plicated
fore
they
begin
the formal
study
remains
study
a vacuum.
urn. Similarly,
Special
the special
preparation
language
chemistry.
cation.
there
Teaching
However,
chemistry
I cannot
cannot
hope
to pursue,
This
removed
Latin
from this
from or-
to be very comnomenclature
is unfortunate.
Latin.
This
is needed
be taught
be-
One can
was
from the medical
of medicine
should
It is not to
away
seem
of organic
to study
Organic
really
not
curriculum
to fill this
vacu
first.
Chemistry
dissertation
from the general
in detail,
de-
syntax.
simple.
with what
science.
chemistry
be divorced
the
of historical
are frightened
elements
students
having
to be drawn
is relatively
and quickly
the basic
for Teaching
to modify
more rap idly than
many students
in the language
of organic
are implications
change
experimental
However,
Implications
I believe
that
not taught
necessary
to be the opposite
nomenclature
for pre-medical
of medicine.
become
as a Language
too early
of the actual
again
seem
which
chemistry
are
it use to be a requirement
to the
would
Chemistry
are confronted
Students
necessary
This
chemical
of organic
words.
that
there
teachers
Organic
organic
chemical
they
recall
belief,
because
it would
it i s the morphe mes
Reading
the
1ology.
ure but not the morpl
of langua ges where
velopment
in which
place,
for the teaching
problem
all the derivative
of organic
of chemical
problems
related
communito chemical
nomenclature.
Increased
As was stated
clature
the
tried
to resolve
problem
index,
chemicals
on the indexing
new chemicals
were
by
the
1961,
of Chemical
Literature
paragraph,
the earlier
international
simultaneously
of indexing
the emphasis
prepared
in the opening
Volume
was
implications
prepared
world’s
the problem
each
chemists
year
already
ments.
machines
searching.
(cf.
and indexing
in 1892, it is quite
have
on organic
increased.
over
75,000
E. Garfield,Mex
that
a few thousand
new chemicals
ChemGcus,
If
chemistry.
understandable
Whereas
nomen-
were
1st Cumulative
33.)
volume
This
a problem
at the turn of the century,
Notation
This
of communicating
of nomenclature
in 1960 alone
corm-&tees
has
includes
both
increased
not only
for listing
The newer
Sys terns
the preoccupation
conventional
chemicals
“nomenclature”
of nomenclature
indexing
systems,
in the conventional
systems,
e.g.
but also
fashion
GM
experts
Dyson
and also
[(1947)
with
systems
indexing
which
for new types
Longmans,
will
requireemploy
of machine
N.Y.
1949Iand
-- 5
W. J. Wiswesser
(A Line
carded
the semblance
cipher
or notation
as
notation
of English
systems
systems
tationsystem,
do undoubtedly
the
systems
perse,
Chemical
and employ
simplify
classification
library
Formula
does
Notation,
completely
simplify
problem
cannot
place
not resolve
Crowell,
N.Y ., 1954) have
symbolic
representations.
the problem
of arraying
the book
the need
of arraying
books
to locate
shelf.
one shelf
chemicals
These
formulas
on a library
on more than
completely
dis-
so-called
in. indexes,
However,
at a time,
in more than
just
just
using
one place
as
a noin the
index.
The various
notation
menclature.
None
menclature.
Rather,
tation
and
the
compounds
systems
of them have
their
ability
been
inventors
to use
purpose
have
can
quite possible
acquiring
with
the
perform
research
to visualize
a speaking
idea
of mastering
methods
one to perceive
semantic
methods,
terest
a situation
to explore
chemical
information
Garfield,
private
language.
ofthe
general
to these
understand
information
of using
requirements.
chemical.
he must
access
A bstrac ts (C.A.)
In the
derstands
The organic
Subject
c
chemist
C.A.
the C.A.
indexes
system
and Formula
one can
begins
this
formal
to explain
with
of Chemists
d’etre
research,
of repeating
indexes.
(Chemical
nomenclature
indexes
then
2. Harris
to synthesize
performed
are
typified
by the
it is of inof
in 1955 (E.
is
some
related
a particular
by others,
Chemical
Ohio).
of two methods.
one can name
more in-
to review
were
by either
helps
1955).
nomenclature
Columbus,
in
the problem
which
Abstracts,
chemical
then
it is necessary
attempting
experiments
interested
the older,
Indexing,
and how chemical
Such
find a specific
Prof.
either
certainly
in studying
and Mechanical
years
than
it is
the linguist
nomenclature
classifications)
Requirements
may spend
Indeed,
without
linguistics
linguistics
possibility
of this
in mind.
more directly
s true tural
Linguistics
Indexes
of chemical
of no-
of chemical
or one may be more
of chemical
of the chemist
to comprehensive
which
one can analyze
modern
categories
discussed
the raison
In order to avoid the possibility
have
Since
formal
“Structural
requirements
of no-
as economy
be analyzed
chemicals
elucidation
Information
Tocompletely
analysis
identification
objectives
might
Similarly
of naming
to a priori
communication
of no-
Analysis
a language
chemicals.
I first
problems
a factor
many different
as grammatical
retrieval.
the pitfalls
linguistic
for the unique
now introduces
with
in which
of classifying
the possibilities
to avoid
of a formal
with such
of Linguistic
the techniques
(comparable
preoccupied
purport
program.
of that
as well
on the basis
This
analysis
knowledge
uncoveringnew
tuitive
linguistic
proposed
simultaneously
searching.
of this
been
been
Objectives
One
have
designed
the system
as well as for generic
the background
which
a particular
If one unchemical
in
which
one
does
ula
is interested
not have
Index.
mastery
and look for it in the alphabetic
mastery
of the CA.
(Incidentally,
not
of the C. A. system.
a graduate
organic
nomenclature
more
Three
chemist
than
system
a few hundred
years
of full-time
to be an indexer
der
according
listed
is a simple
to the number
under
ber of carbon
no special
Index
device
of carbon
acetic
and other
atoms
in the chemical,
training
he can use
to use
chemists
in this
have
country
work are generally
if one
the Forma complete
required
to train
Abstracts.)
each
atoms
acid (ethanoic
acid)
index
chemical
contained
is
the chemist
the formula
hand,
Indexes
and other
while
On the other
has the option
indexing
in which
C.&O
index.
one still
for ChemicaZ
Formula
The Formula
subject
is listed
in alpha-numeric
in it. Ethyl
C2H402.
alcohol
By simply
can compute
to find the CA.
(ethanol)
counting
the molecular
name
oris
the num-
formula.
of the chemical
With
in which
he is interested.
I wish
clarity.
to make clear
In actual
more
complex
brings
practice
molecules
up another
that
vital
these
one
are oversimplified
must
prepared
question,
be very cautious
today
which
can even
cording
to the I.U.P.A.C.
calculate
tural
may frequently
the
or CA
molecular
diagram
and then
ing aspect
of working
hydrogen
atoms
formula
proceed
in the
molecule
to add the number
else’s
diagrams.
as the
This
then
(ideographs).
the chemist
as
such
diagram,
from a name.
will invariably
and other
is the frequent
atoms
from a structural
draw a diagram
of carbon
diagram
IIydrogen
formula
in ideographs.
diagrams
to name a chemical
he can usually
of a complex
with someone
a molecular
to depict
of structural
of explanatory
Diagrams
not be able
systems,
for the purpose
in calculating
be difficult
is, the use
Structural
While a chemist
statements
atoms.
In order
to
draw its struc-
A particularly
practice
of omitting
usually
of little
are
ac-
some
annoy
of the
interest
to the
are based
on the
chemist.
All
existing
methods
assumption
that
mind
comparing
when
originally
tural
reported
diagram.
assigned
the
of naming,
chemist
will
methods
then
proceed
name will be completely
The
indexer
have
seen,
will
also
is very useful
use
provide
of handling
is indexed
by
He will
first
indexing,
coding
a structura2
chemical
by Chemical
to rename
to the chemist
diagram.
information.
Abstracts
to calculate
in finding
a chemical
It is important
For example,
will
More often
to the chemist
diagram
chemicals
the indexer
it “systematically”.
incomprehensible
the structural
and ciphering
who first
the molecular
in a formula
to keep
when
first
in
a chemical
draw a s truc-
than not, the newly
prepared
formula
index.
this
the chemical.
which,
as we
Molecular
The molecular
tial
in analyzing
cal formula
erally
he
shows
sonal
experience
ingly
few
number
formulas
if there
knw
is an odd number
count.
“OThe calculation
fied,”
(E. J.
found
Crane:
journal
requires
articles
atoms
molecular
heis
to find a chemical
trying
know theexistence
ed
whether
reason
other
methods,
the
tional
system
Research
Pennsylvania
now
defunct
This
State
structure.
before
is not always
practical
and machine,
care
and checking
Amer.
his
has
case
is an odd
is justi-
Chem.
Sot.,
frequent
the
errors
been
primarily
for searchin
when
may not even
he may be interestin the literature
indexes.
g chemicals
Coordination
as
For this
Research
generically
Center
on the work of Prof.
National
useful
Cod e
Chemical
Code,
reported
to help
employed.
Center
sys tern designed
Chemical
Thus
with the conventional
Coord ination
based
primarily
the chemist
search.
are now extensively
Chemical-Biological
(CBCC
number
are in the hydrogen
are not especially
in this
he begins
of chemicals
is
count
are designed
they
Indeed,
of a class
system
University
Abstracts
he is interested,
chemical
also
a large
Searches
to Chemical
classification
Council.
great
p. 74).
both manual
of the
the hydrogen
cited
Chem ica l-Biological
is
that
(opus
searching
The most comprehensive
Surpris-
discusses
any member
e. g. h exanol s. Generic
compounds.
new chemical
Crane
of a particular
in learning
on my per-
book
of related
new compound
is based
statement
same
in which
it is gen-
This
this
chemical
reason,
that
requires
Dr.
The empiri-
of each
Abstracts,
indexes
find a specific
formula
of Chemical
-
and formula
the chemist
For this
Most of the errors
present.
formulas
Generic
While the subject
formulas.
The Production
A Today
1958, p. 86),fn
D.C.,
in original
Y
rule which
as it is essen-
It is significant
errors.
100,000
of other
of correct
journal.
of more than
“odd-even”
atoms.
molecular
contain
research
or empirical
and other
to a scientific
the indexing
the
molecular
hydrogen
by authors
Chemistry
role in chemical
the “aalculated”
a paper
in editing
important
them through
carbon,
reported
in Analytical
another
report
submitting
chemists
Yashington,
between
the chemist
when
molecular
plays
to identify
the ratio
that
prepares
also
chemicals
required
of the
formula
Formulas
Council,
of the Na-
D. Frear
of the
Washington,
1948.)
Modifications
The
CBCC
priori
assumptions
While
the
make
modifications
CBCC
chemical
code
concerning
system
the
is quite
in particular
of CBCC
is an elaborate
classes
useful,
parts
one
almost
Needed
hierarchial
may
wish
without
of the classification
system
of classification
to search
exception,
in large
chemists
schedules
files
based
on a
of chemicals.
who employ
to differentiate
it must
more
-8-
precisely
their particular
sections
of the
cals
which
ians
encounter
code
might
where
interests.
For example,
it is not sufficiently
otherwise
in using
classification
receive
systems
specific
the same
such
the laboratory
for a specific
of indexes,
what
code
as the Dewey
chemist
chemical
has
two general
and the generic
is the problem
In attempting
deal
deal
with
to satisfy
with
dozens
In other
cals,
but in France,
icals
in which
the information
The
words,
comment
all
French
intentions
over
countries,
the
that
to resolve
synonyms
for indexing
great
economic
is over
five million
that
and the Library
librar-
of Congress
for chemicals
from the chemist
these
- the
who is the user
indexes.
of the labchemist,
chemical
chemist
papers
their
the chemical
are written.
the same
little
has certain
would
strange
world
is far beyond
that
chemical
devices
preferences
indexer
IIe must
also
in each
foreign
for naming
chemi-
for naming
chem-
The use of machines
arise
3’elch ‘,
L1e d ical Library
from
the
standardized
to the chemical
in documenting
and user
and willing-
nomenclature.
indexer.
However,
It takes
of language.
steps
method
of
could
now required,
of chemistry.
The budget
more than
The plethora
If some
and enervating
the literature
alike.
desire
of cooperation.
vagaries
the many costly
to indexer
the obvious
This
of Chemical
prob-
Abstracts
per year.
to perform
of computers to index
Indexing
one considers
in using
obstacle
made
Cooperation
the mere question
without
been
significance
dollars
when
to cooperate
names
have
More Than
Machine
kins Machine
Turning
not only have
each
a formidable
chemical
lem has
1955)
of chemi-
problem
in searching
of naming
Requires
problems
presents
step
use
numbers
certain
Problem
in which
chemists
may sound
is a problem
a worthwhile
the
large
expand
he is a specialist.
last
nomenclature
be found
System
who prepares
synonym-producing-systems
as in other
of chemists
chemical
would
is the same
requirements
requirements
languages
LNomencl ature
good
This
Decimal
search.
of the chemist
of foreign
the different
language.
ness
chemist
to distinguish
number.
The Indexer’s
must
a steroid
system.
Thus
search
chemical
indexing
Indexing
is by now no novel
chemical
information
Project
(cf.
llimwich,
Indexing
Project
XA.
Reports,
began
II. Field,
Johns
idea.
My own investi,gations
in 1.951 as a member
E. Garfield,
Hopkins
University:
of the Johns
.J. Whittock,
Baltimore,
on
Hop-
S.V. Larkey,
1951,
1953,
Manipulative
In September
to Chemical
However,
self
with
of 1952, I presented
Abstracts
most
of Printed
Beading
Analytical
an oral report
the American
Aspects
Chemical
aspects
of the problem
Indexes
by Machines,
Lists
by Automatic
of Indexing
on a tentative
method
Society’s
work in the use of computers
manipulative
field : Preparation
tion of Subject
before
of the early
the
Versus
Committee
for scientific
rather
than
the indexes
on CA. Mechanization.
documentation
the anaLytical
Am. Documentation,
Punched-Card
for preparing
concerned
aspects.
(cf.
6:68-761955
Techniques,
it-
E. Gar-
and Prepara
-
lOA -10,
J. Docunentation,
1954).
In private
of the
communication
American
problem
Chemical
of mechanical
literature
in the field
ofusing
documents
Arthur
retrieval
as a substitute
as,
for example,
Chemical
Nomenclature
chine Searching
book
of organic
menkzatura
lation
as
cellent
which it
in the
work
and Translation;
Organicheskikh
“V orsc hlage
treatment
of the
Proc.
Soedmionii,
zur Yomenklature
subject
was
analysis
that
between
the
of scientific
the linguistic
prob-
has increased.
All workers
to concentrate
on problems
required
to index
scientific
Nomenclature
also
been
and
Vladutz
Zntl. Conf.
New York,
nomenclature
general
have
of the need
analysis
chairman
as new criteria.
and British
of Tsukerman
Translation,
chemical
as well
scientists
aspects
intellectual
then
the relationship
awareness
the manipulative
for the costly
criteria
Soviet
than
University,
of mechanical
the general
are now more conscious
Soviet
years
passed,
State
Mechanization,
and the problem
have
are far more significant
Pennsylvania
on CA.
of languages
by the conventional
In recent
Rose,
Committee
As the years
of information
computers
Society
translation
was discussed.
lems of indexing
to Prof.
devoting
Moscow,
(cf. AX.
Tsukerman
1961). Indeed,
published
Verbindungen”,
of nomenclature.
what
in 1955(cf. A.P.
1955) ( simultaneously
Organischer
to these
There
Terenti.ev,
Language
for Ma-
it now a Soviet
Terentiev
published
Moscow,
problems
& A.P.
St an d ar d s on a Common
f or L.
Interscience,
first
more attention
et al, nio-
in German
1955.)
text-
trans-
It is an ex-
are not too many extant
works
to
-lO-
Chemical
Abstracts,
to the
1954
critique
Th e work is simply
1957).
CA,
Subject
That
of nomenclature.
is not surprising.
This
chemists
among
others
lications
in this
country
Neither
Index.
which
A.M. Patterson,
interest
teams
are
in the mechanical
puters
for mechanical
working
theory,
on these
subjects.
However,
ing
one must
Yhat
even
editor
though
then
are
to be
Cappell
a
is available
of work for several
and the staffs
Analysis
high-density
storage
of texts,
eminent
of several
on many
of texts
aspects
in an academic
to explore
an academic
sense,
pub-
there
with
index,
too well
I am only
resolution
of all extant
of using
the
problems
computer
the study
facet
of the need
seems
publications
and indexing
for every
to perform
of
can be too much research
of scientific
aware
and
use of com-
involving
scientific
of the computer
now accel-
The possible
volumes
personnel
has
many individuals
question
not that
the growing
qualified
that
problem.
the full potential
of a chemical
possibilities
of this
is not just
of finding
computers
It is not surprising
as one witnesses
a complete
the
a lifetime
L.T.
remarks
can be considered
nomenclature
in Mechanical
of high-speed,
etc.,
difficulties
be tempted
As the
work.
ante,
increasing
of Cahn
of chemical
represented
E. J. Crane,
analysis
analysis
information
the
critique
Interest
simultaneously
language,
and
has
that
of introductory
and abroad.
availability
erated
with comments
work nor
complete
Accelerated
The increasing
this
no really
is a subject
a reprint
train-
of indexing
for such
assist-
now to be “futuristic”.
such
intellectual
analyses?
INTELLECTUAL INDEXINGTASKS REQUIRINGSTUDY
hjechanical
In the first
words.
This
guage.
would
names
premise
one would
avoid
For example,
chemical
basic
place,
the costly
one would
in a text.
of Frome’s
like
experiment
chemical
work
names
and
of indexing
formulas.
to selectively
“read”
ually
a solution.
vices.
finding
However,
or “Isense”
Large
the immediate
Device
available
of manually
to index
words
Selective
In the
to have
step
like
These
Reading
then
(cf.
creating
chemical
would
J.
a device
papers
U.S.
Word Recognition
for the
At present,
printed
index
texts,
though
are now going
prospect
of devices
input
reading
in machine
by underscoring
by the computer.
Office,
lan-
pertinent
This
Report
the
was the
No. 17, 1959).
- Copywriter
chemists
is no device
sums
merely
Patent
Chemicus,
there
a computer
be analyzed
Frome,
for mechanically
available
the character
into research
which
must
underscore
which
recognition
on character
can simultaneously
would
pertinent
permit
problem
is grad-
recognition
read
one
de -
the hundreds
of different
typographical
“reading”
type
invented
unit
and
Council
built
by this
recognition
Information,
p. 949).
.1. Rabinow,
and
readers
Recognition
nition
this
now
that
or by manually
other
COPYWRITER
fonts
Nevertheless,
(cf.
typographical
to accommodate
purposes
Fourth
might
Z. S, Harris,
(cf.
style
has
Annual
bee
n
Report,
be modified
Intl.
these
a proto-
for use
Conf. on Scientific
used
by publications
typographical
styles
(cf.
1961).
Names
to Structural
obtained
a record
and
p. 30). Th is machine
can be built
we have
creating
the
the particular
/Machines,
on the horizon.
for indexing
1961,
know
Chemical
Assuming
called
for selected
one does
character
Character
is
only
words
Washington,
machines
Since
is still
copying
writer
Resources,
in character
indexed,
now employed
for selectively
on Library
regularly
styles
some
Diagrams
form of machine
in machine
language,
input
what
either
by character
do we wish
to have
recog-
done
with
information?
Aside
from the use
systematically,
diagram
and
that
is made
for calculating
is for communication.
when it is presented
necessary
or too time
consuming,
it must
tain
complex
therefore
molecular
The organic
because
the use
While
be understood
overwhelmingly
that
this
are
prefers
to use
nomenclature
convert
chemical
names
back
glance
impossible
task.
However,
nizingand
understanding
diagrams
Opler
and Waldo have
Baird,
Display
International
this
1959,
p.711-730).
would
believe
In fact,
that
Diagrams
by Machine
chemist
considers
computer
the case.
by machine
Conference
it was
structural
itself
Formulas
on Scientific
the diagrams
like
is
too difficult
happens
name.
in order
sys-
is that
cer-
The chemist
to save
space
to use the computer
a photographic
can be drawn
as Digital
Printing
of names
It is not true either
that
is an accomplished
ln formation,
drawn
conversion
nor in the sense
diagrams
and M. de Backer,
not
therefore
presentation
either
or trivial
However,
quickly
to
diagrams.
can be drawn
W a Id o
diagram.
most
by the Geneva
What actually
Drawing
Structural
any chemical
chemicals
of the structural
of graphic
is frequently
to name
One would
uses
a chemical
type
a semi-systematic
extensively.
name
of Chemical
) (W. II.
either
is by no means
that
This
is far from true in practice.
the chemical
shown
to comprehend
nomenclature
possible
for naming
of the primary
diagram.
of systematic
into structural
the average
by the chemist
one
is able
the use of the structural
continue
59063,1958
chemist
assigned
journals
That structural
formulas,
it is theoretically
configurations
At first
diagram
to him in the form of a structural
absolutely
tem,
of the structural
Computer
Chemical
National
Output,
Structures
computer
projection
technique
were
until
cannot
an
of recog“draw”.
In two separate
by a computer.
Academy
by Opler’s
in the sense
a machine
fact.
to diagrams
reports
(A. Opler
andN.
Am. Documentation
Electronically.
10:
Proc.
of Sciences,
Washington,
so realistic,
few chemists
they
were
shown
exactly
4%
how the
has
illusion
a television
drawings
spaces
rate
type
By energizing
daily
dots
raster.
complexity,
the spots,
ally the technique
the
on the IBM 718 output
of amazing
between
of the
was created
newspaper,
as
consequently
is slowed
down,
one
structural
facsimile.
be large
can
increase
computercan
of drawing
find
years
ago,
type
of analysis
concerned
Could
design
with a reasonable
be required
on a 718 display
unknown,
uncoded
chemical
been
how much effort
was
in order
to find this
which
could
(A.
computer
tube,
would
of chemical
would
eye.
the
is basic-
on the front
page
the size
of
If the transmission
the human
eye cannot
we can mechanically
diagram.
be found
then
we must
name
I first
display
procedure
determine
in such
began
for recognizing
be compl eted
Opler,
programming
it became
quite
not a reasonable
ten man-years
performed
This
see
and
a way that
to pursue
chemical
this
names
? A further
wheth-
ques-
and what
question
in a reasonable
the
naturally
length
of time,
for success.
chemical
had
obtain
by Machine
procedure
of the complex
any type
one can
quickly,and
where
a chemical
to draw the correct
and coded
for displaying
that
by machine,
diagram
instructed
of an experiment
Upon examination
of dots
to the point
Names
for “recognizing”
chance
mated that at least
Chemical
a structural
a computable
would
the
patterns
photos
device
one cannot
drawings.
to the naked
is no question
a procedure
be properly
tion
such
output
by computer.
If we are capable
indeed
of spots,
are line
to transmit
resolution
There
Recognizing
er we can
they
more perceptible
the
of the dots.
diagrams
One can see
and
computer
from a distance,
that
necessary
particular
combination
are examined
the illusion
it is frequently
must
easily detect the presence
print
creating
used in Wirephoto
This
the appropriate
If the drawings
thereby
tube.
Private
be required
after
that
for one person
just
to write
suitable
Communication
to produce
apparent
to reproduce
task
be required
diagram
required
1959).
conventional
to recognize
For this
line
of organic
reason
formulas
Opler
computer
analyses
known
a previously
to accomplish.
the necessary
linguistic
a single
esti-
programs
chemistry
it was ascertained
as e.g.
C-C-N(ClI,)-C-C:C:O
To perform
this
the
correct
computers
methods
of displaying
were
of drawing
is
the typographical
chemicals
as e.g.
with the proponents
not available.
structural
sophisticated
This
line-formula.
do not have
communication
routines
as in the ease
but also an extremely
nitionroutine,
ing
feat,
further
of all well
(G.M. Dyson
generation
routine,
complicated
flexibility
ciphers
diagrams,
known
also
requires
i.e.
a procedure
by the fact
required
were
this
that
for conventional
explored.
notation
and W.J. Wisswisser,
A search
systems
Private
indicated
most
not only
recog-
for generat-
general
purpose
line-formulas.
Other
of the literature
that such
Communication).
and
computer
-13-
Calculating
Subsequently,
stated
above,
mation,
also
Indeed,
seemed
formula
goal,
the
it is
usable
duce
difficult
Even
an output.
routine
has
a widely
method
for every
chemical
of preparing
target
to relate
the
any
a syntactic
analysis
study
recopition
of retrieving
in his
a program
As has
formula
infor-
work.
were
or institution
for generating
been
chemical
laboratory
if the molecular
publication
of finding
a recognition
might
have
still
procedure
of a sentence
of chemical
needs
formula.
In many
available.
which
preparcs
molecular
formulas
for research.
procedure
to envision
used
frequently
a diagram
a useful
case
the molecular
to draw
feasibility
by Machine
of calculating
chemist
for a recognition
In the
some
IIaving
The
desirable
search
output.
problem
and provided
it was
the
be necessary
indexes.
reasonable
While
ever,
not
is not only
that
this is a very practical
molecular
output
formula
information
it would
Formulas
to the possibility
the molecular
it is
situations
I turned
Molecular
which
without
nomenclature,
been
any output
to some
undertaken
would
regard
routine
anyhow.
not produce
to ultimate
that results
usable
some
Howtype
use doe
of
s pro-
from a recognition
value.
limited
the
scope
of the output,
it was then
necessary
to define
and limit
the recog-
ni tion capa b&ties.
The Quagmire
Organic
chemical
crossed
by the
million
chemicals
number
of new
c.
people
the various
have
chemicals
already
that
can
a cross-reference
that
are three
(3) Semi-Systematic
been
first
known
ly called
problem
prior
of handling
to the computer
both types
of trivial
chemist
state
thinks
There
of atoms
hundred
of affairs.
names
that could
first
before
an unlimited
are uncovered
entries.
It is necessary
one comes
be
of the several
is almost
thousand
never
every
day.
However,
to differentiate
to the conclusion
with.
names:
(I) Trivial
names,
(2) Systematic
names
and
names.
Trivial
The
quagmire
in the literature.
chemical
to deal
of chemical
or Semi-Trivial
the average
of several
by this
of recognizing
types
a horrible
New combinations
consisting
discouraged
Nomenclature
glance
reported
be made.
is too hopeless
basic
at
Naturally,
file
of the problem
it is a problem
is
chemist.
are unnecessarily
facets
There
ambitious
that
A. maintains
most
that
most
nomenclature
of Chemical
trivial
analysis
names
names
Names
must
and (b) names
be dealt
which
“words-provocateurs”
with
in two parts:
are entirely
(opus
cited,
new.
(a) names
Tsukerman
p. 4).
which
has
arc
proper-
-14-
From
lem,
The
storage
provement
large
of large
dictionaries
is
recognition
dictionaries
access
rapidly
at
names
a problem
trivial
is no longer
memory
we can expect
units
low cost.
of trade
is indeed
of known
in computers
relatively
the thousands
of trivial
primarily
of machine
random
quite
in analyzing
problem
This
of view
of so-called
involved
the
the point
names
of locating
trade
trivial
names
there
a serious
no prob-
With the im-
to look up items
not underestimate
non-systematic
names
and of no basic
and other
exists
obstacle.
to be able
While I would
and other
essentially
names
the work
for chemicals,
interest
synonyms
in
to the linguist.
by reference
to standard
compendia.
Legislation
Similarly
require
legislative
elimination
terms
new
trade
names
can be dealt
action,
though
it is extremely
of the practice
of naming
“Richstein’s
like
for people
to use
trivialnames
Substance
such
names
is absolutely
the chemical
Many chemicals
can only
insulin,
penicillin,
names
to mean
chemical
or (b) named
veloped,
type,
ever
use
constantly
to the
chemist.
distinction
between
any
not
However,
eliminate
make
the case
of
it very difficult
or semi-
in biochemistry.
known
for many years,
The complete
with thousands
the
the use
names
and particularly
formula.
may one day
in our lifetimes
the use of trivial
or empirical
as was
might
is not completely
is
systematic
several
categories.
are (a) named
prescribed
have tried
chemical
of chemicals
chemists
items
that
to name
primarily
like
rules
as they
it is incredible
CA or I.U.P.A.C.
for the use
of indexers.
almost
to another.
that
becomes
as a derivative
that
chemist
(opus
cited,
for
all chemists
is completely
as Tsukerman
If you are a steroid
de-
allow
nomenclature
meaningless.
the 1957 Report
today
Consequently,
authors
has
nomenclature
stand
to think
very
systems
system
on “systematic”
in a way which
name
androstane
As the Geneva
to rely
Soviet
is used
nomenclature
a chemical
names
to name
terms.
in using
and by such
to observe
“systematic”
to existing
The I.U.P.A.C.
trivial
It is amusing
according
of basic
Indeed,
by I.U.P.A.C.
The word
to get chemists
of lexical
are written
and
list
easy.
of having
is a systematic
to get
tic Names
100% consistency.
made
name.
Systema
into
is not always
The rules
so-called
attempt
of a very
the situation
which
a trivial
standards
This
see
You don’t
in chemistry
by a molecular
which
in the selection
name to onechemist
is
names
but this
it with
faces
fall
commissions
so many exceptions
will
journals.
for many years
also
on the basis
the various
ofthelatter
Rigid
of many chemicals
be identified
we will
biographically.
and necessary
structure
that
methods.
etc.
Systematic
loosely
doubtful
S” by legislation,
in published
may not be understood
with by non-linguistic
new chemicals
essential
Unfortunately,
structure
not a Solution
one
foreign
the above
(opus
cited)
What is a trivial
then
androstane
p. 73) gives up
cyclopentanophenanthrene,
the
- 15 -
more
systematic
description.
atic name when one could
properly
Once you are convinced,
communication
It is equally
call
as I am, that
is an impossible
names
also
become
names
can be classified
be computed
absurd.
of its
being
other language
is involved).
nomenclature
is tackledas
describe
a language
linguists
quite
might
Bloomfield
personal
undescribed
organic
as
a chemist
of nomenclature
because
been
I feel
organic
lems have increased,
of indexers
at workand
the
(or whatever
If the study
then
you avoid
problem
any other
then
of organic
pitfalls
it seems
such
reason-
language.
To completely
with
complexities
or a range
languages,
I was
stimulated
Thought
such
nomenclature
prompted
in this
direction
and Reality).
as Harris
This
inevitably
as the structural
to inquire
how
by the words
type
of
of associative
focused
linguist
of
my attention
would
treat
a pre-
a general
reluctant
that
to the linguistic
familiarity
to devote
it has
certain
laboratory
with
a great
inherent
chemical
deal
with completely
nomenclature
of time
limitations
clean
systems,
to the complete
for communication
I
mastery
and re-
purposes.
In discussing
existed.
of English
in spite
language.
further
for me to come
acquired
of it. I have
where
problem,
analyze
natural
with linguists
chemical
can or cannot
language,
was not uncritical
trieval
I was
chemical
to recognize,
language.
to be a “language”
problems.
it was not possible
having
of that
or other
meanings
a sub-language
of ordinary
as you might
English
contact
whose
is a linguistic
1933) and Whorf (Language,
of treating
While
than
is
as a chemical
nomenclature
with such
(Language.,
on the idea
hands,
different
deal
thought and further
viously
chemical
systematic
linguistically
are due to the failure
If nomenclature
the grammar
versus
as a Language
features
as well
of chemistry
is to write
I assumed
complexities
nomenclature
dichotomy.
the language
Since
that
a linguistic
as the trivial-systematic
able to analyze
nomenclature
many
nomenclature
expressions
Nomenclature
It displays
trivial
for human
morphemes.
with
jargon,
nomenclature
between
one treats
of the participating
in dealing
a specialized
hand,
a system-
of benzonaphthalene.
systematic
distinctions
or non-idiomatic
Treating
a derivative
of a truly
then
If, on the other
cyclopentanophenanthrene
portion
the development
as idiomatic
difficulties
to call
the phenanthrene
absurdity
from the meanings
Most
ridiculous
rather
nomenclature
than
the example
the no,nenclature
Chemists
in-between
chemical
chemists
I gave
experts
systems
have
I have
tended
or communicators.
of the change
had to revise
had not followed
meetings.
nomenclature,
Between
the rules
the first
tried
to become
Naturally,
in steroid
systematic
to indicate
nomenclature
nomenclature
forces
is one which
to the facts
could
prob-
more to the requirements
both of these
and the Commission
submission
geared
that as indexing
are constantly
indicates
as they
not overcome
of the 1957 Report
this
a case
already
fact
and its publication
in
in
-16-
1960,
there
sonal
were
over
as
experience,
twelve
thousand
I examined
new steroid
that
many
In the face of such a rapid accumulation
would
do other
a good
facts
than
of what
idea
of natural
mittee
follow
cholesterol
Creation
during
This
be folly
of names
chemicals.
wait
years
to expect
for scientific
cannot
is a fact
the three
it is unreasonable
in naming
is and it would
growth.
prepared.
structures
the principal-of-least-effort
linguistic
from per-
in question.
that chemists
Even
the layman has
commissions
to ignore
for the calling
of annual
the
com-
meetings.
On the other
nicate
better
hand,
and
paper
to humans,
I suggest
that
simplifying
the
for teaching
humans
Certain
which
chemists
are
suffixes
steroids
used
ordering
repeated
to use
rules
prefix
2-ol,P-ol-hexan-oic
machine.
so that
chemical
would
it
that elsewhere
nomenclature
nomenclature
also
another
since
in terms
be most
of
rewarding
Most chemists
avoids
acid.
Not
absurdities
which
Coding
Project,
only
one
which
scheme.
equals
is
Thus
acid,
in an alphabetic
this
structures
hy-
less
substituents
In fact,
Literature
which
encounters
hexanediol
the latter
simplification
to learn,
deviation
of filing
complex
function
or more specifically
but which
file these
chemical
of the rules
it is certainly
from
12(3):6(1960)].
has a particular
a hexanediol,
However,
whether
from I.U.P.A.C.‘s
must
The
and
of a new method
Chem.
one obviously
OZ. Further
easier
result
_or CA
prefixes
criterion.
with
that
intervening
a chemical
is also
so complex
For example,
not care
led to the formulation
rule in naming
di.
could
structures
and to the machine.
of priority.
or by any other
prefixes
*are becoming
to the reader
rules
chemical
but not in the I.U.P.A.C.
to the end of parent
chemical
hydroxy
of very complex
Chemicals
sense
I.U.P.A.C.
2,bdihydroxyhexanoic
places
make
Literature
numerical
as e.g.
acid
Steroid
general
anacidfunction
exanoic
does
for multiple
the
If one files
2,P-dial-h
chemical
them systematically
by complexity,
The system
the
different
of organic
by computer
anyhow.
substituents
order,
2,bhexanedioL.
in entirely
terms
interchangeably,
it is
example,
process
to the established
[cf. E. Garfield,
For
names
in the naming
to name
of adding
regard
alphabetically.
orderingrules
this
of existing
in alphabetical
complex
of organic
noticeable
it necessary
usage
are being
listed
these
are already
finding
without
re-examination
commu-
be designed
it is not at all coincidental
the teaching
chemical
chemists
e
is increasing
phens,
? In fact,
both to help
nomenclature
concerning
of analyzing
to be accelerating
This
practice
questions
Uses
can be designed
why shouldn’t
by machines
a thorough
process
practices
appear
sense.
raised
for Machine
systems
more consistently,
more easily
I have
Nomenclature
if nomenclature
to index
be understood
in this
are
steroid
of new steroids,
Designing
can
chemicals
easier
also
contains
two chemicals
could
might
be called
produce
to analyze
by
-17-
Designing
In designing
chemistry
the experiment
which
nomenclature
was
necessary.
large
I chose
The experiment
would
still
by a team
of linguists,
nomenclature,
especially
the cyclics.
more
older
the
and
specific
machine
cyclic
translation
of the
algorithmic
practical
method
of calculating
As
linguistic
analysis,
it became
confident
that
chemists
most
values
was hitherto
Another
clerk
the
will
practical
use
of this
a molecular
formula
Relationship
A by-product
clature
and chemical
completed,
very
of chemical
If he
adequate
for all
is
generic
chlorooctane.
hex
he
must
is
well
interested
become
which
contain
carbon
of elimination
for the
without
resorting
computer,
based
can
be used
manually.
in Table
V on page
to train
to
on the
Iam
of the method.
to look at a problem
in the ability
One
in a way
30.
a non-chemist
and Searching
of the relationship
searches.
Hence,
analysis
could
co-occurrence
he need
morpheme,
be used
instead
a search
between
of the chemical
If the chemists
of morphemes
simple.
names
a manual,
name.
the morpheme
containing
benz,
by the
we are forced
from the analysis
as specific
quite
of the
as phen,
to delineate
the simplicity
understanding
in terms
such
been
procedure
When the computer
in any six-carbon-chain-alcohol
be the
the
Nomenclature
that results
as
searches
chemicals
interested
01, where
generic
in which
names,
search
expression
performed
is found
from a chemical
requirements.
and deal
- to find a procedure
of chemical
is summarized
algorithm
is the clearer
of chemical
percentage
by a process
has
and appreciat-e
is that
between
study
searching
the parsed
perform
class
of this
that
algorithm
new
Thus,
research
formula
learn
to mechanize
etc.
the feasibility
to include
morphemes
if
formulas.
operations
quickly
sub-divided
domain
and a large
was established
of this
apparent
The complete
oxa,
entire
be expanded
of additional
program
molecular
readily
of trying
difficult.
to calculate
the
could
in the literature
as aza,
results
the
of organic
for chemical
be easily
so as to demonstrate
analysis
to molecular
I simulated
diagrams.
that
names
by-product
structural
greatest
such
portion
to be drawn
could
programmers,
number
of my experimental
of chemical
and
reported
small
class
complete
The present
co-occurrences
objective
One
of the
chemists,
some
conclusions
as this
be reasonably
by use of a relatively
other
general
chemicals
than 90% of the new compounds
literature
c yczo,
as to allow
acyclic
of tackling,
with
its scope, I had to choose
and limiting
sufficiently
in general.
the Experiment
specifies
for all h#exenols
only specify
not the multiplier
name
by the computer
of conventional
hexen
nomenis
to
the type
chemical
becomes
a
and the morphemeoL.
the presence
morpheme
of hex and
as in hexa-
- 18 -
While
that
the
reason
general
is
computer
included
requirements
methodology
exercise
it
is
short
time.
the
grammers,
Univac
prepared
program.
Dr. J. O’Connor
I program
the
is
though
All of the actual
Univac
computer
While
the study
difficultwould
The
limited
compounds
is
that
I have
but reasonable
nomenclature,
at least
formulas,
difficult,
which
of resolving
drawing
This
CA’s
Ring Index
ignore
is not because
it is, but because
bY different
would
chemists,
be more of a problem
(Patterson,
Cappell,
Walker,
Pattern
This
sible
leads
to find a method
usually
has
problem
of “reading”
structural
find our raw information
to deal
with the printed
formula
or for naming
in order
to completely
on this
to another
problem
using
to be far from a solution
structural
the chemical
diagram.
recognition.
topological
techniques.
to the problem.
morphemes
were
the cyclic
faced
how
added
with
in the
When we enter
some
grave
the
prob-
we can do in calculating
designations
solutions
of numbering
published
and acyclic
Certainly
with
of positional
of the chemical
before
is itself
to the problem
well
We have
chemical
information
assumed
names,
known
ring sys-
the appearance
of
1960).
for the purpose
The National
Bureau
is an exciting
problem.
all along
However,
a pattern-recognition
This
just
Devices
Whether
systematically,
mechanize
to explore
of the ambiguities
which
systems
diagrams.
in the form of printed
How-
VII to X,
Ring Index, Am. Chem. SOC., Washington,
facet
pro-
steps.
in Tables
is concerned.
to be no immediate
Recognition
logically
most
designations,
compounds
as
The coded
1000 code
obstacles.
we are indeed
of different
for older
to resolve
formulas
appear
to be tested
assistance.
between
no insurmountable
problems
used.
sample.
over the border
the syntactic
would
both for this
A few cyclic
then
been
by two University
structures.
positional
there
done
I was interested
be possible
diagrams
was
approximately
of a random
The specific
an interesting
chemicals
compounds
of molecular
structural
of the
The
work was done in a
have
by flow diagrams
to present
it would
could
terms
excursion
cyclics
We cannot
This
the use,
terns,
found
side
manpower,
cyclics,
molecular
cyclic
as far as calculation
of mechanically
lems in handling
to acyclic
the selection
of this
them
and comprises
in general
to handling
to simplify
results
reason
to thank
coding
factors.
coding
(and for
research.
it is certainly
computers
program
I wish
of this
are the pertinent
both for the input
Univac
described
has been
procedure
exciting
sufficient,
for this
be the transition
to my testing
tapes
sized
to the reader
program
concern,
medium
etc.
general
of vital
not
and T, Angell.
is
to the
is
the actual
operation
may be of interest
approach,
the Unityper
omitted
computer
incidental
and several
However,
research
the basic
computers
Any large
in this
only
to work with a programmer.
as
realm
used
of the program,
I personally
ever,
here),
of particular
relatively
well
program
we would
it is also
would
of Standards
area
that
of calculating
device
Is it pos-
of research,
has
true one
a molecular
be required
been
working
but we appear
-19-
Experiments
Preliminary
to acyclic
greatest
experiments
compounds
additional
I.U.P.A.C.
involving
does
not
linguistic
to less
with Cyclic
cyclic
affect
work is
systematized
Compounds
chemicals
the applicability
found,
nomenclature
indicate
that
restricting
of the procedure
to cyclic
not
in expanding
such
as is used
from acyclics
by Chem.ical
the experiment
structures.
to cyclics,
The
but from
Abstracts.
STRUCTURAL LINGUISTICS
APPROACH TO CHEMICALNOMENCLATURE
I shall
linguistic
outline
-a
on terms
naturalcourse
of prefixes,
In principle,
ing an informant
list.
suffixes,
obtained
t Table
case
that
ments
of that
in which
language.
a morpheme
ously
basic
approach
they
occur.
Allomorphs,
is a linguistic
if one is to determine
of the
apply
(opus
On the other
I compiled
from a non-
the “syllabic”
to follow.
hand,
occurring
linguistic
analysis
statement
compounds.
- not that
study-
by interrogat-
be the most compact
for acyclic
He thinks
the linguist
of chemistry
of structural
should
frequently
uses
of nomenclature
the procedures
which
differs
cited)
the morphemes
objective
are the most
of
The word
it is a preliminary
of morphs.
Linguistic
The
etc.
to determine
of morphemes
these
be a list
radicals,
The ultimate
I is a list
it would
or roots,
of nomenclature
Tsukerman
good knowledge
He can then
from the informant.
analysis
chemist
with
for a linguist
language.
is used to indicate
In that
stems
it is possible
linguistic
the Soviet
for a chemist
of that
the morphology.
primary
how a structural
For example,
approach.
approach
to data
below
Forms
structural
To obtain
and Their
linguist
a description
morphemes,
etc.
class
it is essential
that
any particular
Environments
is to identify
of a language
are determined
that
groups
sequence
forms
by examining
one must
by a process
of occurrences
the environ-
examine
a large
of trial
and error.
be examined
is or is not an occurrence
corpus
Since
simultane-
of a morpheme.
@Since the phonemes of English chemical nomenclature
were assumed to be the same as those used in normal
it was not considered
necessary
to study the phonology.
(There were very definite problems endiscourse,
countered by chemists
in using Geneva nomenclature
which could have been avoided if the conference had
acetylenes
from
given some attention to phonetic transcription.
Thus, the adoption of yne to differentiate
the phonetic identity of ene in alkenes
and ine in mines
is still
mines
became necessary
later on. However,
chemical names to formulas phonology was not investigated.
This
a problem.) F or the problem of translating
does not mean that phonological
studies are not germane to the problem of analyzing
chemical discourse,
as
Such studies
would help uncover ambiguities
resulting from suprasegmental
morphemes as
indeed they are.
e.g. in dimethy2pheny2amine.
-2O-
In linguistics
ances.
you cannot
Structural
applying
linguistics
this
technique
such
as
frequently
used
radicals].
For
ment
butyZ
such
chloride,
hexylamine,
On this
the
trial,
instead
the
the
reference
butanal,
morphs
meaning
simple
isobutane,
dihexylamine,
testing
butyl
but, yl, hex,
of these
acter
of but.
assume
that but in nembutal
substitute
any
other
nembutal
is a morph
morph
additional
evidence
in the reference
able
s strong
to expres
dence
with
which
the
definitely
fortuitious
the morphemic
tuted
by hex
pentene,
and
as
convictions
etc.
to the morpheme
a preliminary
list
in complementary
to indicate
is not a morph
of but in nembutal
about
that
but
that
it is not the same
Then
the occurrence
We
is a difference
in
say there
is no differchar -
nem a morph.
we check
whether
one would
if there
the informant
rely
as in butane.
now proceed
in most
of its occurrences
not be
on the formal
Thus
We can
for
is not a morph.
ask the informant
Should
with
We
we can
try to make a substitution
and butane.
morph
etc.
to be in error.
the but in nembutal
then
of Tbut in
as to the morphemic
we may also
in nembutaZ
of but in nembu tal.
but is a morpheme
h exene,
{c-c-c-cl
list
evi-
we have
further
dealt
tests
as to
, etc.
to each
referring
In addition
particular
but can replace
single
to the class
it can be substi-
occurrence
pent
in pentane,
of but as the morph
of its occurrences.
In this
fashion
we
in free
variation
or
of morphemes.
may be condensed
distribution.
when
we find that
hexanol
We can now refer
Free
This
tend
there
One may call
We also
butyl-
of but.
in h exane,
pentanol,
establish
meaning
occurrence
tbat
would
is found
tests
analyses.
of the seg-
hex yzaminohexane,
he will
are discovered.
from the previous
but in nembutal
indicates
character
To confirm
This
that
is a difference
further
for nem and we find we cannot.
but in nem,butal and we cannot.
As
and nembutol
with
element.
the occurrence
whether
occurrences
of
as a putative
one finds
allomorph,
an informant
previous
The sam e will be true of yl. We can now proceed
now the words
reveals
of
linguistic
of a particular
am inohexyldodecanol,
If you ask
occurring
aminobutyZdecano1,
In addition,
to be a potential
etc.
of but in each
etc.
In
for lists
A morph is defined
names
aminobutenol,
ence.
Suppose
as a morph.
of more chemical
environments.
you find the repetition
dibutylamine,
utter-
by the existence
by frecluently
names
butylamine,
several
39,5867-5975(1945)
many occurrences
of chemical
butyl
in various
is facilitated
organized
to locate
chloride,
you examine
be examined
Abstracts
occurrences
examination
of yl in hexyl
find
Chemical
list
unless
the procedure
one can classify
butynal,
first
[cf.
a long
as butyl
Further
forms
nomenclature
relatively
butene,
basis
is a morpheme
linguistic
one finds
etc. Preliminarily
(tentative)a,Yomorph.
butane,
Here
in scanning
in names
aminohexane,
that
Abstracts
becomes
example,
a sequence
requires
Chemical
It therefore
that
to chemical
compendia
elements,
decide
Variation
and Complementary
by looking
for allomorphs
In I. I!. P. A. C. nomenclature
Es tribution
which
occur
there
is
either
no free
variation.
While
-2L
I.U.P.A.C.
has eliminated
free variation,
thi and sulf
are allomorphs
of the morpheme
addition,
the terminal
up the
morpheme
morpheme
occurs
e is in complementary
onj. OX always
the allomorph
occurs
morphemes
of co-occurrences
in Table
distribution
was done by finding
with
texts
with
the allomorph
variance.
We do find that
distribution
the conjunctives
o of the preceding
with
sulf.
In
u and y. These
make
distribution,
of the
in complementary
nllomorphs,
in Systematic
in organic
I, The morphemes
list of 1600 theoretically
positional
morpheme
whereas
on
e.
Co-occurrences
A list
not eliminated
{Sj. Thi is in complementary
{e, o, ~1. Ox and on are also
(ox,
with
it has
possible
chemical
on this
co-occurrences,
containing
Organic
list
Nomenclature
nomenclature
was compiled
were permuted
with each
199 actua 1 co-occurrences
the co-occurrence
or from person
using
other.
were
the list
of
From the total
determined.
al kno w ledge
This
of actual
occ ur-
of the Philadelphia
Col-
rences.
Lack
lege
of co-occurrence
of Pharmacy
combinations.
and
in Table
to occur
azol,
in acyclic
do in fact
II was compiled
tested
are based,
compounds.
in chemistry,
Then
the alphabetic
Prof.
went
N. Rubin
over the preliminary
not on their
Thus,
occur
first.
by using
We systematically
as an informant.
Many of the eliminations
but their failure
olium,
was further
failure
combinations
but only
list
in Table
to occur
like
in cyclic
list
aza,
of theoretical
in organic
them istry,
oxa,
ale,
inium,
The classified
list
structures.
III was compiled
this,
to eliminate
repe-
tition.
TABLE
LIST
OF PRIMARY
MORPHEMES
I
FOR ACYCLIC
ORGANIC
CHEMISTRY
1.
a
11.
di
21.
in
31.
on**
2.
acid
12.
e*
22.
iod
32.
ox**
3.
al
13.
en
23.
it
33.
pent
4.
am
14.
eth
24.
ium
34.
sulf***
<.
5
an
15.
fluor
25.
meth
35.
tetr
6.
at
16.
hept
26.
nitr
36.
thi***
7,
az
17.
hex
27.
o*
37.
tri
8.
brom
18.
hydr
28.
act
38.
y*
9,
but
19.
id
29.
oic
39.
Yl
chIor
20.
im
30.
01
40.
Yn
10.
Asterisked
* = lo, e, y
items
are allomorphs
of one of the following
morphemes:
Isulf,
thi 1
- 22 -
TABLE
II.
CLASSIFIED
LIST
OF CO-OCCURRENCES
a
at
di
hept
in
hepta
hexa
octa
penta
tetra
oat
s ulfat
dipent
diprop
disul f
dithi
diyl
diyn
hepta
hep tan
hepten
heptyl
heptyn
ylhept
azin
in0
inyl
sulfin
e
hex
iodid
iodo
iodox
acid
acid amide
acid halide
oic acid
al
az
azid
azin
azo
azon
azox
diaz
hydraz
nitraz
brom
alon
anal
enal
thia 1
ynal
am
amat
amid
amin
amon
anam
diam
enam
sulfam
thiam
triam
ylam
an
anal
anam
ane
an0
anoic
butan
ethan
heptan
hexan
methan
octan
propan
at
ate
nitrat
brom id
bromo
ane
ate
ene
ide
ime
ine
ite
one
Yne
en
butan
buten
bu tox
bu tyl
butyn
yl but
buten
enal
enam
ene
en0
enoic
en01
enon
enyl
enyn
ethen
hepten
hexen
iden
octen
penten
propen
thien
trien
ylen
chlorid
chloro
di
dial
diam
diaz
dibrom
di but
dichlor
dien
dieth
difluor
dihept
dihex
diim
diiod
dimeth
dinitr
dioat
dioct
dioic
diol
dion
diox
0x0
OYl
sulfo
thio
Yno
iod
act
but
chlor
0
eth
ethan
ethen
ethox
ethyl
ethyn
yleth
fl uor
f luorid
fluoro
hexa
hexan
hexen
hexyl
hexyn
ylhex
it
ite
nitrit
sulfit
hydr
ium
hydrat
hydraz
hydrid
hydrox
sulfhydr
id
amid
azid
bromid
chlorid
fluorid
hydrid
ide
iden
idin
idium
ido
id ox
idyn
imid
iodid
nitrid
oxid
sulfid
ylid
im
ime
imid
imin
oxim
ylim
in
amin
idium
onium
meth
dimeth
methan
me thox
me thy1
trimeth
nitr
dinitr
nit&
ni traz
ni trid
nitrit
nitro
ni troxo
nitryl
0
an0
at0
azo
bromo
ch loro
en0
fluoro
hydro
in0
iodo
it0
nitro
oat
on0
octan
octen
octyl
octyn
yloct
oic
anoic
azo ic
dioic
enoic
.
OIC acid
onoic
thioic
ynoic
01
an01
diol
en01
01
olic
tetrol
thiol
trio1
ynol
on
amon
anon
azon
dion
enon
onium
onoic
onyl
tetron
thion
trion
ynon
OX
ethox
hydrox
idox
OX
tri
iodox
methox
nitrox
oxid
oxim
0x0
trien
trie th
trihep t
trihex
trime th
trioct
trio1
trion
triox
tripent
triprop
tri thi
triyn
OXY
pentox
propox
triox
pent
dipent
pentan
penten
pentox
pentyl
pentyn
tripent
sulf
disulf
sulfam
sulfhydr
sulfid
sulfin
sulfit
sulfo d
sulfon
tetr
tetra
tetrol
tetron
tetrox
thi
Y
OXY
Yl
butyl
enyl
ethyl
methyl
nitryl
OYl
pentyl
ProPYl
ylam
ylbut
ylen
yleth
ylhept
ylhex
ylid
ylim
ylmeth
yloct
ylpent
YlProP
ylthi
dithi
thial
thien
thio
thioic
thiol
th ion
trithi
ylthi
YnYl
Yn
diyn
ethyn
idyn
ProPYn
triyn
Yne
tri
ynol
ynon
tri but
YnYl
-23 -
TABLE
1.
7d.
3.
4.
5
6:
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20
21:
22.
23.
24
25:
26.
27.
acid amide
acid halide
amat
am id
;amon
anal
anam
ane
anoic
an01
anon
ate
at0
azid
az in
azo
azoic
azon
azox
bromid
bromo
butan
buten
butox
butyl
28. butyn
29. chlorid
30 . ch loro
31. dial
32. diam
33, diaz
34. dibrom
35. dibut
36. dichlor
37. dien
38, dieth
39. difluor
40. dihept
41. dihex
42. diim
43. diiod
44. dimeth
45. din itr
46, dioct
47. dioat
48. dioic
49. diol
50. dion
III,
ALPHABETICAL
51.
52.
53.
54.
55.
56.
57.
58 .
59 .
60.
61.
62.
63.
64.
65.
66.
67,
68.
69.
70.
71 .
72.
73.
74.
75.
76.
77.
78.
79,
80.
81.
82.
83.
84.
85.
86.
87,
88.
89.
90.
91,
92.
93.
94.
95.
96.
97.
98.
99.
100.
diox
dipen t
diprop
disulf
dithi
diyl
diyn
enal
enam
ene
en0
enoic
enol
enon
enyl
enyn
etha n
ethen
ethox
ethyl
ethyn
fluorid
fluor 0
hepta
heptan
hepten
heptyl
heptyn
hexa
hexan
hexen
hexyl
hexyn
hyd rat
hydraz
hydrid
hydro
hydrox
ide
iden
idin
idium
ido
idox
idyn
ime
im id
imin
ine
in0
LIST
OF CO-OCCURRENCES
inyl
ite
101.
102.
103
104:
105
106:
107
108:
109.
110.
Ill*
12.
13.
14.
15.
16.
17,
18.
119.
t20.
L21,
122,
123.
124.
125.
iodid
iodo
iodox
methan
methox
methyl
nitrat
nitraz
nitrid
nitrit
n itro
nitrox
nitryl
oat
octa
octan
octen
octyl
octyn
oic acid
01
olic
126.
127.
128.
129.
130.
131.
132,
133.
134.
135.
136.
137.
138.
139.
140.
141.
142.
143.
144.
145.
146.
147.
148.
149.
150.
one
on ium
ono
onoic
onyl
oxid
oxim
0x0
oxy
oyl
penta
pentan
penten
pentox
pentyl
pentyn
propan
propen
propyl
propyn
propox
sulfam
sulfat
sulfhydr
sulfid
1to
151. sulfin
152
sulfit
153: sulfo
154. sulfon
155. tetra
156. tetrol
157. tetron
158, tetrox
159. thial
160, thiam
161, thien
162. thio
163. thioic
164. thiol
165. thion
166. tribut
167, trien
168. trieth
169. trihept
170, trihex
171. trimeth
172. trioct
173. trio1
174. trion
175. trios
176. tripent
177. triprop
178, trithi
179, triyn
180. ylam
181, ylbut
182. ylen
183. yleth
184. ylhept
185. ylhex
186, ylid
187. ylim
188. ylmeth
189. yloct
190. ylpen t
191. ylprop
192. ylthi
193. ynal
194. yne
195. yno
196. ynoic
197. ynol
198. ynon
199. ynyl
-24-
The Problem
Organic
of Syntactic
Chemical
Analysis
in
Nomenclature
In analyzing
sentences
“syntactic
analysis”
means: a procedure for recognizing the structure of a particular sentence taken as a string of elements. To
state the structure of a string is to assign its words to word classes,
to divide
the word class
sequence
into substrings
and to say what combinations
of substrings are admitted. (Z.S. Harris, H. Hiz et al.: Transformations
and Discourse
Analysis.
Univ. of Penna. Computing Center Annual Report. 1960, p. 43)s
By anal ogY9 syntactic
analys
a Pa rticular
ture of
Since
rupted
is of chemical
chemical
chemical
names
are
hyphens,
by spaces,
taken
as a string
often
composed
name
or brackets,
it is necessary
In some
however,
diaminopropylaminobutylhexene
but,
in a name
yl,
essary
like
ene must
hex,
to establish
be parsed
instances
bracketing
di and amino in diaminopropyZbu
tylhexene
quite
are
exene on the other
distinct
of two
signate
kinds:
that
those
operations
In a
will
from
syntactic
was
have
been
interpreted
to mean they
mentioned
case
above
tape.
This
are
perfectly
syntactic
is perfectly
named.
procedures
It is
actual
practice
bracketing
of
significant
b i s-p-m
to simplify
are always
would
that
identify
neither
end before
values
required
would
as e.g.
hexene
chemical
morphemes
which
morpheme
routine
in this
there
rules
morphemes
which
de-
during
assumed
an d Discourse
of Linguistics
study
does
that
it only
in
on the use of brackets
of ambiguity,
the preparation
we would
all means
have
In
of the input
to be
to include
tested
additional
function.
prescribe
can be used
modified
all bracketing
is a possibility
accurately
e t h y Z am i n o p h e n y Z h y d r a z on e it might
*For a more deta iled treatment see Transform ations
Dept.
tactic Analysis.
Uni versity of Pennsylvania,
chemical
nomenclature,
I.U.P.A.C.
when
CA.
of operations
but = C, and those
described
as the parent
nor
between
by 2.)
and I have
recognition
nec-
in bisamino-
bis has a domain
as e.g.
or spaces;
It is further
morphemes
that
chemi-
prop, yZ, amino,
di, amino,
remember
be bracketed
I.U.P.A.C.
the “parent”
should
procedure
use of the rules
In a more ambitious
that
reader
programming.
aminopropylbutyl
legitimate
hyphens
characters.
the morpheme
uninter-
for segmenting
he uses
adjacent
for analyzing
the computer
bis will apply to those
will
case,
The computer
This
when
of alphabetic
as di = multiply
procedure
algorithmically.
of morphemes
on the one hand and bis and aminopropylbuty,!
calculational
on them such
this
between
which
part,
the
relationship
di. (The
comprehensive
done
string
In the latter
designate
strings
the morphemes
of its allomorph
performed
be determined
hand.
does
the s truc-
(morphemes).
to set up a procedure
the chemist
as a continuous
the correct
propylbutylh
of elem ents
of long continuous
cal words into morphemes,
for recognizing
is the procedure
nomenclature
the limits
as substituents
by the substituents.
refer
of bis.
and the implied
Thus,
in the case
to =N-N-(C,H,-NHCH,),
Analysis
1959, p.
Project
1.
No.
In
15. Computable
or
Syn-
-25-
(=N-NIl-C6114-NIHX3)2
and parentheses
be no method
(A useful
paren,
for resolving
function
would
In that event
be considered
such
ambiguity
be served
the output
become
except
by pre-editing
if the computer
would
indicate
At the present
essential,
as was
determined
possible
whether
ambiguity.
time
done
appears
in this
his was
In this
there
case
to
experiment,
not followed
the name
by a
would
not
to be well-formed.)
“The successive
words of each
in a dictionary,
and each is replaced
class
to which
it, belongs
(e.g.
sentence are compared with the entries
by its dictionary equivalent,
i. e., the
verb.)
The
sequence
of class
names
which
now represent
the sentence is scanned for class cleavage,
i.e., cases where
the word may belong to two or more classes
(noun and verb, for example). A
program is needed to decide to which
context.”
In the case
complex
as
in normal
would
and
pyridyl morpholinyl
This compound
If one seeks
cular formulas,
functional
complete
terminal
named
as a ketone,
each
possible
important
member
to decide
of the longest
chain
as this
without
the
is “one
alteration
to the
name
that
be applied
on chain
meth,
may appear
than,
would
in a name
but, etc.
will
take
all members
of the
clear.
elision
of
It must
be
group.
it will
as such
precedence.
of this
well-
at the end of
sometimes,
be quite
Hence
to assign
selection
can be added
other
length.
both recognize
I
constitute
is the
mole-
not appear
important
as a functional
eth, prop,
if one can array
one could
nomenclature
the choice
would
of which
designation
pI 33)
from them their
be very
sequences
is classified
will be based
series
it would
whose
p. 46). In this case,
which
so that
the par-
class
be necessary
so that
it will
to
be
The principle
which
contribute
of greatest
unsatura-
Bonding
?lorphemes
length.
Yet
another
distinction
tion. Consequently,
takes
group
of the homologous
can only
rules,
(see
of morphemes
-
as the parent
is considered
of pyridine.
designed
compar
morpholine
is regarded
of calculating
of I.U.P.A.C.
is the only element
of several
were
to be as
analysis
nicotinoyl
function
classifications
categories,”
principle
classification
which
chain
cited,
the ketonic
to 1.IJ.P.A.C.
“syntactic
not identify
morpholine
for the purposes
if the routine
A functional
opus
we could
not appear
syntactic
as a derivative
of syntactic
them according
of a compound
e.” (R.S. Cahn,
be regarded
would
in a comprehensive
In the first case,
names
hand,
A cardinal
group.
name
Another
identify
names.
also
forms
to appropriate
chemical
prin cipal
the
and produce
of classification
Otherwise
synonyms,
chemical
On the other
morpheme
formed
could
then more elaborate
words
However,
in its grammatical
cit., p. 44)
pyridyl, morphoLiny1 ketone,
to recognize
to be necessary.
chemical
case,
the word belongs
the problem
to be performed.
as
class
H. et. al opus
discourse,
ketone
In the second
structure,
each
have
Hiz,
nomenclature,
English
operations
ent
Z.S.,
of chemical
able
structure.
(Harris,
on even
greater
is made on the basis
the classification
significance.
based
of selecting
on bonding,
discussed
chain
lengths
below
under
3
To carry the analogy
where
the
morphemes
which determines
ing
class
tributes
chain
chemical
may belong
This
be particularly
length
and/or
both to bonding,
also
functional
context
the class
as well
Thus
class
i.e.,
therefore
be required
must
element
length-two
cleavage
of morphemes
which
t-be common
as to chain
will
assignment
true of expressions
group.
(unsaturation)
exhibits
An algorithm
to two or more classes.
grammatical
will
-
nomenclature
for a particular
cleavage.
as regards
further,
26
cases
exhibit-
be classified
vinyl
both
(CI12=CH-)
conflicting
con-
choices
accord-
ing to the circumstances.
Transformations
The analogy
that chemical
tences.
synonyms
By using
chemicals
between
exhibit
as
diaryZ
Ar f yL Ar3c yl Icetone z
Ry using
chemical
Ar
A
notation
ketones,
and normal
Alongside
sentences
relationships
we obtain
it is
each
similar
where
possible
group
can be completed
the following
Ar,-(C=O)-Ar3,
transformations
names.
Chemistry
+
#+
4Z. Ar , ylcarbonylh- pe
Ar2 oylAr3ene
these
names
transformational
an appropriate
known
tgood
chemical
in Organic
exhibited
transformations
Ar,=iZr,(C=O)
by sen-
for the class
of
and Ar4=Ar3(C=O).
Ar3ylcarbonyLArIene
to generate
of names
to those
by showing
the following
is the corresponding
z.Ar+Aqene
list
of perfectly
structural
diagram.
B
c
D
E
A rl = phen
pyridin
phen
pyridin
XYl
Ar 2 = benz
nicotin
benz
nicotin
dimethylbenz
Ar 3 = naphtha1
morph01
morph01
naphtha1
fluoren
naphthoyl
fluorenecarbonyl
n
Ar4 = naphthoyl
Group
A
phenyl
naphthyl
morpholenecarbonyl
Group B
ketone
pyridinyi*
0
benzoylnaphthalene
! fJJ
morpholyl+
ketone
nicotinoylmorpholene*
phenylcarbonylnaphthalene
pyridinylcarbonylmorpholene
naphthalylcarbonylphenene’
morpholylcarbonylpyridinene*
naphthoylphenene
morpholenecarbonylpyridinene
Ar
named
1’
Ar
2’
by these
the values
Ar
3’
and Ar4 are class
transformation
for each
Ar group.
Group C
phenyl
morpholyl
One can generate
means
that
well-formed
if one specifies
for any diary1
ketone
names
by specifying
simply
can be
the
morpholylcarbonylphenene
ketone
morpholenecarbonylphenene
benzovlmorpholene
pyridinyl
This
The synonyms
phenylcarbonylmorpholene
morpholyl
*phenene
rules.
designations.
+ benzene
(phen
+ pyridyl
(inyl+
+ morpholinyl
+ benz)
yl)
(yl + inyl)
morpholene
pyridinene
-+ morpholine
-+ pyridine
(ene
(inene
+ ine)
+ ine)
naphthalyl
+ naphthyl
fluorenene
+
fluorene
(alyl
(enene
+ yl)
+ ene)
-27-
Group
Group
D
pyridinyl
naphthalyl”
ketone
E
xylyl
nicotinoylnaphthalene
fluorenyl
ketone
dimethylbenzoylfluorenene*
pyridinylcarbonylnaphthalene
xylylcarbonylfluorenene
naphthalylcarbonylpy
fluorenylcarbonylxylene
ridinene
naphthoylpyridinene
morpheme
tained
Ar 1, Ar,,
mentioned
between
etc.
name is not required.
qua non
in the transformation
In Table
investigation
for developing
here
syntactic
a grammatically
and Ar3 in Ar l-(C=O)-Ar3
A thorough
trated.
are
for Arl
by replacing
chemical
sine
fluorenoylxylene
only
a procedure
analysis
of normal
equations.
IV transformations
of the
to complete
correct
description
English
name will
knowledge
chemical
classes
of standardized
of the analagous
discourse
and
be ob-
of a correct
of chemicalnomenclature
for the generation
the
Prior
for other
transformations
chemical
are illuswould
be a
nomenclature.
relationship
syntactic
They
that
analysis
exists
of chemical
nomenclature.
TABLE
IV.
TRANSFORMATIONS
IN ORGANIC
Aldehydes
R
bn
pent
an
but
en
Yn
prop
Rb,al
R’
CHlKMISTRY
KC1 I=0
formyl
Rb,e
Rb,e
pentane
carboxaldehyde
formyl
pentane
butenal
formyl
butene
butene
propynal
formyl
propyne
propyne
pentanal
Esters
Ryl R %,oate
eth
en
pent
ethyl
pentenoa
hex
an
but
hexyl
butanoate
hept
yn
prop
heptyl
te
propynoate
R’b,oic
acid
Ryl ester
pentenoic
acid
ethyl
butanoic
acid
propynoic
hyde
carboxaldehyde
hexyl
acid
ester
ester
heptyl
ester
Ryl R’b e carboxylate
n
ethyl
pentene
carboxyla
hexyl
butane
carboxylate
heptyl
propyne
carboxylate
R-011
Rb,ol
Rb,e
pent
en
hydroxvpentene
pentenol
but
yn
hydroxybutyne
bu tynol
Ethers
Roxy
car boxalde
R COOR
AZcohoZs
Hydroxy
carboxaldehyde
R bne
R-O-R
’
Ryl R’b yl ether
n
prop
yn
but
propoxy
butyne
propyl
butynyl
ether
hex
an
prop
hexoxy
propane
hexyl
propanyl
ether
eth
en
prop
ethoxy
propene
ethyl
propenyl
ether
(propanyl
= propyl)
te
-28-
TABLE
IV.
TRANSFORMATIONS
IN ORGANIC
Acids
Rbnoic
R
h,
R’
prop
en
propenoic
but
yn
butynoic
Rb,
acid
acid
propehe
acid
butyne
carboxylic
acid
carboxylic
acid
carboxylic
aminoethane
ethylamine
Prop
aminopropane
propylamine
The Value
Study
encies
linguistic
that
have
analysis
book
enables
one
of chemical
dim e th oxy.
This
renders
realization
nition
that
uncovers
organic
in turn
be made
of organic
undoubtedly
new interesting
prefixes
provides
that
linguistic
analysis
co -occurrence
rule
the occurrence
of the
of organic
by parentheses.
This
methoxy,
of organic
to the machine
in the rules
Linguistic
from the imperfect
as dime&
convention
far from acceptable
be followed
indicates
such
to the inconsist-
development.
will result
in strings
flaw in the accepted
an insight
his torical
natural,
ambiguities
to a readjustment
bring
It would
clear
event
would introduce
(cf.
study
by no means
My remarks
are intended
a completely
one would
encounter
classes.
in the cyclic
introduce
T. ti3. R. S’rn g er,
Yo.
which
the complexities
but also
such
would
a nd
nomenclature
and the human.
nomenclature
make
which
This
would
the job of recog-
account
for the majority
British
in spelling,
Entries,
Chemical
in nomenclature
and many
analysis
way,
e.g.
prepared
to-
produced
not
and American
use of different
Society,
will
be under-
chemical
Searching
that
nomenclature
in this
of new chemicals
in analyzing
by standard
American
ambiguities
linguistic
of the methods
of chemical
of the linguistic
involved
Index
to be an exhaustive
as a summary
study
the scope
as variations
IJ. S, and British
4, Washington:
exhaustive
purports
many additional
Expanding
chemicals
other complexities
in Chemistry
this
should
morpheme
also
that
only by the LIT. PA .C. nomenclature
vanxes
their
nomenclature
lead
of nomenclature
example,
another
nomenclature.
be required
In that
taken,
etc.
Nomenclature
in advance,
and
for the
much simpler.
analysis
day,
of Chemical
nomenclature’s
For
oxy
all numerical
It should
would
finding
might
stipulate
and
existing
Linguistics
accumulated
to uncover,
m eth,
di,
R-N
of Structural
to the study
nomenclature.
morph emes
and
approach
slowly
acid
Rylamine
Rane
eth
The
(continued)
RCOOII
A rrGnes
Amino
CHEMISTRY
names
nomenclature.
“trivial”
the CT~emical
19Sl.)
This
words,
Literature
Ad-
-29-
The Value
of the Study
Nomenclature
In a certainsense,
for testing
linguistic
possible,
procedures
as was done
of the experiment.
located
some
of the
all
place
to study
This
their
the
of the rapid
linguistics,
of ten years
that
natural
take
hundreds
changes
syntactical
knowledge.
events
there
has changed
is now a very rapid
knowledge.
in chemical
in normal
and
and by the time one has
of human
of chemical
one can observe
might
entirety
It is
to the needs
established
true in chemistry,where
accumulation
of parameters.
morphemes
course
experiment
according
on previously
in its
controlled
number
of parameters
effect
the
small
additional
language
is particularly
as a result
a relatively
of the language,
in the language,
of historical
in a period
are
a more strictly
to vary the number
knowledge
necessary
relationships.
the point of view
represents
so as to determine
occurrences
in terminology
there
experiment,
be studied
it becomes
even
change
in this
to Linguistics
of chemistry
since
As one gains
relationshipscan
Otherwise
the domain
of Chemical
Certainly
from
nomenclature
take
discourse.
AN ALGORITHMFOR TRANSLATINGCHEMIC4LNAMES
INTOMOLECULAR FORMlJLAS
This
names
dissertation
into molecular
To test
restrictions
to facilitate
placed
etc.
that
entire
has
The
pertinent
domainof
been
prepared
dictionary
!Yhile
note that these
have
on the input
with
been
successful
procedure
that
for direct
chemical
translation
of chemical
account
are
These
capabilities.
computer.
now
Indeed,
using
possible,
formulas.
the dictionary
certain
were made
no such
only
restrictions
it is one of the more sig-
this
This
in which
restrictions
As will be seen,
translators,
molecular
was designed
procedure,
to train
a non-
could
be done
by complet-
of morphemes,
idioms,
homonyms,
experiment.
for each
contains,
and/or
exp erimental
eliminated
it is
nomenclature,
for this
an experiment
by human
and accurately,
of morphemes
morphemes
and output
is used
of addition
the
procedure,
an electronic
research
quickly
operations
or follow.
that
of this
to calculate,
ing, forthe
first
of this
the procedure
aspects
chemist
validity
experimentation
arenecessarywhen
the
formulas.
the general
were
nificant
reports
multiplication
dictionary
which
for that
of morphemes
for a large
those
morpheme,
percentage
are ordinarily
the calculational
morpheme
is small,
of all known
considered
value
or those
which
it is not without
chemicals.
and the
precede
interest
to
The morphemes
to be non-systematic,
i.e.,
trivial.
i’he
computer
work.
procedure
could
was
be similarly
tested
on a Univac
programmed
I computer.
from the general
However,
flow diagram
any medium-sized
which
forms
or large
a part of this
- 30 -
TABLE
AN ALGORITHM FOR TRANSLATING
SUMMARY
1.
Ignore
all locants
2.
Retain
all parens.
3.
Replace
ail morphemes
4,
Relsolve
rences.
ambiguity
5,
Place
V
CHEMICAL
OF OPERATIONS
6.
occur-
1,
You cannot
If either
have
two multipliers
3.
If not, it is multiplier.
of the next
7,
Carry
8.
Calculate
hydrogen
using hydrogen
for mula:
H = 2 + 2nC f nN - nX - 2nDB .
is alkyl
separated
ending,
TABLE
OF MORPHEMES
out all multiplications.
Rules
in a row unless
two morphemes
INVENTORY
If there is + at far right of parenthesized
term, place it outside
right paren, If there
is + at far right of name, always drop i-t,
except multipliers.
Ambiguity
2.
FORMULAS
value.
of any penta-octa
+ after all morphelres
TO MOLECULAR
FOR HUMAN TRANSLATION
(1, a, N, etc.)
by dictionary
NAMES
by paren,
it is not multiplier
VI
USED
IN THE
EXPERIMENT
Calculation
P
Value
C
DB
Morpheme
Meaning
Example
al
O=(H)
ethana2
1
amide
ONH,
methanamide
1
amido
C=O(NH,)
methanamidopropane
amine
NH
methylam
amino
3
NH 2
ine
-
aminobutanol
propanol
*an
-
propane
bis
2X
bis(aminopropy1)
but
C
*ane
1
1
4
di
2X
*en
=r
but enol
*ene
=
butene
eth
C
hept
C
7
2
4
butane
diaminopropane
2
amine
2
et/Lane
2
h eptane
7
hep ta
7x
h ep taiodohexane
hex
C
h.exene
hexa
6X
hexaiodoheptane
h yd roxy
OH
/Lydroxyethanoic
=
butylidenehydroxyamine
-
=SH
iminobutanol
1
*idene
imino
* hon ding morph em e
6
7
6
6
acid
1
I
- 31 -
TABLE
VI (cont.)
Calculation
Value
DB
Morpheme
Meaning
Example
iodo
I-
iodoethanol
iod oso
IO-
iodosoethane
iod oxy
IO-O-
iodoxyethane
meth
Cl
methane
nitrate
-N=O(O,)
methylnitrate
1
nitrile
j&
methanenitrile
2
ni trilo
NE
ni ttiloethanol
nitro
N=O(O)
nitrobutane
2
1
nitroso
N=O
nitrosobutane
1
1
oate
O=( 0)
ethyl
2
1
act
C 8
octane
octa
8X
octaiodooctane
oic acid
O=(OH)
pentanoic
01
OH
pentanol
one
0
0x0
0
oxopentanoic
oxy
- 0-
methoxypropane
OYl
0=
pantanoyZ
pent
5
pentane
penta
5X
pentachloropentane
peroxide
-0-o
ethylmethyl
Prop
C3
prOpYne
sulfate
-o--so,-0
methyl
sulfino
IISO,-
sul finopropanoic
sulfinyl
-so-
ethylsul
r
2Z
2
pentanoate
8
8
acid
1
1
pentanone
1
acid
1
iodide
5
5
2
peroxide
3
&fate
4
2
acid
1
finyl propane
sulfo
HSO,
sulfopropanoic
sulfonyl
-so,-
methylsuZfonylbutane
tet ra
4x
tetraiodobutane
4
te trakis
4x
tetrakis(ethylamino)
4
thial
S=(H)
ethanethial
thio
-S-
methyl thioethane
thiol
-, SH
ethanethiol
th ione
S=
propane
tri
3x
triiodopropane
tris
3x
tris(aminopropy1)
2
thion e
3
amine
3
butyzamine
“Yl
ethylenediamine
*ylene
=
-
Yn
butynal
butyne
Yne
*bonding
3
acid
morpheme
2
-
2
-3%
Generalized
The result
ple generalized
Expression
of my investigating
expression
for the Molecular
the requirements
for a molecular
formula
for such
in terms
Formula
an algorithm
of morphemic
is the following
analysis
sim-
of its chemical
name,
m,f, =
(I)
where
Pj
is
the
number
of occurrences
of morpheme
i is the
M.
element
(e.g.
carbon,
oxygen,
In'
nitrogen,
etc.)
carbon
and
n is
and hydrogen
the
number
of occurrences
(Hydrocarbons)
this
of i in M. For chemicals
expression
which
contain
only
becomes
J
1
p.M
+H
J cn
elements
carbon,
oxygen,
7
pjMNn
(2)
p=l
For
chemicals
expression
0
can
be expanded
m.f, = 7
This
expression
Each
(4)
T-
where
bers
Cl
Pj"C
DB
n
II=
is
the
2
tested
latter
+
in this
expression
P,MC 1 + P2”,Z
meth,
series
for hydrogen
[-MC
special
en, yn, and cycle.
pjMOn
=
of all morphemes
The value
nitrogen
sulfur,
and halogen
the
+
7
pjMSn
+J-pjMXn+
II
experiment.
can be expanded,
as in the case
of morphemes
as follows:
of the homologous
(5)
in this
is the morpheme
111
summation
+ 7
all chemicals
of the terms
to carbon
the
as follows:
pjMcn
covers
relating
M
containing
class
M
C,,
is found
-
ooDC
C,,
cmtribute
+
3
is the morpheme
C2
C,,
which
+ P3”C
.Each
of morphemes
which
+g
of that
terms
particular
terms
are the mem-
in equation
atomic
(4) is the
element,
expression
+7-Q
contribute
00
Pj"c
of the other
to the value
M,,
“’
eth and all the other
from the following
7
l
-FMX
double
bonds,
and cyclics
as e.g.
an,
-33-
Soffer’s
This expression
mula in terms
is derived
of cyclic
Equation
in part from Soffer’s
elements
of structure.
(6)
Soffer’s
nor does it provide
such
is
as
are
such
covered
particularly
value together
does
for chemicals
compounds
pounds
equation
interesting,
the equivalent
ammonium
of a double
as
its
for-
127:880,1958).
such
ammonium
morpheme
morphemes
nitrogen
for the molecular
+ nN - nH 9X)
elements
compounds
expression
main
to trivalent
expression
Science,
into account
generalized
compounds
bond
1/2(2nC
not take
with en and yn. All of these
in quaternary
generalized
as quaternary
by the
Formula
(M.D. Soffer,
p= 1 +
However,
for Molecular
constituent
this
and sulfur,
fashion.
of quaternary
ium is classified
morphemes.
is in a pentavalent
For
in a direct
phi.. The case
are ‘bonding’
nitrogen.
as oxygen
state
reason,
This
com-
by its DR
is reasonable
and thereby
its
All
DB value
contributes
is minus
one
(4
Only One Language
.Aside
from
note that there
but in spite
nyms
the
really
utility
exists
same
Two
morphemes.
reconstruct
chemists
[Jpon
be the same
as $pyridyZ
would
snow
several,
the
algorithm
multipliers
resolving
programs
c
and
deal
dictionary
may
illustrate
The
examination
they
regardles
the
use
of the
will illustrate
and parenthesized
for this type of analysis
with
look-up
routine
quite
no parenthesized
to
little
or
might not appear
to
and calculating
equally
of
be able
with
one language
It works
used.
also
formula,
is in fact
system
dictionary
will
2-(nicotinoyZ)morphoZine
only
in many sync-
basic
but they
from it the molecular
there
of IGglish,
resulting
Since
a series
interesting
of examples
look-up
the fourth
is due to the intricate
translator
routine,
a chemical
to observe
involved,
well
of increasing
< the
not
for Chemical
the second
requiring
complexity
and third
required
are
the use of
the use of an ambiguity-
that much of the complexity
steps
combines
by the machine
the ambiguity-resolving
of computer
to recognize
routine
with the
easily.
First
As a first
differently,
to
nomenclature.
sions,
The humn
ambiguity.
chemicals,
sys terns draw on the same
and
it is important
It is a sub-language
the strut ture of each,
the dictionary
It is particularly
formulas,
but drawing
algorithm
expres
for naming
chemical
s of the
as for I.U. P.A.C.
first
routine.
ketone,
rmlecular
chemistry.
the chemical
are synonyms,
works
nomenclature
the same
of the chemical,
Z-morpholinyl
that
available
all of these
name
diagram
cursory
formula
To
chemical,
Nomenclature
for calculating
of organic
“systems”
specific
no difficulty.
discussed.
only one language
the structural
Abstracts
of the algorithm
of all the different
for the
of Chemical
example
consider
terms,
no positional
the simple
Example
chemical
designations
name methylaminoethan
(locants)
or multiplier
e in which
morphemes
there
(coefficien
are
ts).
- 34 -
Methylaminoethan
0,
eth,
Since
first
an,
e is analyzed
Eat h morpheme
e.
these
morphemically
are the most
is
frequently
by the human
assigned
the
occurring
morphemes
translator
following
as follows
meaning
--
by reference
in the language
they
meth,
yl, amin,
to the dictionary.
are memorized
in the
meth = C
few minutes.
yl
=+
amin = N
By the
process
When written
calculate
0
=
eth
= 2C
e
=+
of simple
+
addition
in the conventional
the hydroge
one obtains
chemical
the partially
subscript
notation
= 9 The complete
formula
this
-2(O)
Second
becomes
example
morphemic
let us consider
Example
the chemical
analysis
this
acid
becomes
(7C + N) + 6C + N + 46 + 2DB = 13C + 2N + 2DB = C,,N,O,
and where
II = 2 + 2(13) + 2 - 0 -2(2)
= 26 Final
Third
As a third
example
2[2(2[2C]
2 [2(4C
consider
+ = oxygen
+ 2DB
m.f. = C,,I~I,,N,O,
Example
bis(bis[diethylamino]propylamino)butane.
+ N) + 3C + N] -I- 4C + 0
+ N) + 3C + N-j + 4C
2(8C + 2N + 3C + N) + 4C
16C + 4N + 6C + 2N + 4C
II = 2 + 2(26) + 6 -0
= 26C + 6N = C,,N,
- 0 = 60 and the m.f, = C26116oN6
Fourth
Finally,
consider
as 3C + N.
C3N. It now remains
+ N] + 3C) +2C+O+N+O+4C+O+2(2$+DB)
(0-[2(X)
formula
is C&N
(3-!diethyInmino)propyZ)ethyl-3-amino-l,4-butanedioic
By a similar
molecular
II.
II = 2 + 2(3) + 1 -0
As a second
complete
the example
Example
of hexanitrohexatriene.
6(N + 24 + DB) + 6C + 3 DB
6N + 12$ + 6DB + 6C + 3DB = 6C + 6N + 12$ + 9DB = C,N,O,,
II = 2 + 2(6) + 6 - 0 - 2(9) = 2 and m,f, = C611,N6012
+ 9DB
to
- 35 -
In this
particular
potentially
case
ambiguous
the
morphemic
morpheme
analysis
algorithm
account
as nitrohexane
in a compound
such
latter
case
in hexatriene
resolved
hexa
by a simple
ambigous
morpheme
ure
This
pent-act
(as e.g.
which
is
used
Since
morphemes.
this reason,
In an expanded
morpheme
above,
find no match.
name
cedure,
a match
match
for hexa.
taneously
would
we would
residue,
then
and
which,
occurrence
For
the
list,
that
hexa
of course,
is followed
program
may
not
example
has
been
ognition
program,
are several
in hexanitro.
ending
to explain
procedure
for assigning
morpheme
letters
a thirty-one
carbon
would
would
we would
would
value
we would
go through
be called
of hexa
encounter
point
the same
letters
name
as
e, g,, a
would
again
would
three
letters
By a similar
and we would
for, as each
morpheme
having
Simulis always
been
determined,
for nitro,leaving
ambiguity-resolving
profind a
in the dictionary.)
in itrohexa
first.
no match
the last
itrohexa
a match
eight
which
with ohexatri.
against
to the
in the example
Since
with atriene,
continue
match
was
be matched
first.
at which
longest
the differentiation,
of a chemical
Ihex and /Lexa are stored
routine
where
be examined
proced-
values
Consequently,
be continued
pent-act
of the
dictionary
would
are
to the right of
the principal
letters
chain,
ene was reached,
the procedure
ambiguities
how the computer
in the dictionary
more
In the
an, en), a multiplier-
in making
nomenclature
both
(as e.g.
necessary
eight
until
These
like hex (called
to understand
would
in hexatriene.
two of the morphemes
the last
the test
buried
as the
as the final
routine
as the pre-
morpheme,
translator,
the reader
found
In order
The correct
would
the hexane
for the morphemes
the longest
the procedure
in this
of this
human
it is
for tri. Then
move on to exantro,
hexa
While
letters,
ambiguity-resolving
for membership
matter
it is an alkyl
xatriene
be no match
(To simplify
there
in hexani tro is not the same
one and/or
as nitro.
the characters
be found
the hexan
of examining
meaning
be stripped
the pent-act
checked
ceive
would
that
of chemical
would
since
Match
Learns, he has no difficulty
consists
combinationof
There
of the
vious
matching
either
recognition
found
as hentriacont,
for this
such
translator
it was
that
as to whether
entire
hexanitrohexatriene,
be found
of testing
in the
coverage
such
fact
sub-routine
in hexanitro,
the human
straightforward
of the Longest
or for that
hexan
In the experiment,
long,For
consists
morpheme
the
as
is not the multiplier
tri) or a morpheme
differentiates
match
for the
ambiguity-resolving
group in experiment).
the
and Principal
must
hexan
the
not
combinations.
Ambiguity
The
is
be self-evident
dictionary
procedure
by the very common
can apply
chosen
this
which
the algorithm
without
will test
look-up
is by no means
morpheme
with
reference
routine,
to a specific
pent-act
as one can readily
nitro and subsequently
no difficulty
all of the steps
as complex,
ambiguity-
by tri.
without
a computer,
example.
For this
in the program,
including
resolving
per-
the computer
reason,
another
the general
routine,
rec-
a n d formula
-36-
calculation
routine,
chemical
with
In order
several
to test
all boxes
parenthesized
expressions,
Fifth
Consider
Off computer
Carrying
and
the chemical
thealgorithm
I
i.e.
Example
--
Human
multiplications
results
shown
a diagram
to indicate
in order
it is necessary
to select
a
parentheses.
Procedure
simply
and additions
gives
how time-consuming
N
I
I
molecular
formula
diagram
of C,,N,
of this
the procedure
+ DB,
chemical
of drawing
is
such
formula.
H H H
2\
a partial
it can be to go through
he molecular
to calculate
in 3[2(2 C4 + N) + C3 + N] + C, + 2 DB,
N 9a The structural
II = 2 + 2(62) + 9 - 2(2) = 131. m.f. = C62II131
also
nested
routine
2,3,4-tris[3-bis(dibutylamino)propyZamino]pentadiene-1,4
for this compound
out the simple
in the calculation
H H H
I
I
H-C-C-C-N
/
I
I
2N
HH
1
1
N(C4H,)2
/
N-C-C-C-H
1 1 \
H H
N !CdH 9) 2
H
i
H 2C= C-C-C=CH,
H-N
N(C,H 9) 2
H
I I
H-C-C-t-H
I I
H H
I
I
N(CqH9)2
Fifth
The
marks
computer
are made
chemical
name
card is then
from right
processed
read
procedure
to help
is
some
on an IBM card
character
The
at a time.
--
Computer
the same
of the details
into the main computer
to left each
one
for analyzing
explain
punched
Example
compound
which
or typed
Procedure
would
directly
and immediately
is given
apply
below.
to all chemicals.
on a Unityper
placed
Parenthetical
typewriter.
in a working
storage
in the name
is brought
into the computer
register
character
in process
at any instant
is referred
re-
The entire
The tape
unit.
or
Working
one at a time
and
to as the current
character.
Ignorability
The
character
first
which
part
of filtering
cannot
enter
each
not Obvious
character
any look-up
or other
Discovery
consists
operations
of the test
that
will
for Ygnorability’,
contribute
i.e.
is it a
to the molecular
-37.
formula.
It is worth
no obvious
noting
discovery
that
ignorability
of positional
and had to be carefully
Current
Since the first
able,
It will
in this
therefore
example.
it is
placed
eight
characters
character
this
current
character
Since
storage
in the alpha
signifies
storage;
the end-of-name.
names
was
how ignorable
alpha
tested
storage.
there
then
test
are handled
a paren
Immediately
experiment,
is an e, it is not ignor-
characters
for being
aren’t,we
In this
example
until
and since
it is not
we ask whether
whether
we have
the ampersand
symbol
later
there
are
a sentinel
was used
for
sentinel.
and
eight
Since we have not reached
the end-of-name,
processed
same
in exactly
characters
initiating
the
in the alpha
the dictionary
way.
storage
match
the next
This
(ntadiene),
or look-up
The dictionary
match
ary and will find a match
will be placed
ing.
ntadi.
for ene.
in a special
In this
it is not,
routine
case
it will
Numerical
will
be found
multipliers
to differentiate
in alpha
be no such
morpheme.
is encountered.
Since
it is
not,
or more
dictionary
ambiguity-resolving
case,
until
process
of the alpha
storage
we do have
the alpha
which
storage
is not on the pent-act
storage
will
that
Processing
which
routine.
routine.
again.
current
paren
will
morphemes.
match
we will
area
along
now be asked
remain
storage,
is used
during
list
with
the diction-
the morpheme
ene
with its appropriate
mean-
whether
Since
will be shifted
di and it, too, will then
code digit
Therefore,
the paren
complete
in this
out of working
Routine
the contents
storage
is now shifted
This
will be fLlLZy processed
is taken
it is empty.
to the far right
be stored
leaving
in calculation
the formula
area,
calculation
routine
from adders.
storage
paren
point,
storage
for the morpheme
Fully
The alpha
:At this
and morpheme
have a special
them
continue,
morpheme
The alpha
be DR,.
all of the characters
A match
this
calculation
will
Match
will compare
Since
character
routine.
Dictionary
the
in chemical
Processing
it is then
since
locants
our pentadiene
to discuss
unit called
i.e.
for validity.
Character
the e is not ignorable
in a special
which
checked
in processing
not be possible
terms,
will
be placed
means
that
In this
Since
then
This
Storage
time, when
a match
character
processing
cause
the computer
in a paren
whatever
case
Alpha
penta
storage
it is on the pent-act
will continue
to check
remain
in alpha
list,
it will also
until
there
the first
store
storage
will
right
is empty,
of the alpha
in alpha
storage
fornta,
if alpha
and the contents
characters
remains
is sought
must
storage
be one
and it will go through
go through
the pent-act
-. 38 -
Pent-Ott
Since
whether
the
it is
as
pentane,
rent
an alkyl
character
fully
i.e.
the
next
processed,
pylamino
right
and
the
will
have
when
penta
was
processed,
paren
(a left
Tris
area,
paren)
will
than
will go into alpha
paren
is
the procedure
prefix,
it is determined
penta
will
then
determines
whether
be stored
The ambiguity
been
will also
found
at which
be stored
the next
in calculation
has
been
area
resolved.
be processed
ignorable
the
formula
is done
then
paren
the
character
with
a complete
Cur-
hyphen,
comma,
prime,
and colon.
will
done
of ignorable
with the previous
as a morpheme,
will
be initiated.
in which
until
paren
the next
will
Determining
whether
compares
will
always
a
each
of the integers
character
be
in the calculation
the computer
can be processed
right
the
When the end-of-nane
consisting
of an ignorable
con-
storage,
the hyphen
and placed
ignored.
characters
which
will
as a full word, since
with dibutylamino,
characters
sub-routine,
of the name
area
be encountered
routine
The presence
of a portion
also
similarly
remaining
and placed
Processing
prop will be found in alpha
be processed
calculation
list
be matched.
calculation
was
continue
by a dictionary
current
the ending
This
will
and
and am/in0 will be matched
point
in the
to be empty.
The procedure
storage
and yL will also
isencountered,
left
character
anending,
ending,
as a multiplier.
the 3 and the second
is encountered,
beginningor
ene is such
Bis will
character
an alkyl
di is a numerical
is encountered.
then
not
will continue
paren
store
as will
Since
is
Routine
is now resumed.
alpha
ignorable,
Since
Processing
area.
penta
as a Cs rather
characters
in calculation
until
ending.
processing
The eight
tinlre
prefix.
a numerical
is
would
preceding
morpheme
morpheme
Ambiguity-Resolving
1 to 8,
indicate
independently
the
of the other
portions,
Computer
The calculation
words.
Each
parens
also
morpheme
Value
1.
tris
S(9)
d.
2
(
3,
bis
The
The
tions.
paren,
word
is
area
followed
as separate
stored
V’ord
46 (
storage
of the computer
by its
calculation
appropriate
5.
di
2(9)
---
6.
but
C,
10.
2(9)
7.
yl
---
__I
8.
amino
N
first
portion
word
of the
tris
is
a paren,
Since
calculation
a m&iplier,so
now starts
it is not,
but
the following
additive
sixteen
or multiplicative
calculation
value.
Note
that
words.
Value
first
Routine
now contains
K’ord
which it is. The computer
is
Calculation
Pord
13* )
--
prop
C,
14.
penta
C,
11,
yl
---
15.
di
2P)
12.
amino
N
16.
ene
DK
disposes
it is then
counting
Value
_I_
9. >
routine
Fiord
Value
of parentheses
determined
left and right
it is a multiplier,
bis,
whether
parens,
multiplication
and multiplying
the next
We again
opera-
word is a left
ask
if the next
is not yet carried
out.
39
-I
Since
the
next
the registers
a paren,
word
is a left paren,
for left and right
but is a multiplier,
When
the
registers
will
first
paren
left
W6),
but@)
bis(3).
-
prop(lO),
Y&11),
Since
not,
routine,
is
last
will
yl(7)
right
checks
whether
be processed
paren
(13),
and
and penta(14)
the morpheme
ene(16)
calculation
words
are now replaced
The computer
di
000
000
6.
but
C 4 x3x2x2=C
000
7.
yl
000
000
8. amino
3.
4. (
The totals
is
are takenand
performed
formula
using
of C
62
II
131
The computer
with the formula
a partial
the equation
48
6
molecular
2 + 2n
C
+ n
N
two
ways
the
by a paren,
as will
amino(S),
been
found,
area
has
right
paren(9),
during
been
the am-
by two.
then
adds
Since
All parens
it is the last
and multiplier
the contents
of these
regis-
Value
Value
000
13.
)
000
10.
prop
C3 x3=Cs
14.
penta
C,
11,
yl
000
15.
di
000
12.
amino
Nx3=N,
16,
ene
DBx2=DB
formula
-n
Yo rd
X
- 2n
of Cb2NqDB
2
. The hydrogen
In this case
calculation
it is 131 giving
a final
DB’
N ,
9
will now test
for experimental
manually
and stored
IIydrogen
are
is encountered,
word in calculation
are completed.
9. )
Nx3x2=N
give
calculated
The calculation
the
in multiplying
is not followed
whichhad
Word
5.
2. (
paren
as follows:
Value
000
his
by zeros.
Word
tris
1.
pq~(lO),
following
will result
amino(8)
is multiplied
operations
Vahe
the first
ignored
the paren and multiplication
Word
di is not
and right
to the word immediately
which
since
to be %.
a multiplier
now looks
the left
the last
word,
which
is encountered,
it is a multiplier,
calculation
ters
Since
for yZ(7), amine(8),
by tris(l),
following
Since
However,
it is not a numerical
covered
now be followed
When the paren
di(5).
the parens
to return
will
to two.
word is examined.
will occur
amino(12)
computer
increase
word is but. Since
The same
within
process
the computer
amino(l2),
di(15)
will
by two.
it has
biguity-resolving
signal
by two,
proceeding,
Since
the
back to the first
will be multiplied
reached,
will
A similar
amino(8)
will be referred
Before
following
This
par-ens
Tlhe next
tris.
are all contained
paren
be equal.
and
yQ7)
computer
right
be ignored,
by the multiplier
as they
of left
are not yet equal, the next
parens
it, too,will
prefix, it will be multiplied
yZ( 1 l), and am(ino(l2)
the count
-
of hydrogen
of solving
the
is by no means
problem.
There
purposes
with
whether
the original
the calculated
formula
agrees
data.
Calculation
a simple
straightforward
is the method
described
or obvious
in this
task,
dissertation
There
which
2
-4O-
derives
form
reader
an idea
chemical
brute
Soffer’s
of the
diagram
force
formula
there
difficulties
shown
method
and
is
the
of using
on page
36, where
the
131 hydrogen
brute forcemethodof
calculating
hydrogen
in terms
time.
The
assignment
morphemes
which
IIowever,
bond
one
must
one
formation,
of t he
CII,.
not
here
must
For example,
most
commonly
It is invariably
merely
the
replace
more
associated
a knowledge
to form an ester.
The
linguist
is
prompted
mapping
bon
to the molecular
atom
the simple
known
tion2Nc
even
that
t!le
+2 where
method
reaction
chemical
i s
value
of
is implied
ethyl
group,
we assign
one
the values
rules
dealing
as between
ethanoate
is
CII,CII(CIIJCII,
very sophisticated
(opus
number
of cyclic
Soffer’s
formula
to me immediately
directly,
It was
representing
but it does
to the chemist,
solving
cited)
(ethyl
based
on
with names
an acid
acetate)
and
is not
observed
it could
that
configuration.
Ilowever,
hydrogen
will always
three
the problem
a more sophisticated
in a chemical
the
i.e.
meth
contribute
in a saturated
atoms,
to include
contribute
atoms
of hydrogen
hydrocarbon
in the
one car-
of hydrogen.
It is
calculation,
It i$
is derived
the average
value
chemist
from the relae
has no system.
for hydrogen,
configurations
that
atoms
of carbon
provides
in checking
not always
how one resolves
of hydrogen
is the number
a morpheme,
to ‘cyclic’
morpheme.
N,
one has the right
as net/~,. The morpheme
such
formula,
number
of q uickly
Soffer
to ask whether
of morphemes
not at all obvious,
buting
For example,
a chemical
This
methyZpropane
the methyl
when
task.
of chemical
and yZ.
a structure
in complexity
of
The formula for this chemical
is C&O,
since an ester is formed
U
of an alcohol
and an acid with the elimination
of a molecule
of water.
semantic
atic
where
In adding
of morphemes,
incorporate
increases
list
of C2115+C211b+07.
from the combination
well
as oate,
methyl
the
trivial
the rules
However,
giving
a dictionary
we must
The problem
nucleus
small
and has a calculational
and propane.
the
uneconomic
to be a rather
is CII ,CII,CII,.
for methyl
then
chemistry
propane
but also
into consideration
that
To duplicate
to a relatively
appear
the
to the complex
errors.
of two morphemes
in organic
If in compiling
formation.
such
and take
on the propane
with the morpheme,
morphemes
theaddition
atoms
mapping)
To give
It is obvious
is not only difficult
consists
hand,
values
isobutane.
of chemical
containing
an alcohol
of the
one of the hydrogen
called
terms
On the other
CII,.
summation
commonly
usually
occurring
he is referred
to generate
would at first glance,
the term methyZ
by chemists.
is discussed.
is likely
(semantic
from morphology
used
method,
example
atoms
values
for hydrogen,
depart
fifth
by an algorithm
of computational
alsoaccounts
procedure
the conventional
of counting
of computer
standard
accuracy
of the
and used
terms
a group of allomorphs,
Then
its
of several
be modified
each
and
it was
statement
molecular
thousand
in Soffer’s
particularly
formula.
formulas.
as a means
possible
of the relationship
I had previously
IIowever,
for obtaining
equation
could
the “bonding”
to simplify
the
between
the
used
it did not OCCUI
the hydrogen
be replaced
morphemes
syntactic
rules
value
by a term
contrifor each
-41.
The value
is determined
The
morpheme
is
well
gen. However,
in which
identifying
an alkane
The
such
as
one
adds
in which
structure
ane.
its molecular
morpheme
an example
the ‘parent’
ending
You calculate
of heptane.
for chemicals
hydrogen
Thus,
is
ficient
The
in a chemical
chemical
formula
atom
name,
4hydroxy-3-
by starting
an oxygen
hydrogen
with
and
C7H1 6,
subtracts
two
oxygen
double
affects
the dictionary
along
Having
nitro
one must analyze
rules
for distinguishing
there
is
of the
hydrogen
with the remaining
the
semantic
between
of numerical
For example,
pent
or it may be a multiplier
verb,
one
total
a class
analysis
hydrogen
atom,
substituent,
This
procedure
of one functional
group
for hydro-
one’s
dictionary
calculations
by the syntactic
the correct
each
It is necessary
is NO,.
the
recorded
all other
is conditioned
name and for generating
in which,
bond
By confining
after
one additional
the hydroxyl
substitution
cases,
calculated
of ‘meaning’
is chosen,
atoms,
in adding
in more complex
and
contains
to morphemes
are performed,
a more
is possible.
the chemical
to know that
of one 11 atom
excluded
the assignment
the new approach
Nydroxy
with straightforward
down
procedure
for analyzing
atom of oxygen.
by the loss
it breaks
straightforward
same
of first
frequently
you add another
is balanced
works quite
tactic
method
if one considers
atoms.
but this
kanes,
is more apparent
from heptane.
formula
For hydroxy
two
is
derived
molecular
hydrogen
approach
by the previous
parent
heptanone
the
of this
the
to learn
attachments
content
value
homonyms
may be additive,
but are separated
text,
in which,
formula.
are employed
However,
more closely.
it is one nitrogen
bond.
once
It is not suf-
atom attached
The presence
It therefore
morpheme,
unfortunately
This
one finds
word,
it is
occur
as in a chain
by an intervening
that
is by a double
which
as in pentachlorohexane.
of English
a little
that
must
to
of this
be recorded
in
information.
of each
which
molecular
of the molecule.
semantic
prefixes
morpheme
methods
are ambiguous
carbon
situation
two words
necessary
in systematic
of five
e.g.
further
nomenclature.
Thus,
with morphemes
for al-
atoms,
is not unlike
in a sentence
a split
infinitive,
to provide
suchas
pentatriene
the problem
which
of syn-
are part of the
-42TABLEVII. GENERAL
PROGRAM
FOR CHEMICAL
NAME
RECOGNITION
1
RESET,FULLY
PROCESSALPHA
4
BEGIN HERE
STORAGE AND
START FORMULA
NAME
CALCULATION
ROUTINE
PLACEDIN
WORKING
STORE
/
l
TEST CURRE.NT
3
CHARACTER
NO
IEND-OF-NAME
FULLY
PROCESS
ALPHA
l-
t
r
/
6
IS
CURRENT
CHARACTER
PAREN?
It Yes~
1
I
ROUTINE
TO RIGHT
Unmatched
I
Portion
Matched
Portion
I
t
IS
ALPHA
CALCULATION
AND
MORPHEME
STORAGE
STORE
EMPTY?
\
1
LOOK-UP
c -_
lgnorable
r
1
DICTIONARY
Not
1
*MOVE CONTENTS
OF ALPHA
1'
I
!
CHARACTER
PROCESSED
4
FOR
(
No
PAREN
STORED IN
CALCULATION
ANDMORPHEME
AREA
t-6
STORE
CURRENT
CHARACTER
IN ALPHA
&
STORAGE
r
DOES ALPHA
b
b
STORE
CONTAIN
PROCESS
ALPHA
STORAGE
L Yes
EIGHT
A
I
-
c
COUNTOF
LETTERS
IN
ALPHA
STORE
1
-439
TABLE
CHEMICAL
COMPUTER
NOMENCLATURE
CALCULATION
GENERAL
1.
Chemical
lowed,
7
d.
name
to simplify
name
to-bottom
in storage.)
right
is
of name
of chemical
Determine
comma,
or delta
the
If it is not ignorable
7.
If current
character
alpha
storage.
as they
9.
up value
ters,
10,
When
the
storage
11.
match
location
character
character
is
in the name
in working
storage,
ignorable,
i.e.,
alpha
contents
determine
store
if current
or less
are al-
words.
is equivalent
i.e.
character
to top-
on the far
a dash
(hyphen),
number,
prime,
case,
store
the paren
storage.
is paren,
area of storage,
then store
in alpha
storage.
character
it in calculation
not ignorable,
are stored
of alpha
in paren
it in alpha
This
storage
storage.
alpha
and
storage
fully
al-
process*
Continue
processing
by counting
characters
storage.
for the contents
in alpha
than two letters,
is found,
enter
of the calculation
of the alpha
storage.
This
otherwise
the
storage,
might
there
calculation
storage
i.e.
from the morpheme
be the entire
is error
value
of the morpheme
and the morpheme
will
storage.
of the nu mber of letters
in alpha
eight
letters
dictionary,
or just
look
two let-
signal.
in alpha
the count
unless
is determined
MOv e any re mai ning unm atch ed port ion to the far right
change
characters
in five Univac
(Left-to-right
bottom
in which
and also
of morpheme
but no less
of sixty
are stored
storage.
process*
character,
characters
a “match”
Find
current
something,
go into
with
is a paren,
8, If it is not a paren
eight
in working
it and fully
6,
contains
characters
names
(space).
ignore
ready
starts
DESCRIPTION
Only chemical
Sixty
FORMULAS
name.
whether
5. If it is, then
until
placed
ANALYSIS
OF MOLECULAR
PROGRAM
on Unityper.
proFamming.
Chemical
3 . Processing
4.
is typed
VII
itself
in the next
in the morpheme
storage.
At the same
avail3 ble
area.
time this
alpha storage means that whatever
alphabetic
characters
are in alpha storage will be ex* Fully
process
amined so as to identify
the morpheme(s)
involved.
After finding a match for the right end of alpha storage
“fully”
process cannot be used if alpha
the remainder will be shifted and similarly
processed.
However,
storage processing
was started as a result of 8 count.
7
u.
Keep
on examining
more
characters
in name
until
there
are again
eight
characters
in alpha
storage.
3.
Continue
signal
the
process
until
(&) is encountered,
completed.
all
characters
computer
will
have
know
been
that
placed
processing
in storage,
When end-of-name
of all characters
has
been
- 45 -
TABLE
VIII.
CHEMICAL
ANALYSIS
NOMENCLATURE
DICTIONARY LOOK-UP
ROUTINE
BEGIN HERE
*
I
t
,
3
ISMORPHEME
_iA$,
Match
i
*
/
PENT-OCT
ON
PENT-OCT
LIST
_Yes
AMBIGUITY
SUBROUTINE
.
No
Match
I
END-OF-NAME
MORPHEME
REMOVE LEFT-MOST
CHARACTER AND
PLACE IN
TEMPORARY
AND
ITSCALCULATION
I
STORAGE
,RETURN CHARACTERS
IN TEMPORARY
STORAGETO
ALPHASTORE
1
LETTER IN
ALPHA
STORAGE
No
TO FAR RIGHT
Yes
1
Yes
1
/iRROBI
I
SIGNAL
I
No
ALPHA
STORE
EMPTY
*
Yes
ANY
CHARACTERS
IN TEMPORARY
STORAGE?
A
-46-
TABLE VIII
CHEMICAL NOMENCLATURE ANALYSIS
DICTIONARY LOOK-UP ROUTINE
1.
The longest morpheme match
to all morphemes
2.
If no match
is found,
group
4.
matched
for first.
The characters
in alpha
storage
are compared
process
begins
again.
in dictionary.
way thial is matched
3, Before
is looked
left-most
before
morpheme
character
is dropped
and matching
al.
is
stored
in calculation
area,
it is
checked
for being
in pent-act
of homonyms,
If the
morpheme
is found
to be in pent-act
group, then
a special
ambiguity-resolving
initiated.
5. If morpheme is not pent-act,
it is placed
in calculation
and morpheme
storage.
t
6,
In this
If alpha
store
is not empty,
it is shifted
to far right
and process
begins
over.
routine
is
- 47-
TABLE
IX.
PENT-OCT
AMBIGUITY
FOLLOWING
PENT--OCT
NO
MORPHEME
STORE
FOLLOWING
NUMERICAL
PENT-OCT
Yes
No
r
b
ISSECOND
' STOREALKYL
t
'
MORPHEME
MORPHEMEIN
Yes
$-
ISTORAGE AREA
9
1
PRECEDING
No
' PENT-OCT
MORPHEME
LALKYLENDlNGj
I
I
Yes
-+
PROCESSING
r
MULTIPLIER?
ENDlNG?
CHARACTER
*
ISMORPHEME
MORPHEME
iANALKYL
CURRENT
ROUTINE
r
ISMORPHEME
CALCULATION
RESOLVING
r
PREFIX CALCULATION
WORD FORCALCULATION
STORAGE
-48TABLEX.
MOLECULAR
FORMULA
CALCULATION
ROUTINE
L
J
LAST
No,
t
CALCULATION
7
I REPLACEALL
PARENS
‘Yes
1
EXiMlNE
'
*
,
PERFORM
HYDROGEN
, CALCULATION
. Yes
t
I
t
MULTIPLY
1
t-
A-Yes
q
START COUNTINGLEFT AND
w
RIGHT PARENS
r
'
!
w
I
[YES
.
AREAPAREN?
MULTIPLY NEXT
' NEXTWORD
'INCALCULATION
I
.
9
ISNEXTWORD
, No
NUMERICAL
/
-L
PREFIX?
No
rr. *
Yes
I
ADD ITO
LEFTOR RIGHT
PARENTOTAL
7
)
GO TO WORD
IMMEDIATELY
9
LEFTPARENS
[No . EQUAL RIGHT
4
PAREN?
Yes
FOLLOWING
FIRST LEFT PAREN
t
J
I
ADD ALL
#I
WORDSIN
CALCULATION
BYZERO
\I
I
WORD BY
NUMERICAL
PREFIX
AREA
TABLE
MOLECULAR
1.
Find
2.
If the
tinue
3.
a word which
next
If the
next
as first
in calculation
for other
word
left
is
Examine
5,
If it is not a paren,
6,
If it is a paren,
7.
End process
8.
Now
9.
each
This
11.
Replace
The
calculational
When
Hence,
Since
the
to the
paren
DB value
final
paren,
multiply
it by the multiplier
totals
of left and right
and con-
parens
counting
this
word.
it by the multiplier,
word
immediately
value
nitro is stored
in this
paren
totals.
totals
multiplier
prefix.
are equal.
the first
left
paren
and continue
looking
for
area,
word in the calculation
area
by zero,
NOW add
area.
total
formula
of each
is
the
with
is stored
iodine,
made,
Now calculate
formula
morpheme
double
/OO/OO, iodo
one character
equipment.
ccjunt.
total
represents
as +0/01/02/01
requires
paren
it is a numerical
in the calculation
word and every
of numbers
unless
following
all multipliers
in preliminary
calculation
the IJnivac
keeping
as the left and right
preliminary
pair
starting
calculation
multiply
as soon
gives
can be tested
paren,
in the calculation
10.
is not a left
add one to the left and right
every
successive
a left
So process
all words
area
multipliers.
successive
go back
Replace
ROUTINE
paren,
4.
multipliers.
CALCULATION
is a multiplier.
word
looking
FORMULA
X
for sign,
double
bond
II.
the calculated
as a twelve
bonds,
character
oxygen,
position
containing
number
nitrogen,
is replaced
+l/OO/OO/OO/OO/OO/
no formula
II value.
in which
sulfur
each
and carbon.
by the hydrogen
count.
and methyl
+ 0/00/00/00/1)1.
more than
nine iodine atoms
- 50-
Sampling
The
manual
translation
deliberately
selected
deliberately
chosen
as presenting
names
hexanitrohexadiene.
viously
morpheme
origins
test
found
is
that
structure
is
as it would
sample
chemical
at
the
to keep
test
morphemes
give
wrong
of chemicals.
randomly
selected.
pent-act
ambiguous
of parens.
The fifth
top
of each
until
required
than
did
by CA.
quite
should
acyclic,
that
amino)e
the
count.
would
37
38
No.
were
For example,
morphemes
example
correctly
inconsistently.
Names
t h an e. C.A.
same
the
as e.g.
shown
pre-
require
a name
in the
was
should
elimination
The following
which
illustrate
name
of chemicals
some
GHMNZS
to check
Abstracts.
to cover
which
of the samples
He was
contain
Chemical
Name
2-hydroxy-2-methyl-butyronitrile
2-amino-4-(methyl-thio)butyronitrile
4-methoxy-2-buten-l-ol
40
c5HllNo3
methylnitro-2-butanol
52
Cs H,N,
3,3’-iminodipropionitrile
56
C6H1,N0,
57
%H 1ZN204
2,3-dimethyl=2,3=dinitrobutane
58
c61%202s
3-(propylthio)-propanoic
I59
C,H 13NO
4-dime thylamino-2-butanone
71
C&NO
3,4-dimethyl-2-oxopentenenitrile
73
c7H1002
3-ethylidene-2,4-pentanedione
80
C,H,
1 -dimethylamino-2-methyl-3-buten-2-01
80
C,%sNO,
6-amino-4-oxo-hexanoic
acid
acid
[bis(2-hydroxyethyl)amino]-2-propanone
on
cyclic
located.
C5H100z
a
the first
from the morphemes
39
,NO
from the
in the dictionary
to Chemical
Formula
C,H,NO
attempt
each
routine.
a clerk
be obtained
that
was omitted
that would
on the
by I.U.P.A.C.
the principle
by asking
Index
could
of hundreds
which
The
of good
not be based
all the morphemes
was done
Subject
principle
ambiguity-resolving
tested
1958
located
program
a special
This
This
for hydrogen.
A basic
violates
substituent.
In any computer
obtained.
column
the
calculate
correspond.
I had deliberately
was
not
,IloZecuZar
CA Page
Some of these
is 1,1 ‘-(ethylenediimino)di-2-propanol
hydrogen
satisfied
This
rather
nesting
indexes
name
C, A, ‘s, imino
scanning
list.
were
that contained
represent
of chemicals
the
and
example
consistently
I was
random
in CA.
l,Z-bis(Z-hydroxy-propyl
including
When
on dozens
Others
complex
to be used
A C.A,
should
all systems,
told
found
was
of compounds,
morpheme
several
involved
imino
nomenclature
tested
of these.
chemicals
nomenclature
was
difficulties.
included
Others
is typical
Certain
procedure
Method
.
- 51 -
I have
come
intentionally
under
for the
the
The
purview
person
the
interesting
was
testing,
to the list
An additional
continuous
numbers
was
as is shown
of my instructions,
the yro and ion were
had to be eliminated
not anticipated
of ignorable
it was
though
apparently
not on the list
from the computer
in preparing
was
taken
from the
in the cross-reference
the computer
Merck
index.
Index.
This
testing,
gave
was
53
hylaminopentane
2-aminoethanesulfonic
C,H,NS
2-aminoethanethiol
315
4
a-amino-a-iminoe
c2 H6N2
666
GH
were
still
Further,
be easily
added
test
acid
thane
20amino-4-methylthiobutanoic
1 ,NO,S
made that
theuseol
Name
2-amino-S-diet
C,H,NO,S
difficult
a
of page
ic acid
1013
a further
by taking
4-aminobutano
C9H22N-2
As
have
1-aminobutane
c, H gN02
it could
It would
q_uite a scattering
738
though
but not the
done
Chemical
C4H1,N
178
acid.
difficult
of morphemes
program.
This
M. F.
S ulfonic
do not
below:
Page
Selections
they
characters.
sample
of chemicals
that
37, 38 and 52 even
N-[(2-[l,l-dimethyI-2-propynyZoxy]ethoxy)methyl]diethylamine.
following:
random
series
that
for pages
In spite
to note
example
the
compounds
experiment.
sample
use of the N as a locant
to be added
the
of this
taking
One other
human
listed
could
alpha(a)
as a locant
to the list
of the algorithm
not be handled
was
of ignorable
several
by the experimental
not anticipated
acid
dictionary
as e.g.
for the computer
program,
characters.
chemists
were
asked
to coin
names
that might
be
to handle.
A few
of these
were
3,7-dimethyl-2,6-octadienal,
1,4-bis(methanesulfonoxy)butane.
The latter
3,3 ‘-dithiobis(2-aminopropanoic
is not covered
by the experimental
acid)
and
tested
on
dictionary.
Debugging
As
a further
the Univac
puter program
compounds
eliminated.
test
I, The so-called
which
were
of the procedure,
debugging
had to be traced
well
selected,
as
fifty
procedure
meticulously.
the
computer
of the randomly
uncovered
Apparently
went
into
selected
dozens
of coding
the first
loops
compounds
twelve
on each
were
mistakes
in the com-
deliberately
one until
chosen
the bugs
were
- 52 -
A More Significant
It is obviously
sive sampling.
it fails,
However,
he will
in order
example
were
surprised
belief
the
importance
and
that
indexer.
Most
morphemes
more
is quite
could
chemists
quickly
without
Every
structures.
for him to calculate
steroid
particularly
interesting
of so-called
During
aware
the
two
latter
term
accurate.
in what
endings
the
grammers
generic
entire
chemist
about
that
draw a dia-
a dozen
chemists
This
a large
the steroid
All
confirms
my
can be helpful
enough
but the DB rules.
no matter
When
first
It certainly
memorized
knows
work,
draw a diagram.
algorithm.
device.
have
chemists
he would
a brief
anything
formulas
are
accurate.
This
how complicated
to
number
includes
nucleus
is C
the name
of
the
so it
17
may be.
I shall
this
call
ene,
the
pert-act
ambiguity-resolving
bonding
morphemes
that
DB values
the
yne are the same
between
which,
surprisingly
enough,
difficulties
an alkyd group
the more precise
the chemist
were
and an alkyd or aLkane
name
radicals.
To use these
terms
to describe
alk-yl
does
not associate
bonding
all of the suffixes
morphemes.
and ium since
routine,
they
it would
of which
two double
contribute
are really
completely
some
of this
this
from memory,
may
bonds.
not
Thus
ending.
alkanes.
morpheme
to describe
group
in keeping
is quite
morphemes.
only
be obvious,
The chemist
the DB value
for nitrilo,
in-
grouped
are morphemes
the operation
need
of
is the
of the chemical.
of bonding
the chemist
pro-
Alkyl
suffixes
class
of
Neither
can now be properly
to the DB value
be more accurate
morphemes
algorithm
The members
that
definition
has no generic
encountered
the generic
bond as being
i.e.,
has been
have
for alJ morphemes,
of a triple
research
of this investigation,
as the alkyl
to learn
of this
hydrocarbons
idene,
yne,
hlorphemes
Open chain
for hydrocarbon
Furthermore,
Bonding
or suffixesfor
of the difference
ane,
think
learning
product
course
suchas
esting
teaching
that
I showed
to such
it will
36.
One
term.
Invariably
already
steroid
The
a class
formula.
it, that
by more exten-
of imino.
the claim
reason,
useful
be proven
he has used
For this
be reduced
organic
quite
cyclic
simple
page
*See
graduate
once
to verify
the
can be an extremely
to calculate
complex
to calculate
of the algorithm
as in the case
test
formulas.
calculation
the algorithm
intuitively,
an informal
molecular
each
the
knows
validity
in the nomenclature
was
asked
that
the absolute
the chemist
to calculate
five
that
find ambiguities
Of further
gram
important
Test
learn
does
cyano,
In the
in terms
of
It is interthe correct
not usually
diazo,
and
DB,.
Conclusions
I believe
can
there
be no doubt
are a number
that
one
can
of impor tant
conclusions
ca lculat e molecular
that
formulas
can be draw n from this
from chemi cal
work.
no menclature
There
. Th e
- 53 -
grammatical
work that
remains
large
that a group of chemists
able
length
be taken
of time.
in the
analysis
membership
cedure
diagrams.
ber
Further,
each
could
In fact,
arrangements
the appropriate
marpholine,
there
by nitrogen
This
oxygen
It would also
terminology
a method
of other
for translating
the work involved
nomenclature,
and
standing
ably
linguistic
linguist
alike.
written
On the
of organic
other
study
purposes,we cannot
large
dozens
50% of the effort
part
of the
work
of chemical
able
to describe
deal
with
could
that
that
names,
the
the problem
help
text
than
goes
is
into
done
both
simply
pro-
structural
small
num-
would
In the
case
of nicotinoyl
but the replacement
of carbon
considerable
analyze
arise
programming
the chemical
of that hguage,
languages,
such
arrive
at
as Russian,
by transliteration
indexing
in reading
new and old.
normal
article
precision
of Russian
new
and
will
chemical
grammatically
reaped
a useful
cannot
for indexing
problem
involves
be re-
of chemical
More
names,
the recognition
a corresponding
an
and other
in chemistry.
a very poor harvest
without
a suit-
chemical
the linguist
is in the analysis
have
that
be much too formidable
texts
documents
fo r
and under-
interesting
nomenclature
of the indexing
chemical
promise
in teaching
offers
of chemical
discourse
chemistry
great
It is not improbable
of chemistry
of analyzing
We will
holds
postulate
field
a 50% resolution
of a chemical
of synonymy,
the
that
if we are to find methods
better
to its
difficulties
one could
per se.
If the problems
expect
a machine
requires
greater
classification
I suspect
to foresee
are a relatively
For certain
it can mean
discourse.
then
according
chemico-linguistics
of normal
by linguisticanalysis,
of syntactic
names.
that
solved
the type
classified
atoms.
procedures,
easily.
i. e,
chemistry
a reasonthat could
be true of displaying
there
to be so
analyses.
I believe
for the
Certainly
rings
hand,
model
obstacle.
chemical
would
as one can already,
to chemistry,
even
possible
the transformations
quite
For thechemist,
and
grammar
by similar
of the chemical
approach
nomenclature
structures.
than
most
great
i.e.
the hexagon,
by the grammatical
terminology
not be very
understand
The
chemist
should
are many shortcuts
for substituent
and by establishing
chemical
it within
The programming
and morpholine
that
completing
to include
in that
configuration,
to conclude
languages
The same
to the diagrams
be aided
be safe
it is quite
in chemistry.
in the pyridine
work would
is expanded
sophisticated,
required
is only one topological
and/or
ingenuity.
names.
is less
additions
then
not appear
there
as a part-of-speech,
categories,
but it does
any difficulty
work
is described
problem
large,
is at their disposal,
standardized
the latter
quite
have
grammatical
grammatical
generate
of topological
in making
If the
is still
would
if a computer
morpheme
in various
which
and linguists
analyses.
in which
to be completed
A
of
if we are
ability
to
- 54 -
TABLE
RANDOM
butane
SAMPLE
OF CHEMICALS
TESTED
= C&I 10
2-aminoethanol
= &II 7NO
1,4_bis(ethylamino)butane
1,3,5=heptatriene
= C&oNz
= C7 II 10
1,2,3,4,5,6,7-heptaiodooctane
= CsH 1117
2-[(3-aminopropyl)ethylamino]ethanol
= C7II 18N20
1,4-bis[bis(3-diethylaminopropyl)amino]butane
1-methylsulfonylbutane
1-propanethiol
3-pentanethione
= C,,H,,N,
= C, II ,,O,S
2-methylpropanedioic
acid
= C4H,O,
= C,H,S
= C ,H, ,S
1,6=dinitrohexane
= C ,I1 1 2N ,O 4
2,Sdiaminohexanedioic
acid
4-oxo-heptanedioic
acid
= Cs II 12N20,
= C7 H 1 ,O 5
1-dimethylamino-2-methyl-3-buten-2-01
= C,H l ,NO
l-ethylamino-2-me
thyl-3-buten-2-01
2-(hydroxymethyl)
-2-propyl-1,3-propanediol
3-ethyl-2-amino-3-pentanol
= C,H l ,NO
2-propeny l-2-pentenoic
= C,H ,NO
acid
= C 81-Il 2O 2
2-ethylidene-3-methyl-1,5-pentanediol
= C&I l 602
2-nitro-2-pentyl-1,3-propanediol
= C,H 1 ,NO,
3-diethylamino-2-methyl-1-propanol
= C ,H l ,NO
$5 ‘-oxybis(2-methyl-2-pentanol)
l ,l-diiodo-2-nitro-1-pentene
nitrate
= C,,I1,,0,
4
= CSH712N02
= C&I 1 INO,
2 ,$diiodo-hexanedinitrile
1-aminobutane
= C,H & ,N,
= C, H1 1 N
4-aminobutanoic
acid
2-amino-1-butanol
= C 4H ,NO 2
= C,II l ,NO
2-amino+diethylaminopentane
&aminoethanethiol
= C,HzzNz
= G H 7NS
2-amino-5hydroxypentanoic
acid
1-amino-l-iminoethane
3-methyl-1-pentyn+ol
= CJI
bis(hydroxyethyl)amine
= C 5H 11NO 3
= CzH gN 2
2-amino-4-methylthiobutanoic
1 ,3-butadiene
= C,H 1 603
= C,H 1 ,NO
%hydroxy-6-octene-2,4diynenitrile
pentyl
XI
acid
= C 5II 11NO 2s
= C6H 100
(j
= C 4H 11NO 2
2,2-bis(hydroxymethyl)-1,3=propanediol
= C 5H 120 4
ON COMPUTER
PROGRAM
- 55-
TABLE
XI (cont.)
Z-ethoxyethanol
= C,H, 002
dimethylenimine
= C,HsN
3,7-dimethyl-2,6-octadienal
= Cl&60
3,3 ‘-dithiobis-(20aminopropanoic
acid) = C6H12N20&
l-iodo-3-iodomethyl-5methylheptane
= C,Hl&
1,4-diiodo-2-(methylbutyl)-butane
= CgHl&
methylsulfonylethane
= C,I1,0$
(2-hydroxyethyl)-4-hydroxymethyl)-3-propyl-l,6-hexanediol
= Cl$l2604
methylthiopropane
= C4H 10s
1-(propylsulfinyl)butane
= C+I 160s
ethylsulfinylethane
= C4H 100s
ethanamide
= C2H $0
butanediamide
= C4HgN202
methylthiopropane
= CA HI 0 S
nitrosobutane
= C4 H, NO
ethylmethyl
peroxide
= C, 1-I802
iodosoethane
= G H$O
iodoxypropane
= C, l--I$02
sulfopropanoic
acid = C 3H 60 5S
ethanethial
= GH4S
trichloromethane
= CHCl3
tetranitromethane
= CN 40 8
1-nitro-1 ,1,2,2,2-pentachloroethane
= C Cl NO
2 5
2
hexach loroethane
= C $16
1,1,2-trichloroe
thane = G 113Cl3
octachloropropane
= C3 Cl 8
propylnitrate
= C 3H 7NO 3
1 ,l ,1,3,3-pentachloro-2,3-dinitro-2-trichloro-methylpropane
= C4Cl&04
U
4-chloro-3-butyn-l-01
= C,H,ClO
2-methyl-1,2-dinitropropane
= C $8 N2 04
1 ,+diamino-2-butanone
= C4H,oN,O
1,3,3,4,4-pentachloro-2-methylcyclobutene
= C&Cl5
pe nten-4-ynol
= C5 i 160
4,5,5-trichloro-4-pentenylamine
= C,H&l,N .
dimethylcyclopropane
= C,H,o
chloropentanol
= CSH, 1 Cl0
pentachlorobenzene
= C6HC15
2-aminochloronitrophenol
= C&ClN203
benzenediol
= C6k$jO2
2,6=dichlorocyclohexanone
= C&$&O
1 ,I ,l-trichloromethyl-3-penten-2-01
= C6H&130
l-cyclopentene-l-methanol
= C,W,,O
chlorocyclohexane
= C&? l I Cl
2-amino-4-butyl-6-nitrophenol
= CloH, 4N2O3
(1-cyclohexen-l-yl)
butanone
= C,,H,,O
2-phenyl-2,4fi-cycloheptatrien-l-one
= C1 ,I1 ,O
7-(2,4,5trichlorophenoxy)heptanoic
acid -= 6 I 3H I &1303
ethyl 2-cyano-5phenyl-2,4=pentadienoate
= C1,H, 3Nb2
7-(4=dimethylaminophenyl)-2,4,6=heptatrienenitrile
= ClsH16N2
l-3-bis(aminophenoxy)-2-propanol
= Cl SHY 8N203
4,6-dibutyl-3-methyl-2,4-dinitro-2,5-cyc
lohexadien-l-one
= ClsH22N205
2,4-dimethyl-3-octyl-2-cyclopenten
-l-one
= Cl sH260
2-nitro-4-phen
yl-l-naphthol
= Cl611 1 INO3
l-(nitrophenyl)-4-phenyl-2-butene-1,4-dione
= Cl&l 11N04
-cycle hexen-l-one
= C 1611140
2-(naphthyl)-2
diphe nyl-3-but ynol = Cl 6 HI 4 0
- 56 -
APPENDIX
I .U.P.A.C.
A Summary
use both
In summarizing
emphasis
has
been
quire
even
necessary
have
no difficulty
more
corrlplex
chemist
molecules,
who comes
explanations
rules
at least
acyclic
for naming
reasonable
I suspect
he would
to the subject
that
with certain
will
nomenclature
help
and accurate
of names.
structural
would
diagrams.
in this
derivatives,
for simple
experiment.
should
chemicals.
less
For the
difficulty
on his knowledge
re-
This
a non-chemist
no more and possibly
based
of the mean-
The latter
covered
names
preferences
for the non-chemist.
in the recoLgnition
hydrocarbons
hydrocarbon
have
Names
of understanding
chain
of its
of Systematic
for the generation
straight
perfectly
Example
organic
to the extent
the instructions
creating
a Detailed
of I.U.P.A.C.
than complete
for the
by following
The ref ore,
principles
of chemistry
Nomenclature
and Generation
on didactive
rather
Chemical
Including
in Recognition
placed
a knowledge
is not
of Principles
the basic
ing of chemicalnames,
Organic
than
the
of chemistry.
Punctuation
Commas
are used
between
numerals
which
refer
to identical
operations
as in 1,2,3-tribromo-
h exan e.
Colons
are used
between
groups
of numerals
for similar
but distinct
operations
as in 1,2:5,6
diisopropyZidenesorbito1.
YumeraZs
2- bromohexane
should
rather
be placed
than
Z-hexanol
would
be rather
bon atom
in the
longest
use
of numerals
and
German
literature.
are popularly
numbered
acid also
known
are
immediately
bromo-Shexane;
commonly
and present
In some
this
as serine.
hexan-2-01
encountered.
chart-r of carbon
legion
in front
systems
atoms
a major
Creek
of the syllables
rather
The
numeral
contained
obstacle
letters
way as in P-hydroxyalanine,
than
to which
hexanol-2.
designates
in the chemical.
to comprehension,
are used
which
instead
is also
they
refer
as e.g.
However,
in the U.S.
the number
of the car-
The variations
especially
of numerals.
in the
in French
Amino
2-amino-3-hydroxypropanoic
acids
- 57 -
Order
Prefixesarearranged
the multiplying
in alphabeticaL
prefixes
are then inserted
of Substituents
order.
The atoms and groups
are alphabetized
2-bromo-1-chloro-hexane;
as in:
first
and
4-ethyL-3-methyl-hexane;
and l,l,l-trifkoro-3,3-dimethylpentane.
Elision
The terminal
following
letter
e is elided
before
is a consonant.
a vowel
Propane
of an organic
i
becomes
suffix,
but
not
hexan-&one
propanone;
in cases
where
the
becomes
hexane-2,3-
dione.
Hyphens
These
ical
are used between
Society
derivative,
vative,
uses
hyphens
also
thiaeompound,
amide
two identical
group,
when
letters
partial
m,ethoxy-group,
In English,
to avoid
names
words
as in tetra-amino.
end in a voiced
but not after
chemical
ambiguity
a consonant
vowel
in such
The Chem-
or y as e.g.
places
in amino-
as methyl
deri-
do not end in vowels.
Parentheses
Parens
a string
entire
are used
of morphemes
parenthesized
atom
is contained
expression
ethylphenyl)-butanol
first
when necessary
means
in a four carbon
to clarify
in parens
which
is a substituent
that
the
entire
(but) chain.
the limits
of operations
is preceded
of a parent
expression
by a numeral,
structure.
were 1,2 bis-(P-amino-2-ethylphenyl)
be multiplied
by two, i.e. it occurs
butanol
this
the entire
at both the first and second
carbon
means
For example,
&amino-2-ethyzphenyl
The word mono is understood
the chemical
but not unnecessarily.
I-(P-amino-2-
used.
parenthesized
atoms
that the
is attached
but rarely
If
to the
However,
expression
in the chain
if
would
C-C-C-C.
Terminology
Parent
the
ever,
rules
is
a very
ambiguous
for deciding
no matter
what
which
name
term in chemicalnomenclature,
morpheme
is chosen
the parent
all other groupsof atoms
in the molecule
and
in ethanol.
ethane
one time,
is the parent
was
true when
chemicals
in a name
This
shall
term no longer
were named
be considered
morpheme
are attached.
especially
refers
Thus
the parent
to that
benzene
of the shortest
one considers
morpheme.
,group of atoms
is the parent
has any chemical
on the basis
when
to which
in nitrobenzene
significance
chain
Ilow-
length.
which,
at
- 58 -
Group or radical.
of these
Most
sisting
are
Any group of atoms
single
morphemes
of the morphemes
meth
commonly
but some
Function
A functional
hydroxy
group
group is a group
gives
alcoholic
atom which
is doubly
is frequently
difficult
thev
to make,
but is an important
Thereare
viously
several
discussed
one
types
trivial
hydrogen
atom such
atom
as sulfur
of names
is replaced
con-
A ketone
owes
between
its properties
what
in naming
of a chemical.
to the oxygen
is functional
chemicals
The
and what_ is not
regardless
of how
are respectively
another,
involves
tion of double
between
There
functional
are other
class
name
bonds
types
of names
such
as ethyl
such as naphthaleneacetic
However,
“composed
in this
brief survey,
wholly
[cf. I.U.P.A.C.:
of specially
Nomenclature
or group.
as oxygen,
such
atoms
sees
a chemical
4,5,8-trihydroxyoct-3,5-dienoic
acid, he
sense
diagram
out of it. The structural
atoms
hydrogen
propanol
where
names,
and
one
hy-
where
one
propanethiol,
as e.g.
atoms
in aliphatic
names
end-
are removed
by the
crea-
such
a name
from a radical
and
such
as styrene
as benzofuran
with
syllables,
formed
systematic
with or without
London:
conjunctive
oxide,
and
other
names,
i. e.
numerical
Butterworths,
1958(p.
thylaminoprop
yl)amino-7
cyclics.
names
prefixes”
4)],
In a Name?
like
probably
for this
names
concerned
Chemistry.
name
are replacement
substitution,
C-C-C-C-C=C.
names
or selected
What’s
Yhen the layman
--
additive
and fusion
of Organic
from the pre-
involve
in pentanol,
as for example
where
we will be primarily
coined
There
aside
which
element,as
as radicofunctionaZ,
alcohol;
acid,
are names
of specified
or hexyne
carbon
nomenclature
II.
the removal
by hexene
There
or another
and C-C-C-S
ing in ene or yne exemplified
or triple
in systematic
names.
radical
such
C-C-C-OH
name
of Names
by a group
by the hydroxy
replaces
A subtractive
names
group
group.
the mode of activity
artefact
encountered
and semi-systematic
drogen atom is replaced
which
CH3 is a methyl
or radical.
act.
Types
where
a group
Group
defines
The distinction
is called
of morphemes.
or Functional
to an alcohol.
to carbon.
together
OH is the hydroxy
of atoms which
properties
bonded
are pairs
3owever,
and yl.
occurring
7-bis(3-die
wonders
chemical
how it is possible
is
‘butylamine
for chemists
to make
- 59 -
CIIZ-CIIZ-CIIP-CII3
I
011 N-H
011 011
I
I
HC-C-Cd
H
I
I
C-
C=C-C-C
II
II
//o
II II
‘OH
I
II II II I II II II
(C+-CIi,),N-C-C-C-NN-C-C-C--N(CI12-CI13)3
II II II
II II II
IIowever,
chemical
names
are surprisingly
can be derived
from a relatively
periments
Table
(see
short
simple
list
name
pal functional
noting
as simple
Including
MateriaZ
than one
as it sounds,
for Advanced
ated
precede
of the
before
this
to form
valent
aldehyde,
are named
quite
to Table
acid
En is a bonding
double
Unsaturation
bonds.
in my ex-
Barnes
complex
ketone,
XII,
simply
morpheme,
The entire
that
is,
structure
It is
functions,
worth
Course
1957) completely
a-
that is, chemicals
there
is
is no rational
a preferred
with
way
of
order
of
Cahn
before
alcohol,
a preferred
order.
Since most chemicals
letter
of these
unsaturation
nomenclature
sequences,
that
i.e.
mar-
has an assock
as are di and en which
in the basic
atoms
that
understands
morphemes
a morpheme
of hydrogen
.
etc.
in the exampleshown,
of short
is such
of organic
this
of the Beginning
by now that the reader
it denotes
to the removal
simple.
will specify
that each
name
princi-
ketone
into a series
it will be noted
i.e.
Unfortunately
& Noble,
one can conclude,
It is assumed
at the end of this
refers
order
a chemical
The senior,
An Outline
and others
before
by this
can be parsed
oic
--
I.U. PA .C. does not stipulate
order,
Britain
aldehyde
name
The
used
in creating
it is quite
to do so because
Abstracts
function.
molecule.
so-called
matter
p. 43.)
above
Chemistry
Chemical
By reference
it.
of naming
is the acid
meaning.
was
and numbering.
6th Ed. New York:
function
chemical
phemes.
Organic
Study;
problem
opus cited
in the example
though
in the U.S. and Great
each
(cf.
or for that
of nomenclature
though
Degering
of this
with
the principal
pattern
(Cahn,
principle
--acid
agree
groups.”
IIe is well advised
precedence
which
made
Group
in understanding,
group.
this
as that
Functional
functional
explaining
would
be done
the whole
others
a discussion
more
must
out the functional
that among
voids
that
group sets
not always
number of those
and a large
VI).
thing
is to ‘(seek
to understand
ofmorphemes,such
Principal
The first
Y
attached
is based
carbon
chain
to carbon
atoms
on the theory
of co-
bonds.
Most Unsaturated
In naming
this
chemical
no difficulty
would
Straight
arise
Chain
concerning
the next principal
group
as
-6O-
there
is no choice
here
greaterunsaturation
I
with
the
most
were
a side
two
and a longer
chain
double
chain
achemical,
between
and
containing
the saturation
functions
come
triple
at
with
bonds
end
would
name
The
this
third
is meant
There
criterion
not the longest
is, indeed,
regardless
ticular
a school
of the atoms
chemical,
chain
for selecting
is eight
an eight
the
prefers
chain
be the case,
priority
words,
the
e.g.
chain
if there
item in naming
the so-called
when
with
this
principal
is possible.
Chain
name
is the principle
chain
and nitrogen
is why the next
of the longest
of consecutive
whereby
can be made
of carbon
chain
were, then
morphemes
the principle
A good case
involved.
long and that
chain
that
In other
but the longest
of atoms,
would
As the second
by bonding
Longest
of a shorter
If there
This
from the right.
the proper
of thought
longest
atoms
carbon
chain
unsaturation.
bonds.
preceded
The
alternatives
be selected.
double
second
of the
perplexing
no or less
two additional
is indicated
the
sometimes
chain.
carbon
atoms.
chain
is used,
the longest
for it in many instances.
atoms
morpheme
is fourteen.
In this
The longest
to the left of dien
By
par-
carbon
is octa signifying
(C-C-C-C-C-C-C-C).
Numbering
After
making
the decision
parent, then onenumbers
is attached
preceding
atoms
number.
diene
as
atoms
In our example,
pattern
the numbering
merals
sequence
each of the contiguous
the lowest
sequently
as to which
the
of atoms
giving
in the
molecule
the atom to which
the oic acid
function
will be (llO)O=C-C-C=C-C=C-C-C.
12345678
two double
bonds are located
will
the functional
is the principal
This
between
become
will
carbon
group
function,
explain
atoms
the
con-
the
nu-
3 and 4 and
5 and 6.
Substi tuen ts or Prefixes
Once
the
selection
of the parent
chain
the bonding morphemes and the principal
chains,
all
plexity
and
of which
may be regarded
of the chemical.
eighth
atoms.
In this
They
functions,
case
in turn by the morpheme
hydroxy,
this
themselves
as
are
atom attached
placement
gens
Silence,
is
to the seventh
of one hydrogen
also
substituted
replaced
butylamino
is
atom
e.g.
by a radical,
group
the butyl
Cll~ClI,ClI,C11~~II-.
Y
Y
U
parent
to name
hydroxy
3,4,8
trihydroxy.
butylamino
in the octane
atom by the amino
are three
hence
which
structure.
(NII~),
U
radical,which
Ry a similar
as well
or sub-names
the numerals
fix tri followed
name
groups,
there
by using
completed,
it only remains
as radicals,
particular
are specified
has been
as adding
assuffixes,
the substituents
depending
groups
followed
Ordinarily,
upon the com-
at the third,
fourth,
by the numerical
The remaining
means
or side
that
pre-
substituents
there
amino
in
is a nitrogen
implies
the re-
but in this case,
one of the amino
hydro-
is composed
of a four carbon
chain.
building
up process,
the last
portion
of
-61-
this
name,
(diethylam,inoprop
since
the parenthesized
ever,
on the right most
chain
is the following:
expression
has
the same
chain
by bis, it simply
repeated,
i.e.
which
means
means
group
is attached
HOW
that
we really
bis(diethylaminopropy1)
specifies that the left most amino
simply
(C~IIS)‘-N-CII~-CI~~-C~~~-N-.
Y
Y
a.4
is preceded
- N-CI$-CH,-CH&,N-,
[(C,Hs),
diethyl
nitrogen
yl) amino
the other
have,
amino.
to the third
-
bond
for this
side
The 3- preceding
carbon
atom in the
prop yZ chain.
This
sketch
oftheproblems.
hydroxy
name
Of interest
group
morpheme
of the rules
rather
is used.
of this
to the linguist
than 01. It is only
Were the carboxyl
chemical
3,4,8&hydroxy
and explanation
would
and the addition
is the choice
when
group
change
of this
example
of allomorph
to be made
the principal
(oic acid)
function
by another
but primarily
as a suffix
giving
does
not cover
e.g.
is an alcohol
to be replaced
considerably,
of tetrol
very complex
for ON, the
that
this
hydroxy
by the elimination
us a name
ending
all
latter
group, the
of the prefix
in octal3,5-dien-1,
4,5,8-tetrok
Since
chemicals
tions,
the reader
ferred
choice.
cyclic
nomenclature
point
but
there
it is the
if the
the
is
can
always
be pieced
case
communication
the difficulties
trying
to use
confused
state
generates
chemist
grams.
together
Perhaps
with ideographs.
this
the name
ed disucssion
of the Geneva
unnecessary,
and
that
strong,
opus
cited
ILE.,
T able
twenty-three
primary
name.
generic
ceased
a closing
be used
groups
quotation
for the purposes
both pertinent
review
of chemicals
synthesized
take
into considera-
it the principal
If that
This
forgets
take
rely
chemists
“ Prof.
P.F.
of a register
of I.U.P.A.C.
function.
in naming
because
were
it
not the
is not to underestimate
from the fact
and ironically,
as a condensed
how long,
of the variations
from the British
it is said
no matter
of morphemes.
chemists
of the Japanese
in which
be better
Wiser
i.e.cyclic,
involved,
not always
to the
to a ring,
out the chemical
arise
invariably
He does
call
long ago.
difficulties
he has chosen.
p. 130) seems
XII can also
such
Conference
it would
have
would
in spite
to the dictionary
for the success
In this connection,
that
increase
of
in turn to a ring, the British
indexer
likely
in figuring
nomenclature,
an ambiguous
accounts
the CA.
logic
chain,
and that
to make a pre-
If considerations
is attached
over the acyclic
emphasizing
would
In general
function
and combina-
having
of nomenclatural
and more than
by reference
chemists
at different
to a chain
no difficulty
“systematic”
to try deciphering
names.
priority
substituent
have
of decipherment.
chemist
arrive
absurdities
is attached
it is worth
will
between
is given
permutations
when
as a substituent,while
of the cyclic
generally
the
of different
one may encounter
If the principal
group
radical
a multitude
chemists
then
which
this discussion,
one
that
confusion.
functional
the cyclic
with
the difficulties
wonder
system
complexity
chemicals,
imagine
mass
cyclic
Inclosing
prepared
are introduced,
principal
would treat
tion
can well
It is no small
where
then
can be
that
the distraught
one of the rules
the trouble
strictly
and in his
to ask
on structural
who are used
another
dia-
to working
Chemical
Society’s
heat-
Frankland
thought
names
to use
formulae.”
(Arm-
prophetic.
nomenclature.
by the organic
chemist.
It covers
Each TV@
-629
is shown
tional
by indicating
Following
group.
specified
value
In this
experiment,
sist
an R group, the conventional
the
generic
name,the
of R and/or
R,‘one
can quickly
particular
of the homologous
one, two, three,
This
can
complete
etc.
be used
list
attention
series
carbon
meth,
atoms:
in applying
of the morphemes
TABLE
the
algorithm
used
commonly
determine
given
eth, prop,
Name
used
but, pent,
of chemical
where
the R values
i.e.
value
for each
morpheme
is shown
of molecular
in Table
func-
For any
to expect.
would
where
R equals
is shown.
formulas.
VI on pages
A more
30-31.
Morpheme
Value
R-CH3
&an
R=CH2
alkenes
RrCH
alkynes
Yne
DB2
R-OH
alcohols
01
01
R-SH
mercaptans
thiol
Sl
R-
radicals
vl
(+ )
R-O-R’
ethers
DBO
DB1
Ol
. .
sulfides
thlo
s1
sulfoxides
sulfinyl
S1+O 1
R-SO?-R’
sulfones
sulfonyl
SlfO2
R-&O
aldehydes
al
Ol+D131
R-CH=S
thioaldehydes
thial
SlfDB
R-C( R ‘)=O
ketones
one
Ol+DBl
R-C(R‘)=S
thioketones
thione
Sl+DBl
R-COOM
carboxylic
RCSOH
thio
RCOO R ’
salts
R-COX
acid
RCONH2
R-S-R’
R-SO-R
’
con-
NOMENCLATURE
ane
es
name
and act,
for the calculation
OF I.U.P.A.C.
is listed.
hept,
hex,
the calculational
to the appropriate
morpheme
the sort
to compounds
in the experiment
XII. SUMMARY
Generic
Structure
most
was
Finally,
symbol for radical attached
acids
acids
& esters
1
oic acid
02+DBl
d
thioic
Sl+Ol+DBl
acid
oate
02+DBl
oyl halide
Ol+DBl+X
amides
amide
Ol+DBl+yl
R-CN
nitriles
ni trile
DB,+Nl
d
R-NO2
nitro
nitro
02+DBl+N1
R-NO
nitroso
nitroso
Ol+DBl+Nl
RON02
nitrates
nitrate
03+D131+N1
halides
derivatives
1
l3IBLIOGRAPHY
Armstrong,
H. E.:
Contributions
clature
L. : Language.
Bloomfield,
CA:
Naming
& Indexing
Abstracts,
1957.
Cahn,
R, S.:
Cahn,
R. S, & Cross,
to an International
of Cycloids.
Proc.
Chemical
An Introduction
L.
Compounds
to Chemical
Fourth
Crane,
E. J.:
Annual
Report,
CA Today
Council
The
Nomen-
Society
Columbus:
gutterworths,
Authors.
Chemical
1959,
London:
The
Chemical
1960.
on Library
The Production
-
Abstracts.
London:
for Chemical
Society,
CLR:
of Nomenclature.
1892,127.
by Chemical
Nomenclature.
Handbook
C,:
Sot.,
IIolt & Co., 1933.
New York:
of
System
Chem.
Resources.
of Chemical
Washington:
Abstracts.
The Council,
Washington:
Amer.
1961.
Chem.
Socb,
1958.
Dyson,
G.M. : A New Notation
mans,
Frome,
J ,:
Semi-Automatic
Indexing
Washington:
Garfield,
E,:
E, :
News,
E.:
E. :
Preliminary
of Printed
Report
Forms
Garfield,
E.:
Preparation
Garfield,
E. : Citation
Garfield,
E,:
Breaking
Garfield,
E,:
A Unified
Harris,
2. S.:
Harris,
2,s.:
Indexes
lO:l,
Index
National
New York:
Long-
and Development
Report
No. 17.
in facilitating
documentation.
Chem.
Index
Chemicus
1961,33.
Index
to Science,
Academy
Indexing
of Information
Am. l)ocumentation,
Lists
Equipment
Medical
5:7,
by Automatic
- A Manual
Project,
1953.
by use o f the 10 1
1954*
Punched-Card
Techniques,
Molecular
120:1039,1954.
Am. Documentation,
by Machines,
6:68,1955,
Sci enc e, 122: 108,1955.
Barrier,
Proc.
,I. Pat.
Intl.
of Sciences,
Coding
Project,
Formula
Linguistics.
Transformations
Analysis
Science,
for Science,
in Structural
PunchedXard
University
1954.
Indexes
Literature
Issue:
Linguistic
Mechanical
Citations,
the Subject
In formation,
Hopkins
Machine.
of Printed
Methods
by Automatic
o f S u b j e c t Heading
E, : The Steroid
Garfie Id, E,:
Research
of machines
Johns
on the
for Literature
ington:
Indexes
Punched-Card
Preparation
E.:
Garfield,
Compounds.
1959.
the use
Baltimore:
J. Documentation,
Garfield,
Office,
concerning
Preparation
Statistical
Garfield,
for Organic
30:5232,1952.
of Procedures.
Garfield,
System
and Encoding.
I-J. S. Patent
Communication
Eng.
Garfield,
and Enumeration
1949,
Conf.
39:583,
on Scientific
1957.
Information,
Vol.
1. Wash-
1959,
Chem.
Index,
Chicago:
for Information
Vo 1. 2. Washington:
Off. Sot.,
National
Literature,
Index
Chemicus,
Univ.
Retrieval,
Academy
12(3):6,1960.
First
of Chicago
Proc.
Cumulative
Press,
Zntl. Conf.
of Sciences,
1959.
Index
1951.
on Scientific
- 64 -
BIBLIOGRAPHY
Marris,
IIiz,
Z.S.:
II,, Joshi,
& Discourse
A, K., Kaufman,
Analysis.
Pennsylvania,
Iarris,
Z,S,:
IIiz,
II,, et al,
Field,
W. A.,
dexing
I.U.P.A.C.:
Opler,
and
A,,
A.M.:
E., Whittock,
FinaL Reports.
Rules
for Nomenclature
Display
N.:
Definitive
A, Xl,:
Patterson,
.I.
Baltimore:
&I., Capell,
Le
A,:
hl. D.:
Univ. of
Chemistry,
ffelch
Univ.,
MedicaL
1951,
1953,
J. Am. Chem. Sot.,
Formulas
as Digital
Libraryln1955.
82:5545,1960.
Computer
Output,
of the Nomenclature
of Organic
American
Chemical
The Ring
Index
Washington:
Geneve
pour
Society,
- A List
American
la
Reforme
1957.
of Ring
Chemical
Systems
Society,
used
1960.
de la Nomenclature
Chimique,
27:485,1892.
Recognition
The hlo 1ecular
S. V,:
on the Reform
D. F,:
de
Univ.
55:3905,1933.
2nd Edition,
Nat.,
Philadelphia:
l
Hopkins
Structural
Washington:
International
Phys.
J. : Character
Rabinow,
Soffer,
Sci.
Transformations
10:59,1958.
Sot.,
L. T. and Walker,
Chemistry.
Congres
Arch.
Words.
Johns
of Organic
of the Commission
J. Am. Chem.
Words about
in Organic
. Pictet,
L.:
Philadelphia:
Projects.
1959-61
J. and Larkey,
of Chemical
Report
Chemistry,
Patterson,
Center.
Analysis
of Linguistics,
Am. Documentation,
Patterson,
C,, and Gleitman,
of th,e Computing
and Discourse
Department
Project
Baird,
Report
Transformations
)I,, Garfield,
Definitive
B,, Chomsky,
Annual
1960.
of Pennsylvania,
Iimwich,
(continued)
Machines.
Formula
Washington:
generalized
Rabinow
in terms
of cyclic
Engineering
elements
Co,,
1961.
of structure.
Science,
127:880,1958.
Stock,
CC,:
A AlethodofCoding
Academy
Terentiev,
A. P.,
Kost,
Tsukerman,
F,:
Ueber
for Correlation
hloscow:
die
A. h!.and Potapov,
Akademiya
Beschlusse
m&en
Congresses
Gesell,
26:1595,1892.
des
zur
A, M, & Terentiev,
Nauk SSSR,
internationalen,
Regelung
A. P.:
der
M,: Printing
entific
Wiswesser,
W, J.:
A Line
V. M.:
Chemical
Nomenclature
on a Common
Chemical
Chemical
Vol.
Vol.
Washington:
Notation.
nomenclatur,
National
Organicheskikh
Ber.
Translation.
Language
I, New York:
Structures
II.
Nomenklatura
in Genf vom 19 bis 22. April
Standards
Information.
Formula
!Vashington:
1955.
chemischen
Translation.
Waldo, W,tI, and de Backer,
and Classification.
1950.
A. N,, Tsukerman,
Soedmionii.
Tiemann,
Ch emicals
of Sciences,
Electronicallyr
National
New York:
Thos.
1892 Versamd. Deut.
Proc.
lntl.
for Machine
Chem.
Conf.
for
Searching
&
Interscience
Press,
1961.
Proc.
Conf.
on Sci-
Academy
Crowell,
Intl.
of Sciences,
1954.
1959,
-65
AUTHOR
Angell,
T.
.
.
.
.
.
l
o
e
a18
INDEX
Kost,
A. N, .
Armstrong,
II, E.
e
.
.
e
.
.3,61,B
Larkey,
Baird,
a
.
e
.
.
e
.
O’Connor,
.
.
e
.
e
a E,B
e
l
e
e
e 9,25,59,B
N.
Bloomfield,
Cahn,
.
L,
m
R, S.
Cappell,
L, T,
Chomsky,
Opler,
Patterson,
.
l
l
m
e
.
.
m
l
7,10,B
93
e
Rose,
.
.
e
.
.
18
.
.
e
.
.
L
e 11,12,B
A, hl.
e
.
.
.
l
.
.
e
.
.
.
.
.
e
.
e
.
.
e
.
. 11,B
.
.
a
.
.
e
.
. 9
a
.
m
l
.
.
.
21
28
. B
.
Soffer,
hI.D.
.
a
a
.
l
e
33,40,B
C, C,
e
o
l
a
o
l
a
e B
A. P.
a
.
.
.
.
.
93
.
.
.
a
.
.
1,B
A, hf.
0
0
b
0 9,13,19,B
.
.
e
.
.
e
Q 8
Stock,
l
a
Q 7
Terentiev,
.
Frome,
J.
.
.
.
e
.
e
.
.
.
.
.
.
.
.
.
.
e
5,11,15,24,25,B
Waldo,
.
.
l
b
.
Whittock,
o
o
o
o
a
.
e
.
e
.
a
.
B
Wiswesser,
.
a
D
.
l
a
.
B
Zuckermann.
Tiemann,
.
F.
Tsukerman,
Whorf,
24,25,B
.
.
.
.8,B
.
l,B
.
.
l
l
.
.
B
.
m
.
l
l
.
.
10,B
l
T. E, R.
e
4,12 ,B
e
2,10,B
Singer,
.
B,
.
N,
a
Kaufman,
.
Rubin,
.
A, K. e
J.
A,
, .
Joshi,
8
J.
l
l
.
Rabinow,
a
o
.
J. M.
. B
D,
H.
a
A.
Frear,
Hiz,
.
Potapov,
J.
W, A.
.
A.
H. .
Himwich,
e
.
.
2, S,
.
S. V.
Field,
Harris,
B
Pictet,
.
L.
o
e HI,18
Dyson,G.M..
Gleitmn,
e
l
m
.
o
e
e
L. C.
e
.
.
Cross,
e
.
.
E
.
.
C.
Crane,
11 ,B
e
.
.
.
e
.
.
e
ll,B
J.
.
e
s
a
a
.
.
b
.
.
u
.
.
.
l
.
.
e .
.
6
. 5,12,B
W, H,
B,
.
W. J.
a
.
l
(see
Tsukerman)
8
15
-66-
SUBJECT
Acyclic
a
chemistry.
Algorithm
.
for translating
e
.
them,
.
17
names
INDEX
Class
Cleavage
definition
29
in
.
summary
q
l
.
a
a
.
30
of
chemb
Classification
Alphabetic
them,
Alphabetic
list
indexes,
simplification
.
b
b
b
b
b
25
b
b
b
b
b
26
b
b
b
b
b
b
b
22
b
b
b
56
b
b
b
56
of chemicals
5
16
Classified
a
of co-occurrences
.
nomenclature
list
of co-occurrences
chemb
nomenclature
23
Colons
in
b
Ambiguity
in them,
.
nomenclature
indicated
by linguistic
.
.
.
.
18
Commas
28
Communication
m
18
.
e
match principle
of parens .
b
.
a
.
e
.
a
.
.
35
. 24,25
. 30,38
in ring systems
longest
relation
.
analysis
resolution
.
of
.
l
in chemb nomenclature
Indexing
versus
Conclusions
b
nomenclature
.
.
.
a
9,lO
Contradictions
Analytical
chemistry.
.
e
.
6
.
7
Co-occurrences
Analytical
function
.
.
.
9
Biochemistry,
names
need
.
in
Bonding
Bracketing
.
.
.
b
25,26,40,52
analysis
b
routine
b
b
b
longest-shortest
Chemical
b
Abstracts
24
b
.
b
24
b
b
b
b 9
b
@
b
b
British
designing
b
b
b
b
b
b
.
b
b
b
b
b
Soviet
b
b
b
b
to
structb
hgb
hgb
b
b
b
b
.
b
Diagrams
Ctrb b
7
Diagrams,
b
b
b9,10
b
b
14
b
b
16
b
1
b
b
b
b
1
b
b
b
b
3
b
l
b
b
9
b
b
b
b
b
b
b
28
29
b
b
b
b
53
Examples
b
12
First
b
b
b
b
b
b
20
b
b
b
52
b
b
b
b
b
1
b
b
23
b
b
21
b
b
b
b
b
21
b
b
b
b
b
22
b
b
b
b
10
b
b
b
b
37
b
b
b
b
b
b
b
51
.
.
b
b
b
17
b
b
b
b
b
8
b
b
b
b
b
b
System
-c
by machine
match
18,21
12,13
Diagrams
routine
#
b
b
Complementary
tables
b
b
Structural
Distribution,
b
b1,14,15
goals
b
drawn
Dictionary
b
b
processing
Decimal
6
b
in
studies
b
b
b
.
the experiment
Elision
End-of-Name
Chemico-Linguistics
Ciphers
Dewey
0
background
b
of
b
b
Designing
b
b
b
b
Debugging
b
uses
b
official
V&Ii2
ofb
co-occurrences
3
b
b
nomenclature
See Structural
.
b
cydics,
b
b
b
list
character
Current
e
nomenclature
it for machine
goals of
historical
of
b
nomenclature
American
due
38
b
lists
Copywriter
8
Chemical-Biological-Coordination
Chemical
alphabetic
. 13,14
6
names
nomenclature
Cahlation
Chain,
.
in chemical
in syntactic
British
.
b
in nomenclature
@
a
morphemes
b
in systematic
lack of
for trivial
.
b
distribution
Complementary
American
in indexing
b
Environments
b
37
b
b
b
.
20
of
b
b
b
b
b
b
l
b
21
b
b
@
b
b
b
.
b
.
57
sentinel
b
b
b
b
b
b
37
b
b
in which
replacing
b
morphs
forms
in
occur
b
b
b
b
19
b
20
of algorithm
b
b
l
b
b
b
b
b
33,34
-6?-
BIECT
.
Examples
of algorithm
.
a
.
.
Second.
.
.
.
Third
Fourth
Fifth
INDEX
(cant .)
l
l
.
.
.
.
.
.
e
b
.
a
(continued)
Informant
a
.
.
l
a
e
.
e
a
.
.
.
.
.
.
.
.34
34
34
in linguistic
Information
requirements
Intuition
in linguistics
diagrams
for computer
calculation
routine
dictionary
general
Formula
program
l
ambiguity
.
indexes
Fortuitous
Free
e
look-up
Pent-Ott
.
.
e
a
e
.
.
0
a
l
+
0
a
.
.
e
b
.
e 42
. 47
.
.
.
. 6
variation
F.requently
. 27
used
Functional
chemical
.
group
.
principal
Generation
.
.
.
.
a 19,20
.
e
.
.
.
n
a
.
e
.
. 2558
e e 59
.
.
e
a
a
. 12,13
.
. 11
e
.
.
programs
for
formulas
structural
diagrams
types
morphemes
e
molecular
Generic
20,21
.
.
and words
b 45
. 20
0
of
. 48
.
e
l
in table
Geneva
nomenclature
Generic
searches
Grammatical
.
.
. 62
analysis,
Linguistic
forms
chain
Longest
match,
principle
categories
Machine
History
52,53
in linguistics
Human translation,
Hyphens
. 30
.
.
32,39
.
0
Ignorability
.
of locants
problems
requirements
Indexing,
formula
tube
a
manipulative
analytical
aspects.
19
. 1
algorithm
calculation
IBhl 718 display
Indexer’s
0
of nomenclature
Hydrogen
19
.
.
.
.
.
30
.
u
.
.
.
.
.
. 23
. 56
.
.
.
8
of
.
.
5
llilanipulative
Meaning,
.
.
57
.
.
.
.
. 12
e
.
.
.
l
.
a
a
.
e
.
.
e
36
e
8
.
15
I
9
3,25,60
.
.
.
.
35
.
.
.
.
.
.
8,9
.
.
. 9
.
.
.
.
20
of indexing
.
.
.
.
.
.
41
19
to, in chemico-linguistics.
Mechanical
analysis
Merck Index
of texts
.
.
*
,
.
.
.
.
10
l
.
51
formula
generalized
expression
calculation
Morphemes,
putative
of chemicals
Morphology
.
0
b
b
0
*
b
.
.
0
e
0
0
0
0
0
e
0
0
.
.
.
Names,
types
Nesting
in chemical
.
e
of chemical
Nomenclature,
see
Non-systematic
names
chemical
names
systems
.
in chemical
in chemical
32
* 7
13
33
20
21
for 17
by searching
of nomenclature
Morphs
Notation
for
list.
retrieval
e
b
b
for
chemistry
Soffer ‘s equation
Numerals
.
and
of
.
a
.
assignment
Numbering
versus
.
l
19
.’
aspects
referential
resorting
environments
ambiguity
indexing
primary
Heuristics
.
and their
of
5
.
objective
.
,
.
system
.
Longest
machine
7
l
Linguistic
in analytical
1,19
.
l
of Congress
Molecular
of chemicals
summarized
Library
of
19
.
.C. nomenclature
summary
combinations
table
of morphemes
.
.
. 11
devices
1.IJ.P.A
Flow
of chemists
37
Inventory
Facsimile
.
research.
.
.
.
. 3
.
. 19,20,21
.
.
.
58
.
.
.
24
nomenclature
a
.
.
a
.
13
e
.
.
a
e
, 4
names
.
l
n
60
.
56
nomenclature
-68-
SUBJECT
INDEX
nome ncla ture .
a
0
Oral communication
.
.
.
Order
.
0
. 57
.
.
.
. 25
of
.
.
.
. 57
.
.
.
. 25,57
Official
of substituents
Parent
structure
definition
Parentheses
Pattern
recognition
*
devices
l
3
Simulating
1
Soffer’s
18
and phonology
.
*
. 19
Positional
variance
.
.
a
0 21
and suffixes
used
Putative
Retrieval
rlrng
.
in nomenclature
.
. 56
Substitutive
names
m
Searching,
Selective
l
see
names
.
.
.
.
.
.
.
8
.
.
.
.
a
.
.
of chemicals
lo,18
.
a
B
e
a
b 12
.
m 20,21
.
.
# 58
.
0 17
ambiguities
.
. 18
e
.
e 50
.
1
resolving
.
.
,.
l
.
relationship
.
. .
e
e
e 17
to nomenclature
device.
.
.
.
10,ll
Sentinel
for end-of-name
.
.
.
a
.
Shortcuts
in linguistics
.
.
e
.
19,20
e
b
.
.
Shortest
chain
e
.
l
Synonyms,
Syntactic
analysis,
Syntactic
categories
Syntax,
rapid
.
. 14,15,16
e
a
.
.
e .
.
coding
Unsaturation
. 6
.
11
.
.
to
l
28
29
.
.
,
.
58
a
a
e
.
l
58
interchangeably
of
16
26
*
24
25
a
e
.
in
e
l
.
0
14
.
.
.
16
b
14
tw 0 types
names
394
14
putative
.
l
l
.
e
l
e
. 26,53
. 13,14
names
Typographical
.
.
.
see
Transformations
.
. 60,61
morphemes,
names
. 6
,
Tentative
morphemes,
.
as transformations
changes
trivial
.
. 9
.
.
Trade
e
e
e
chemistry
Univac
3
.
Teaching
Trivial
37
,
definition
names,
versus
33
.
used
chemical
27
.
.
and prefixes
.
e
or Prefixes
Systematic
by
analysis
reading
e 58
of
names
method
Suffixes
morpheme
.
formula.
nomenclature
nomenclature
names
systems,
Sampling
linguistics
in chemical
of chemical
Subtractive
meaning
morpheme
n*
.
to
.8,9
machine,
names
Replacement
Structural
value
value
.
.
e
assumed
0
machines
Referential
always
from names
by machine
morphemes,
Recognition,
chemical
.
diagrams
Substituents
Radicofunctional
Reading
nomenclature
e 16
l
indexes
Punctuation
nomenclature
Steroid
.
for molecular
.
interchangeably
Printed
equation
Soviet
e
computer.
Structural
Phonemes
Prefixes
(continued)
problems
.
.
12
.
18
. 26,59
© Copyright 2026 Paperzz