ethnic and religious encoding system (eres)

ERES
LIST SERVICE DIRECT, INC.'S
ETHNIC AND RELIGIOUS
ENCODING SYSTEM
(ERES)
3
ERES
List Service Direct, Inc. is a full range list services firm specializing in ethnic and religious target
marketing. With over twenty-five years of experience and advanced expertise, our proprietary
ethnic encoding system identifies 175 different ethnic and religious categories and 80 languages, all
of which can be coded and appended to any database within as little as 24 hours.
ETHNIC AND RELIGIOUS ENCODING SYSTEM (ERES)
Humans are social animals. Individuals form groups; groups form cultures, and cultures evolve into
civilizations. Names have created a unique means of identifying and categorizing individuals.
List Service Direct, Inc.’s Ethnic and Religious Encoding System (ERES) utilizes this historical
concept as one component of its process. The process has created a rule and exception based expert
learning program that incorporates the idea that each group has last and first names that will be
unique to that group. By applying specific criteria in a specific order, the ethnic, religious, and
minority identity of the individual can be ascertained. The accuracy of this identification is further
enhanced by applying a geographical analysis, based on census tract data, of the name within the
ethnic, religious or minority group.
ERES is NOT A SURNAME BASED SYSTEM. Rather, it is a new process that allows the
marketer or researcher to select over 190 ethnic, religious, and minority groups from any list. It
analyzes both an individuals first and last name and applies, in a specific order, ethno-linguistic and
geocentric rules to both the surname prefix and suffix and identifies the specific ethnic, religious,
and minority status of individuals, even an individual with a multi-ethnic surname.
The ethnic encoding system consists of a set of interfacing computer programs and 17 data files
some of which are:
A unique first name file by ethnicity
A non-unique surname file by ethnicity
A series of two to five character prefix rules by ethnicity
A series of two to five character suffix rules by ethnicity
A series of codes to identify the ethnic, religious and minority status of an individual
A geocentric reference table
A series of computer programs that analyze an individual’s names using the systems data.
In addition ERES is exclusive in its ability to recognize hyphenated and misspelled names, which
the system correctly codes based on the prefix and suffix rules. Hyphenated names are captured
using ERES’ first name and surname tables in conjunction with the prefix and suffix rules that
apply to them.
4
ERES
One of the building blocks of the system is its ability to trace the onomastic over 1,190 variables
which include Ethnic Heritage Descriptions, Locational Identifiers, and Ethnic Life Form &
Individual Trait Describers.
Ethnic Heritage Descriptors:
An ethnic heritage descriptor is a link to the parentage of the individual. Each ethnicity and
language has a different way of expressing this within the first or the last name. ERES uses these
descriptors (either suffixes or prefixes appended to first or last name) to accurately identify
particular names unique to particular ethnic groups. Below are several examples that will illustrate
the way ethnic heritage describers may be used.
In Finnish, suffix “NEN” means the “offspring of’.
In Welsh, the original prefix “AP” (since shortened to “P” when combined with a first
name) means “offspring of’. Thus, PROBERT is Welsh for “offspring of Robert”.
The suffix “UCCI” means “descendant of’ in Italian while the Turks use the suffix
“BASHI” to mean “father of’.
Prefixes play an important role in identifying some Irish names. “Grandson of’ is
implied in the prefix “O”, while the prefix “MC” means “son of’. To designate
“Uncle”, the Burmese use the prefix “U”.
Ethnic Locational Identifiers:
During the Dark and Middle Ages it was essential that an individual could be traced to his country
of origin or his geographic location within a country. This information immediately identified the
individual as friend or foe. One method that was adapted was to add an identifier to his name in the
form of a prefix or suffix. Geographic locators are important to the ethnic identifier process as well.
In addition to the suffixes and prefixes and the rules derived therein, ERES incorporates actual U.S.
geographic coordinates to determine ethnic, religious and minority group clusters enhancing the
accuracy of our system. Below are some examples of ethnic locational identifiers and the popular
name “myths” they refute.
In the Finnish ethnicity, the surname suffixes “OLA” “YLA” and “KOSKI” mean
upper, lower, and middle respectively. “KOSKI” refutes the popular notion that all
names ending in SKI are Polish and “OLA” proves that not all names ending in a
vowel are Italian.
The French use the prefixes “DU, DE, DELAS, and DES” to designate “from”;
while the Romanians use ‘AN-U” and “EANU” to convey the same meaning.
Italian names ending in “DDA” and “DDO” denote Sardinia heritage.
Ethnic locational identifiers help ERES to correctly determine the ethnic origin of all groups
including Italian and Polish. (see above Finnish). Other systems currently in use do not have this
ability and are far less accurate.
5
ERES
Ethnic Life Form & Individual Trait Descriptors:
Early naming conventions were also created as a means of classifying individuals by their physical
attributes and likeness to animals (sometimes not flattering). These life form and trait descriptors
offer an unusual method of identifying individuals by ethnic group.
The Italians provide us with many examples of both life form and individual
trait descriptors: “FUZZO”(“curly), “MANCINI”(“left handed”), “LAGO”(“tall”) and “FASANO”
(“pheasant”). Less flattering is “BOCCACIO” (“ugly mouth”) “IZZO” (“snail”), and
“MUSSOLINI” (“gnats”).
ERES uses them in combination with the ethnic heritage descriptor ethnic locational identifiers to
help build its rule and exception based system. These rules enable ERES to correctly identify
ethnicity of names that might be eliminated or inaccurately identified using surname systems, and
other programs and assign them to their correct ethnic group.
Religious Affiliation: ERES has a code that determines religious affiliation. The process cannot,
however, distinguish denominations, sects and the like within individual religions. For example, it
cannot accurately determine who is Baptist or Calvinist within the Protestant group. Nor can it
select Hasidic Jewish groups from the Jewish population at large. Religious affiliations are
determined by geographic locators and ethnic group identifiers.
ETHNIC CLOSE UP: African-American
ERES goes well beyond knowing which areas have high concentrations of African-Americans. It
identifies Africans based African-American names with its unique first name and surname tables.
Individuals identified in this manner may reside anywhere in the United States, not just in AfricanAmerican clusters. In addition, our system identifies African-Americans with non African based
but unique first names anywhere in the United States. Sheneka Brinter living in Fort Lee, NJ is
African American. So is Amarta Azubuike.
As a further safeguard, ERES looks within the African-American clusters and eliminates all nonblack ethnicities, qualifying only those individuals with commonly borrowed ethnic names and
certain Islamic names. ERES continually refines the selection criteria to ensure that the name
identified as African-American will be African-American; not just a “could be” but an “is”.
6
ERES
ETHNIC CLOSE UP: Hispanic
ERES identifies Hispanic individuals by unique last and first names using rules and exceptions that
apply. Geographic mapping confirms the locations of this population and will identify Hispanics in
non-Hispanic areas. There are many multi-ethnic names (e.g. Delgado), which could be Hispanic
but could also be Italian. ERES can separate the multiethnic name into its proper component
ethnicities by using first names where possible. The remainder is stored with the multiethnic
uncoded class until they are verified as being Hispanic using first name indicators. Our system’s
unique first name table identifies Hispanic women who marry individuals with non-Hispanic
surnames. Quite often, Hispanics marrying Hispanics lead to hyphenated names. ERES identifies
these and some misspelled names with its ethno-linguistic rules. Further ERES can accurately
identify and differentiate Mexican, Puerto Rican, Cuban, Portuguese, and Brazilian – a List Service
Direct exclusive.
ETHNIC CLOSE UP: Japanese
Almost all Japanese names are combinations of descriptive components. Although the Japanese
surname is the easiest Oriental name to distinguish, (especially since they were not influenced by
the Chinese) most surname based systems have included all Asians in a category known as
“Oriental”. Therefore, the Koreans, who not only share certain surnames with the Chinese but often
introduce Chinese qualifiers to their names, are mixed with the Japanese and other Asian ethnic
groups. ERES has separate and unique prefix and suffix rules and exceptions for all ethnic groups
including those representing the continent of Asia. Also, we have an extensive Japanese surname
and first name table. These features allow us to identify Japanese in traditionally non-Japanese areas
such as Mark Tanaka and Junko Takahashi in Omaha, Nebraska and to also identify Japanese
women who have non-Japanese surnames.
ERES views each ASIAN ethnic group as a separate and identifiable selection by creating rules and
exceptions based on the study of the history and development of surnames and first names with the
culture of each country. This allows the user of our system to target and select a particular ethnic
group, such as Japanese.
7
ERES
ERES UPDATE
With the availability of the latest Census information the ERES geo-coding protocol was updated to
contain the new geographic tables, which include African-American, American Indians and
Hispanic Origin of every census tract in the United States, down to the block group level.
From the latest Census data we have incorporated Ethnic Origin and Hispanic Origin data for every
Census Track Block Group in the United States.
# of unique Block Groups: 214,412
# of unique Tracks:
65,994
# of Block Groups with an African American Percentage greater than 90%: 10,569
# of Block Groups with an African American Percentage greater than 80%: 14,719
# of Block Groups with an African American Percentage greater than 70%: 18,336
# of Block Groups with an Hispanic Origin Percentage greater than 90%:
# of Block Groups with an Hispanic Origin Percentage greater than 80%:
# of Block Groups with an Hispanic Origin Percentage greater than 70%:
# of Block Groups with an American Indian Percentage greater than 90%:
5,755
8,346
11,372
390
This data allows us to identify the smallest possible pockets of a given ethnicity. This pinpoint
accuracy improves ERES’ ability to find Ethnic pockets in non-ethnic zip codes. We use geographic
data as a component when determining African-American, American Indian, and Spanish Speaking
households.
The updated geo table improves ERES’ accuracy and identifies additional African American and
Spanish Speaking Households, by locating and recognizing these households outside of high
African American and Spanish Speaking zip codes.
For example, Thomas Smith living in zip code 11361, can now be correctly identified as African
American because he lives in a census block that is predominantly African American. The zip code
11361 is not identified as a high population African American zip code, but the census tract block
group that Thomas Smith lives in is identified as a high population African American census tract
block group.
This new version of the Ethnic System utilizes population percentages of each Ethnic
Group/Hispanic Origin for every census tract in the United States down to the block group level.
The latest census data has been applied to every census tract in the United States.
8
ERES
Completeness and Accuracy:
In order to correctly determine the accuracy of our data and the algorithms contained within, we
contracted with a national market research company to conduct a telephone study. Sample size was
determined by the research company to ensure that the resulting data would be at the 95th level of
confidence. We set up quotas by major ethnicity (Hispanic, African American, Asian and "Other")
in an effort to make sure each was properly represented in the study. A total of 1,566 telephone
interviews were conducted. A telephone methodology was chosen for its ability to reach a larger
number of respondents quicker and at less cost than a mail study.
The sample for the study was pulled from LSDI’s national database using a random nth selection.
Each piece of sample was assigned a sample number. This number was used after the data was
tabulated to cross match the data from each individual completed interview with the data for that
record contained in our database. In other words, if a respondent indicated that they are Hispanic in
the survey, we would look at that respondent's data record in our database to see if the data matched
as a way of checking accuracy. This extra step was done in addition to the standard data tabulations
that were completed for the study. Our findings indicated that different ethnicities produced
different levels of cooperation and accuracy. Please see the chart below, which reflects cooperation
and accuracy by major ethnicity.
ETHNICITY
HISPANIC
ASIAN
AFRICAN AMERICAN
OTHER
COOPERATION
48%
39%
47%
46%
99
ACCURACY
94%
86%
90%
92%
ERES
SOFTWARE LICENSING:
Site licenses are available and include.
The software package that performs a detailed analysis of each name on either a residential
or business database.
A coding structure that is added to your file that ascertains the ethnic, religious, and minority
status of each individual, as well as a code to define the likelihood that the individual thinks
in and speaks their native language.
Documentation for both system implementation and usage.
Assistance with system implementation and usage training, both for data processors and
marketers.
Membership in ERES Users Support.
System Delivers
System is rule based
Rules include first name
Number of first names
Rules include surname
Number of surnames
Rules include suffixes
Rules include prefixes
Latest census data
Use Zip + 4 prior to encoding
Encodes on either residential or bus files
Can identify ethnicities in any U.S. geo location
Accuracy levels
Encoding rates
Deliverability Rates
Number of ethnicities
Number of religions
Number of language groups
Can reach specific groups within Asian
population
Examples of Asian ethnicities
Differentiates within Hispanic ethnicities
Examples of other ethnicities
Types of access to ethnic coding
Ongoing support
10
List Service Direct Inc.
Yes
Yes
70,121
Yes
469,731
Yes
Yes
Yes
Yes
Yes
Yes
96%
82% to 95%
94%
163
12
80
Yes
Chinese, Korean, Thai,
Vietnamese, Filipino, Japanese
Yes
Greek, African American
Rental, append, license
Dedicated customer service &
technology service team.
ERES
Descriptions and Explanation of Usage
On the following pages are summary level counts by ethnicity, which depict the actual record
counts that ERES stores and utilizes when analyzing an individuals full name and address.
For each ethnicity there are columns for the number of onomastic rules that apply to that ethnicity,
the number of unique first names applicable to that ethnicity, and the number of surnames stored for
that ethnicity.
ONOMASTIC RULES (Prefix & Suffix Rules)
There are 1,192 onomastic rules currently implemented in our ethnic system. Each rule reaching
implementation level was hypothesized and tested to ensure validity. Implemented rules apply to
the examination of the prefix and suffix of a surname. When an individuals ethnicity cannot be
determined by looking at the whole name, its component parts, the prefix and suffix are analyzed
and matched against the rule files in a specific order. The order is governed by length of argument,
i.e., search five character suffix before four character, the three character etc. Misspelled names,
hyphenated names, and names new to this country are a few. Our Onomastic rules allow our process
to outperform other surname based systems in all three above cases.
UNIQUE FIRST NAME FILE
ERES currently recognizes 70,121 unique first names that can be linked to a specific ethnicity.
While there are no absolutes, the chance that a person with a first name “Fumihiko” is other than
Japanese is statistically insignificant. Likewise, a person with the Igbo first name “Ogochukwu” is
statistically unlikely to be other than from Africa or is an African American, even if their last name
is “Smith”.
SURNAME FILE
ERES currently has 469,731 surnames on its surname file. Where surnames are useful due to
numerous variations in prefix and suffix spellings, such as in Italian, there are a correspondingly
large number of surnames compiled for that ethnicity. There are over 18,000 on file for Italian.
Where a large proportion of individuals can be determined by either unique first names or
onomastic rules, there are fewer names needed. For instance, in Japanese there are only about 2500
surnames on file, but there are 182 onomastic rules and another 500 plus unique first names.
11
ERES
Ethnic, Religious, Minority (Group Classification) and Language Speaking
Codes:
ERES appends the following codes to every record.
Ethnic Code
Religion Code
Language Code
Minority Grouping
Hispanic Country of Origin
African American Confidence Code
Assimilation Code
2 bytes
1 bytes
2 bytes
1 bytes
2 bytes
1 bytes
1 byte
Religious Identity Code:
RELIGION
Buddhist
Catholic
Greek Orthodox
Hindu
Islamic
Jewish
Siku
Lutheran
Mormon
Eastern Orthodox
Protestant
Shinto
Not Known or Unmatched
12
RELIGION CODE
B
C
G
H
I
J
K
L
M
O
P
S
ERES
Minority Classification Code:
GROUPS
All African American Ethnic Groups
Hispanic
Far Eastern
Southeast Asian
Central & Southwest Asian
Mediterranean
Native American
Scandinavian
Polynesian
Middle Eastern
Jewish
Western European
Eastern European
Other
Uncoded (No group)
13
GROUP
CODE
F
Y
O
A
C
M
N
S
P
I
J
W
E
T
ERES
Hispanic Country of Origin Codes:
Country
Argentina
Bolivia
Brazil
Chile
Colombia
Costa Rica
Cuba
Dominican Republic
Ecuador
El Salvador
Guatemala
Honduras
Mexico
Nicaragua
Panama
Paraguay
Peru
Puerto Rico
Spain
Uruguay
Venezuela
Unknown
14
LSDI Country
Code
HA
HB
HZ
HQ
HJ
HR
HC
HD
HL
HE
HG
HH
HM
HN
HK
HY
HX
HP
HS
HU
HV
ERES
Language Speaking Codes:
LANGUAGE
Afrikaans
Albanian
Amharic
Arabic
Armenian
Ashanti
Azeri
Bantu
Basque
Bengali
Bulgarian
Burmese
Chinese
Comorian
Czech
Danish
Dutch
Dzongha
English
Estonian
Farsi
Finnish
French
Georgian
German
Ga
Greek
Hausa
Hebrew
Hindi
Hungarian
Icelandic
Indonesian
Italian
Japanese
Kazakh
Khmer
Kirghiz
Korean
Laotian
CODE
A1
A2
A3
A4
A5
A6
A7
B1
B2
B3
B4
B5
C1
C2
C3
D1
D2
D3
E1
E2
F1
F2
F3
G1
G2
G3
G4
H1
H2
H3
H4
I1
I2
I3
J1
K1
K2
K3
K4
L1
LANGUAGE
Latvian
Lithuanian
Macedonian
Malagasy
Malay
Moldavian
Mongolian
Nepali
Norwegian
Oromo
Pashto
Polish
Portuguese
Romanian
Russian
Samoan
Serbo-Croatian
Sinhalese
Slovakian
Slovenian
Somali
Sotho
Spanish
Swahili
Swazi
Swedish
Tagalog
Tajik
Thai
Tibetan
Tongan
Turkish
Turkmeni
Tswana
Unknown
Urdu
Uzbeki
Vietnamese
Xhosa
Zulu
15
CODE
L2
L3
M1
M2
M3
M4
M5
N1
N2
O1
P1
P2
P3
R1
R2
S1
S2
S3
S4
S5
S6
S7
S8
S9
SA
SB
T1
T2
T3
T4
T5
T6
T7
T8
U1
U2
V1
X1
Z1
ERES
Ethnic Codes:
ETHNIC DESC
Afghani
African American 1
African American 2
Albanian
Aleut
Algerian
Angolan
Arab
Armenian
Ashanti
Australian
Austrian
Azerb
Bahrain
Bangladesh
Basotho
Basque
Belgian
Benin
Bhutanese
Bosnian
Botswanian
Bulgarian
Burkina Faso
Burundi
Byelorussian/Belorussian
Cameroon
Caribbean African American
Cent Afric Rep
Chad
Chechnian
Chinese
Comoros
Congo
Croatian
Czech
ETHNIC
CODE
C1
M0
WL
U0
R1
D0
M1
D1
U1
M2
RP
U2
U3
D2
C2
M3
T2
T1
M4
M5
U4
WM
U5
M6
M7
U6
M8
WP
M9
MA
U7
R3
MB
MC
U8
U9
GROUP
CODE
A
F
F
E
N
I
F
I
C
F
T
W
C
I
A
F
Y
W
F
A
E
F
E
F
F
E
F
F
F
F
C
O
F
F
E
E
16
ETHNIC DESC
Danish
Djibouti
Dutch
Egyptian
English
Equat Guinea
Estonian
Ethiopian
Fiji
Filipino/Philippines
Finnish
French
Gabon
Gambia
Georgian
German
Ghana
Greek
Guinea-Bissau
Guinean
Guyana
Hausa
Hawaiian
Hispanic
Hungarian
Icelandic
Indian
Indonesian
Iraqi
Irish
Italian
Ivory Coast
Japanese
Jewish
Kazakh
Kenya
ETHNIC
CODE
N1
WE
N2
D3
T3
MD
UA
ME
R4
RE
N3
T4
MF
MG
UB
T5
MH
D4
MJ
WF
MK
WN
R5
T9
UC
N4
C3
R6
D5
T6
T7
ML
R7
D7
UD
MM
GROUP
CODE
S
F
W
I
W
F
E
F
P
P
S
W
F
F
C
W
F
M
F
F
T
F
P
Y
E
S
A
O
I
W
M
F
O
J
C
F
ERES
Ethnic Codes (Cont.):
ETHNIC DESC
Khmer/Cambodia/Kampuchea
Kirghiz
Korean
Kurdish
Kuwaiti
Kyrgyzstani
Laotian
Latvian
Lesotho
Liberian
Libyan
Liechtenstein
Lithuanian
Luxembourgian
Macedonian
Madagascar
Malawi
Malay
Maldivian
Mali
Maltese
Manx
Mauritania
Moldavian
Mongolian
Moroccan
Mozambique
Multi-Ethnic
Myanmar
Namibian
Native American
Nauruan
Nepal
New Zealand
Niger
ETHNIC
CODE
R8
UE
R9
D6
D8
UF
RA
UG
MN
MO
D9
TE
UH
TF
DE
MP
MQ
RB
RJ
MR
DS
TJ
WG
UI
RC
DF
MU
ZZ
R2
MS
KS
RK
C6
RM
WH
GROUP
CODE
O
C
O
W
I
C
O
E
F
F
I
W
E
W
E
F
F
O
T
F
M
W
F
E
O
I
F
Z
O
F
N
P
A
T
F
17
ETHNIC DESC
Nigerian
Norwegian
Other Asian
Pakistani
Papua New Guinea
Persian
Pili
Polish
Portuguese
Qatar
Romanian
Ruandan
Russian
Saudi
Scottish
Senegalese
Serbian
Seychelles
Sierre Leone
Slovakian
Slovenian
Somalia
South African
Sri Lankan
Sudanese
Surinam
Swahili
Swaziland
Swedish
Swiss
Syrian
Tajik
Tajikistan
Tanzanian
Telugan
ETHNIC
CODE
MT
N5
RD
C4
MV
DH
RS
UJ
T8
DG
UK
MW
UL
DJ
N6
MX
UM
WJ
MY
UN
UP
MZ
W0
C5
W2
W1
WS
W3
N7
TH
DK
UR
UQ
W4
C7
GROUP
CODE
F
S
A
A
P
I
P
E
Y
I
E
F
E
I
W
F
E
F
F
E
E
F
F
A
F
T
F
F
S
W
I
C
C
F
A
ERES
Ethnic Codes (Cont.):
ETHNIC
CODE
RF
RG
W5
W6
DL
DM
UT
W7
UV
00
UW
RQ
RH
N8
WK
W8
DN
W9
WA
WB
WC
ETHNIC DESC
Thai
Tibetan
Togo
Tonga
Tunisian
Turkish
Turkmenistan
Ugandan
Ukrainian
Unknown
Uzbekistani
Vanuatuan
Vietnamese
Welsh
Western Samoa
Xhosa
Yemeni
Zaire
Zambian
Zimbabwe
Zulu
18
GROUP
CODE
O
O
F
P
I
W
C
F
E
Z
C
P
O
W
P
F
I
F
F
F
F
ERES
Assimilation Codes :
Assimilation is the process whereby an individual new to the country adopts the customs and
attitudes of the prevailing culture. Within each level of assimilation the individual's spending
habits, socioeconomic status, language and lifestyle preferences differ. ERES also applies
Assimilation codes to Hispanic and many other minority groups.
ASSIMILATION DESCRIPTION
Assimilated - English Speaking
Bilingual - English Primary
Bilingual - Native Language Primary
Unassimilated - Native Language Only
19
CODE
A
B
C
D