Electronic analysis and organization of the Syro

Electronic analysis and
organization of the Syro-Turkic
inscriptions of China and Central
Asia
Margherita Farina
Dipartimento di Scienze storiche del Mondo Antico
Università di Pisa
A Database of the Syriac and Syro-Turkic
Inscriptions from Central Asia and China
Prof. Pier Giorgio Borbone
Dipartimento di Scienze Storiche del Mondo Antico
Università di Pisa
Scholarship for gathering and informatization of epigraphic
material; elaboration of concordances
History and distribution of the corpus
• About 700 inscriptions
• Syriac, Turkic, Armenian
• Central Asia and China (Kazakhstan, Kyrgyzstan,
China Xinjiang, Inner Mongolia, Quanzhou, Yangzhou)
• 12th-14th cent. A.D. (1426-1690 Seleucid Era)
• Christian (Nestorian?) and Muslim
Typology
Typology
Typology
Typology
Typology
Languages, contents, structure
•
•
•
•
Funerary texts
Monolingual (Syriac/Turkic) / Multilingual
Variable length
Variable competence in Syriac
Languages, contents, structure
•
•
•
•
Funerary texts
Monolingual (Syriac/Turkic) / Multilingual
Variable length
Variable competence in Syriac
“In the year (Seleucid), Turkic (Animal S/T)
This is the grave of PN + status
Eulogy”
Chwolson 1897: 21 (=231)
In the year 1596
this is the grave [of] the priest Abraham
disciple of the priest Bar Yuhannan
(…)
May our Lord forgive his sins.
‫ܒܫܢــــܬ ܐܬ ܪܝܘ‬
‫ܗܢܘ ܩܒܪܗ ܐܒܪܗܡ ܣܥܘܪܐ ܩܫܝܫܐ‬
‫ܐܣܟܘܠܝܐ ܒܪ ܝܘܚܢܢ ܩܫܝܫܐ‬
‫ܗܘܐ ܥܢܕܢܗ ܒܝܘܡ ܬܪܝܢ ܒܫܒܐ‬
C97/ ‫ ܚܛܗܐ‬C97/‫ ܚܘܒܗ‬sic/‫ܪܡܝܢ ܥܕܬܐ ܡܪܢ ܢܚܣܐ ܚܡܘܗ‬
‫ܐܡܝܢ‬
‫)‪Chwolson 1897: 52 (= 266‬‬
‫‪In the year 1616‬‬
‫‪turkic year of the snake‬‬
‫‪this is the grave [of] Sabrîšûʻ‬‬
‫… ‪archdeacon‬‬
‫ܒܫܢــــܬ ܐܬܪܝܘ‬
‫ܗܘܐ ܫܢܬ ܛܘܪܟܐܝܬ ܚܘܝܐ‬
‫ܗܢܘ ܩܒܪܗ ܣܒܪܝܫܘܥ ܐܪܟܝܕܝܩܘܢ‬
‫ܣܒܐ ܒܪܝܟܐ ܟܗܢܐ‬
‫ܓܡܝܪܐ ܣܓܝ‬
‫ܥܠܡ ܥܠ ܝܨܝܦܘܬܗ ܕܥܕܬܐ‬
Sources and editions
It is difficult to trace the inscriptions and to establish
correspondences between various editions
• Corpora edited by Chwolson, Sluckij, Kokovcov et. al.
1886-1897
• Djumagulov (1967-1987)
• Borbone, Dickens, Klein (2000-2009)
• Geng Shimin, Samuel Lieu, Niu Ruji, Wu Wenliang
(1996-2009)
About 40 publications, not always easy to access
Organize all this material into a database
Obelix
Borbone & Mandracci 1987:
• MsDos operating systems > modern Dos interfaces;
• Concordances to the Peshitta of the OT;
Borbone, P. G. & Mandracci, F. (1989).
“An other way to analyze Syriac texts. A simple powerful tool to draw up Syriac
computer aided concordances.” Proceedings of the II Conference Bible and Computer,
Jerusalem, 9-13 June 1988. Paris-Genève: Champion-Slatkine.
Borbone, P. G. & Jenner, K. D. (1997). Peshitta. The Old Testament in Syriac ; Part V:
Concordance to the Old Testament in Syriac. Leiden: Brill.
What does Obelix do?
• Parse every single word and reconnect it to a general lemma
• Separate particles and suffixes from the words (for Syriac)
• Manage textual variants
• Give multilingual translations of the lemmata
• Quote lemmata in sentence-embedded format
• Sort concordances alphabetically, generate all kinds of word
lists
• Sort corpus roughly by grammatical category (N, V, A)
Step 1: text input
• transformation of the transcriptions into a pure
text file;
• univocal and regular transliteration code;
• coding of variants and additional information.
Step 1: text input
• transformation of the transcriptions into a pure
text file;
• univocal and regular transliteration code;
• coding of variants and additional information.
@0231
1 b$nt 'tqSw hnw qbrh 'brhm s`wr' q$y$' 'skwly' br ywHnn q$y$' hw' `ndnh
bywm tryn b$b' rmyn `dt' mrn nHs' Hmwh/sic Hwbh/C97 HTh'/C97 'myn;
@0266
1 b$nt 'tryw hw' $nt Twrk'yt Hwy' hnw qbrh sbry$w` 'rkydyqwn sb' bryk' khn'
gmyr' sgy `lm `l ySypwth d`dt';
Step 1: text input
• transformation of the transcriptions into a pure
text file;
• univocal and regular transliteration code;
• coding of variants and additional information.
Step 2: Lemmatization: Every word is linked to
• a lemma in the dictionary
• a translation
• a grammatical category
Chwolson 1897: 21 (=231)
In the year 1596
this is the grave [of] the priest Abraham
disciple of the priest Bar Yuhannan
(…)
May our Lord forgive his sins.
‫ܒܫܢــــܬ ܐܬܪܝܘ‬
‫ܗܢܘ ܩܒܪܗ ܐܒܪܗܡ ܣܥܘܪܐ ܩܫܝܫܐ‬
‫ܐܣܟܘܠܝܐ ܒܪ ܝܘܚܢܢ ܩܫܝܫܐ‬
‫ܗܘܐ ܥܢܕܢܗ ܒܝܘܡ ܬܪܝܢ ܒܫܒܐ‬
‫ ܚܘܒܗ‬sic/‫ܪܡܝܢ ܥܕܬܐ ܡܪܢ ܢܚܣܐ ܚܡܘܗ‬
‫ ܐܡܝܢ‬C97/ ‫ ܚܛܗܐ‬C97/
@0231
b$nt = $nt'.N& b.P
'tqSw = 'tqSw.A
hnw = hnw.A| hn'.A| hw.A
qbrh = qbr'.N
'brhm = 'brhm.N
s`wr' = s`wr'.N
q$y$' = q$y$'.N
'skwly' = 'skwly'.N
br = br'.N
ywHnn = ywHnn.N
q$y$' = q$y$'.N
hw' = hw'.V
`ndnh = `nd.V
bywm = ywm'.N& b.P
tryn_b$b' = tryn_b$b'.N
rmyn = rm'.N
`dt' = `dt'.N
mrn = mr'.N
nHs' = Hs'.V
Hmwh = Hmw'.N/ sic
Hwbh = Hwb'.N/ C97
HTh' = HTh'.N/ C97
'myn = 'myn.A
;
@0266
b$nt = $nt'.N& b.P
'tryw = 'tryw.A
hw' = hw'.V
$nt = $nt'.N
Twrk'yt = Twrk'yt.A
Hwy' = Hwy'.A
hnw = hnw.A| hn'.A| hw.A
qbrh = qbr'.N
sbry$w` = sbry$w`.N
'rkydyqwn = 'rkydyqwn.N
sb' = sb'.N
bryk' = brk.V
khn' = khn'.N
gmyr' = gmr.V
sgy = sgy.N
`lm = `lm.V
`l = `l.A
ySypwth = ySypwt'.N
d`dt' = `dt'.N& d.P
;
Chwolson 1897: 52 (= 266)
In the year 1616
turkic year of the snake
this is the grave [of] Sabrîšûʻ
archdeacon …
‫ܒܫܢܬ ܐܬܪܝܘ‬
‫ܗܘܐ ܫܢܬ ܛܘܪܟܐܝܬ ܚܘܝܐ‬
‫ܗܢܘ ܩܒܪܗ ܣܒܪܝܫܘܥ ܐܪܟܝܕܝܩܘܢ‬
‫ܣܒܐ ܒܪܝܟܐ ܟܗܢܐ‬
‫ܓܡܝܪܐ ܣܓܝ‬
‫ܥܠܡ ܥܠ ܝܨܝܦܘܬܗ ܕܥܕܬܐ‬
Step 1: text input
• transformation of the transcriptions into a pure text
file;
• univocal and regular transliteration code;
• coding of variants and additional information.
Step 2: Lemmatization: Every word is linked to
• a lemma in the dictionary
• a translation
• a grammatical category
Step 3: elaboration of concordances
Concordances
8 *=Slyb'.N/ S_cross;
@0127 1 b$nt 'lp_$tm''_'rb`yn_t$`' hdy qbrh
Slyb'=*;
@0175 1 hd' hy qbrh Slyb'=* xw$TnS pSyn 'yl;
@0180 1 whd' hy qbrh Slyb'=* Tylt'/sic;
@0317 1 b$nt 'lp_$tm'_tltyn_$t' $nt twr' hd' hy
qbrh Slyb'=* `lymt’;
@0383 1 b$nt 'trn $nt Tb$x'n/und hdy qbrh `lymt'
Slyb'=* myt mwtn';
@0451 1 $nyt/nau Hwy' hdy hny/nau qbrh Slyb'=* TlyT'/sic;
@0612 1 b$nt 'lp_$tm'_`sryn_tmny' hw' $nt Hwy' Twrk'yt yyl'n hd' hy
qbrh Slyb'=* mhymnt';
@0704 1 mngw_Tngry kwySynT' mngk' x'x'n yrlykmz pyznyng 'wySwn $hr'
T'pynyp 'lky$ xlyp 'wrwx 'wrwxwmyz_x' pwy'n pyrswn Typ Slyb'=*
T'mx pyrTymyz mry qtwlyq'_x' pw T'mx'ny kwyz_k'S 'ryp mry
Hsy'_l'r rbn_l'r 'rk'kwn_l'r mry qtwlyq'_Tyn swyzsyz 'Tyxsyz
klm'swn_l'r pw T'mx'lyx pytygsyz 'wyz kwngwlS' klglg rbn_l'r
'rk'kwn_l'r y'byz m'x' s'xynyp k'lyz'ryn Typ yrlx'Tymyz;
Concordances
•
•
•
•
•
Word / string
Grammatical category (N, V, A)
Translation
Year
Office / position etc.
Sort inscriptions by date 1
Sort inscriptions by date 2
18 *='wd.A/ T_bue;
@0003 1 b$nt 'tqsh hn' Twrk'yt TwrkS'/sl T'w$x/sl 'wd=* ()'rdy/sl hd'
hy qbrh dmrym k'Twn xw$T'nS;
@0052 1 b$nt 'lp_$tm'_tryn_`sr 'wd=* hnw qbrh ywHnn 'skwly';
@0109 1 b$nt 'lp_$tm''_'rb`yn_wHm$ hw' $nt twr' Trk'yt 'wd=* hnw qbrh
'ly' mhymn';
@0116 1 b$nt 'trmH Twrk'yt 'wd=* hdhy qbrh Tbyt' mhym(n)t';
@0117 1 b$nt 'lp_$tm''_'rb`yn_tmny' hw' $nt twr' Twrk'yt 'wd=* hd hy
qbrh kwTwr_T'ryn mhymnt';
Orthographic variants and different readings
A. Lemmatize orthographic variants and
mistakes
A. Codify different readings (when autoptic
examination is impossible)
Lemmatize orthographic variants and mistakes
18 *='wd.A/ T_bue;
@0003 1 b$nt 'tqsh hn' Twrk'yt TwrkS'/sl T'w$x/sl 'wd=* ()'rdy/sl hd'
hy qbrh dmrym k'Twn xw$T'nS;
@0052 1 b$nt 'lp_$tm'_tryn_`sr 'wd=* hnw qbrh ywHnn 'skwly';
(…)
7.2 *='wT.A/ T_bue;
@0033 1 b$nt 'lp_$tm'' yy'/sl 'wT=* 'rdy gywrgys q$y$' ry$_`dt' sgy
nSyH';
@0048 1 b$nt 'lp_w$tm''_wHd`sr hw' `wqbr' sSx'n sySx'n/sl yyl yyly/sl
'rdy 'rdnyng kb kwn 'wT=*/sl 'rdy/sl 'wldy hnw qbrh $m`wn Tly'
br Swm' 'skwly' y'T pwlswn tnr'/und tnSr'/und tSr'/und nyng
dw$md/und yw$md/und;
Codify different readings
@0231
1 b$nt 'tqSw hnw qbrh 'brhm s`wr' q$y$' 'skwly' br ywHnn q$y$'
hw' `ndnh bywm tryn b$b' rmyn `dt' mrn nHs' Hmwh/sic
Hwbh/C97 HTh'/C97 'myn;
@0231
‫ܒܫܢܬ ܐܬܩܨܘ ܗܢܘ ܩܒܪܗ ܐܒܪܗܡ ܣܥܘܪܐ ܩܫܝܫܐ ܐܣܟܘܠܝܐ ܒܪ ܝܘܚܢܢ ܩܫܝܫܐ ܗܘܐ‬
/C97 ‫ ܚܛܗܐ‬/C97 ‫ ܚܘܒܗ‬/sic ‫ܥܢܕܢܗ ܒܝܘܡ ܬܪܝܢ ܒܫܒܐ ܪܡܝܢ ܥܕܬܐ ܡܪܢ ܢܚܣܐ ܚܡܘܗ‬
‫ܐܡܝܢ‬
Hmwh = Hmw'.N/ sic
Hwbh = Hwb'.N/ C97
HTh' = HTh'.N/ C97
Limits: parsing Turkic
47.1 *='r.V/ T_to_be;
@0003 1 b$nt 'tqsh hn' Twrk'yt TwrkS'/sl T'w$x/sl 'wd ()'rdy=*/sl hd'
hy qbrh dmrym k'Twn xw$T'nS;
@0033 1 b$nt 'lp yy'/sl 'wT 'rdy=* gywrgys q$y$' ry$_`dt' sgy nSyH';
(….)
@0701 1 b$m 'b' wbr' wrw' dqwd$' lylmn mqdwny' p'lyxlyx pylypws x'n
'wxly 'lxsndrws 'ylyx x'n s' ky$y yyl myng 'lTy ywz yygrmy
TwyrT yylynT' Twyrq s' xy$y 'wwd yyl 'wnwnS 'y 'lTy y'ngyT'
Twyz y'rynT' prq'mS' xw$T'(S) my$H'nyng y'rlyxy pwyTwrdy
'wyzwTy 'w$Tym' xd' 'r=* r y'd pwlzwn 'mn;
Limits: parsing Turkic
1 *=Tyn.A/ T_ending_(ablative);
@0704 1 mngw_Tngry kwySynT' mngk' x'x'n yrlykmz pyznyng 'wySwn $hr'
T'pynyp 'lky$ xlyp 'wrwx 'wrwxwmyz_x' pwy'n pyrswn Typ Slyb'
T'mx pyrTymyz mry qtwlyq'_x' pw T'mx'ny kwyz_k'S 'ryp mry
Hsy'_l'r rbn_l'r 'rk'kwn_l'r mry qtwlyq'_Tyn=* swyzsyz 'Tyxsyz
klm'swn_l'r pw T'mx'lyx pytygsyz 'wyz kwngwlS' klglg rbn_l'r
'rk'kwn_l'r y'byz m'x' s'xynyp k'lyz'ryn Typ yrlx'Tymyz;