ComputerProgramsirt Biomedicitte6 (197O 23*38
O North-Holland
l'ublishing
Company
FINDINGCONDENSED
DESCRIPTIONS
FOR MULTI-DIMENSIONAL
DATA
NannyWERMUTH,TheoWEHNERand HerberrGdNNER
Institut fi)r Medizinische Statistik und Dokumentotion der Johannes Gutenberg.IJniversirat,
65 Mainz, Langenbeckstr. 1, I'-ederalRepublic o!'Gerntany
W c d e s c r i b et w o p r o g r a l n st h a t m a y b e u s c d t o f i n d c o n d e n s e dd c s c r i p t i o n st b r d a t r a v a i l a b l ci n r c o n t i n g e n c y
table or in a covariance matrix in the case that these data follow a multinomial or a multivariate normal distributron.
respectively. The prograrns perform a stepwise rnodel searr:hamong multiplicative models by cornputing anpropriate
l i k e l i h o o d - r a t i ot e s t s t a t i s t i c s .
Model searchprocedure
rnultiplicativc models
contingcncy tablc analysis
analysis of correlation matrices
l. lntroduction
r ( . y lx 2 x t x 4 x 5 ) =
The interrelationsamongseveralvariablesare usually difficult to understandunlessthey canbe described
in a condensed
manner.If thereexistsa joint probability distribution function, then a fairly cornpact
quantificationof suchinterrelations
is available.
Condenseddescriptions
forjoint distributionsare ofTered
by multiplicativemodels[6,7]. They area subclass
of
log-linearmodels[1,2], and a subclass
of covariance
selectionmodels[4] in the casethat the joint distribution is a muitinorninalor a multivariatenormal distribution,respectively.
Eventhoughmultiplicative
modelsfor multinomial distributionshavealready
beenemployedasdata-analytic
tooisin extensive
medicalstudies[3,5], a more widespread
usewill be
likely only after the modelsearchprocedures[7] have
becomewell known, and after the corresponding
cornputer programsare easilyavailable.This papergives
descriptionof COVSELand TASEL, two Fortran IV
programsfor the ntodel searchamongmultiplicative
models.
2. Methods
A multiplicativemodel stateshow a joint distribution may be factoredinto marginaldistributions.Supposethat the joint distiibution of five variablescan
be factoredas
f (x1x2 x3) f (x1xa) f(x1 x5 )
-..
(i,l aGil
then the notation for this multiplicativemodel is
123114115.
Thus,the notationfor a multiplicative
modelshowsthosesubgroups
of variables
that producethe interrelations
amongall variables.
Furthermore, the notationtellswhich variablepairsareconditionallyindep,endent.
They areall thosepairsthat
do not belongjointlyto one of the subgroups
listed
in the notation.In model 123114115,
for instance,
the pairs(2,4), (2,5),(3,4),(3,5), (4,5) areall conditionally independentgiventhe remainingthree variables.Theseindependencies
imply a relativelysimple
patternof association
for all variables:givenvariable
1. thejoint variable23 andthe variables
4 and 5 are
no longerinterrelated,they areindependentof each
other.
Theseinterpretations
may seemratherabstractbut
they giveuseful and practicalguidelinesfor the data
analysis.
Supposethat model 123I 14l15 fits a set of
datawell. Thenwe know that the variablegroups123,
14 and l5 shouldbe analyzedin more detailbecause
they containthe more importantinterrelations.
In addition, a way to presentthe datais suggested.
The
materialmay be displayedin groupsthat arehomogeneouswith respectto variable1. Then,interrelations
areeasyto grasp,sincethereis only one association
left, the one for pair (2,3).
Likelihood-ratio
testsmay be usedto evaluatethe
N. llermutlt et al., Multi-demensional data description searclt
fit betweenthe maxinrum-likelihood
estimates
for a
givennrodeland the otrserved
data.Let nir,...,i, und
,irir,...,;, denotethe observedand the estimatetr
cell
countstbr a multinomialdistribution,and let l,9land
of the observed
and the estilXl be the determinants
rnatedcovariance
matrix tor a multivariatenormal
distribution,andn be the samplesize.Than the chisquareteststatisticsare [2,4,71
f it,
, 1 n i , . . . . i.',",a n d
lrl
It l_'"
2ln
12)
i r , . . . i.p L n i, . . . . i. , )
/:,\ nl2
2h {++l
\ lJt/
(3)
respectively.
To computethesetest statisticsin the caseof multiplicativemodels,the maximum likelihood estimates
neednot be evaluated
explicitly.lnstead,fbr each
model the abovestatisticmay be derivedas the sum
of teststatisticsfor certainzeropartialassociations
(z.p.a.).As an example,we partitionthe test statistic
in (3) for model 123ll4l I 5 . Let D r234s,DtZt, D tc
denotethe determinants
of the observed
correlation
matrix for aii five variablesand of the submatrices
(1 ,2,3)and (1,4), only. Thenwe
containingvariables
c a nw r i t e ( 3 ) f o r m o d e i1 2 3 / 1 4 l 1 5a s
tt
f D t z t D M D t s l D I D-rll ' l 2
-2lnl--"
;
L
u12345
Gl
J
This statisticequals,for instance,the sum of the following five test statistics
Dnts Dlja5lDljl-nl2
-- )zl nl|nf - '-1 - - ^ : - - J
Drzzqs
L
rrt2,
t
fDt2jDlj5lDljl
/
\
fDr3aDlj5lD,;,f-rtzy
|
,
Dnqs
+l-2lni
\LDnts)/ ^
--l
'
*t_rln
- "t .'-L
. ( ,' (4rP') "'')
.
*(-,'l%];,-l!.1-','')
(s)
The first of thesestatisticsevaluates
the condrtionalindependence
of pair (2,4) in the joint distribu.
tion of all five variables(1 ,2,3,4,5).The secondstatistic evaluates
the conditionalindependence
of pair
(2,5) in the marginaldistributionof the variables
(1 ,2,3,5).Anotherinterpretationfbr this type of a
sequence
of test statistics
hasbeendemonstrated
(in
They
represent
that
order)testsfor zer<r
[6,7].
partiaiassociation:
of pair (2,4) givenmoCel12345;
o f p a i r ( 2 , 5 )g i v e nm o d e l1 2 3 5 / 1 3 4 5 ; opf a i r( 4 , 5 )
givenmodel 12311345;ofpair (3,5) givenmodel
12311341135;of pair (3,4) givenmodel 1231134115.
The fact that eachlikelihood-ratiotest statisticfor
a multiplicativemodel may be partitioned into a sequenceof easiiycon.rputable
test statisticsfor z.p.a.
(asin our example)is the basisfor the model search
procedureperformedin COVSELand TASEL.
3. Descriptionnf plograms
The programsand flow chartsare organizedin
threeparts,(A) the setupfor a givenset of data,(B)
the first selectionstep and preparationsfor the second
selectionstep,and (C) the performanceof a! of the
remainingselectionstepsin one largecycle.The first
selectionstepneednot be treateddifferently than the
other steps.We decidedto leaveit separate
though,to
savecomputingtime and becausewe wanted to use the
resultsof the first selectionstepin other data-analysis
programs,aswell.
A
The input cardsconsistof five different types of
cards.
1: Beginningcard,AAAA in the first four columnsof
the first input card;
2: a headingcard;
3: informationfor the datacards(for CCVSELthe
(NVAR) and the numberof
numberof variables
(NOBS),for TASEL the numberof
observations
per variableon (ICARD ");
categories
5: the main data(for COVSELthe lower triangular
correlationmatrix, for TASEL the celi countsul
e a c hc e l lo f t h e c o n t i n g e n ctya b l e ) .
In the casethat an error in the input cardsis enareprintedand the next
countered,errormessages
*A
durrrn.ry"1" in the second column made this card compatible with L. (]oodman's liCTA-program (Universit-vof
Chicago).
N. Wermttth et al., Multi-demensional data description search
C OVSEL
C L E A R C O U NT E R S
READlst INDUf-fAPlAMAtNACATES
BE6INNIN6
OF A 'ET OF DATA
R , E A D AP
NPDI N TH E A D I NG ,
THE 2nd TNPUT-CARD
READ
ANDPRJNT
NUMEER
Of VARIABIf
R),NUMBER
O FO 8 5E R V A T I O N S
WO85), THE 3 rd t NPUT- CA RD
P R / N TE R R O R
ME55AGF
R E A DF O R M A TF O RD A T A M A T R I X
T H E4 I h I N P U T - C A R D
AND PRINT.
MATI( LOWER
TPIAN6UT
MTAMATAIX),THENEXTNVAR
INPUT
5 UB R O U T I N E
R E A D M2
P R / N TE R R O R
MESSAGE
TE DETERM
INANT(DET) A N D
(MAT)OFMATI
iAIIVE INVEPSE
T:1t-10
FORM ALL NVAR (NVAR-I)/2
VARIABLI PAIRS
5UEROUTINE:
Q5w
P RINT ERROR
ME55A6E
25
z6
\'. lllcnnutlt (.t al., Multi-detnensional data dest'riptittrt seur[lt
COVSEL
INDEXCOM
B I NA T I O N F O R
F I R S TS E L E C T I O NS T E P
5 ELECT VARIABLE PAIR,EY
P O 5 t T I O (Nt N D E ) W t T HS M A L L EX52I
COMPUTE PROBABILITI ES A ND
7 t ' so r r t n s fi f t E c r / o N s r E P
P N N {I ' 5 0 r T f f
SUBROUT/NE
CHILOW
5UBROUI/NE
DRUCKl
S E L E CEfD p A t R
COMPUTE AND PRINT
x2 ro rtsr Ftr oFTHE
S f I E C T F DM O D E L
(,(.0€PTNOENT
r t 0 Nc H t q
J
S I O R f / N D T \ C O , N8 l/ N A T / C N
o F T H ES t L E C T E D P A r R
5USROUT'NEI
U5PMR
F A C T O R I N D EA CO M BI N A T / O N
O F T H ES E L E C T E DP A I R
5UBROUIINE;
PERM
'IORI BESULT
OFPERMIN IZA ANDINT
5USROUT/NE5
U5PZ, USPN
PRlNI- NOIA]-ION FOR TflE
1 s I S E L E C T E DM O D E L
MARK THE SELECTED
PAIR
(v3(tNDE)=8)
PAIRSTOBE
MARKALLVARIAELE
lN 2nd 5ELECT|0NlTE
EXCLUDED
( v 3 ( . )= e )
BINATIONFOR
FIND INDEXCOM
tN 2nd ST
EACH PAIRINCLUDED
SUBROUTINE:
T E5 7 ' N
- D f P f N D E NI
N. lttermuth et al., Multi-dimensional
data tlescription search
C OV5 E L
c)MPuft x2'5r0 rc,T FORADD|qONA
Z.PA.FOREACHPAIRINCLUDEDIN
T H I S S T L E C T I O NP A IR
5 U 8 R O U TN
IE
QSW
PAIRBYFINDINi
SELECTVARIABLE
P)StrtoN (tNDE)0FSMALIES|
y'z
5 U B ROUTINE
PRt NT X'5 OFfHt s 5 ELEcTtoN sTE
SUBROUTINE
D RU C K l
c H lL o w
LUL- UCtsC
FUNCTION,
CH I
P R I N TT H E S E L E C T E DP A I R
coMPUfE
ANDPRtNf72 rO rEsT
FITOFTHESELECTED
MODEL
S T O R EI N D E X C OBMI N A T I O N
O F T H ES E L E C T E DP A ; R
5 U8 R O U N N E .
USPMR
F A C T O RI N D EX C O M8 I N A T I O N
O F T H E S E L E C I E DP A I R
SUEROUTINE
PERM
STARE
I N P U TA N D R E S U L O
T FP E R M
IN IZA AND /N E
MARK THE 5 EL TCTEDPAIR
(V!(III'T)
,8) AND NULLIFV TXCLUS
IONS
( V J H . 9 ) A F P R E V I O U SS T E P
PPINT
NOTANON
FOR MUTUAL
CAN CEL AIL I N DEXTOMEINATIONS
I N I ZA THAT ARI CONTNNTD
IN IN E
A N DP i l N T N O T A T I O N F O R T H E
S E L E C T T DM O D E L
M A R K A L I V A R I A E L E P A I R ST O E E
EXCLUDEDIN THENEXTSELECNON
srEp (v3(.)=9)
INDEXCOM
8I NATI ON S IN I Z A
FOR EACHPAIR INCIUDEDIN
T H EN E X TS E L E C T I OSNT E P
SUEROUNNT.
Z A THLNEU
cttARwoRKtpActs
27
28
N. llermuth et al., Multi-tlimensional data description searclt
TASEL
CLEAR COUNTERS
P E A D 1 5 1I N P U T - C A R D
A A A A " I N DI C A T T SB E 6 l N N / N O
OF A SET OF DATA
PRiNIFiNJT
OF
CHARACTERS
WRON6 CARD
R E A DA N D P R / N T :
HEADTNGTHE2nd INPUT-CARD
( I C A R DN
) UMBER
O FC A I E 6 0 R I E S
VARIABLE,THE 3rd I NPUI-CARD
( NVAR),
OFVARIAELES
NUMBER
N U M B EO
RFC E L L , { N I E L U F RIACMA R D
PRINTTRROR
M ESSA6E
P R I N IE R R O R
MESSAGE
READFORMATFOPTHf.DATATABLE,
T H E4 I h I N P U T- C A R D
PEADffABITHE
DATATABLE, LASI
C A R D SO F T H I Ss E TO F D A T A
.t,
ffii
(rci-!,
IvArloNi rl;,rI.l :0iJulgL1l
FORM ALL NVAR. N V AR-l')/ 2
-
N. Wermuth et al., Multi-dimensional
TASEL
F O R M I N D E X C O M BN
I A T I O NF O P
TI-18 FIRSTSELECflON STEP
C?MPUTEy''5 T0 TESTZ.P.A.
V A R I A B L EP A I R
OF EACH
S UB RO U T I N
E:
c H l c R , c t - it T A
PRINTV,'5
OF FIRSTSELTCTION
STEP
SUBROUTINE:
DRUCK2
STTETT
VARIABLE
PAIR8Y FINDIN6
P05I Tt0N(tNDt) t{IH Ht6Ht'r pR?
8A8tUl
S U8 R O U T I N E :
PRHI6H
MARK THE5TLECTE
DPAIR(V3(INDO.8)
P R I N T H ES E L E C T E P
DA I R
PRINr 12 TO TESTFtT 0F THE
S E L E C T EM
DODEL
SIORE'NDEX COM BI NAT I ON
DA I R
O F T H E S E L E C T EP
SUBROUT/NE:
USPMR
FACTOR I N DEXCO
M 8 I NATI ON
O F T H ES E L E C T E D
PAIR,
5U BROUTIN E:
PERM
S T O R ER E S U L TA F P E R M
IN IZA AND /NE
P R / N T N O T A T I O NF O R T H E
l s t S E L E C T E DM O D E L
;
5 UBPOUT|NfS
U s P Z ,U S P N
DC-DEPENDENT
OVERPRiNT
MARK ALL VARIABLE PAIRS
T O B E E X C L U D E Dl N 2 n d
S E L E C T T sOrN
E P ( V 3 ( . ). 9 )
FI N D I I'IDEXCAM BI NATI ON 5
F O R E A C H P , A I RI N .
C L U D E Dl N 2 n d 5 T E P
5 UBROUTINF:
I€5 T,N
data description search
29
30
N. Wermuthet al., Multi-dimensionaldata description search
TASEL
C0MPUTEX''5 TO TEST FORADDt T I O I ' I A LZ . P A . F O R E A C H P A I R
INCLUDEDIN THISSELTOIONSTEP
5 UB R 0 U T N
t E,
ITA
CHICR,CH
PRINTy''5 0FTHlt SELE(\?N'TEP
SUBROUTINE:
DRUCK2
V A R I A B L EP A I R B Y
SELECT
(INDI) WITH
FINDIN6POSITION
H I G H E S TP R O E A B I L I T Y
5U EROUTINE:
P R HI 6 H
(VJ(tNDt).8)
THEstLECr.ED
PAtR
PRII'IT THE SELECTED PAIR
co M PUTE AN D PRtNT x'z T0 TESI
FITOf THESELECTED
MODEL
STORE IN DE XCOM BI NAT I ON
OF THE SELECTED PAIR
SUPROUTINE:
USPMR
FACTOR I N DEXCOM8 INATI ON
O F T H E s E L E C T E DP A I R
SUBROUTINE:
PERM
rOREINPUTAND
RESULTOF
P E R MI N I Z A A N D I N E
SUEROUTINES:
(V3U.9) 0F
NULLTFY
EXCLUSIONS
P R E V I O U SS T E P
FOR MUTUAL
CANCELALLINDIXTOMBINATIONS
! N IZA THATARECONTAINED
INE,
I I D P R I N TN O T A T I O NF O R T H E
S E L E C T E DM O D E L
M A R K A L LV A R I A B L EP A I R ST
N THENEXT
B E E X C L U D EI D
( v 3 ( . )" 9 )
SELECTt0N
STEp
FINDtNDEXC)l"lBtNATt?NS
tNIZA
THATBI'LON6TOf ACHPA!I?II'1.
II\ TIIENiXT5ILTCTION
ILUDSD
5f EP
5 UB R O U T I N E :
ZAEHLNEU
WJfTOFDATA
N. Wermuth et al., Multi-dintensional data description searclt
AAAA card,that is the next setof data,is being
for.
searched
B a n dC .
For a setof datawith NVAR variables,
thereare
NVAR (NVAR-1)/2variablepairsand the samenumber of selectionsteps.In eachstepone variablepair is
At the same
selectedto havezeropartialassociation.
time the notationfor the corresponding
selectedmultiplicativemodelis printed,aswell asits goodness-ofllt statistic(for COVSEL (3), for TASEL (a)). The
decisionon the fit or on the lack of fit of a selected
modelis left to the user(compare[7] and the sample
run). The goodness-of-fit
statisticsaresumsof chisquarestatisticsfor z.p.a.
We needsomenotation to describethe actuaicomputatiorrof a chi-square
statisticfor z.p.a.[7] at any
selectionstep.kt (i,7)denotethe indicesof a variable pair and let (tl() be an index combinationin
which K containsindicesof someor all other variables.Then,for COVSELwe compute
- l l n l l -t ',- '
;
viivzl
(6)
|
,tt yll _)
whererii, rii arethoseelementsin the inversecorre(ijK) that correspondto the
lation matrix of variables
pair (i,i). For TASEL,we compute
?
\a
/L. tt,,r ln ni,r
tlh
- (j
ri.g ln tt,.r
tK
5as\
+ /Jn iKIntt.6- 4/,r..xlnn..s
]K
(1)
K
whereni16 denotecell countsin the (marginal)table
of variables(ryK) and
t
\-\
n..K = /J ili. K = L/ iliiK
tl
.
The pair (i,7) is actually selectedwhen it is the most
likely pair to havez.p.a.(amongall thosepairsincludedin that selectionstep).After the selectionof
pair (i,7),the variables
(tlK) will no longerbe investigatedjointly but only in subgroups.
The index combination QlK) will be factoredas
0K) (iK)
(t,()*ff.
(8)
This kind of factoringgivesthe index combinaofvariables,and it leadsto the
tionsfor subgroups
3l
multiplicativemodels.Atier
notationfor the selected
the first seiectionstep,the index combinationon the
left hand sidein (8) and in the denominatorarestored
in lNE, thoseof the numeratorarestoredin lZA. After canceilingtermscontainedin INE aswell asin
IZA, the contentof IZA givesthe modelnotation:
the contentof INE tellswhich variablepairscannot
their
be includedin the next selectionstep(because
z.p.a.would not leadto a multiplicativemodel).
The vector V3 storesinformation on the variable
pairs.If, for instancepair (1,2) is not yet selectedto
h a v ez . p . a .t,h e nV 3 ( 1 ) = 0 , i i p a i r ( 1 , 2 )i s t e m p u rarily excludedin a selectionstep,then V3(t) = !
V3 (1) = 8. Aiand after pair (1,2) hasbeenselected,
ter a1lof the NVAR' (NVAR-l)/2 selectionsteps,all
elementsof V3 equal8 and workspaces
areclearedfor
t h e n e x t s e to f d a t a .
4. Sampleruns
As samplerunswe displaythe two setsof data,
that havebeenusedprevlously[7] to describethe
selectionalgorithmin detail.
For the five indicatorson the maturity of a newborn inlant (COVSEL)the variablepairs(4,5) and
(2,5) areselectedin the first two selectionsteps.Thus,
after the secondstepthe selectedrnodeiis rnodel
12311135,
the corresponding
likelihoodratio statistic
(3) hasa valueof 2.78 on 2 degrees
of freedom(d.f.).
to a fairly high probabilityof
Sincethis corresponds
0 . 2 4 8w e c a nr e g a r dm o d e l1 2 3 4 / 1 3 5a sa w e l l - f i t t i n g
model.But, the z.p.a.of pair (2,4) in the third select i o n s t e p( a n dh e n c em o d e i1 3 5 / 1 2 3 i)s n o t c o m p a t i of 21.00 on I degree
ble with the data: a chi-square
of freedomindicateswith p = 0.000 that pair (2,4) is
extremelyunlikelyto havez.p.a.;similarly,the test
with a valueof 23.19
statisticfor model 13511341123
on 3 d.f. andp = 0.000 showsthe lack of fit of this
model.For all of the followingselectionstepsit is assumed,that (2,4) actualiyhasz.p.a.,therelbrenone
of.the followingntodelsshouldbe judgedasbeing
acceptable.
For the four symptomsobservedon psychiatnc
patients(TASEL) the variablepairs(2,3),(3,4) and
(1 ,2) areselectedin that orderin the flrst threeseleccomputedto testtheir z.p.a.,
tion steps.The statistics,
a r ea l l s m a i l :3 . 3 9o n 4 d , . t . ; 4 . 9 9o n 2 d . f . ; a n d5 . 4 9
N. l4)ermuthet al., Multi-dimensionaldata descriptionsearch
2=SO].-T]]I TY r
IaVALIOIIYI
3:STAB I.LI TY I
flulfaER oE CATEGoRIESFoR tHf
mPur-rABLf*
_ 15
q ,,_12,,
_30-,_*
q:ACUTE
OEPRESSTOIIL
rr vARl.-AA\es:lL 2t
L3
22
__ 1l__
't
2r._ rr
zl___[L 2l _
,ra_
__
-._=_..16
l\lUuBER_OLCL
\lur{BER0F oEsERvATtohiS
; 362:
F O RT H E S U B T A B L E
I.r3
L23q
1r 4
L23t+
2t3
123{
5.3953
5r 4
123r+
7.6ti01
--'
-*,,12-.8695
--tr-..,--,.--lL-01L9
tt
55.00r+3
-
010000
0.4qq3
_,_r+___-.,
t{
0.1057
P-A
SELEC.TEO
lfr 2 r 3
NOTAIION FOR THE STL€CTEO I4OOELL
/ L34
lzt+
PAIR
,
112
INDLXCOI4BIIIATION
FOR THE.SUBTABLI
Lzt+
113
l3t+
2rT
Lzt+
3r{
154
CHI.SOUARE
O-
5. q859
2
0.064rr
13.57+5,--L-'---.l)gQ{LLl
rr.990$
2
0.0825
S E L E C T E DP A I t i 3 r 4
cHIo-suM=
8.38356
oF-sut4=
6
P-sUt4=
0.21132
T O I A I I O N F O R T H E S F L E C T E DI I O D E L L
LzI+/ L3
PAIR
L t2
INDLXCOI'IBINATION
FOR THE SUtsTABLE
12t{
lr5
15
10.0235
1
010000
1r4
12{
30.795r+
2
0
2r4
L24
19.7331
2
0.0000
S E L E C T E OP A I R 1 r 2
13.86958
cHIo-SuM=
oF-StrM=
CHI-SOUARE
DF
5. rr859
2
0.05{4
I
P-SUu=
P
0.08S23
N. llermuth
et al., Multi-dimensional data desuiption
search
JJ
T v O T A T I 0 NF o R T H L S E L f C T E O l . O o E L L
1 3 / 2 t +/ L t +
PAIR
1r3
INDLXCOIUEINATIOI.I
FOR THL SUBTABLE
13
1t.{
2rT
P
OF
ChI-SOUARE
10r 0255
1
0.0 0! 0_
1'{
28.0325
I
0.0000
2t1
L6.9692
1
0.0000
S E L E C T E DP A I R 2 r 4
30,8387ri
cHto-sur/i =
DF-SUM =
9
P-SU|4 =
0.00052
it0TATI0N FOR THt SELtCTt0 i400ELL
13/t4/2
PAIR
1r 3
INOLXCOIV,BINATION
FOR THE SUB1ABLE
13
1rT
ltf
CHI-SOUARE
10.0235
I
0.0000
28.c325
I
0r0000
S E L E C T E 0P A I R l r t { -
c H I Q - s u M=
58,87125
o F - S U U=
OF
10
P - S U H=
0.00000
^ T O T A T I O NF O R T H E S T L F C T E OF 4 O O E L L
L3/2/q
PAIR
1r3
INOEXCOFBINATION
FOR THE SUBTABLE
15
S E L E C T E DP A I h 1 r 3
58.89q75
CHIS.SUM:
O F - S U r 4=
A I O T A T I O NF O R T H E S E L E C T E OI I O O E L L
l/2/3/tt /
STOP
CHI.SOUARE
DE.
1
10.0235
11
P - S U f r=
0.0000
0.00000
N. llennuth
J+
I-GESTATIONT 2:HLA!
et al., Multi-dintensional data description search
3 ; } I E I 6 H T T I { = L E N G T H T5 = C O N S I R U C T E O
INDICATOR
CtRC.r
! ^ I U I 4 B EO
RF VARIABLES
:
\UI{BER OF OBSFRVATINNS
=21+E
5
INPUT lr{ATRIXR
1.000000
6.{31t{gg
0.51{600
0..r89100
0.{11200
1.0OOOOo
c.626300
cJr{.6600
0.260r{00
{E6ATIVE INVEf.SEOF R
-1.5311t+3
0.241026
0.ABle5A
0.27231q
0.36295r{
1,000000
o.78300o.
0.39A600
1.0trotio0-__'-_._
1-000000
0.3{J300
_,
-1.706053
j-Aqr+82a
0.19775q
- 0.038718
_5.189959
1.615987
0 . 3 03 7 1 7
-2,677929
0.042906
r 1.273125
CHI-SCUARE
DF
l23t{5
55.51855
1
1r3
123r+5
r+0.38187
I
1r{
le5q5
r+5.13{2{
1 --_,.,_..__0..L
1r5
123r+5
L73.02012
1
0.00000
2r3
125t+5
313.54523
1
0.00000
2ttt
l23t{5. .
2r5
123+5
3,{
123t+5
5r 5
123rr5
t tr 5
123q5
PAIR
INDLXCOI,:tsINATION
FOR SUBT'.A.TRIX
1r2
S E L E C T E OP A I R :
CHIO-SUM=
_
_
2Lt25932
1.707r+3
1206. t+2542
56.81802
1.33567
I +T 5
1.33567
OF-SUi4=
1
P-SUM=
__
_
. 0.0n000
o.0oo00
1 -_-,-__,..-0_.-0IqO0___
1
0.19152
1
0.00000
t-
0-O000L_
I
0.2q?80
0.24?8
I!OTAIION FOR THE SELFCTED I.!ODEL
123q
/L235
OgIR
INDLXCOIiBINATION
FOR SUEIV.ATRIX
1r{
123ri
52.96311
1r5
1235
180.8q899
2 14
1?3r+
21r 0044t+
2r5
1235
1.t+5255
5r{
123tt
L262.0t+29t+
3r5
1235
112. $3555
S I L E C T E DP A I R : 2 I 5
CHIO-SUM=
2.?8823
OF-SUU=
CHI.SOUARE
2
P-SUII:,
DF
L
0.2481
O,228L2
N. llernttth et al., Multi-demensionaldata descriptionsearch
! O T A T I O N FORTHE SELICTIO I'IIOOEL
125q11J5
C}II-SoUAFE
PAIR
I . N O E X C O } 4 B T N N L I O-N
FOR SUBPATRIX
1-t2
12-tt{ ---
5q.33{3f
1r.l
123.+
52.96511
lr5
155
180,r1011
2rJ
12.5-'+
, --
.tr15' 50,0ofi
2r{
1234
5r4
123{
lr5
-L35-
P
..-Df
o.oo000
21.004qq
L262 ) 0t+29t+
-
-
--
S E L E C T EPOA I R i 2 r {
CHIO-SlllL=.-
DF-:SUX =
22.7925]1
P:SUll -=-,
I
^ , t O T A T I O NF O R f H E S E L f C T E 0 | 4 O O E L
L35/L3t11L23
PAIL
1r2
I
F O RS U B P I A T R I X
66J0066-
,,.
, _ l,zl
65,.+0959
lrT
l3q
i;.
:,35
180.r1011
2-.3
123
?89 r 05817
Srlf
1t4
a737.53105
5rS
135
136.30890
CHIO-SU.I4?
89-2n2O6
-
r+
oFrSUl,l ;
P:Slllr :
l.O-OtoOO
0JL0II0
-
- { 9 T A T I O NF O RT H E S F L F
L35/t23/34
c Hr - s 0 u A R E
I NDEXCOI,lgINAT
ION
fOR SUEIIATRIX
PAIR
AA^rAnaa
l-6
r.lq
180..f1011
2.5
123
789-0681L
3r{
.l{-
3ra
-
131
23'f 8.26333
-
ul6J08!lo
DEP
t
0.0n000
35
N. Wennuthet al., Multi-demensionaldata descriptionsearch
36
S E L E C I E OP A I R i 1 r 2
CHIQ.SUH
L
-0-€:5llll:
NOTATION FUR THE SELECTED itOOEL
la5/e.t/23
P4IR
-
IIIDEYCotrBlhlaTIoIl-.*----,
F O R S I J E M A T IRX
1r3
tnl
Lr J
135
2r5
23
3r{
3q
5r5
LJs
S E L E C T E OP A I h : 5 t 5
CHl0-SUt1=
292.29162
-. _ Ct{I
_
cr,lJ110].
1&0.q1011
1231.54027
-
DF-SU!;
2j{8.26333
5
P*StrM:
0.0000
IIOTAIION FOR THE S€L€CT€IL }TO!E].
3 +/ 2 3 / L 5 / L 3
PAIR
INOLXCOPiB:iIATIOAL
FOR SU-BUATTIX
I rJ
13
7 6 0 .? 7 0 q 9
l rb
15
b 54. 06960
2t3
23
a231-5rr027
5r$
39
25q8.25333
S E L E C T E OP A I K : 1 r 5
75043A122
CHlO:SUil;
CHI.SOUARE
OF:.S,Lll!,.= 7
P:S141 =
DF
1
P
0 .0o0! 0
0.0000
\ICTATION FOR THE SFLFCTFO MOOEL
5L+/23/L3/5
Pllt{
INOTXCOPTBINITIoN
ICR SUEP'AIBII
lr3
15
760-z70q9
Zti
25
1231.5r{027
I rt
t'{
2 - r r 8{ . 2 5 3 3 3
S E L E C T E OP A I R : 1 r 3
1511.15a21
cHl0-SlJtvl =
CHI-SOUARE
OF:lilJllt =
\OTATION FOR THE SFLfCTEIJ XOOEL
3q / 2t/ L.1_'
6
P-S{.lIt =
0F
1-,
-,.,0
P
,.0.0oQ00
N. llernuth et al., Multi-demensic>nal
data descriptionsearch
PArB
lNDLXCQvrBlNa,TlOlL
FOF SUBI,IATRIX
2t3
L3
I r$
3i{
.
et]l:LAU]jE
___
113r.54027
0F
1
0.00000
e3t+8.25333
-
s t L E C T E DP A I i i : 2 r J -
cHlo-sun=
27ta2.67L98
oF-sul4=
9
P-SU|4=
o.00oo
r v o r a r r o r r " r o nr H E s r L E c T [ D M o o E L
5t+/I/2/5
PAIR
-
INDEXCOP]BINATION
F O R S U B I | ] A T IRX
L+
3i+
S E L E C T E DP A I R : 5 T ' +
5090.9I5U.
CHIS-SUM =
^rorlr io-r- rpr-rlrr serrrrfa-
CHI.SOUARE
OF
1
oF-SUH = l0
P-SUfi =
P
0.0000_q
0.000,0
iropEL -
1,12f3/ 1l:1
r!ttr(ltrLt!-!!_!t_!!!lrlrntrIttttr||t|||rn||||'nr
rr[,ll-mtrxBllnltr{intrtrt!iiinntrttrittrll{illretrolf,rcntilllil
37
38
N. llermuth
et al., Multi-dimensional data descri)tion search
on 2 d.f.; all with p > 0.05. They indicatetherefore
of z.p.a.areacceptable
for these
that the assumptions
threevariablepairs.This result is confirmedby the
overalltert lor the selectedmodel:p > 0.05 for the
chi-square
i4,41 on 8 d.f. The largetest statisticsfor
z.p.a.in the followingselectionstepstel1that no simpler multiplicativemodel fits thesedata well.
After a modelhasbeenselected,
it is instructiveto
look at the expectedvaluesirnpliedby this mode1.
Moreprecisely,one wishesto computethe maximumlikelihoodestimates
for correlations
in the caseof a
givencovariance
selectionmodeland maxirnumlikelihood estirnates
of cel1countsin the caseof a given
log-linearmodel. Fortran programsare availablein
b o t h s i t u a t i o n s[ 8 , 2 ] .
5. Hardwareand softwarespecifications
The programshavebeenwritten in FortranIV tbr
the CDC 3300 computerat the ComputationalCenter
of the Universityof Mainz.Locatiolrsin the program
whereCDC-dependent
functionsand tbrmatsareused
havebeenidentitledassuchin the flowcharts.Storagerequirements
are2l ,650wordsfor COVSELand
3 l , 7 5 0 w o r d sf o r T A S E L .
6. Mode of availabilityof the programs
Readersinterestedin obtainingthe programsare
invitedto contactthe first author.The text of the
programoutput is availablein Germanand in English.
Acknowledgements
We thank O. Pietschmannfor drawingthe flowchartsand H. Biancofor typing the manuscript.
References
[ 1 ] M . W . B i r c h , M a x i m u m l i k c l i h o o d i n t h r e e - w a yo o n t i n g e n c y t a b l e s .J , R o y . S t a t i s t . S o c B 2 5 ( 1 9 6 3 ) 2 2 0 - 2 3 3 .
[ 2 ] Y . M . M . B i s h o p , S . F i e n b e r ga n d P . f l o l l a n d , D i s c r e t eM u l t i v a r i a t e A n a l y s i s : T h e o r y a n d P r a c t i c e .( M . l . T . P r e s s .
B o s t o n ,B o s t o n , 1 9 7 5 ) .
[ 3 ] J . P . B u n k e r s e t a l . ( e d s ) ,T h e N a t i o n a l H a l o t h a n e S t u d y
( N a t l . I n s t . o 1 ' H e a l t h ,B e t i e s d a , 1 9 6 9 ) .
[ 4 ] A . P . D e m p s t e r ,C o v a r i a n c es e l e c t i o n ,B i o m e t r i o s 2 8
( t 9 ' 7 2 )1 5 7* t 7 5 .
f d Kindes[ 5 ] S . K o l l e r e t a l . , S c h w a n g e r s c h a f t s v e r l au n
entrvicklung (in preparation).
[ 6 ] N . W e r m u t h , A n a l o g i e sb e t w e e n m u l t i p l i c a t i v e m o d e l s
in contingency tables and covariance selection. piometr i c s 3 2 ( t o a p p e a rM a r c h 1 9 7 6 ) .
[ 7 ] N . W e r m u t h , N l o d e l s e a r c ha m o n g m u l t i l r l i c a t i v em o d e l s ,
B i o m e t r i c s 3 2 ( t o a p p c a rJ u n e 1 9 7 6 ) .
[ 8 ] N . W e r m u t h , I r . S c h e i d t ,F i t t i n g a c o v a r i a n o es e l e c t i o n
m o d e l t o a m a t r i x , A p p l i e d S t a t i s t i c s( s u b m i t t e d 1 9 7 5 a s
Algorithrn 172).
© Copyright 2026 Paperzz