Chapter 3 Chapter 3: Three Strategies: From Contextual Analysis to Multilevel Modeling Petr Soukup 3.1 Introduction When working with data, researchers are often confronted with problems that arise when a lack of attention is paid to the specific context of respondents and only data on individuals are used. Conversely, sometimes aggregate data for certain kinds of units are used (regions, schools, electoral districts, etc.), and the results of the analyses are incorrectly interpreted as applying to the behaviour of individuals rather than the analysed aggregates. The objective of this chapter is to highlight the importance of context in data analysis and what happens when context is overlooked, and to present three basic strategies of working with data where taking contextual phenomena into account is warranted. 3.1.1 Variable Types as the Basis for Data Analysis Before describing the data analysis strategies that will be discussed in this chapter, it is first necessary to describe the various types of variables that can be used in data analyses. This subject was most rigorously addressed by Lazarsfeld and Menzel (1965) (and the contextual analysis that emerged from that was used by Lazarsfeld and his colleagues in several studies, the most important of which was the Academic Mind (Lazarsfeld, Thielsen 1958). Here we will try to explain “Lazarsfeldian” typology and the significance it holds for contextual data analysis1. Lazarsfeld and Menzel defined three types of variables at the collective level and four at the individual level. We shall start with the variables at the individual level, even though Lazarsfeld and Menzel’s approach proceeded in the opposite direction: 1) absolute 2) relational 3) comparative 4) contextual Absolute characteristics apply to an individual (a member of a collective) 2 in isolation and without reference to other members of the collective. Typical examples would be a person’s marks at school3, their height and weight, or, if we are looking at territorial regions as collectives, even the size of community they live in, and so on. Chapter 3 Relational characteristics are based on the relationship of one member of a collective to the other members of that collective. Typical examples of these variables are found in sociometry. They may refer to the number of positive choices expressed by other members, or the number of times a person contacts relatives in a month, and so on. What is significant is that these characteristics can only be measured if we have access to other members of the collective and we know about the relationship of the individual to them or vice versa. The final traditional category of variables (and in many ways one that is similar to absolute and relational variables, raising the question of whether it needs to be defined as a separate category at all) is that of comparative variables. This type of variable determines where an individual ranks within a collective based on properties measured using absolute and relational variables. It can be used, for example, to rank children in school by how well they do in math (or how well they do in general), by weight (with the help of absolute characteristics), or to rank individuals by their popularity in a group (with the help of relational characteristics). In statistical terms these examples are about transferring an initially absolute or relational variable onto a different scale, and it is a question whether they have anything to contribute on their own. A somewhat better example is Lazarsfeld and Menzel’s (1965: 432) variable used to observe the birth order of an individual in a given family, a variable that can then be used to answer questions such as whether parents devote better care to first-born children than other children, or, on the contrary, whether the youngest children in a family receive better care. Contextual variables are a type of variable that is not measured at the individual but at the collective level. Contextual variables are defined as follows: “Contextual properties are actually the characteristics of collectives applied to their members.” (Lazarsfeld and Menzel 1965: 433) The properties of a collective (see below the description of the types of variables pertaining to collectives) apply to individual members of a collective. An example of this could be information about whether a student attends a private or a public secondary school, whether a person is male or female, or a person’s religious affiliation. Clearly, every individual belongs to a number of contexts at once (see the section on Problems with Contextual Variables and How to Measure Them). Below we will try to show how variables originate at the collective level that can then be used as contextual variables. The following types of variables are relevant to collectives: 1) analytical 2) structural 3) global Chapter 3 Analytical variables are created out of an aggregation of all the individual values of a variable for all members of a collective. An absolute variable may therefore be the average marks of students in a particular class, the average height of people in the Czech Republic, or the average size of neighbourhoods in the Czech Republic. It is also possible to have aggregate comparative variables and determine the proportion of people who are the oldest children in a family, in an entire population, and so on. Structural variables are likewise created out of an aggregation of relational variables measured for individual members of a collective as representative of an entire collective. They can be used to establish the average popularity of an individual in a group or the average number of contacts that people have with relatives in the Czech Republic. The final type of variable at the collective level is not an aggregate variable but a direct measurement at the collective level: global variables. It is possible to note whether a community does or does not have a water purification plant, but that fact does not pertain to a single citizen. Lazarsfeld and Menzel (1965:428) cite the fact of whether a nation uses money as its currency or not as a global variable that serves as an anthropological characteristic of a nation. The typology described above is not entirely original, as a similar typology was put forth by Cattel and Kendall (Lazarsfeld, Rosenberg 1955), where using different terms they proposed distinguishing between five types of variables (in Kendall’s case referred to as types I-V) or three variables Cattel refers to population, structural, and syntality). It should also be noted that no typology can be regarded as dogma, and that each of the typologies mentioned here is intended as an aid for analysing data. Contextual variables are especially important, and knowledge of them forces researchers to think about the context that members of a collective are influenced by. Some of the most common contexts that can be useful in data analysis are presented in Table 1. Table 1. Contexts common in data analysis (collectives and their members) Example Collective member (level 1) Collective (level 2) Higher-order collective (level 3) Schools Student Class School Neighbours Individual Neighbourhood Religion Individual Religious community Church Region Chapter 3 Professional association Individual Local association of a professional organisation National professional association Interest group Individual Local association of an interest group National interest group Regions Individual District Region International context Individual Country Institution other than a school Patient e.g. Hospital Region (hospital incorporator) Institution other than a school Member of a party Local association Regional or national party organisation All the examples cited above show where members of a collective rank within that collective or within higher-order collectives. Models developed according to this concept are called hierarchical models (Bryk and Raudenbush 2002; in Czech, see Soukup 2006). However, they are only a fraction of all the models that take into account the importance of the contexts of so-called multilevel models. For terminological reasons, it is customary to refer to members of collectives as first-order units, collectives as second- (higher-) order units; collectives of an even higher order are then third-order units, and so on. I use this multilevel terminology in the following text in the place of collectives and members of collectives. It is often necessary to work with more than two levels in an analysis, and the context and the typology of variables are often relative. While for students the context is the class, for classes it is the school that the class belongs to4. Lazarsfeld and Menzel also importantly noted that we should only be interested in context when we have at hand individuals from different contexts (from different groups influencing the individual in different ways). The relativity of the contextual concept is illustrated in Diagram 1, inspired by Hox (2002). Diagram 1. Types of variables and how they are derived from other levels of analysis (Level 1) (Level 2) (Level 3) Absolute A→ Analytical A→ Analytical Relational A→ Structural A→ Structural Comparative A→ Analytical A→ Analytical Global A→ ←D Analytical Contextual ←D Contextual Chapter 3 Notes: A→ means that the variable on the right originates from the variable on the left through aggregation ←D means that the variable on the right is used at a lower level by means of the opposite process (disaggregation). To reiterate, the contextual variable can take the form of an analytical, structural, or global variable, and the purpose is to use this variable measured at a higher level (or aggregate) for analysis at a lower level. This specific quality is graphically articulated by the division of rows containing contextual variables in Diagram 1. 3.1.2 Problems with Contextual Variables and How to Measure Them Defining the level (context) or group to which individuals belong is not always straightforward and without difficulty. Iversen (1991:4-6) has noted the following problems in particular: 1) the fuzzy boundaries of a group, 2) multiplicity of membership and the overlapping of groups, 3) the mobility of individuals between groups. Iversen’s description of the first phenomenon is made in connection with data collection. It is often difficult to determine whether a person belongs to the group that allegedly influences them or not. A simplified view of membership5 in a group founded on dichotomous variables (0 – does not belong to the group, 1 – belongs to the group) can be the source of errors in many analyses. Consequently, it would clearly be more accurate if membership/nonmembership could be measured as a joint variable (or at least an ordinal variable) that would express the degree of participation in the group and the duration of membership. While this suggestion sounds logical, in practice it is often difficult or even impossible to apply. It is useful to try to apply this approach whenever possible, but particularly for paradigmatic reasons dichotomous variables continue to dominate in practice for measuring membership in groups. An individual usually belongs to more than one group that has an influence on them, which means that individuals are situated in multiple contextual frameworks. Total institutions, in the sense of the term given by Goffman (1965), are one exception. The effect that different groups have on an individual varies, but an individual is often substantially influenced by a number of groups at once, and to overlook that fact in data analysis would lead to erroneous results. When the groups are hierarchically ordered the problem is relatively simple from an Chapter 3 analytical perspective (a student attends just one class and that class is part of one school; an individual lives in a certain district and that district belongs to a single and precisely defined region). However, in practice membership in different groups is overlapping, which means that an individual belongs to a number of different groups that are not hierarchically related or ordered – neighbourhoods, interest groups, employment, religious communities, etc. These analytical problems are consequently more complex and modelling them is still in the early stages of development (see the section More problems that can be solved in multilevel analysis). The final difficulty connected with contextual variables is the inter-group mobility of an individual. For example, a child changes schools when their parents change residence; a person may convert from one religion to another; when a person moves to a different community they acquire new neighbours. When these changes occur, the question arises of which of the groups a person has belonged to has had more of an impact on them – the current or the previous group? Or both of them? A similar problem arises in the case of membership in multiple groups. Researchers usually ignore the problem and assign the individual to the group they belong to at the time of data collection or in the group they belonged to at the time the analysis applies to. Of course, by following this approach, room opens up for errors to occur. Therefore, it is good to have an idea of how much mobility there is within the groups under observation in order to accurately estimate the size of the bias in such analyses. Having outlined the typology of variables and the problem of measuring contexts (groups), we will now look at the possible approaches to data analyses using contextual variables. Three basic strategies of data analysis warrant attention here. 3.2 Three Strategies for Working with Contextual Data 1) Classic single-level analysis – regression, ANOVA, ANCOVA, etc., at the individual level 2) Analysis at the higher-order level 3) Multilevel analysis working with data on individuals and contextual data The first strategy simply involves ignoring contextual variables (or the context as a whole) and working only with variables at the individual level. This can produce very biased results, but this strategy may in some cases suffice. A classic example is linear regression analysis of variables that apply to individuals. Chapter 3 The second strategy is only used on data for higher-order units (aggregate data). The results are then interpreted to apply to individuals, a lower-level unit. This approach can cause very biased results. Inference from aggregate data is a method that uses aggregate data to apply results to lower-order units (ecological inference). The third strategy uses data at the level of individuals, but also uses contextual information that applies to higher units. Consequently, it is able to avoid the errors that occur using the first and second strategies, and in many cases it is a more appropriate strategy than either of them. This strategy is embodied in Lazarsfeld’s contextual analysis and its more recent variant, multilevel analysis. Below, each of the three strategies is examined in detail and the advantages and shortcomings of each are discussed. Examples demonstrating the use of each strategy are presented for better illustration. Lazarsfeld was very familiar with the first and second strategies, and it was based on them and on a typology of variables (see above) that he proposed and began using the third, the contextual strategy. 3.3 Working with Data on Individuals (Strategy 1) The first strategy is widely used when only variables that apply to individuals are being analysed. The strategy is very simple, and statistics contain standard procedures for using it, among them regression analysis (linear, logistic, nonlinear), analysis of variance (or t-tests for two groups), covariance analysis, discrimination analysis, structural equation modelling, and so on. To provide a more detailed illustration we will describe the use of regression analysis, as a classic example of the first strategy. 3.3.1 Regression Analysis and the Interpretation of Parameters Regression analysis is essentially designed to model the relationship between one dependent variable and one or more independent variables and to express this relationship as an equation. The equation can be used to estimate the value of the dependent variable when the values of the independent variables are known, but in sociology it is more often used to describe the variables that have an effect on the dependent variable. Classic linear regression is the most common method. Therefore, we will describe here the conditions that must be met in order for the “ordinary least square” (Gauss 1805) or maximum likelihood to be regarded as estimating techniques that make it possible to achieve BLUE (Best Linear Unbiased Chapter 3 Estimator) estimates. The conditions for classic linear regression are (Draper and Smith. 1998; Fox 1997): 1) A continuous dependent variable, the independent variable can be continuous or dichotomous (but never ordinal or nominal). 2) Noncorrelation of the independent variables (the opposite of multicollinearity). 3) The independence of individual observations – fully random selection. 4) Homoscedacity and noncorrelation of random components (the opposite of heteroscedacity and autocorrelation). 3.3.2 Methods of Estimating Parameters The “ordinary least square” (OLS) is the most commonly used technique in regression analysis – observed values are distributed along a selected curve so that the sum of square deviation of the regression curve from the observed values is as small as possible. The following is the regression model expressed in mathematical terms: y=b0+b1*x1+b2*x2+b3*x3+…+e, (3.1.) The value of the dependent variable y is therefore the sum of the constant (b0), the sum of the regression coefficients (b1, b2, b3, etc.) and the relevant values of independent variables (x1, x2, x3, etc.) and estimate bias (residuals - e). Estimation by least squares used the vector of the values of dependent variable y and the matrix of independent variables X. The vector of regression coefficients can be estimated using the following formula: b=(XT*X)-1*XT*y (3.2.) When the above requirements are not satisfied, the following situations may arise as a result: a) The formula cannot be used and the parameters cannot be estimated (this applies even in the case of a linear dependency of one independent variable on one or more other independent variables – perfect multicollinearity). b) The parameters can be estimated, but the estimates have large standard deviations (owing to weak multicollinearity, heteroscedacity, autocorrelation, or the dependence of individual observations). In this case the fact that a requirement of classic linear regression analysis is not met means that it may erroneously be impossible to demonstrate the effect of a certain independent variable on a dependent variable. Chapter 3 For heteroscedacity and autocorrelation special techniques of generalised regression have been developed (Hebák and Hustopecký 1987: 248-269) that address this situation. The case of strong multicollinearity can be solved using ridge regression. There is no simple solution in classic regression for dealing with dependent observations, and multilevel data analysis is the only possible approach to this situation (see the discussion of Strategy 3 in the notes of this article). Obviously it is possible to test for whether the regression conditions have been met, and anyone interested in tests for multicollinearity, heteroscedacity, and autocorrelation should look at texts on regression analysis (Jobson 1991; Draper and Smith 1998; Fox 1997; Hebák 2005a). I wrote about testing for the presence of dependent observations in a separate article (Soukup 2006). Let us quickly review the logic of determining the effect of independent variables on a dependent variable using the simple example below (Example 1). Example 1. Dependence of income on years of education and on sex Table 2. Regression in SPSS – dependence of income on years of education and on sex Unstandardised Coefficients Standardised Coefficients t B Std. Error Beta (Constant) 736.058 885.345 SEX 3196.042 394.692 .265 NO. OF YEARS SPENT AT SCHOOL 453.545 701.987 Sig. 95% Confidence Interval for B Lower Bound Upper Bound .831 .406 -1001.888 2474.003 8.098 .000 2421.257 3970.828 577.766 63.281 .299 9.130 .000 Dependent Variable: MAIN PERSONAL NET INCOME Source: Data ISSP 1999, n=782 . From the values in the table the equation reads: Income=736 + 578* years of education + 3196*sex (male)6 (3.3) On the basis of column Sig. or t it is possible to infer that both independent variables have a statistically significant effect on the income variable. The interpretation of the parameters is: The average person who has a year more education has an average of 578 CZK more income when we take into account sex (so-called ceteris paribus). Men have an average of 3196 CZK more income when we take into account the effect of education7. 3.3.3 The Consequences of Ignoring Context Chapter 3 When we leave out the context of gender, that is, whether the individual is male or female, we obtain a single regression equation (1.4) for men and women together. Income=2343 + 589* years of education (3.4) Equation 1.4 shows that the constant has changed substantially and the value of the regression coefficient of the education variable has increased slightly. Estimates made on the basis of the equation without the sex variable will be undervalued for men and overvalued for women. The effect of education in equation 1.4 is slightly overestimated. Figure 1 depicts this situation graphically. Figure 1. Dependence of an individual’s income on sex and education 18000 16000 Income in CZK 14000 12000 M+F 10000 M 8000 F 6000 4000 2000 0 0 5 10 15 20 25 Years of Education The blue and yellow lines in Figure 1 are derived from equation 1.3 and indicate the difference between men’s and women’s incomes; the purple line is combined for men and women and it shows what the estimates look like when we leave aside the contextual variable. We should add that in regression we usually use the sex variable the way it is presented in equation 1.3, and therefore at first glance the context seems to be taken into account. However, the problem is more complicated, because it is not just a certain constant but also the value of the other regression coefficients that differ in the equations for men and women. Table 3 presents the regression estimates when we calculate the regression equation separately for men and women. Table 3. Dependence of an individual’s income on education, for men and men separately Chapter 3 Unstandardised Coefficients Standardised Coefficients t Sig. 95% Confidence Interval for B SEX B Std. Error Beta Lower Bound Upper Bound Woman (Constant) 2913.902 837.915 3.478 .001 1265.919 4561.885 NO. OF YEARS SPENT AT SCHOOL 412.835 61.692 .337 6.692 .000 291.501 534.168 Man (Constant) 1955.543 1430.603 1.367 .172 -856.339 4767.424 NO. OF YEARS SPENT AT SCHOOL 725.913 104.553 .318 6.943 .000 520.412 931.414 Dependent Variable: MAIN PERSONAL NET INCOME Source: Data ISSP 1999, n=782 From Table 3 it is evident that the constants of the equations for men and women are different (that is, the regression lines start at different points) and the effect of education on the income of men and women also varies. Education brings men greater income increases than women (approx. 300 Czk per year of education). The regression lines therefore differ in terms of where they start and their slope, as demonstrated in Figure 2. Chapter 3 Figure 2. Dependence of an individual’s income on sex, for men and women separately 20000 18000 Income in CZK 16000 14000 12000 10000 M F 8000 6000 4000 2000 0 0 5 10 15 20 25 Years of education Clearly, if we want to select the right model for our data it is important not just to include contextual variables as additional independent variables but also to include the interaction between the contextual variable and the measured variable at the individual level. Instead of equations 1.3 and 1.4, equation 1.5 would be the most suitable equation. Income=constant + b1* years of education+b2*sex+b3*sex*years of education (3.5) Clearly, when context is added the use of classic regression analysis becomes more complicated. In reality multiple groups are regularly observed at the same time, and that just makes the problem more complicated. It must be remembered that individuals in a certain context often cannot be regarded as mutually independent, and therefore the regression estimates in equation 1.5 must be inaccurate. There is then the possibility of using techniques of multilevel analysis (Strategy 3 in the notes of this article). Before looking at these techniques and an example of their use, we will first focus on the situation where we ignore or where we do not have data at the first level and therefore use aggregate contextual data. 3.4 Working with Data at the Aggregate (Contextual) Level (Strategy 2) In practice, situations arise where there is no data on individuals available, only aggregate data for higher-order units. This typically happens in research on voting behaviour. For Chapter 3 example, if we are trying to determine whether men or women vote more often, the following are possible ways of finding an answer: 1) By learning, perhaps from census data (Population Census), whether they voted in the previous election. This method should be relatively accurate, but it has some shortcomings. The first problem is that a) censuses are held once every ten years. Another problem is that b) the date of the census may be remote from the date of the most recent elections, and people may not remember whether they voted or not, or they may confuse individual elections. Also, c) people may be untruthful about whether they participated in voting or not. Finally, conducting a census is demanding financially and in terms of time d), the results become available around two years after the survey, and that is not very timely. This way of finding an answer to the question must be rejected. 2) By taking a sample of people from regular surveys on voting preferences and finding out whether they voted in the previous elections. There are some weaknesses to this method, too. Even in a randomly selected sample people b) may not remember whether they voted or may confuse individual elections or c) they may lie about (not) having participated. The financial costs of this kind of survey also play a role d). A specific feature of sampling is e) that the results calculated from a sample (e.g. the share of male compared to female voters) are affected by sampling error8 (and often also by systematic errors not connected with sampling)9. 3) The third possible method is to use data that is available from the state statistical office on individual electoral districts. It is possible to determine how many potential male and female voters are in each electoral district, and it is equally possible to learn what the voting participation was in each electoral district. The third case here works only with aggregate data for all individuals, and that is its advantage. In the literature this approach is called ecological inference. Errors can also occur when ecological inference is used to reconstruct individual behaviour. This is known as Robinson’s problem, an error named after Robinson (1950), who suggested that an analysis of aggregate data may lead to conclusions about the relationships between variables that is the very opposite when individual data are analysed. Simpson’s Paradox similarly refers to the difference between trends produced by data for lower-order units and the trends produced when those data are aggregated (Anděl 2003, 2005; Hendl 2004). These failures connected with inference of aggregate data are collectively referred to as ecological fallacies. Statisticians have tried from the outset to propose a solution to Robinson’s problem; some Chapter 3 such solutions include Duncan and Davis’s method of bounds (1953) and Goodman’s regression for inferences from aggregate data (Goodman 1953). Both solutions were long the only possible ways of inferring individual behaviour from aggregate data. We should add that in the case of the method of bounds the estimates are quite inaccurate (the results of this method are interval estimates, so, for example, a finding on voting behaviour may be that 30%-50% of men voted and the share of women who voted was somewhere between 35% and 70%). Goodman’s regression is a useful method only when behavioural patterns in individual groups (contexts) are similar, and that is not usually the case. From the perspective of the sociology of science, it is interesting that the solution to Robinson’s problem remained the same from the 1950s to the 1990s. In 1997 King arrived with a new solution (1997). King’s method, cockily named EI (ecological inference), combines the approaches of the method of bounds and Goodman’s regression with the use of truncated multidimensional normal distribution. King even developed software that can be used to make estimates from aggregate data, and the software is also called EI. King’s solution has been criticised in the literature for its limits (again it applied only under certain conditions). It is likely that no general solution to Robinson’s problem applicable in ever situation will ever be developed, but the existing one’s are used mainly in political science and are often very useful. An example of the initial information and the possible solutions is presented in Tables 2x2 in Example 2. Example 2 Inferring the voting behaviour of men and women10 EI Problem The information we have on individual electoral districts is as follows: Table 4. The share of voters and non-voters in the selected electoral district Gender/Voting Men ?? Women ?? Total 35% 65% Voted Did not vote Total 55% 45% Note: In the table, row percentages are in the rows and column percentages in the columns. From the aggregate data we can tell how many people took part in the elections in each electoral district and also the share of men and women make up that participation figure. However, we do not know (question tags in the Table 4) whether it is men or women who vote more often; data on individuals is lacking. EI Solution We would like to obtain the following results for the Czech Republic as a whole and for electoral districts. Chapter 3 Table 5. Share of men and women who participated in the elections Gender/Voting Men 30% Voted Did not vote Total 100-30=70% 55% Women 40% Total 35% 65% 100-40=60% 45% Note: In the table, row percentages are in the rows and column percentages in the columns. The results inform us that 30% of Czech men voted and for women the figure was ten percentage points higher (i.e. 40% of women voted). Inferred estimates from aggregate data are always affected by error, because we lose some information on analysed units when data are aggregated. The degree of error varies in each application (a detailed discussion of various ways of modelling and estimating error can be found in King, Tanner and Rosen (2004)). It is best to use the strategy of analysing aggregate data only when data on individuals (or on lower-level units) are not available. When we do have data on individuals and on the context it is important to use both these sources of information. The most appropriate methodological strategy is contextual analysis or its more elaborate variant multilevel analysis, which will be discussed in the following section. 3.5 Working with Data at the Level of Individuals and the Contextual Level (Strategy 3) The description of the first strategy for working only with individual data also cautioned that this approach can produce some problems. Let’s briefly review them here: a) Ignoring the context results in bad (distorted) estimates of the regression parameters (see Figure 1). b) Ignoring the dependence of individual observations within a particular context leads to overestimated standard deviations of the regression parameters (the greater the dependence of units, the greater the standard deviations, and the result may be an incorrect inference of the statistical insignificance of individual variables in the models). Inferences made just on the basis of contextual data (Strategy 2) are also accompanied by problems: c) By aggregating data we lose some of the information and the estimates of the models’ parameters are inaccurate. Chapter 3 In relation to the first strategy we tried to solve the problem described in point a) by introducing contextual variables and by means of interactions with individual variables. The problem in point b) cannot of course be solved the same way. Moreover, other problems also arise that require that the first strategy be abandoned. The following tends to be one basic problem: d) The groups (contexts) containing varying relationships are randomly selected from a basic sample and they are not all included in the research. One example could be a random selection of several schools, in which the students are then also randomly selected. If we want to generalise the conclusions to apply to all the schools in the Czech Republic and discover the factors that make the schools different, classic regression analysis or dispersion analysis will not suffice. These analyses just detect differences between the groups included in the research (or, more commonly, experiments, in the case of dispersion analysis). The random error in the regression equation makes it possible to generalise findings to all individuals in the given population but not to all groups in the given population. For that kind of generalisation, random errors at higher levels of analysis must be taken into account. The procedures that make these steps possible are collectively referred to as multilevel modelling or multilevel analysis. Several other equivalent terms are also used: random-coefficient modelling, hierarchical modelling, mixed-effects modelling, and covariance components models. The most appropriate term in connection with the examples cited for the first strategy of data analysis will be random-coefficient modelling. This term derives from the idea that in individual groups (contexts) regression equations11 are calculated using variables measured at the first level. Each equation includes a random error at the first level. The values of the parameters of these equations (i.e. constants and slopes) are re-entered into the regression at the second level as a dependent variable.12 Independent variables at these levels are contextual variables. There will be as many equations as there are estimated parameters in the regression equations at the first level. A random error can be added to each equation (at the second level), thus enabling generalisations to all contexts (groups), that is, even to those not included in the research.13 These random errors at higher levels cause the randomness of the coefficients, which is where random-coefficient modelling gets its name. 3.5.1 Lazarsfeld’s Version of Contextual Analysis Chapter 3 Before we embark on a description of the more complex models of contextual analysis and multilevel analysis using the latest statistical findings, let us take a look at one of the first researchers to use contextual analysis and even gave it a name; that researcher is Paul Felix Lazarsfeld. Lazarsfeld derived the concept of contextual analysis from the contextual variables that are typical for it (see the typology of variables on p. XXXX). Lazarsfeld’s attempts at contextual analysis can be found in his publications from the 1950s (Lazarsfeld and Rosenberg 1955; Lazarsfeld and Katz 1955), and around the same time other researchers were also taking the first steps in contextual analysis (e.g. Marsh and Coleman 1956). A classic example of Lazarsfeld’s contextual analysis is in the study and book Academic Mind (Lazarsfeld and Thielsen, Jr. 1958). The main problem addressed in the publication is the deteriorating situation teachers were finding themselves in at universities in the United States after the Second World War. One problem was that teachers were being accused of left-wing extremism and other excesses. Understandably, the teachers consequently felt uncomfortable and threatened. Lazarsfeld and Thielsen did not view this problem as a single phenomenon but instead inquired into whether the number of such accusations and the feelings of threat were the same at different schools and especially at different types of schools. They distinguished between universities according to size, incorporative status (private, public), and the quality of education (low, middle-to-low, middle-to-high, and high). These types of schools then served as the basic context for examining the atmosphere during lessons, the number of recorded accusations, the attitudes of individual universities, and the sense of threat on the part of teachers. One of the findings in Lazarsfeld and Thielsen’s study is expressed in Figure 3.14 Figure 3. Number of accusations and sense of threat by quality of school Chapter 3 70 Accusations 60 Sense of threat 50 % 40 30 20 10 0 Low Mid-to-low Mid-to-high High quality of the school Figure 3 clearly shows that at higher quality schools the percentage of teachers accused of anti-American behaviour was much higher than at lower-quality schools (63% on high-quality schools and only 19% at lower-quality universities). The main reason for this is the creative, open approach of teachers at quality universities, and the greater openness of such approaches to attack. However, the administrations at high-quality universities also more effectively protected their teachers against such (often unsubstantiated) accusations. This piece of data on the schools represents a typical context for individual data relating to recorded events, feelings, and attitudes. The protection that better-quality universities were able to provide their teachers is reflected in the second piece of data on the individual statements of teachers about how threatened they feel. It is on the basis of contextual information that this piece of data can be meaningfully interpreted. The sense of threat is lower in absolute figures at lower quality schools (16-18%) compared to 33-35% at higher quality universities. Relatively, however, at these lower quality schools the threat of accusation corresponded with 84% (16 out of 19) and 75% (18 out of 24) of all the accusations made. At quality and high quality universities it is relatively lower (just 66%, i.e. 33 out of 50, and 55%, i.e. 35 out of 63). For teachers from the best quality universities the free-thinking environment at the schools was a frequent source of such accusations, but the Chapter 3 strong administrations at these schools in almost half the cases managed to overturn the accusations. At the poorest quality schools teachers were accused less often, but in 84% of cases the administration did not quash the accusations. The context was important both to explain the free-thinking behaviour of teachers at better quality universities and to assess the relative perceived threat of attack by teachers, either those under the effective protection of an administration at a higher quality university, or those with ineffective protection at a lower quality university. Readers can find a detailed and sociologically relevant explanation of these differences in the study by Lazarsfeld and Thielsen, Jr. (see Lazarsfeld and Thielsen, Jr. 1958). Here we have focused mainly on the methodologically relevant comparison of different types of school (i.e. of different contexts). We should recall that in Lazarsfeld’s era sociologists did not have computers to use, and the simple expression of contextual analyses in the form of percentages of accusations and threats was the only practical solution possible. Methodologically, the typological approach is also important, as Lazarsfeld divides the contexts (individual schools) into different types and tries, mainly with the aid of variables describing these types of schools (but also of course with the aid of analytical variables), to explain the differences between these types. This approach is still used today in contextual or multilevel analysis (see below). Chapter 3 3.5.2 Contextual Analysis and a Comparison with Multilevel Analysis Following the example of Lazarsfeld’s contextual approach, we will look now at the statistical models that evolved out of that idea. Two simple methods that we can rank among contextual analyses are the model of absolute effects and the model of relative effects (Iversen 1991). Both models are statistical regression models that contain interactions between contextual variables and variables for individuals in the spirit described in the notes in the section on the first strategy of data analysis. These models do not include random error at the higher levels, only at the first level of analysis. Estimates can be obtained from these models in two ways: 1) By constructing interaction variables for individuals (Method 1), or 2) By means of a two-step estimated model based on variables measured at the first level for individual contexts, and then by using these estimates at the second level in the role of dependent variables (Method 2). Both methods only approximate the results of multilevel analysis, but in many cases that is enough. We will demonstrate these methods and their results in an example and compare them with the method of multilevel modelling. Example 3. How the math skills of US students are related to status in different schools (state and church), taking into account the average status at schools (absolute contextual analysis) Method 1) data for individuals with interaction variables Table 6. Estimated regression parameters at the level of individuals using interaction Unstandardised Coefficients Stand. Coefficients t Sig. 95% Confidence Interval for B Collinearity Statistics B Std. Error Beta Lower Bound Upper Bound VIF (Constant) 12.102 .107 CSES .155 .282 18.906 .000 MEANSES 5.170 .191 .311 SECTOR 1.271 .158 SES*MEANSES 1.044 .300 2.936 SES*SECT -1.642 .240 113.210 .000 2.631 11.893 12.312 3.240 .516 1.937 27.091 .000 4.796 5.544 .872 1.146 .092 8.049 .000 .961 1.580 .872 1.146 .040 3.481 .001 .456 1.632 .874 1.145 -.104 -6.836 .000 -2.113 -1.171 .493 2.030 Source: HSB data (USA, 1980), n=7185 students, 160 schools Tolerance Chapter 3 The dependent variable was math performance, measured using a standard test. The performance of individual students ranged from -3 to 25 points; the average performance was 12.7 points and the standard deviation was 6.9 points. A status variable (SES) was selected as the independent variable to explain the differences between students’ math skills15. At the level of schools (context) the selected variable was the average status in the school MEANSES 16 (i.e. the analytical variable in Lazarsfeld and Menzel’s typology) and the type of school |SECTOR (0=state and 1=church, i.e. the global variable). For the purpose of the analysis it was necessary to create two new variables out of the status variable and the type of school or the average status in the school (see the last two lines in the table). The conclusions from the model are these: an individual’s status has an effect on their performance in math (with each unit increase in status, skills are on average 2.9 points higher). Student skills are better at church schools by 1.3 points, and student skills are also better at schools with higher status (with each unit increase in average status, skills are on average 5.2 points higher). The last two rows of the table are also important, and they tell us that at church schools the effect of status on math skills is lower than at state skills (see the negative value of the coefficient in row SES*SECT). At schools with a higher average status the effect of an individual’s status is higher than at schools with a lower average status (see row SES*MEANSES). Method 2) has two steps: first the model is estimated on the basis of a status variable measured at the first level, and then these data are used to model the constants and the regression coefficients at individual schools with the aid of contextual variables. This gradually gives us one-hundred-and-sixty-two tables17, the first one-hundred-and-sixty of which are used to estimate the effect of status on students‘ math skills at individual schools, and the last two (see Tables 7 and 8) to express the effect of contextual variables on the value of the constant and the regression coefficient at individual schools. The effect of status on skills is determined as the average of slopes at individual schools, and its value equals 2.2. This is lower than the value obtained using Method 1), and even lower than the value obtained using the multilevel modelling method (see below). Based on the tables for modelling the school constants we can reiterate the findings already presented in the previous method – higher status schools and church schools have students with better average math skills (the coefficient values 5.39 and 1.12 are almost identical to the values obtained in Method 1). There remains to compare the interactive coefficients. The ses*sector interaction is almost the same in both methods (-1,642 according to Method 1, and -1,56 according to Method 2). However, the ses*meanses interaction differs more (1,04 for Method 1, and 0,86 for Method Chapter 3 2). The general interpretation agrees, but the effect of average status is underestimated in Method 2 (see the comparison with the multilevel modelling method below). Table 7. Estimated effect of contextual variables on the constant Unstandardised Coefficients B Std. Error Standardised Coefficients t Sig. Beta (Constant) 12,104 ,202 59,871 ,000 Meanses 5,393 ,381 ,716 14,147 ,000 Sector 1,213 ,317 ,194 3,824 ,000 Dependent variable: regression constant, n=160 schools Source: HSB data (USA, 1980). Table 8. Estimated effect of contextual variables on the regression coefficient Unstandardised Coefficients t Sig. Std. Error Beta 2,883 ,160 ,000 ,861 ,301 ,005 Coefficients B (Constant) Ses*meanses Ses*sector -1,558 ,250 Standardised 18,067 ,219 2,861 -6,224 Dependent variable: regression coefficient, n=160 schools Source: HSB data (USA, 1980). ,000 Having examined the methods of contextual analysis using classic regression, we will now apply multilevel analysis to the data (see Tables 9 and 10), compare the results, and reveal the differences in these approaches. The model with random coefficients is formulated as follows: Yij= γ00+ γ01*(CSES) ij +γ10*SECTOR+γ20*MEANSES+u0j+ γ11*SECTORj*CSES ij + γ21*MEANSESj*CSES ij +u1j*CSES ij +eij, (3.6). Unlike classic regression models, this model contains the random errors at the school (second) level u0j a u1j. The model’s parameters are represented by the Greek symbol γ instead of the letter b, and a lower index ij stands for the individual in a given school (i) and the given school (j). Table 9. Estimated fixed effects in the multilevel model from Example 3 Parameter Intercept Sector γ10 Meanses Cses γ01 Equation symbol Estimate Std. Error df t γ00 12,113585 ,198807 159,893 60,931 ,000 1,216672 ,306385 149,600 3,971 ,000 γ20 5,339118 ,369299 150,970 14,457 ,000 2,938763 ,155093 139,313 18,948 ,000 Sig. Chapter 3 Sector * cses γ11 Cses * meanses -1,642583 ,239791 143,353 -6,850 ,000 γ21 1,038871 ,298901 160,562 3,476 ,001 Source: HSB data (USA, 1980), n=7185 students, 160 schools. Table 10. Estimated covariance components in the multilevel model from Example 3 Parameter Estimate Std. Error Wald Z Residual 36,721129 ,626133 58,648 ,000 Intercept + cses [subject = school] Sig. UN (1,1) 2,381859 UN (2,1) ,192603 ,204524 ,942 ,346 UN (2,2) ,101380 ,213812 ,474 ,635 ,371748 6,407 ,000 Source: HSB data (USA, 1980), n=7185 students, 160 schools. The value of interaction SECTOR*CSES (-1,618) indicates that for church schools the average value of the regression slope is 1,3 (=2,94-1,64), that is, in the church schools for each increase in status by one unit19 a student’s math skills are on average 1.3 points higher (at state schools the average value of the regression slope is 2,9420). Now we will interpret the second interaction (CSES*MEANSES=1,0321). At schools where the status of students is one unit higher22, the regression slope is 3,98 (=2,94+1,04). Therefore, even multilevel analysis confirmed that at schools with higher average status there is a stronger connection between the status of students and their math skills, and vice versa. Unlike contextual models, thanks to the introduction of random errors at various levels in multilevel modelling we are able to estimate the covariance of these components (Table 10). The first row of the table shows the estimated variance of a random component at the first (student) level (36,7). The test of the null value of this variance component23 (see the Sig. column) tells us that there are differences in the math skills of students that the model is unable to help us explain. The second row of the table indicates that there are differences in the average math skills of different schools, and these differences have not yet been satisfactorily explained with the use of contextual variables. Conversely, the last row in Table 10 (Sig. = 0,635) indicates that the contextual variables adequately explain the differences in the slopes of regression equations between schools. Let us compare the findings from the contextual models and the multilevel model. In the case at hand the findings are comparable (the values of the parameters and their estimated standard deviations). Method 2 of the contextual analysis differs slightly in some aspects. The reason for this is that it works with a smaller number of cases in the second of its two steps (160 Chapter 3 schools). Sometimes the differences in the standard deviation of estimates are bigger, but in our case there is a large enough number of individuals in the data and therefore the differences are minimal. One difference between the contextual analysis methods and multilevel analysis must however be highlighted again. The findings from the multilevel analysis can be generalised to apply to all the schools in the United States (on the condition that the schools in our analysis were randomly selected), while with the findings from the contextual analysis no such generalisation is possible. 3.5.3 More Than Two Levels of Analysis Having reviewed the basic ideas behind contextual analysis and multilevel modelling we should add that the above example of a two-level model (student-school) can be expanded even further. It is generally possible to model more than just two levels. When three levels are modelled, the changes in the variables at the second level are explained with the aid of variables at the third level, and the variables at the third can also of course be used to explain changes in the variables at the first level. Variables at the second as well as the first level can be allowed to vary randomly and random error can be introduced at both the second and third levels. Naturally, as the model becomes more complex the robustness of the solution unfortunately decreases, and the required size of the sample increases, because we need an adequate number of observations (units) at each level in order to obtain stable estimates. Therefore, for pragmatic reasons two-level models prevail in practice, and in the case of special studies with a sample in the many thousands three-level models can be used. But models with more than three levels, even though they are theoretically possible (and there even exists software for working with such models), are not used. 3.5.4 More Problems That Can Be Solved in Multilevel Analysis 3.5.4.1 Nonlinear models All the applications in this text are based on the use of continuous dependent variables and the linear association] between dependent and independent variables. This however in no way exhausts all the possible applications of multilevel analysis. The next logical use of logistic regression, after the introduction of random errors at higher levels of analysis, is multilevel logistic regression, which can be used when a dependent variable is dichotomous, ordinal, or nominal (with multiple categories) (Raudenbush and Bryk 2002). 3.5.4.2 Growth models Chapter 3 In growth models (Raudenbush and Bryk. 2002) we have measurements for a certain property at multiple time intervals for the same individuals. We can regard the measured property at various time intervals as the first level (like the students in a given school) and the individuals as the second level (like the schools). Growth models produce findings about whether the given property is on the increase on the decrease, and whether individuals have varying growth curves. The differences between the trajectories of individuals can be explained by means of their individual characteristics, which in these analyses have the character of contextual variables. 3.5.4.3 Meta-analysis One very interesting possibility is the use of multilevel modelling in meta-analysis (Raudenbush, Bryk. 2002). This procedure is based on the following logic: the first level is formed by data from individual studies, and the second level (the context) is formed by the individual studies. Individuals do not figure in the context of the group they belong to but in the context of the study in which they were respondents (or the experiment they took part in, etc.). The basic objectives of using multilevel modelling for meta-analysis are: 1. finding a common (“average”) outcome of all of the published studies, and 2. revealing what the sources of differences between studies are. The advantage of multilevel modelling is that it can be applied without knowledge of the data from individual studies (although such knowledge is certainly an advantage). One complication in meta-analyses (in a strict multilevel perspective) is when individual studies ignore context. There is also the more general problem connected with meta-analyses that stems from the fact that scientists have a tendency to publish just (mainly) outcome that is statistically significant, and this leads to the risk of publication bias in the outcome of metaanalysis. 3.5.4.4 Cross classified models Simple multilevel models assume that contexts are hierarchically structured (see Table 1). A student (level 1) attends a class (level 2) within a given school (level 3). However, social reality is much more complex and offers situations where an individual is situated in various contexts (groups) and these contexts are not hierarchically structured. Consequently, the individual does not belong to just a single group with an effect on them, but to many groups at once, and the effects of these many groups intersect or cross one another. Cross classified Chapter 3 models take this fact into account (Snijders and Bosker 1999; Hox 2002), and in many ways they represent a better model of reality that class hierarchical models. This better approximation of reality comes however at a price, as these models are formally much more complex and as a result the necessary software is often lacking. 3.5.4.5 Covariance structure analysis This form of analysis is an extension of the classic structural models that work with latent and manifest variables. The multilevel approach can be used in confirmatory factor analysis (testing the measurement model) and path analysis, or to combine the two (i.e. into a fully structural model) (Hox 2002). The multilevel approaches makes it possible to model the difference between the covariance matrices of individual groups by breaking down the covariance matrix into the intra- and the inter-group matrices (mathematically analogical to the analysis of variance, but the analysis is not carried out on the sums of squares but multidimensionally on the matrices). This inventory of the possible applications of multilevel modelling is not exhaustive. Psychologists also use a multilevel version of IRT models, and general methods of analysis known as Latent Models are also being elaborated, and so on. But these methods must be the subject of separate monographs and articles. 3.6 Conclusion – What Lazarsfeld and Menzel Already Knew This chapter opened with a look at a typology of variables, accompanied by contextual analysis, and reviewed three possible strategies of data analysis. Lazarsfeld and Menzel were very familiar with the first two strategies: 1) Working with data on individuals (regression, analysis of variance, etc.) 2) Working with aggregate data (along with Robinson’s problem and early solutions to it in the form of the method of bounds and Goodman’s regression). Lazarsfeld and his colleagues designed and introduced a third strategy: 3) Analysing data with individuals and contextual variables at once (contextual analysis). However, in Lazarsfeld’s time the mathematical formulation of this model was not yet adequately developed, and none of today’s computer technology was yet available. Lazarsfeld and his team clearly for this reason had to remain with recommending the use of contextual Chapter 3 variables when comparing individual contexts using descriptive statistics. One could surmise that were Lazarsfeld alive today he would be one of the primary advocates of the use of multilevel models. In my opinion he would favour the kind of rational recommendations in the three strategies discussed in this chapter. If we have data on individuals and the context plays no major role, then it makes sense to use data on individuals and submit them to standard methods of analysis (regression, dispersion analysis, etc.). Conversely, if we have just aggregate data, it makes sense to infer from the data using available algorithms. In many cases it makes sense to use data on individuals and contexts at the same time. In such cases the right approach is to use contextual analysis or its more modern version, multilevel modelling. For now, these represent the last stop in the evolutionary trajectory that Lazarsfeld and his team staked out fifty years ago. Chapter 3 References Anděl, J. (2003): Statistické metody. Praha, Matfyzpress. Anděl, J. (2005): Základy matematické statistiky. Praha, Matfyzpress. Draper,N.R., H. Smith. (1998): Applied regression analysis. Wiley. Duncan, O., D., B. Davis. (1953): An alternate to ecological correlation. American Sociological review, 18 pp. 665-666. Fox, J. (1997): Applied regression analysis, linear models, and related methods. London, Sage. Goffman, E. (1965): The characteristics of total institutions. In: Etzioni, A. (ed.). Complex Organizations. New York, Holt, Rinehart & Winston, pp.312-340. Goldstein,H. (2003): Multilevel Statistical Models ( Third Edition). London, Edward Arnold. Goodman, L.,A. (1953): Ecological regressions and behavior of individuals. American Sociological review, 18, pp. 663-664. Hamplová, D. (2005): Základní principy víceúrovňových modelů. SDA info 7 (2), pp. 1-2. Hebák, P. et al. (2005a): Vícerozměrné statistické metody (díl 2). Praha, Informatorium. Hebák, P., J. Hustopecký. (1987): Vícerozměrné statistické metody s aplikacemi. Praha, Státní nakladatelství technické literatury. Hebák,P. et al. (2005b): Vícerozměrné statistické metody (díl 3). Praha, Informatorium. Hendl, J. (2004): Přehled statistických metod zpracování dat :analýza a metaanalýza dat. Praha, Portál. Hox, J. (2002): Applied Multilevel Analysis: Techniques and Applications. Erlbaum associates. Iversen, G. (1991): Contextual analysis. London, Sage. Jobson, J.D. (1991): Applied multivariate data analysis.Vol. 1,Regression and experimental design. Berlin, Verlag Springer. King, G. (1997): A solution to the ecological inference problem. Princeton, Princenton University Press. King, G., M., A. Tanner, O., Rosen (eds.). (2004): Ecological inference: new methodological strategy. Cambridge (NY), Cambridge University Press. Kreft, I. G., J. de Leeuw. (1998): Introducing multilevel modeling. London, Sage. Lazarsfeld,P.F.-Katz,E. (1955): Personal Influence. The Part Played by People in the Flow of Mass Communications. New York, The Free Press. Lazarsfeld, P.,F., H. Menzel.(1965): On the Relation between Individual and Collective Properties. In Etzioni, A. (ed.). Complex Organizations. New York: Holt, Rinehart & Winston: pp. 422-440. Lazarsfeld, P. F., M. Rosenberg (eds.). (1955): The language of social research. The Free Press of Glecoe, Illinois Lazarsfeld, P. F., W. Thielsen, Jr. (1958): Academic mind (Social scientists in a time of crisis). The Free Press of Glecoe, Illinois. Longford, N. T. (1993): Random Coefficient Models. Oxford, Oxford University Press. Luke, D. (2004): Multilevel modeling. London, Sage. Marsh, P.C., A.L. Coleman. (1956): Group Influences and agricultural innovations: Some tentative findings and hypothesis. American Journal of Sociology, 61, pp. 588-594. Raudenbush, S. W., A.S. Bryk. (2002): Hierarchical Linear Models 2nd edition. London, Sage. Robinson, W.S. (1950): Ecological correlations and the behaviour of individuals. American Sociological review, 15, pp. 351-357. Snijders, Tom A.B., Roel J.Bosker. (1999): Multilevel Analysis: An Introduction to Basic and Advanced Multilevel Modeling. London, Sage. Soukup,P. (2006): Proč užívat hierarchické lineární modely. Sociologický časopis 42, No. 5, pp. 9871012. Soukup, P., L. Rabušic (2007): Několik poznámek k jedné obsesi českých sociálních věd – statistické významnosti. Sociologický časopis, 43. No. 2, pp. 379-396. Chapter 3 Poznámky ke kapitole 3 1) I have presented a simplified account of this typology along the lines of Hox’s overview (2002) elsewhere (Soukup 2006: 988-989). 2) An individual is just a metaphorical expression of a phenomenon, and the phrase “member of a collective” also does not automatically refer to a real individual, as will be explained below. These are relative expressions of a particular entity (metaphorically, its members) belonging to a more general entity (metaphorically, a collective). 3) Frequent use is made in this text of examples from the field of sociology of education because the author’s professional specialisation is in this field of sociology, but also because these examples are particularly illustrative. Better examples could certainly be proposed, but given that the contextual analyses use data from the sphere of education the author has opted for this approach. 4) The situation is even more complicated because a school is also obviously a context for students. Nevertheless, if we obtain relevant information about the class it is right to first use this information and only then additionally use information about the school. 5) In stricto sensu this cited terminology only applies to groups with membership, but the conclusions apply to all types of groups, and therefore the term of membership should be understood metaphorically. 6) The variable was coded as 0=woman, 1=man. 7) It is better to use interval estimates (see the last two columns of Table 2) rather than interpreting these point estimates of regression parameters. 8) In random sampling (Soukup and Rabušic 2007) this error is quantifiable and the results can therefore be predicted with a certain degree of likelihood (probability). 9) Sometimes voters are polled as they leave polling stations (exit polls). These could be a reliable indicator and they tend not to reveal the errors cited here. The problem is that these surveys focus on voters and only report indirectly on non-voters. 10) The data in this case are not real and are only intended to illustrate how the method works. 11) The use of these equations or the graphic expressions of these equations makes it possible to infer whether or not it makes sense to carry out a contextual (multilevel) analysis. If the curves or equations differ, the answer is yes. 12) The number of observations will therefore be equal to the number of units at the second level (the number of contexts) in which first-level units can be found. 13) The procedure to calculate the solution for a multilevel model is somewhat complicated. Unlike classic regression models and contextual models it is not possible to find the solution immediately (in one step) and instead it is reached through gradual iterative steps based on moving through various levels of analysis. 14) I would like to thank Dr. Jeřábek for drawing attention to this classic case and for recommending its inclusion in this text. 15) The variable was centred, and its values ranged between -4 and +3. 16) This variable ranged between -1,2 and 0,8. 17) For logical reasons we cannot present the 160 tables for all the schools, but it is possible to send them to anyone interested upon request. 18) The value is similar to the value obtained in Methods 1 and 2 in contextual analysis. 19) All other conditions being equal. 20) This information comes from the coding of the variable SECTOR (1=church schools) and the values of the parameters for the state schools are drawn from the parameter for variable CSES. 21) The variable almost corresponds to the estimate obtained using Method 1 in contextual analysis. Chapter 3 22) All other conditions being equal. For more on the tests and interpretations of these models see Soukup (2006) and the literature cited there. 23)
© Copyright 2026 Paperzz