Final meeting of the research group on "Mixture and Latent Variable Models for Causal Inference and Analysis of Socio-Economic Data" Book of Abstracts Department of Statistical Sciences University of Bologna February 1-2, 2017 Dynamic latent class models for cross-sectional data⇤ Brian Francis Lancaster University, UK [email protected] Valmira Hoti Lancaster University, UK [email protected] Abstract Using a latent class approach, this talk addresses the problem of assessing change over time in cross-sectional surveys. Traditional methods of analysis would assume that the class-specific item probabilities( the class profiles) are time-constant, but perhaps with the proportion of each class changing over time. However, this assumption may not be true. Social change will undoubtedly be present. We develop a range of models and apply them to human value items from seven sweeps of the European Social Survey for the UK. Various models of profile change are considered. The models are linked to the the ideas of measurement invariance, and this link will be discussed. Extensions to the model can be made, allowing for non-linearity in temporal trends. The conclusion is that human values are indeed changing over time, with change focused on one or two specific items. ⇤ Presented at the final meeting of the FIRB (“Futuro in ricerca” 2012) project “Mixture and latent variable models for causal-inference and analysis of socio-economic data”, Bologna (IT), February 1-2, 2017 1 Mixture growth modeling of households’ investment in risky financial assets ⇤ David Aristei University of Perugia [email protected] Silvia Bacci University of Perugia [email protected] Francesco Bartolucci University of Perugia [email protected] Silvia Pandolfi University of Perugia [email protected] 1 Abstract In this work we study the dynamics of households’ portfolio choices over the life cycle and we analyze the factors a↵ecting both the financial market participation and the amount invested in risky financial assets. At this aim, we propose a bivariate mixture latent growth model which allows for the inclusion of heterogeneous growth trajectories by assuming the existence of unobservable clusters (or latent classes) of households with similar behaviors in terms of portfolio choices. We also investigate the e↵ect of time-constant and time-varying covariates on the response variables at household level. The approach is illustrated by the analysis of an unbalanced panel dataset of Italian households over the 1998-2014 period. On the basis of this dataset, we identify three latent groups characterized by heterogeneous investment behaviors over the life cycle in terms of both asset market participation and conditional share. ⇤ Presented at the final meeting of the FIRB (“Futuro in ricerca” 2012) project “Mixture and latent variable models for causal-inference and analysis of socio-economic data”, Bologna (IT), February 1-2, 2017 1 Fixed-e↵ects estimation of binary short panel data models with predetermined covariates⇤ Francesco Bartolucci University of Perugia [email protected] 1 Claudia Pigini Marche Polytechnic University [email protected] Abstract Strict exogeneity of covariates other than the lagged dependent variable, conditional on unobserved heterogeneity, is often required for consistent estimation of binary panel data models. This assumption is likely to be violated in practice because of feedback e↵ects from the past of the outcome variable on the present value of covariates and no general solution is yet available. We propose a novel model formulation that takes into account feedback e↵ects without specifying a parametric model for the predetermined explanatory variables. We further propose estimating the model parameters with a recent fixed-e↵ects approach based on pseudo conditional inference, thereby taking care of the correlation between individual permanent unobserved heterogeneity and the model’s covariates as well. Our results hold for short panels with a large number of cross-section units, a case of great interest in microeconomic applications. ⇤ Presented at the final meeting of the FIRB (“Futuro in ricerca” 2012) project “Mixture and latent variable models for causal-inference and analysis of socio-economic data”, Bologna (IT), February 1-2, 2017 1 Advances in mixture models for rating data Maria Iannario Department of Political Sciences University of Naples Federico II [email protected] ⇤ Domenico Piccolo Department of Political Sciences University of Naples Federico II [email protected] Rosaria Simone Department of Political Sciences University of Naples Federico II [email protected] 1 Abstract The talk aims to overview recent developments in finite mixture models for ordinal data grounding on the specification of an uncertainty component: cub models. The original rationale has been adapted to build a more comprehensive framework that is designed to meet classical proposals as well as to be sensitive to di↵erence sources of nuisances, such as shelter e↵ect, response styles and overdispersion. The resulting class of mixture models has proved to be particularly fruitful for several applications, ranging from job satisfaction to sensometric experiments, and also for determining the e↵ect on uncertainty of the chosen ordinal scale. In conclusion, future methodological researches are outlined: the focus is being addressed to specify a more flexible shaping of the uncertainty component and, most importantly, to multi-item analysis to model subjective heterogeneity. ⇤ Presented at the final meeting of the FIRB (“Futuro in ricerca” 2012) project “Mixture and latent variable models for causal-inference and analysis of socio-economic data”, Bologna (IT), February 1-2, 2017 1 Stochastic block models for social network data: inferential developments ⇤ Francesco Bartolucci University of Perugia [email protected] Maria Francesca Marino University of Perugia [email protected] Silvia Pandolfi University of Perugia [email protected] 1 Abstract Stochastic Blockmodels have known in the last decades a flowering interest in the social network literature. They provide a tools for discovering communities and identifying clusters of individuals characterised by similar social behaviours. In particular, they assume that units belong to one of k distinct blocks, which are defined by a discrete latent variable. The probability of observing a connection between two individuals only depends on the corresponding block membership. In this framework, full maximum likelihood estimates are not achievable due to the intractability of the likelihood function. A number of approximate solutions are available. Here, we propose a new and more efficient approximated method for estimating model parameters which shows a great potential. The proposal is illustrated via simulations and the application to a benchmark dataset in the social network literature. ⇤ Presented at the final meeting of the FIRB (“Futuro in ricerca” 2012) project “Mixture and latent variable models for causal-inference and analysis of socio-economic data”, Bologna (IT), February 1-2, 2017 1 Generalized linear latent variable models for joint modeling in ecology⇤ Sara Taskinen University of Jyväskylä [email protected] Jenni Niku University of Jyväskylä [email protected] 1 Francis K.C. Hui The Australian National University [email protected] David I. Warton The University of New South Wales [email protected] Abstract In many ecological studies, counts or biomass of interacting species are collected from several sites. Such data are often very sparse, high-dimensional and include highly correlated responses, and the main aim of the statistical analysis is to understand relationships among such multiple, correlated responses. In this talk we show how generalized linear latent variable models can be used to analyse data common in ecological studies. By extending the standard generalized linear modelling framework to include latent variables, we can account for any covariation between species not accounted for by the predictors, species interactions and correlations driven by missing covariates. Fast and efficient maximum likelihood based algorithms for fitting the models will be discussed and simulations are used to study the finite-sample properties of the resulting estimates. It is shown that especially the variational approximation method performs well in the case of GLLVMs. The method will be applied to two ecological datasets. ⇤ Presented at the final meeting of the FIRB (“Futuro in ricerca” 2012) project “Mixture and latent variable models for causal-inference and analysis of socio-economic data”, Bologna (IT), February 1-2, 2017 1 Equating test scores with covariates⇤ Marie Wiberg Department of Statistics, USBE, Umeå University, Sweden [email protected] Valentina Sansivieri Department of Statistics, University of Bologna, Italy [email protected] 1 Abstract When di↵erent test forms are given to di↵erent test takers equating are used to ascertain that the same conclusions are drawn regardless of the test forms they have been given. To compare test forms, typically common test takers, or equivalent test takers, or common items are used in the single group, equivalent group (EG) or non-equivalent groups with anchor test (NEAT) designs. There are however situations when we do not have access to common test takers or common items and the groups are non-equivalent. To improve the equating in such situations one can use information from the test takers covariates through the non-equivalent groups with covariates (NEC) design. In this presentation, di↵erent approaches with the NEC design is presented with focus on a new method where item response theory observed-score equating is used. Especially, the focus is on the situation when some of the items have di↵erential item functioning. The new proposed method are compared with the results from equating with either an EG design or a NEAT design. The results show that the standard errors are lower when the information from the covariates are used than when the information is not used as in the EG or the NEAT designs. ⇤ Presented at the final meeting of the FIRB (“Futuro in ricerca” 2012) project “Mixture and latent variable models for causal-inference and analysis of socio-economic data”, Bologna (IT), February 1-2, 2017 1 Posterior predictive model checks for IRT models⇤ Mariagiulia Matteucci Department of Statistical Sciences University of Bologna [email protected] 1 Stefania Mignani Department of Statistical Sciences University of Bologna [email protected] Abstract Within the framework of item response theory (IRT) models, the issue of model fit assessment is crucial. To overcome the limitations of classical methods which are a↵ected by the problem of sparse data, Bayesian predictive assessment was recently introduced, where the reference distribution is built empirically. The purpose of this study is to investigate the e↵ectiveness of posterior predictive model checking (PPMC) in detecting model fit when multidimensional data are analysed with a unidimensional approach. A simulation study is conducted by using discrepancy measures based on association or correlation among item pairs, and the use of relative entropy (RE) based on these measures is investigated. The results show that the amount of extreme posterior predictive p-values (PPP-values) is the most useful tool while the RE should be used to identify potential misfit due to specific items. An application to real data is presented. ⇤ Presented at the final meeting of the FIRB (“Futuro in ricerca” 2012) project “Mixture and latent variable models for causal-inference and analysis of socio-economic data”, Bologna (IT), February 1-2, 2017 1 Generalized linear latent variable models for the analysis of cognitive functioning over time⇤ Silvia Bianconcini, Silvia Cagnone Department of Statistical Sciences University of Bologna [email protected], [email protected] 1 Abstract Dimensions of cognitive functioning are potentially important but often neglected determinants of the central economic outcomes. The Health and Retirement Study and the Asset and Health Dynamic study (HRS/AHEAD) aim at examining the impact of cognitive performance and decline on key domains of interest (e.g., health and daily functioning, economic and health decision making). In this paper, HRS/AHEAD data are analyzed using latent variable models that allow to individuate common factors of the cognitive items and analyze their dynamic over time. The estimation of these models is cumbersome when the items have di↵erent nature, as in our case, since the integration of the likelihood function is not analytically feasible. To overcome this problem, we propose a new integration method, called Dimension Reduction Method, that provides parameter estimates as accurate as commonly applied techniques, but without sharing the same computational complexity. ⇤ Presented at the final meeting of the FIRB (“Futuro in ricerca” 2012) project “Mixture and latent variable models for causal-inference and analysis of socio-economic data”, Bologna (IT), February 1-2, 2017 1 Multilevel models with stochastic volatility for repeated cross-sections⇤ Silvia Cagnone Department of Statistical Sciences, University of Bologna (IT) [email protected] Simone Giannerini Department of Statistical Sciences, University of Bologna (IT) [email protected] Lucia Modugno Department of Statistical Sciences, University of Bologna, Bologna (IT) [email protected] 1 Abstract In this work we introduce a multilevel specification with stochastic volatility for repeated crosssectional data. Modelling the time dynamics in repeated cross sections requires a suitable adaptation of the multilevel framework where the individuals/items are modelled at the first level whereas the time component appears at the second level. We perform maximum likelihood estimation by means of a nonlinear state space approach combined with Gauss-Legendre quadrature methods to approximate the likelihood function. We apply the model to the first database of tribal art items sold in the most important auction houses worldwide. The model allows to account properly for the heteroscedastic and autocorrelated volatility observed and has superior forecasting performance. Also, it provides valuable information on market trends and on predictability of prices that can be used by art markets stakeholders. ⇤ Presented at the final meeting of the FIRB (“Futuro in ricerca” 2012) project “Mixture and latent variable models for causal-inference and analysis of socio-economic data”, Bologna (IT), February 1-2, 2017 1 Methods for integrated analysis of datasets: prediction and interpretation⇤ Jeanine J. Houwing-Duistermaat University of Leeds, UK [email protected] Hae Won Uh Leiden University Medical Center, NL [email protected] Mar Rodriguez Girondo Leiden University Medical Center, NL M.Rodriguez [email protected] 1 Abstract Nowadays several correlated datasets are available in epidemiological studies. For example omics datasets are measured to identify biomarkers for diseases. Interpretation of obtained models is however hampered by high dimensionality and correlation within and across datasets. For dimension reduction, I will present two types of integrated methods, namely PLS methods which decompose each dataset in a common and a noise component and network methods which identifies correlation structures within a dataset. The obtained variables will be used to predict an outcome. To deal with the clustered variables from the network methods, grouped versions of regularized regression techniques will be applied. Cross validation will be applied to circumvent overfitting and to estimate the tuning parameters.The methods will be compared via simulations and will be illustrated by data from an epidemiological study on life style. ⇤ Presented at the final meeting of the FIRB (“Futuro in ricerca” 2012) project “Mixture and latent variable models for causal-inference and analysis of socio-economic data”, Bologna (IT), February 1-2, 2017 1 Latent Space Model for Multidimensional Networks Silvia D’Angelo La Sapienza, Università di Roma [email protected] ⇤ Marco Alfò La Sapienza, Università di Roma [email protected] Thomas Brendan Murphy University College Dublin [email protected] 1 Abstract A multidimensional network (multiplex) is a collection of networks for which the node set is constant while the edge sets may vary. Such a structure can be due either to a phenomenon changing over time or to the observation of multiple characteristics over a group of units. We present a latent space approach to model binary multiplex data. The probability of having a linked dyad in a network is modelled as a function of its propensity to be connected and of the distance between its nodes in an unobserved space. A common latent space for the whole multiplex allows to study the interconnections between networks. The distances are rescaled by a network-specific coefficient, summarizing the association among networks. We adopt a hierarchical Bayesian approach and use MCMC inference to estimate the parameters. An application on real data will be presented. ⇤ Presented at the final meeting of the FIRB (“Futuro in ricerca” 2012) project “Mixture and latent variable models for causal-inference and analysis of socio-economic data”, Bologna (IT), February 1-2, 2017 1 A robust fuzzy clustering method for non-precise data based on a trimming approach⇤ Maria Brigida Ferraro Sapienza University of Rome [email protected] 1 Ana Belén Ramos-Guajardo University of Oviedo [email protected] Abstract In many practical situations the observed data are a↵ected by imprecision and cannot be expressed in terms of single values. A useful approach consists in managing them by fuzzy sets, in particular, LR fuzzy data. Most of the existing algorithms for clustering fuzzy data produce only clusters with spherical shape because the homogeneity of the clusters is expressed in terms of Euclidean distance. In order to overlook the implicit hypothesis of spherical clusters, we introduce a generalized distance for fuzzy data. This type of data is characterized by a complex structure and, for this reason, there exist di↵erent kinds of contamination in this context. There are di↵erent proposals of robust methods for clustering fuzzy data. We suggest to use a trimming approach. The adequacy of our proposal is checked by means of simulation and real-case studies. ⇤ Presented at the final meeting of the FIRB (“Futuro in ricerca” 2012) project “Mixture and latent variable models for causal-inference and analysis of socio-economic data”, Bologna (IT), February 1-2, 2017 1 Segmentation of sea current fields by cylindrical hidden Markov models: a composite likelihood approach⇤ Francesco Lagona University of Roma Tre [email protected] 1 Abstract A hidden Markov random field is proposed for the analysis of spatial cylindrical data, i.e. bivariate spatial series of angles and intensities. The model is based on a mixture of cylindrical densities, whose parameters vary across space according to a latent Markov field. It allows to segment the data within a finite number of latent classes, simultaneously accounting for spatial autocorrelation, circular-linear correlation, multimodality and skewness. Due to the numerical intractability of the likelihood function, estimation of the parameters is based on composite likelihood methods, by introducing a computationally efficient EM algorithm that iteratively alternates the maximization of a weighted composite likelihood function with weights updating. In a case study of sea circulation in the Gulf of Naples, these methods allow to segment the data according to meaningful latent classes that represent specific conditions of the sea surface. ⇤ Presented at the final meeting of the FIRB (“Futuro in ricerca” 2012) project “Mixture and latent variable models for causal-inference and analysis of socio-economic data”, Bologna (IT), February 1-2, 2017 1 Conditional average treatment e↵ect: an application related to the partner union quality and divorce on the child’s psychological wellbeing⇤ Anna Garriga Pompeu Fabra University [email protected] Fulvia Pennoni University of Milano-Bicocca [email protected] Isabella Romeo Istituto di ricerca Mario Negri [email protected] 1 Abstract We test the hypothesis that divorce may be a positive experience for children with parents in high-distress unions, while the dissolution of low-distress unions may have negative e↵ects. We use the first three waves of the Millennium Cohort Study (MCS), a longitudinal and representative British survey, to explore parental divorce and parent relationship quality on several children’s outcomes at age five. By using the augmented inverse propensity weighted estimator we show that the dissolution of high-quality parental unions has the most harmful e↵ects on children, especially on conduct problems. Among children whose parents have the highest-quality relationships, those that experience parental separation/divorce have higher conduct problems than children with a stable family. Our findings indicate that early childhood programs and interventions should particularly target children whose parents report high quality relationship before parental separation since they are especially at risk. ⇤ Presented at the final meeting of the FIRB (“Futuro in ricerca” 2012) project “Mixture and latent variable models for causal-inference and analysis of socio-economic data”, Bologna (IT), February 1-2, 2017 1 A multivariate multilevel model to analyze educational achievement in Reading, Mathematics and Science in Italy⇤ Leonardo Grilli University of Florence [email protected] Fulvia Pennoni University of Milano-Bicocca [email protected] Carla Rampichini University of Florence [email protected] Isabella Romeo Istituto di ricerca Mario Negri [email protected] 1 Abstract We illustrate a multivariate multilevel analysis in a complex setting of large-scale assessment surveys, dealing with plausible values and accounting for the survey design. We analyse the Italian sample of the TIMSSPIRLS 2011 Combined International Database on fourth grade students. We jointly considers educational achievement in Reading, Mathematics and Science, thus we test for di↵erential associations of the covariates with the three response variables, and we estimate the residual correlations among pairs of responses within and between classes. Multilevel modelling allows us to disentangle student and contextual factors a↵ecting achievement. We also account for territorial di↵erences in wealth by means of an index from an external data source. The model residuals point out classes with high or low performance. As educational achievement is measured by plausible values, the estimates are obtained through multiple imputation formulas. ⇤ Presented at the final meeting of the FIRB (“Futuro in ricerca” 2012) project “Mixture and latent variable models for causal-inference and analysis of socio-economic data”, Bologna (IT), February 1-2, 2017 1 A comparison between two statistical models to analyse and predict individual changes over time⇤ Fulvia Pennoni University of Milano-Bicocca [email protected] 1 Isabella Romeo Istituto di ricerca Mario Negri [email protected] Abstract The latent Markov model and the growth mixture model for longitudinal data are compared when the ordinal nature of the response variable is of interest. The latent Markov model is based on time-varying latent variables to explain the observable behavior of the individuals. It is proposed in a semiparametric formulation as the latent process has a discrete distribution and is characterized by a Markov structure. The growth mixture model is based on a latent categorical variable that accounts for the unobserved heterogeneity in the observed trajectories and on a mixture of Gaussian random variables to account for the variability in the growth factors. We refer to a real data example on self-reported health status to illustrate their peculiarities and di↵erences. ⇤ Presented at the final meeting of the FIRB (“Futuro in ricerca” 2012) project “Mixture and latent variable models for causal-inference and analysis of socio-economic data”, Bologna (IT), February 1-2, 2017 1 Multilevel cluster-weighted models for the evaluation of hospitals⇤ Paolo Berta University of Milano-Bicocca [email protected] Salvatore Ingrassia University of Catania [email protected] Antonio Punzo University of Catania [email protected] Giorgio Vittadini University of Milano-Bicocca [email protected] 1 Abstract In recent years, increasing attention has been directed toward problems inherent to quality control in healthcare services. In particular, it is necessary to measure e↵ectiveness with respect to improving healthcare outcomes of diagnostic procedures or specific treatment episodes. The performance of hospitals is usually evaluated by multilevel models and other methods for risk adjustment. However, these approaches are not suitable for data with large unobserved heterogeneity. A potentially large source of unobserved heterogeneity comes from the variation of the regression coefficients between groups of individuals sharing similar but unobserved characteristics. To overcome such drawbacks, we propose the multilevel cluster-weighted model, a new mixture model approach for handling hierarchical data. ⇤ Presented at the final meeting of the FIRB (“Futuro in ricerca” 2012) project “Mixture and latent variable models for causal-inference and analysis of socio-economic data”, Bologna (IT), February 1-2, 2017 1 Structural Equation Models in a Redundancy Analysis Framework with observed covariates⇤ Pietro Giorgio Lovaglio Dept. of statistics and quantitative methods, University of Bicocca-Milan [email protected] Giorgio Vittadini Dept. of statistics and quantitative methods,University of Bicocca-Milan [email protected] 1 Abstract A recent method to specify and fit structural equation modelling in the Redundancy Analysis framework based on so-called Extended Redundancy Analysis has been proposed in the literature. In this approach, the relationships between the observed exogenous variables and the observed endogenous variables are moderated by the presence of unobservable composites, estimated as linear combinations of exogenous variables. However, in the presence of direct e↵ects linking exogenous and endogenous variables, or concomitant indicators, the composite scores are estimated by ignoring the presence of the specified direct e↵ects. To fit structural-equation models, we propose a new specification and estimation method, called Generalized Redundancy Analysis, allowing us to specify and fit a variety of relationships among composites, endogenous variables and external covariates. The proposed methodology extends the Extended Redundancy Analysis method, using a more suitable specification and estimation algorithm, by allowing for covariates that a↵ect endogenous indicators indirectly through the composites and/or directly. To illustrate the advantages of GRA over ERA we propose a simulation study small samples Moreover, we propose an application aimed at estimating the impact of formal human capital on the initial earnings of graduates of an Italian university, utilizing a structural model consistent with well-established economic theory. ⇤ Presented at the final meeting of the FIRB (“Futuro in ricerca” 2012) project “Mixture and latent variable models for causal-inference and analysis of socio-economic data”, Bologna (IT), February 1-2, 2017 1 Dealing with unobserved heterogeneity in return to education⇤ Michele Battisti University of Palermo [email protected] Salvatore Ingrassia University of Catania [email protected] Angelo Mazza University of Catania [email protected] Antonio Punzo University of Catania [email protected] 1 Abstract The Mincer earnings function is a regression model that explains earnings as a convenient function of schooling and experience. The model has been examined on many datasets and it is one of the most widely used models in empirical economics. In many circumstances, however, due to unobserved heterogeneity, a single Mincer’s regression is inadequate. Moreover, whatever (concomitant) information is available about the nature of such a heterogeneity should be incorporated in an appropriate manner. Motivated by these considerations, we propose a mixture of Mincer’s models with concomitant variables: it simultaneously provides a flexible generalization of the Mincer model, a breakdown of the population into several homogeneous subpopulations, and an explanation of the unobserved heterogeneity. The proposal is motivated and illustrated via an application to data provided by the Bank of Italys Survey of Household Income and Wealth in 2012. ⇤ Presented at the final meeting of the FIRB (“Futuro in ricerca” 2012) project “Mixture and latent variable models for causal-inference and analysis of socio-economic data”, Bologna (IT), February 1-2, 2017 1 Clustered Heterogeneity: Fixed E↵ects, Random E↵ects and Mixtures⇤ Gerhard Tutz Ludwig-Maximilians-Universitt Mnchen [email protected] 1 Abstract A classical approach to explicitly model heterogeneity is to use random e↵ects models, which assume that heterogeneity can be described by distributional assumptions. A drawback of random e↵ects models is that inference may depend on the assumed mixing distribution and, more seriously, the model assumes that the random e↵ects and the observed covariates are independent. In the fixed e↵ects models considered here it is assumed that there are clusters of units that share the same e↵ect on the response models. The objective is to identify the clusters and estimate the e↵ects of covariates on the response. We consider two strategies. The first uses regularization techniques. The proposed tailored penalization methods allow to identify the clusters of units. The second strategy is based on recursive partitioning (or tree based) methods. It is useful especially if the numbers of units is very large. ⇤ Presented at the final meeting of the FIRB (“Futuro in ricerca” 2012) project “Mixture and latent variable models for causal-inference and analysis of socio-economic data”, Bologna (IT), February 1-2, 2017 1 Bayesian Methods for Principal Stratification and Causal Mediation Analysis with Multiple Intermediates: Evaluating Power Plant Regulatory Interventions⇤ Corwin Zigler Harvard T.H. Chan School of Public Health [email protected] 1 Abstract This talk outlines methods to evaluate the extent to which the e↵ect of a power plant regulatory intervention on air pollution is mediated through e↵ects on power plant emissions. Power plants emit various compounds that contribute to ambient pollution, necessitating new methods to accommodate multiple intermediating factors that are measured contemporaneously. We leverage two related frameworks for causal inference in the presence of mediating variables: principal stratification and causal mediation analysis. Both approaches are anchored to the exact same model for the observed data, which we specify with flexible Bayesian nonparametric techniques. The principal stratification and causal mediation analyses are interpreted in tandem to provide the first empirical investigation of the presumed causal pathways that motivate a variety of air quality regulatory policies. ⇤ Presented at the final meeting of the FIRB (“Futuro in ricerca” 2012) project “Mixture and latent variable models for causal-inference and analysis of socio-economic data”, Bologna (IT), February 1-2, 2017 1 Evaluation of student performance through a multidimensional finite mixture IRT model∗ Silvia Bacci, Francesco Bartolucci Department of Economics University of Perugia [email protected], [email protected] Leonardo Grilli, Carla Rampichini Department of Statistics, Computer Science, Applications “G. Parenti” University of Florence [email protected], [email protected] 1 Abstract In the Italian academic system, a student can enroll for an exam immediately after the end of the teaching period or can postpone it; in this second case the exam result is missing. We propose an approach for the evaluation in itinere of a student performance accounting also for non-attempted exams. The approach is based on an Item Response Theory model that includes two discrete latent variables representing student performance and priority in selecting the exams to take. We explicitly account for non-ignorable missing observations as the indicators of attempted exams also contribute to measure the performance (within-item multidimensionality). The model, which is fitted by an EM algorithm, allows for individual covariates in its structural part. The analysis is carried out on freshmen enrolled in academic year 2011/2012 at two degree programmes of the School of Economics of the University of Florence. ∗ Presented at the final meeting of the FIRB (“Futuro in ricerca” 2012) project “Mixture and latent variable models for causal-inference and analysis of socio-economic data”, Bologna (IT), February 1-2, 2017 1 Estimating Treatment and Spillover E↵ects in Observational Social Network Data using Generalized Propensity Scores ⇤ Laura Forastiere University of Florence [email protected] 1 Fabrizia Mealli University of Florence [email protected] Edo Airoldi Harvard University [email protected] Abstract In the presence of interference, potential outcomes have to be defined based on both the individual treatment and the vector of treatments received by the units interacting individuals (e.g, friends or neighbors). In observational studies, a further issue is that when the treatment of individuals spills over to neighboring individuals along an underlying social network the usually invoked unconfoundedness assumption involves both the individual and the neighborhood treatment. Therefore, proper covariate-adjustment methods are needed to make this kind of assumption more plausible. Formal theory for designing and analyzing observational studies with units organized in a network is still in its infancy. For this purpose, we define a generalized propensity score that balances individual and neighborhood covariates across units under di↵erent levels of such a bivariate treatment. ⇤ Presented at the final meeting of the FIRB (“Futuro in ricerca” 2012) project “Mixture and latent variable models for causal-inference and analysis of socio-economic data”, Bologna (IT), February 1-2, 2017 1 Causal inference for multiple binary non-independent outcomes⇤ Monia Lupparelli University of Bologna [email protected] 1 Alessandra Mattei University of Florence [email protected] Abstract We focus on drawing causal inference on multiple non-independent binary outcomes using the potential outcome approach. We define causal e↵ects of treatment on joint outcomes introducing the notion of product outcomes, and we provide a decomposition of the causal e↵ect on product outcomes into intrinsic and extrinsic causal e↵ects, which, respectively, provide information on treatment e↵ect on the intrinsic (product) structure of the product outcomes and on the outcomes’ dependence structure. We propose a log-mean linear regression approach for modeling potential outcomes, such that all the causal estimands of interest can be easily derived by model parameters. The method is applied in two randomized studies concerning (i) the e↵ect of presurgery administration of oral morphine on pain intensity after surgery; and (ii) the e↵ect of honey on nocturnal cough and sleep quality for coughing children. ⇤ Presented at the final meeting of the FIRB (“Futuro in ricerca” 2012) project “Mixture and latent variable models for causal-inference and analysis of socio-economic data”, Bologna (IT), February 1-2, 2017 1
© Copyright 2026 Paperzz