Final meeting of the research group on "Mixture and Latent Variable

Final meeting of the research group on
"Mixture and Latent Variable Models for Causal Inference
and Analysis of Socio-Economic Data"
Book of Abstracts
Department of Statistical Sciences
University of Bologna
February 1-2, 2017
Dynamic latent class models for cross-sectional data⇤
Brian Francis
Lancaster University, UK
[email protected]
Valmira Hoti
Lancaster University, UK
[email protected]
Abstract
Using a latent class approach, this talk addresses the problem of assessing change over time
in cross-sectional surveys. Traditional methods of analysis would assume that the class-specific
item probabilities( the class profiles) are time-constant, but perhaps with the proportion of
each class changing over time. However, this assumption may not be true. Social change will
undoubtedly be present. We develop a range of models and apply them to human value items
from seven sweeps of the European Social Survey for the UK. Various models of profile change
are considered. The models are linked to the the ideas of measurement invariance, and this link
will be discussed. Extensions to the model can be made, allowing for non-linearity in temporal
trends. The conclusion is that human values are indeed changing over time, with change focused
on one or two specific items.
⇤
Presented at the final meeting of the FIRB (“Futuro in ricerca” 2012) project “Mixture and latent variable
models for causal-inference and analysis of socio-economic data”, Bologna (IT), February 1-2, 2017
1
Mixture growth modeling of households’ investment in risky
financial assets ⇤
David Aristei
University of Perugia
[email protected]
Silvia Bacci
University of Perugia
[email protected]
Francesco Bartolucci
University of Perugia
[email protected]
Silvia Pandolfi
University of Perugia
[email protected]
1
Abstract
In this work we study the dynamics of households’ portfolio choices over the life cycle and we
analyze the factors a↵ecting both the financial market participation and the amount invested
in risky financial assets. At this aim, we propose a bivariate mixture latent growth model
which allows for the inclusion of heterogeneous growth trajectories by assuming the existence
of unobservable clusters (or latent classes) of households with similar behaviors in terms of
portfolio choices. We also investigate the e↵ect of time-constant and time-varying covariates
on the response variables at household level. The approach is illustrated by the analysis of an
unbalanced panel dataset of Italian households over the 1998-2014 period. On the basis of this
dataset, we identify three latent groups characterized by heterogeneous investment behaviors
over the life cycle in terms of both asset market participation and conditional share.
⇤
Presented at the final meeting of the FIRB (“Futuro in ricerca” 2012) project “Mixture and latent variable
models for causal-inference and analysis of socio-economic data”, Bologna (IT), February 1-2, 2017
1
Fixed-e↵ects estimation of binary short panel data models with
predetermined covariates⇤
Francesco Bartolucci
University of Perugia
[email protected]
1
Claudia Pigini
Marche Polytechnic University
[email protected]
Abstract
Strict exogeneity of covariates other than the lagged dependent variable, conditional on unobserved heterogeneity, is often required for consistent estimation of binary panel data models.
This assumption is likely to be violated in practice because of feedback e↵ects from the past
of the outcome variable on the present value of covariates and no general solution is yet available. We propose a novel model formulation that takes into account feedback e↵ects without
specifying a parametric model for the predetermined explanatory variables. We further propose
estimating the model parameters with a recent fixed-e↵ects approach based on pseudo conditional inference, thereby taking care of the correlation between individual permanent unobserved
heterogeneity and the model’s covariates as well. Our results hold for short panels with a large
number of cross-section units, a case of great interest in microeconomic applications.
⇤
Presented at the final meeting of the FIRB (“Futuro in ricerca” 2012) project “Mixture and latent variable
models for causal-inference and analysis of socio-economic data”, Bologna (IT), February 1-2, 2017
1
Advances in mixture models for rating data
Maria Iannario
Department of Political Sciences
University of Naples Federico II
[email protected]
⇤
Domenico Piccolo
Department of Political Sciences
University of Naples Federico II
[email protected]
Rosaria Simone
Department of Political Sciences
University of Naples Federico II
[email protected]
1
Abstract
The talk aims to overview recent developments in finite mixture models for ordinal data grounding on the specification of an uncertainty component: cub models. The original rationale has
been adapted to build a more comprehensive framework that is designed to meet classical proposals as well as to be sensitive to di↵erence sources of nuisances, such as shelter e↵ect, response
styles and overdispersion. The resulting class of mixture models has proved to be particularly
fruitful for several applications, ranging from job satisfaction to sensometric experiments, and
also for determining the e↵ect on uncertainty of the chosen ordinal scale. In conclusion, future
methodological researches are outlined: the focus is being addressed to specify a more flexible
shaping of the uncertainty component and, most importantly, to multi-item analysis to model
subjective heterogeneity.
⇤
Presented at the final meeting of the FIRB (“Futuro in ricerca” 2012) project “Mixture and latent variable
models for causal-inference and analysis of socio-economic data”, Bologna (IT), February 1-2, 2017
1
Stochastic block models for social network data: inferential
developments ⇤
Francesco Bartolucci
University of Perugia
[email protected]
Maria Francesca Marino
University of Perugia
[email protected]
Silvia Pandolfi
University of Perugia
[email protected]
1
Abstract
Stochastic Blockmodels have known in the last decades a flowering interest in the social network
literature. They provide a tools for discovering communities and identifying clusters of individuals characterised by similar social behaviours. In particular, they assume that units belong
to one of k distinct blocks, which are defined by a discrete latent variable. The probability of
observing a connection between two individuals only depends on the corresponding block membership. In this framework, full maximum likelihood estimates are not achievable due to the
intractability of the likelihood function. A number of approximate solutions are available. Here,
we propose a new and more efficient approximated method for estimating model parameters
which shows a great potential. The proposal is illustrated via simulations and the application
to a benchmark dataset in the social network literature.
⇤
Presented at the final meeting of the FIRB (“Futuro in ricerca” 2012) project “Mixture and latent variable
models for causal-inference and analysis of socio-economic data”, Bologna (IT), February 1-2, 2017
1
Generalized linear latent variable models for joint modeling in
ecology⇤
Sara Taskinen
University of Jyväskylä
[email protected]
Jenni Niku
University of Jyväskylä
[email protected]
1
Francis K.C. Hui
The Australian National University
[email protected]
David I. Warton
The University of New South Wales
[email protected]
Abstract
In many ecological studies, counts or biomass of interacting species are collected from several
sites. Such data are often very sparse, high-dimensional and include highly correlated responses,
and the main aim of the statistical analysis is to understand relationships among such multiple,
correlated responses. In this talk we show how generalized linear latent variable models can be
used to analyse data common in ecological studies. By extending the standard generalized linear
modelling framework to include latent variables, we can account for any covariation between
species not accounted for by the predictors, species interactions and correlations driven by
missing covariates. Fast and efficient maximum likelihood based algorithms for fitting the models
will be discussed and simulations are used to study the finite-sample properties of the resulting
estimates. It is shown that especially the variational approximation method performs well in
the case of GLLVMs. The method will be applied to two ecological datasets.
⇤
Presented at the final meeting of the FIRB (“Futuro in ricerca” 2012) project “Mixture and latent variable
models for causal-inference and analysis of socio-economic data”, Bologna (IT), February 1-2, 2017
1
Equating test scores with covariates⇤
Marie Wiberg
Department of Statistics, USBE, Umeå University, Sweden
[email protected]
Valentina Sansivieri
Department of Statistics, University of Bologna, Italy
[email protected]
1
Abstract
When di↵erent test forms are given to di↵erent test takers equating are used to ascertain that the
same conclusions are drawn regardless of the test forms they have been given. To compare test
forms, typically common test takers, or equivalent test takers, or common items are used in the
single group, equivalent group (EG) or non-equivalent groups with anchor test (NEAT) designs.
There are however situations when we do not have access to common test takers or common
items and the groups are non-equivalent. To improve the equating in such situations one can use
information from the test takers covariates through the non-equivalent groups with covariates
(NEC) design. In this presentation, di↵erent approaches with the NEC design is presented with
focus on a new method where item response theory observed-score equating is used. Especially,
the focus is on the situation when some of the items have di↵erential item functioning. The
new proposed method are compared with the results from equating with either an EG design
or a NEAT design. The results show that the standard errors are lower when the information
from the covariates are used than when the information is not used as in the EG or the NEAT
designs.
⇤
Presented at the final meeting of the FIRB (“Futuro in ricerca” 2012) project “Mixture and latent variable
models for causal-inference and analysis of socio-economic data”, Bologna (IT), February 1-2, 2017
1
Posterior predictive model checks for IRT models⇤
Mariagiulia Matteucci
Department of Statistical Sciences
University of Bologna
[email protected]
1
Stefania Mignani
Department of Statistical Sciences
University of Bologna
[email protected]
Abstract
Within the framework of item response theory (IRT) models, the issue of model fit assessment
is crucial. To overcome the limitations of classical methods which are a↵ected by the problem
of sparse data, Bayesian predictive assessment was recently introduced, where the reference
distribution is built empirically. The purpose of this study is to investigate the e↵ectiveness
of posterior predictive model checking (PPMC) in detecting model fit when multidimensional
data are analysed with a unidimensional approach. A simulation study is conducted by using
discrepancy measures based on association or correlation among item pairs, and the use of
relative entropy (RE) based on these measures is investigated. The results show that the amount
of extreme posterior predictive p-values (PPP-values) is the most useful tool while the RE should
be used to identify potential misfit due to specific items. An application to real data is presented.
⇤
Presented at the final meeting of the FIRB (“Futuro in ricerca” 2012) project “Mixture and latent variable
models for causal-inference and analysis of socio-economic data”, Bologna (IT), February 1-2, 2017
1
Generalized linear latent variable models for the analysis of
cognitive functioning over time⇤
Silvia Bianconcini, Silvia Cagnone
Department of Statistical Sciences
University of Bologna
[email protected], [email protected]
1
Abstract
Dimensions of cognitive functioning are potentially important but often neglected determinants
of the central economic outcomes. The Health and Retirement Study and the Asset and Health
Dynamic study (HRS/AHEAD) aim at examining the impact of cognitive performance and
decline on key domains of interest (e.g., health and daily functioning, economic and health
decision making). In this paper, HRS/AHEAD data are analyzed using latent variable models
that allow to individuate common factors of the cognitive items and analyze their dynamic over
time. The estimation of these models is cumbersome when the items have di↵erent nature, as in
our case, since the integration of the likelihood function is not analytically feasible. To overcome
this problem, we propose a new integration method, called Dimension Reduction Method, that
provides parameter estimates as accurate as commonly applied techniques, but without sharing
the same computational complexity.
⇤
Presented at the final meeting of the FIRB (“Futuro in ricerca” 2012) project “Mixture and latent variable
models for causal-inference and analysis of socio-economic data”, Bologna (IT), February 1-2, 2017
1
Multilevel models with stochastic volatility for repeated
cross-sections⇤
Silvia Cagnone
Department of Statistical Sciences, University of Bologna (IT)
[email protected]
Simone Giannerini
Department of Statistical Sciences, University of Bologna (IT)
[email protected]
Lucia Modugno
Department of Statistical Sciences, University of Bologna, Bologna (IT)
[email protected]
1
Abstract
In this work we introduce a multilevel specification with stochastic volatility for repeated crosssectional data. Modelling the time dynamics in repeated cross sections requires a suitable
adaptation of the multilevel framework where the individuals/items are modelled at the first
level whereas the time component appears at the second level. We perform maximum likelihood estimation by means of a nonlinear state space approach combined with Gauss-Legendre
quadrature methods to approximate the likelihood function. We apply the model to the first
database of tribal art items sold in the most important auction houses worldwide. The model
allows to account properly for the heteroscedastic and autocorrelated volatility observed and
has superior forecasting performance. Also, it provides valuable information on market trends
and on predictability of prices that can be used by art markets stakeholders.
⇤
Presented at the final meeting of the FIRB (“Futuro in ricerca” 2012) project “Mixture and latent variable
models for causal-inference and analysis of socio-economic data”, Bologna (IT), February 1-2, 2017
1
Methods for integrated analysis of datasets: prediction and
interpretation⇤
Jeanine J. Houwing-Duistermaat
University of Leeds, UK
[email protected]
Hae Won Uh
Leiden University Medical Center, NL
[email protected]
Mar Rodriguez Girondo
Leiden University Medical Center, NL
M.Rodriguez [email protected]
1
Abstract
Nowadays several correlated datasets are available in epidemiological studies. For example
omics datasets are measured to identify biomarkers for diseases. Interpretation of obtained
models is however hampered by high dimensionality and correlation within and across datasets.
For dimension reduction, I will present two types of integrated methods, namely PLS methods
which decompose each dataset in a common and a noise component and network methods which
identifies correlation structures within a dataset. The obtained variables will be used to predict
an outcome. To deal with the clustered variables from the network methods, grouped versions of
regularized regression techniques will be applied. Cross validation will be applied to circumvent
overfitting and to estimate the tuning parameters.The methods will be compared via simulations
and will be illustrated by data from an epidemiological study on life style.
⇤
Presented at the final meeting of the FIRB (“Futuro in ricerca” 2012) project “Mixture and latent variable
models for causal-inference and analysis of socio-economic data”, Bologna (IT), February 1-2, 2017
1
Latent Space Model for Multidimensional Networks
Silvia D’Angelo
La Sapienza, Università di Roma
[email protected]
⇤
Marco Alfò
La Sapienza, Università di Roma
[email protected]
Thomas Brendan Murphy
University College Dublin
[email protected]
1
Abstract
A multidimensional network (multiplex) is a collection of networks for which the node set is
constant while the edge sets may vary. Such a structure can be due either to a phenomenon
changing over time or to the observation of multiple characteristics over a group of units.
We present a latent space approach to model binary multiplex data. The probability of having
a linked dyad in a network is modelled as a function of its propensity to be connected and
of the distance between its nodes in an unobserved space. A common latent space for the
whole multiplex allows to study the interconnections between networks. The distances are
rescaled by a network-specific coefficient, summarizing the association among networks. We
adopt a hierarchical Bayesian approach and use MCMC inference to estimate the parameters.
An application on real data will be presented.
⇤
Presented at the final meeting of the FIRB (“Futuro in ricerca” 2012) project “Mixture and latent variable
models for causal-inference and analysis of socio-economic data”, Bologna (IT), February 1-2, 2017
1
A robust fuzzy clustering method for non-precise data based on a
trimming approach⇤
Maria Brigida Ferraro
Sapienza University of Rome
[email protected]
1
Ana Belén Ramos-Guajardo
University of Oviedo
[email protected]
Abstract
In many practical situations the observed data are a↵ected by imprecision and cannot be expressed in terms of single values. A useful approach consists in managing them by fuzzy sets,
in particular, LR fuzzy data. Most of the existing algorithms for clustering fuzzy data produce
only clusters with spherical shape because the homogeneity of the clusters is expressed in terms
of Euclidean distance. In order to overlook the implicit hypothesis of spherical clusters, we
introduce a generalized distance for fuzzy data. This type of data is characterized by a complex
structure and, for this reason, there exist di↵erent kinds of contamination in this context. There
are di↵erent proposals of robust methods for clustering fuzzy data. We suggest to use a trimming approach. The adequacy of our proposal is checked by means of simulation and real-case
studies.
⇤
Presented at the final meeting of the FIRB (“Futuro in ricerca” 2012) project “Mixture and latent variable
models for causal-inference and analysis of socio-economic data”, Bologna (IT), February 1-2, 2017
1
Segmentation of sea current fields by cylindrical hidden Markov
models: a composite likelihood approach⇤
Francesco Lagona
University of Roma Tre
[email protected]
1
Abstract
A hidden Markov random field is proposed for the analysis of spatial cylindrical data, i.e. bivariate spatial series of angles and intensities. The model is based on a mixture of cylindrical
densities, whose parameters vary across space according to a latent Markov field. It allows to
segment the data within a finite number of latent classes, simultaneously accounting for spatial
autocorrelation, circular-linear correlation, multimodality and skewness. Due to the numerical
intractability of the likelihood function, estimation of the parameters is based on composite
likelihood methods, by introducing a computationally efficient EM algorithm that iteratively
alternates the maximization of a weighted composite likelihood function with weights updating.
In a case study of sea circulation in the Gulf of Naples, these methods allow to segment the data
according to meaningful latent classes that represent specific conditions of the sea surface.
⇤
Presented at the final meeting of the FIRB (“Futuro in ricerca” 2012) project “Mixture and latent variable
models for causal-inference and analysis of socio-economic data”, Bologna (IT), February 1-2, 2017
1
Conditional average treatment e↵ect: an application related to
the partner union quality and divorce on the child’s psychological
wellbeing⇤
Anna Garriga
Pompeu Fabra University
[email protected]
Fulvia Pennoni
University of Milano-Bicocca
[email protected]
Isabella Romeo
Istituto di ricerca Mario Negri
[email protected]
1
Abstract
We test the hypothesis that divorce may be a positive experience for children with parents in
high-distress unions, while the dissolution of low-distress unions may have negative e↵ects. We
use the first three waves of the Millennium Cohort Study (MCS), a longitudinal and representative British survey, to explore parental divorce and parent relationship quality on several
children’s outcomes at age five. By using the augmented inverse propensity weighted estimator
we show that the dissolution of high-quality parental unions has the most harmful e↵ects on children, especially on conduct problems. Among children whose parents have the highest-quality
relationships, those that experience parental separation/divorce have higher conduct problems
than children with a stable family. Our findings indicate that early childhood programs and
interventions should particularly target children whose parents report high quality relationship
before parental separation since they are especially at risk.
⇤
Presented at the final meeting of the FIRB (“Futuro in ricerca” 2012) project “Mixture and latent variable
models for causal-inference and analysis of socio-economic data”, Bologna (IT), February 1-2, 2017
1
A multivariate multilevel model to analyze educational
achievement in Reading, Mathematics and Science in Italy⇤
Leonardo Grilli
University of Florence
[email protected]
Fulvia Pennoni
University of Milano-Bicocca
[email protected]
Carla Rampichini
University of Florence
[email protected]
Isabella Romeo
Istituto di ricerca Mario Negri
[email protected]
1
Abstract
We illustrate a multivariate multilevel analysis in a complex setting of large-scale assessment
surveys, dealing with plausible values and accounting for the survey design. We analyse the
Italian sample of the TIMSSPIRLS 2011 Combined International Database on fourth grade students. We jointly considers educational achievement in Reading, Mathematics and Science, thus
we test for di↵erential associations of the covariates with the three response variables, and we estimate the residual correlations among pairs of responses within and between classes. Multilevel
modelling allows us to disentangle student and contextual factors a↵ecting achievement. We also
account for territorial di↵erences in wealth by means of an index from an external data source.
The model residuals point out classes with high or low performance. As educational achievement is measured by plausible values, the estimates are obtained through multiple imputation
formulas.
⇤
Presented at the final meeting of the FIRB (“Futuro in ricerca” 2012) project “Mixture and latent variable
models for causal-inference and analysis of socio-economic data”, Bologna (IT), February 1-2, 2017
1
A comparison between two statistical models to analyse and
predict individual changes over time⇤
Fulvia Pennoni
University of Milano-Bicocca
[email protected]
1
Isabella Romeo
Istituto di ricerca Mario Negri
[email protected]
Abstract
The latent Markov model and the growth mixture model for longitudinal data are compared
when the ordinal nature of the response variable is of interest. The latent Markov model is
based on time-varying latent variables to explain the observable behavior of the individuals. It is
proposed in a semiparametric formulation as the latent process has a discrete distribution and is
characterized by a Markov structure. The growth mixture model is based on a latent categorical
variable that accounts for the unobserved heterogeneity in the observed trajectories and on a
mixture of Gaussian random variables to account for the variability in the growth factors. We
refer to a real data example on self-reported health status to illustrate their peculiarities and
di↵erences.
⇤
Presented at the final meeting of the FIRB (“Futuro in ricerca” 2012) project “Mixture and latent variable
models for causal-inference and analysis of socio-economic data”, Bologna (IT), February 1-2, 2017
1
Multilevel cluster-weighted models for the evaluation of hospitals⇤
Paolo Berta
University of Milano-Bicocca
[email protected]
Salvatore Ingrassia
University of Catania
[email protected]
Antonio Punzo
University of Catania
[email protected]
Giorgio Vittadini
University of Milano-Bicocca
[email protected]
1
Abstract
In recent years, increasing attention has been directed toward problems inherent to quality control in healthcare services. In particular, it is necessary to measure e↵ectiveness with respect
to improving healthcare outcomes of diagnostic procedures or specific treatment episodes. The
performance of hospitals is usually evaluated by multilevel models and other methods for risk
adjustment. However, these approaches are not suitable for data with large unobserved heterogeneity. A potentially large source of unobserved heterogeneity comes from the variation of the
regression coefficients between groups of individuals sharing similar but unobserved characteristics. To overcome such drawbacks, we propose the multilevel cluster-weighted model, a new
mixture model approach for handling hierarchical data.
⇤
Presented at the final meeting of the FIRB (“Futuro in ricerca” 2012) project “Mixture and latent variable
models for causal-inference and analysis of socio-economic data”, Bologna (IT), February 1-2, 2017
1
Structural Equation Models in a Redundancy Analysis
Framework with observed covariates⇤
Pietro Giorgio Lovaglio
Dept. of statistics and quantitative methods, University of Bicocca-Milan
[email protected]
Giorgio Vittadini
Dept. of statistics and quantitative methods,University of Bicocca-Milan
[email protected]
1
Abstract
A recent method to specify and fit structural equation modelling in the Redundancy Analysis
framework based on so-called Extended Redundancy Analysis has been proposed in the literature. In this approach, the relationships between the observed exogenous variables and the
observed endogenous variables are moderated by the presence of unobservable composites, estimated as linear combinations of exogenous variables. However, in the presence of direct e↵ects
linking exogenous and endogenous variables, or concomitant indicators, the composite scores
are estimated by ignoring the presence of the specified direct e↵ects. To fit structural-equation
models, we propose a new specification and estimation method, called Generalized Redundancy
Analysis, allowing us to specify and fit a variety of relationships among composites, endogenous
variables and external covariates. The proposed methodology extends the Extended Redundancy
Analysis method, using a more suitable specification and estimation algorithm, by allowing for
covariates that a↵ect endogenous indicators indirectly through the composites and/or directly.
To illustrate the advantages of GRA over ERA we propose a simulation study small samples
Moreover, we propose an application aimed at estimating the impact of formal human capital on
the initial earnings of graduates of an Italian university, utilizing a structural model consistent
with well-established economic theory.
⇤
Presented at the final meeting of the FIRB (“Futuro in ricerca” 2012) project “Mixture and latent variable
models for causal-inference and analysis of socio-economic data”, Bologna (IT), February 1-2, 2017
1
Dealing with unobserved heterogeneity in return to education⇤
Michele Battisti
University of Palermo
[email protected]
Salvatore Ingrassia
University of Catania
[email protected]
Angelo Mazza
University of Catania
[email protected]
Antonio Punzo
University of Catania
[email protected]
1
Abstract
The Mincer earnings function is a regression model that explains earnings as a convenient function of schooling and experience. The model has been examined on many datasets and it is
one of the most widely used models in empirical economics. In many circumstances, however,
due to unobserved heterogeneity, a single Mincer’s regression is inadequate. Moreover, whatever
(concomitant) information is available about the nature of such a heterogeneity should be incorporated in an appropriate manner. Motivated by these considerations, we propose a mixture of
Mincer’s models with concomitant variables: it simultaneously provides a flexible generalization
of the Mincer model, a breakdown of the population into several homogeneous subpopulations,
and an explanation of the unobserved heterogeneity. The proposal is motivated and illustrated
via an application to data provided by the Bank of Italys Survey of Household Income and
Wealth in 2012.
⇤
Presented at the final meeting of the FIRB (“Futuro in ricerca” 2012) project “Mixture and latent variable
models for causal-inference and analysis of socio-economic data”, Bologna (IT), February 1-2, 2017
1
Clustered Heterogeneity:
Fixed E↵ects, Random E↵ects and Mixtures⇤
Gerhard Tutz
Ludwig-Maximilians-Universitt Mnchen
[email protected]
1
Abstract
A classical approach to explicitly model heterogeneity is to use random e↵ects models, which
assume that heterogeneity can be described by distributional assumptions. A drawback of
random e↵ects models is that inference may depend on the assumed mixing distribution and,
more seriously, the model assumes that the random e↵ects and the observed covariates are
independent. In the fixed e↵ects models considered here it is assumed that there are clusters of
units that share the same e↵ect on the response models. The objective is to identify the clusters
and estimate the e↵ects of covariates on the response. We consider two strategies. The first
uses regularization techniques. The proposed tailored penalization methods allow to identify the
clusters of units. The second strategy is based on recursive partitioning (or tree based) methods.
It is useful especially if the numbers of units is very large.
⇤
Presented at the final meeting of the FIRB (“Futuro in ricerca” 2012) project “Mixture and latent variable
models for causal-inference and analysis of socio-economic data”, Bologna (IT), February 1-2, 2017
1
Bayesian Methods for Principal Stratification and Causal
Mediation Analysis with Multiple Intermediates: Evaluating
Power Plant Regulatory Interventions⇤
Corwin Zigler
Harvard T.H. Chan School of Public Health
[email protected]
1
Abstract
This talk outlines methods to evaluate the extent to which the e↵ect of a power plant regulatory
intervention on air pollution is mediated through e↵ects on power plant emissions. Power plants
emit various compounds that contribute to ambient pollution, necessitating new methods to accommodate multiple intermediating factors that are measured contemporaneously. We leverage
two related frameworks for causal inference in the presence of mediating variables: principal
stratification and causal mediation analysis. Both approaches are anchored to the exact same
model for the observed data, which we specify with flexible Bayesian nonparametric techniques.
The principal stratification and causal mediation analyses are interpreted in tandem to provide
the first empirical investigation of the presumed causal pathways that motivate a variety of air
quality regulatory policies.
⇤
Presented at the final meeting of the FIRB (“Futuro in ricerca” 2012) project “Mixture and latent variable
models for causal-inference and analysis of socio-economic data”, Bologna (IT), February 1-2, 2017
1
Evaluation of student performance through a multidimensional
finite mixture IRT model∗
Silvia Bacci, Francesco Bartolucci
Department of Economics
University of Perugia
[email protected], [email protected]
Leonardo Grilli, Carla Rampichini
Department of Statistics, Computer Science, Applications “G. Parenti”
University of Florence
[email protected], [email protected]
1
Abstract
In the Italian academic system, a student can enroll for an exam immediately after the end
of the teaching period or can postpone it; in this second case the exam result is missing. We
propose an approach for the evaluation in itinere of a student performance accounting also for
non-attempted exams. The approach is based on an Item Response Theory model that includes
two discrete latent variables representing student performance and priority in selecting the exams to take. We explicitly account for non-ignorable missing observations as the indicators of
attempted exams also contribute to measure the performance (within-item multidimensionality).
The model, which is fitted by an EM algorithm, allows for individual covariates in its structural
part. The analysis is carried out on freshmen enrolled in academic year 2011/2012 at two degree
programmes of the School of Economics of the University of Florence.
∗
Presented at the final meeting of the FIRB (“Futuro in ricerca” 2012) project “Mixture and latent variable
models for causal-inference and analysis of socio-economic data”, Bologna (IT), February 1-2, 2017
1
Estimating Treatment and Spillover E↵ects in Observational
Social Network Data using Generalized Propensity Scores ⇤
Laura Forastiere
University of Florence
[email protected]
1
Fabrizia Mealli
University of Florence
[email protected]
Edo Airoldi
Harvard University
[email protected]
Abstract
In the presence of interference, potential outcomes have to be defined based on both the individual treatment and the vector of treatments received by the units interacting individuals (e.g,
friends or neighbors). In observational studies, a further issue is that when the treatment of
individuals spills over to neighboring individuals along an underlying social network the usually invoked unconfoundedness assumption involves both the individual and the neighborhood
treatment. Therefore, proper covariate-adjustment methods are needed to make this kind of
assumption more plausible. Formal theory for designing and analyzing observational studies
with units organized in a network is still in its infancy. For this purpose, we define a generalized propensity score that balances individual and neighborhood covariates across units under
di↵erent levels of such a bivariate treatment.
⇤
Presented at the final meeting of the FIRB (“Futuro in ricerca” 2012) project “Mixture and latent variable
models for causal-inference and analysis of socio-economic data”, Bologna (IT), February 1-2, 2017
1
Causal inference for multiple binary non-independent outcomes⇤
Monia Lupparelli
University of Bologna
[email protected]
1
Alessandra Mattei
University of Florence
[email protected]
Abstract
We focus on drawing causal inference on multiple non-independent binary outcomes using the
potential outcome approach. We define causal e↵ects of treatment on joint outcomes introducing
the notion of product outcomes, and we provide a decomposition of the causal e↵ect on product
outcomes into intrinsic and extrinsic causal e↵ects, which, respectively, provide information
on treatment e↵ect on the intrinsic (product) structure of the product outcomes and on the
outcomes’ dependence structure. We propose a log-mean linear regression approach for modeling
potential outcomes, such that all the causal estimands of interest can be easily derived by model
parameters. The method is applied in two randomized studies concerning (i) the e↵ect of presurgery administration of oral morphine on pain intensity after surgery; and (ii) the e↵ect of
honey on nocturnal cough and sleep quality for coughing children.
⇤
Presented at the final meeting of the FIRB (“Futuro in ricerca” 2012) project “Mixture and latent variable
models for causal-inference and analysis of socio-economic data”, Bologna (IT), February 1-2, 2017
1