Correct Specification in Bayesian Hierarchical Models Jayaram Sethuraman Department of Statistics Florida State University and University of South Carolina [email protected] October 21, 2008 Summary Specification of Probability Distributions Summary Specification of Probability Distributions Specification of Bayesian Models Summary Specification of Probability Distributions Specification of Bayesian Models Usual Specification of Hierarchical Bayesian Models Summary Specification of Probability Distributions Specification of Bayesian Models Usual Specification of Hierarchical Bayesian Models Correct Specification of Hierarchical Bayesian Models Specification of Probability Distributions - I Data X - it can be real data, multivariate data, continuous data, etc. Specification of Probability Distributions - I Data X - it can be real data, multivariate data, continuous data, etc. Notation for distribution of X : L(X ). Specification of Probability Distributions - I Data X - it can be real data, multivariate data, continuous data, etc. Notation for distribution of X : L(X ). When X is real, this distribution can be specified by a distribution function. Specification of Probability Distributions - I Data X - it can be real data, multivariate data, continuous data, etc. Notation for distribution of X : L(X ). When X is real, this distribution can be specified by a distribution function. Bivariate data (X , Y ) Specification of Probability Distributions - I Data X - it can be real data, multivariate data, continuous data, etc. Notation for distribution of X : L(X ). When X is real, this distribution can be specified by a distribution function. Bivariate data (X , Y ) (X and Y can be multivariate): Specification of Probability Distributions - I Data X - it can be real data, multivariate data, continuous data, etc. Notation for distribution of X : L(X ). When X is real, this distribution can be specified by a distribution function. Bivariate data (X , Y ) (X and Y can be multivariate): its distribution is correctly specified by a marginal and an appropriate conditional: L(Y ) and L(X |Y ). Specification of Probability Distributions - I Data X - it can be real data, multivariate data, continuous data, etc. Notation for distribution of X : L(X ). When X is real, this distribution can be specified by a distribution function. Bivariate data (X , Y ) (X and Y can be multivariate): its distribution is correctly specified by a marginal and an appropriate conditional: L(Y ) and L(X |Y ). Incorrect way to specify the distribution of (X , Y ): L(Y ) and L(Y |X ). Specification of Probability Distributions - I Data X - it can be real data, multivariate data, continuous data, etc. Notation for distribution of X : L(X ). When X is real, this distribution can be specified by a distribution function. Bivariate data (X , Y ) (X and Y can be multivariate): its distribution is correctly specified by a marginal and an appropriate conditional: L(Y ) and L(X |Y ). Incorrect way to specify the distribution of (X , Y ): L(Y ) and L(Y |X ). Marginal and an inappropriate conditional. Specification of Probability Distributions - I Data X - it can be real data, multivariate data, continuous data, etc. Notation for distribution of X : L(X ). When X is real, this distribution can be specified by a distribution function. Bivariate data (X , Y ) (X and Y can be multivariate): its distribution is correctly specified by a marginal and an appropriate conditional: L(Y ) and L(X |Y ). Incorrect way to specify the distribution of (X , Y ): L(Y ) and L(Y |X ). Marginal and an inappropriate conditional. This problem has been studied extensively and is a whole special topic by itself. Specification of Probability Distributions - III The distribution of (X , Y , Z ) is fully described by L(Z ), L(Y |Z ) and L(X |Y , Z ) Specification of Probability Distributions - III The distribution of (X , Y , Z ) is fully described by L(Z ), L(Y |Z ) and L(X |Y , Z ) and not, for instance, by L(Y ), L(Y |Z ) and L(X |Z ) Specification of Probability Distributions - III The distribution of (X , Y , Z ) is fully described by L(Z ), L(Y |Z ) and L(X |Y , Z ) and not, for instance, by L(Y ), L(Y |Z ) and L(X |Z ) or other inappropriate conditional or marginal distributions. Specification of Bayesian Models Data Y is modeled by a distribution depending on a parameter θ. Specification of Bayesian Models Data Y is modeled by a distribution depending on a parameter θ. L(Y |θ) ∼ p(y |θ) p(y |θ) is the probability density function (pdf) of Y given θ. Specification of Bayesian Models Data Y is modeled by a distribution depending on a parameter θ. L(Y |θ) ∼ p(y |θ) p(y |θ) is the probability density function (pdf) of Y given θ. Let the prior distribution of θ be given by L(θ) ∼ q(θ). Specification of Bayesian Models Data Y is modeled by a distribution depending on a parameter θ. L(Y |θ) ∼ p(y |θ) p(y |θ) is the probability density function (pdf) of Y given θ. Let the prior distribution of θ be given by L(θ) ∼ q(θ). Then the joint distribution is given by p(y |θ)q(θ) and all Bayesian analyses begin from this point. Specification of Bayesian Models Data Y is modeled by a distribution depending on a parameter θ. L(Y |θ) ∼ p(y |θ) p(y |θ) is the probability density function (pdf) of Y given θ. Let the prior distribution of θ be given by L(θ) ∼ q(θ). Then the joint distribution is given by p(y |θ)q(θ) and all Bayesian analyses begin from this point. For instance the posterior distribution of θ given the data y is L(θ|Y = y ) ∝ p(y |θ)q(θ). Usual Specification of Hierarchical Bayesian Models - I In a simple but typical Bayes hierarchical model, one says in addition to the data Y and parameter θ, there is also a hyperparameter δ. (Do not forget L(Y |θ) = p(y |θ).) Usual Specification of Hierarchical Bayesian Models - I In a simple but typical Bayes hierarchical model, one says in addition to the data Y and parameter θ, there is also a hyperparameter δ. (Do not forget L(Y |θ) = p(y |θ).) Further more, the distribution of θ given δ is L(θ|δ) = q ∗ (θ|δ) Usual Specification of Hierarchical Bayesian Models - I In a simple but typical Bayes hierarchical model, one says in addition to the data Y and parameter θ, there is also a hyperparameter δ. (Do not forget L(Y |θ) = p(y |θ).) Further more, the distribution of θ given δ is L(θ|δ) = q ∗ (θ|δ) and the distribution of δ is L(δ) = r (δ). ♦ Usual Specification of Hierarchical Bayesian Models - II ♦ We also immediately write down the joint distribution as L(Y , θ, δ) = p(y |θ)q ∗ (θ|δ)r (δ) Usual Specification of Hierarchical Bayesian Models - II ♦ We also immediately write down the joint distribution as L(Y , θ, δ) = p(y |θ)q ∗ (θ|δ)r (δ) and say that the posterior distribution is L(θ|Y , δ) ∝ p(y |θ)q ∗ (θ|δ)r (δ) ∝ p(y |θ)q ∗ (θ|δ) Usual Specification of Hierarchical Bayesian Models - II ♦ We also immediately write down the joint distribution as L(Y , θ, δ) = p(y |θ)q ∗ (θ|δ)r (δ) and say that the posterior distribution is L(θ|Y , δ) ∝ p(y |θ)q ∗ (θ|δ)r (δ) ∝ p(y |θ)q ∗ (θ|δ) and L(δ|Y , θ) ∝ p(y |θ)q ∗ (θ|δ)r (δ) ∝ q ∗ (θ|δ)r (δ). Usual Specification of Hierarchical Bayesian Models - II ♦ We also immediately write down the joint distribution as L(Y , θ, δ) = p(y |θ)q ∗ (θ|δ)r (δ) and say that the posterior distribution is L(θ|Y , δ) ∝ p(y |θ)q ∗ (θ|δ)r (δ) ∝ p(y |θ)q ∗ (θ|δ) and L(δ|Y , θ) ∝ p(y |θ)q ∗ (θ|δ)r (δ) ∝ q ∗ (θ|δ)r (δ). This is completely wrong. Usual Specification of Hierarchical Bayesian Models - III What went wrong? Usual Specification of Hierarchical Bayesian Models - III What went wrong? Incomplete or incorrect specification of the joint distribution of X , θ, δ. So, the claims about the posterior distributions are incorrect. Correct Specification of Hierarchical Bayesian Models - I It is fine to go and assume as before the following about the parameter θ and the hyperparameter δ: L(θ|δ) = q ∗ (θ|δ) Correct Specification of Hierarchical Bayesian Models - I It is fine to go and assume as before the following about the parameter θ and the hyperparameter δ: L(θ|δ) = q ∗ (θ|δ) and L(δ) = r (δ). Correct Specification of Hierarchical Bayesian Models - I It is fine to go and assume as before the following about the parameter θ and the hyperparameter δ: L(θ|δ) = q ∗ (θ|δ) and L(δ) = r (δ). This will give the joint distribution of the parameter and the hyperparameter. Correct Specification of Hierarchical Bayesian Models - I It is fine to go and assume as before the following about the parameter θ and the hyperparameter δ: L(θ|δ) = q ∗ (θ|δ) and L(δ) = r (δ). This will give the joint distribution of the parameter and the hyperparameter. This should tied up with a model for the distribution for the data Y, Correct Specification of Hierarchical Bayesian Models - I It is fine to go and assume as before the following about the parameter θ and the hyperparameter δ: L(θ|δ) = q ∗ (θ|δ) and L(δ) = r (δ). This will give the joint distribution of the parameter and the hyperparameter. This should tied up with a model for the distribution for the data Y , namely one should specify L(Y |θ, δ) = p ∗ (y |θ, δ). Correct Specification of Hierarchical Bayesian Models - I It is fine to go and assume as before the following about the parameter θ and the hyperparameter δ: L(θ|δ) = q ∗ (θ|δ) and L(δ) = r (δ). This will give the joint distribution of the parameter and the hyperparameter. This should tied up with a model for the distribution for the data Y , namely one should specify L(Y |θ, δ) = p ∗ (y |θ, δ). In other words, all hyperparameters introduced (usually at the end, and at will, and with abandon) should be tied up to the model describing the data to produce a joint distribution. Correct Specification of Hierarchical Bayesian Models - I It is fine to go and assume as before the following about the parameter θ and the hyperparameter δ: L(θ|δ) = q ∗ (θ|δ) and L(δ) = r (δ). This will give the joint distribution of the parameter and the hyperparameter. This should tied up with a model for the distribution for the data Y , namely one should specify L(Y |θ, δ) = p ∗ (y |θ, δ). In other words, all hyperparameters introduced (usually at the end, and at will, and with abandon) should be tied up to the model describing the data to produce a joint distribution. Are there any published papers that do not do this? Correct Specification of Hierarchical Bayesian Models - II One way out of this quandary is to introduce the joint distribution of the the parameter and hyperparameter as before as q ∗ (θ|δ)r (δ) Correct Specification of Hierarchical Bayesian Models - II One way out of this quandary is to introduce the joint distribution of the the parameter and hyperparameter as before as q ∗ (θ|δ)r (δ) and to define the model for the data as L(Y |θ, δ) = L(Y |θ) ∼ p ∗∗ (y |θ) and require it to depend only on on the parameter θ Correct Specification of Hierarchical Bayesian Models - II One way out of this quandary is to introduce the joint distribution of the the parameter and hyperparameter as before as q ∗ (θ|δ)r (δ) and to define the model for the data as L(Y |θ, δ) = L(Y |θ) ∼ p ∗∗ (y |θ) and require it to depend only on on the parameter θ and not on the hyperparameter δ. Correct Specification of Hierarchical Bayesian Models - III In that case, the joint distribution of the quantities involved becomes p ∗∗ (y |θ)q ∗ (θ|δ)r (δ) Correct Specification of Hierarchical Bayesian Models - III In that case, the joint distribution of the quantities involved becomes p ∗∗ (y |θ)q ∗ (θ|δ)r (δ) and one can obtain the full conditional distributions of θ and δ to perform MCMC. L(θ|Y , δ) ∝ p ∗∗ (y |θ)q ∗ (θ|δ)r (δ) ∝ p ∗∗ (y |θ)q ∗ (θ|δ) and L(δ|Y , θ) ∝ p ∗∗ (y |θ)q ∗ (θ|δ)r (δ) ∝ q ∗ (θ|δ)r (δ).
© Copyright 2026 Paperzz