J. Sethuraman Partition Based Nonparametric Priors Partition based priors to analyze data arising from failure models and censoring models Jayaram Sethuraman Department of Statistics Florida State University and University of South Carolina [email protected] October 21, 2008 J. Sethuraman Partition Based Nonparametric Priors Summary Failure and repair models J. Sethuraman Partition Based Nonparametric Priors Summary Failure and repair models Censoring models J. Sethuraman Partition Based Nonparametric Priors Summary Failure and repair models Censoring models Partition based (PB) priors J. Sethuraman Partition Based Nonparametric Priors Summary Failure and repair models Censoring models Partition based (PB) priors introduced by Sethuraman and Hollander (2008) turn out to be the natural priors for analyzing both these kinds of data. J. Sethuraman Partition Based Nonparametric Priors Failure and repair models - I Here X1 , X2 , . . . are dependent random variables whose distribution depends on a pm P and we want to estimate P. J. Sethuraman Partition Based Nonparametric Priors Failure and repair models - I Here X1 , X2 , . . . are dependent random variables whose distribution depends on a pm P and we want to estimate P. For any set A, let PA denote the restriction of P to A. J. Sethuraman Partition Based Nonparametric Priors Failure and repair models - I Here X1 , X2 , . . . are dependent random variables whose distribution depends on a pm P and we want to estimate P. For any set A, let PA denote the restriction of P to A. The general repair model postulates that there some environmental random variables Y1 , Y2 , . . . , and J. Sethuraman Partition Based Nonparametric Priors Failure and repair models - I Here X1 , X2 , . . . are dependent random variables whose distribution depends on a pm P and we want to estimate P. For any set A, let PA denote the restriction of P to A. The general repair model postulates that there some environmental random variables Y1 , Y2 , . . . , and X1 |P ∼ P J. Sethuraman Partition Based Nonparametric Priors Failure and repair models - I Here X1 , X2 , . . . are dependent random variables whose distribution depends on a pm P and we want to estimate P. For any set A, let PA denote the restriction of P to A. The general repair model postulates that there some environmental random variables Y1 , Y2 , . . . , and X1 |P ∼ P X2 |(X1 , Y1 , P) ∼ PA1 J. Sethuraman Partition Based Nonparametric Priors Failure and repair models - I Here X1 , X2 , . . . are dependent random variables whose distribution depends on a pm P and we want to estimate P. For any set A, let PA denote the restriction of P to A. The general repair model postulates that there some environmental random variables Y1 , Y2 , . . . , and X1 |P ∼ P X2 |(X1 , Y1 , P) ∼ PA1 X3 |(X1 , X2 , Y1 , Y2 , P) ∼ PA2 .. . J. Sethuraman Partition Based Nonparametric Priors Failure and repair models - I Here X1 , X2 , . . . are dependent random variables whose distribution depends on a pm P and we want to estimate P. For any set A, let PA denote the restriction of P to A. The general repair model postulates that there some environmental random variables Y1 , Y2 , . . . , and X1 |P ∼ P X2 |(X1 , Y1 , P) ∼ PA1 X3 |(X1 , X2 , Y1 , Y2 , P) ∼ PA2 .. . where An is a set that depends on X1 , . . . , Xn , Y1 , Y2 , . . . , Yn for n = 1, 2, . . . . J. Sethuraman Partition Based Nonparametric Priors Failure and repair models - III For example if AX1 = (X1 , ∞), the distribution of X2 is that obtained by minimal repair J. Sethuraman Partition Based Nonparametric Priors Failure and repair models - III For example if AX1 = (X1 , ∞), the distribution of X2 is that obtained by minimal repair and if AX1 = (0, ∞), J. Sethuraman Partition Based Nonparametric Priors Failure and repair models - III For example if AX1 = (X1 , ∞), the distribution of X2 is that obtained by minimal repair and if AX1 = (0, ∞), distribution of X2 is P (new component). J. Sethuraman Partition Based Nonparametric Priors Failure and repair models - III For example if AX1 = (X1 , ∞), the distribution of X2 is that obtained by minimal repair and if AX1 = (0, ∞), distribution of X2 is P (new component). Fix a number p in (0, 1). Let the environmental variable Y1 ∼ Uniform [0, 1] and independent of X1 . J. Sethuraman Partition Based Nonparametric Priors Failure and repair models - III For example if AX1 = (X1 , ∞), the distribution of X2 is that obtained by minimal repair and if AX1 = (0, ∞), distribution of X2 is P (new component). Fix a number p in (0, 1). Let the environmental variable Y1 ∼ Uniform [0, 1] and independent of X1 . Let (0, ∞) if Y ≤ p AX1 = (X1 , ∞) if Y > p. J. Sethuraman Partition Based Nonparametric Priors Failure and repair models - III For example if AX1 = (X1 , ∞), the distribution of X2 is that obtained by minimal repair and if AX1 = (0, ∞), distribution of X2 is P (new component). Fix a number p in (0, 1). Let the environmental variable Y1 ∼ Uniform [0, 1] and independent of X1 . Let (0, ∞) if Y ≤ p AX1 = (X1 , ∞) if Y > p. This is the Brown-Proschan model of randomized minimal repair. J. Sethuraman Partition Based Nonparametric Priors Failure and repair models - III For example if AX1 = (X1 , ∞), the distribution of X2 is that obtained by minimal repair and if AX1 = (0, ∞), distribution of X2 is P (new component). Fix a number p in (0, 1). Let the environmental variable Y1 ∼ Uniform [0, 1] and independent of X1 . Let (0, ∞) if Y ≤ p AX1 = (X1 , ∞) if Y > p. This is the Brown-Proschan model of randomized minimal repair. Many other repair models fall under this general repair model. J. Sethuraman Partition Based Nonparametric Priors Failure and repair models - III For example if AX1 = (X1 , ∞), the distribution of X2 is that obtained by minimal repair and if AX1 = (0, ∞), distribution of X2 is P (new component). Fix a number p in (0, 1). Let the environmental variable Y1 ∼ Uniform [0, 1] and independent of X1 . Let (0, ∞) if Y ≤ p AX1 = (X1 , ∞) if Y > p. This is the Brown-Proschan model of randomized minimal repair. Many other repair models fall under this general repair model. Repair models can also be viewed as search models. J. Sethuraman Partition Based Nonparametric Priors Failure and repair models - IV A Bayesian will put a prior distribution for P. J. Sethuraman Partition Based Nonparametric Priors Failure and repair models - IV A Bayesian will put a prior distribution for P. He will also assume that the distribution of Yn given X1 , . . . , Xn , Y1 , . . . , Yn−1 is independent of P, n = 1, 2, . . . J. Sethuraman Partition Based Nonparametric Priors Failure and repair models - IV A Bayesian will put a prior distribution for P. He will also assume that the distribution of Yn given X1 , . . . , Xn , Y1 , . . . , Yn−1 is independent of P, n = 1, 2, . . . This leads to a nice simplification; J. Sethuraman Partition Based Nonparametric Priors Failure and repair models - IV A Bayesian will put a prior distribution for P. He will also assume that the distribution of Yn given X1 , . . . , Xn , Y1 , . . . , Yn−1 is independent of P, n = 1, 2, . . . This leads to a nice simplification; we can assume that Y1 , Y2 , . . . are constants. J. Sethuraman Partition Based Nonparametric Priors Censoring Models - 1 Here X1 , X2 , . . . are dependent random variables (actual censored and uncensored observations) whose distribution depends on a pm P and we want to estimate P. J. Sethuraman Partition Based Nonparametric Priors Censoring Models - 1 Here X1 , X2 , . . . are dependent random variables (actual censored and uncensored observations) whose distribution depends on a pm P and we want to estimate P. The general censoring model postulates that there are potential observations X1∗ , X2∗ , . . . , which are i.i.d. P, and there are censoring (environmental) variables Y1 , Y2 , . . . . However, only Z1 = min(X1∗ , Y1 ), Z2 = min(X2∗ , Y2 ), . . . are observed. J. Sethuraman Partition Based Nonparametric Priors Censoring Models - 1 Here X1 , X2 , . . . are dependent random variables (actual censored and uncensored observations) whose distribution depends on a pm P and we want to estimate P. The general censoring model postulates that there are potential observations X1∗ , X2∗ , . . . , which are i.i.d. P, and there are censoring (environmental) variables Y1 , Y2 , . . . . However, only Z1 = min(X1∗ , Y1 ), Z2 = min(X2∗ , Y2 ), . . . are observed. Also observed are X1∗ A1 X2∗ X2 |(X1 , Y1 , Y2 , P) ∼ A2 .. . X1 |(Y1 , P) ∼ where A1 = (Z1 , ∞), A2 = (Z2 , ∞), . . . . if X1∗ ∈ Ac1 if X1∗ ∈ A1 if X2∗ ∈ Ac2 if X2∗ ∈ A2 J. Sethuraman Partition Based Nonparametric Priors Censoring Models - II For left censoring, we define Zi = max(Xi∗ , Yi ) and take Ai = [0, Zi ) in the model above. J. Sethuraman Partition Based Nonparametric Priors Censoring Models - II For left censoring, we define Zi = max(Xi∗ , Yi ) and take Ai = [0, Zi ) in the model above. Similarly by taking An to be a set depending on X1 , . . . , Xn−1 , Y1 , Y2 , . . . , Yn for n = 1, 2, . . . , in the model above, we can simultaneously allow for all kinds of censoring and also allow for dependence on previous X ∗ observations. This general censoring model can also be used in search models. J. Sethuraman Partition Based Nonparametric Priors Censoring Models - II For left censoring, we define Zi = max(Xi∗ , Yi ) and take Ai = [0, Zi ) in the model above. Similarly by taking An to be a set depending on X1 , . . . , Xn−1 , Y1 , Y2 , . . . , Yn for n = 1, 2, . . . , in the model above, we can simultaneously allow for all kinds of censoring and also allow for dependence on previous X ∗ observations. This general censoring model can also be used in search models. As before, a Bayesian will put a prior distribution for P. J. Sethuraman Partition Based Nonparametric Priors Censoring Models - II For left censoring, we define Zi = max(Xi∗ , Yi ) and take Ai = [0, Zi ) in the model above. Similarly by taking An to be a set depending on X1 , . . . , Xn−1 , Y1 , Y2 , . . . , Yn for n = 1, 2, . . . , in the model above, we can simultaneously allow for all kinds of censoring and also allow for dependence on previous X ∗ observations. This general censoring model can also be used in search models. As before, a Bayesian will put a prior distribution for P. He will also assume that the distribution of Yn given X1 , . . . , Xn , Y1 , . . . , Yn−1 is independent of P, n = 1, 2, . . . J. Sethuraman Partition Based Nonparametric Priors Censoring Models - II For left censoring, we define Zi = max(Xi∗ , Yi ) and take Ai = [0, Zi ) in the model above. Similarly by taking An to be a set depending on X1 , . . . , Xn−1 , Y1 , Y2 , . . . , Yn for n = 1, 2, . . . , in the model above, we can simultaneously allow for all kinds of censoring and also allow for dependence on previous X ∗ observations. This general censoring model can also be used in search models. As before, a Bayesian will put a prior distribution for P. He will also assume that the distribution of Yn given X1 , . . . , Xn , Y1 , . . . , Yn−1 is independent of P, n = 1, 2, . . . This leads to the nice simplification; J. Sethuraman Partition Based Nonparametric Priors Censoring Models - II For left censoring, we define Zi = max(Xi∗ , Yi ) and take Ai = [0, Zi ) in the model above. Similarly by taking An to be a set depending on X1 , . . . , Xn−1 , Y1 , Y2 , . . . , Yn for n = 1, 2, . . . , in the model above, we can simultaneously allow for all kinds of censoring and also allow for dependence on previous X ∗ observations. This general censoring model can also be used in search models. As before, a Bayesian will put a prior distribution for P. He will also assume that the distribution of Yn given X1 , . . . , Xn , Y1 , . . . , Yn−1 is independent of P, n = 1, 2, . . . This leads to the nice simplification; we can assume that Y1 , Y2 , . . . are constants. J. Sethuraman Partition Based Nonparametric Priors Partition Based Prior Distributions The models described earlier for dependent data X1 , X2 , . . . lead us, in a natural fashion, to Partition based (PB) Priors. These were introduced in Sethuraman and Hollander (2008). J. Sethuraman Partition Based Nonparametric Priors Partition Based Prior Distributions The models described earlier for dependent data X1 , X2 , . . . lead us, in a natural fashion, to Partition based (PB) Priors. These were introduced in Sethuraman and Hollander (2008). Let B = (B1 , . . . , Bm ) be a finite partition of X . Let P(B) = (P(B1 ), . . . , P(Bm )) be the partition probability vector, and let (PB1 , . . . , PBm ) be the vector of restricted pm’s. J. Sethuraman Partition Based Nonparametric Priors Partition Based Prior Distributions The models described earlier for dependent data X1 , X2 , . . . lead us, in a natural fashion, to Partition based (PB) Priors. These were introduced in Sethuraman and Hollander (2008). Let B = (B1 , . . . , Bm ) be a finite partition of X . Let P(B) = (P(B1 ), . . . , P(Bm )) be the partition probability vector, and let (PB1 , . . . , PBm ) be the vector of restricted pm’s. There is a one-to-one correspondence between pm’s P and the pair (P(B), (PB1 , . . . , PBm )) since P(B) = m X P(Bi )PAi (B). 1 The pair (P(B), (PB1 , . . . , PBm )) is just another way to understand a pm P. J. Sethuraman Partition Based Nonparametric Priors Partition based Priors Suppose that P(B), PB1 , . . . , PBm are independent, with P(B) having a pdf proportional to h(y), and PBi having a distribution Gi , i = 1, . . . , m. J. Sethuraman Partition Based Nonparametric Priors Partition based Priors Suppose that P(B), PB1 , . . . , PBm are independent, with P(B) having a pdf proportional to h(y), and PBi having a distribution Gi , i = 1, . . . , m. Then we say that P has a Partition based (PB) prior distribution H(B, h, G) where G = G1 × · · · × Gm . J. Sethuraman Partition Based Nonparametric Priors Partition based Priors Suppose that P(B), PB1 , . . . , PBm are independent, with P(B) having a pdf proportional to h(y), and PBi having a distribution Gi , i = 1, . . . , m. Then we say that P has a Partition based (PB) prior distribution H(B, h, G) where G = G1 × · · · × Gm . The Gi ’s are probability measures on the space of probability measures and there are m of them. J. Sethuraman Partition Based Nonparametric Priors Partition based Priors Suppose that P(B), PB1 , . . . , PBm are independent, with P(B) having a pdf proportional to h(y), and PBi having a distribution Gi , i = 1, . . . , m. Then we say that P has a Partition based (PB) prior distribution H(B, h, G) where G = G1 × · · · × Gm . The Gi ’s are probability measures on the space of probability measures and there are m of them. Hopefully, we will use standard nonparametric priors for Gi , i = 1, . . . , m, J. Sethuraman Partition Based Nonparametric Priors Partition based Priors Suppose that P(B), PB1 , . . . , PBm are independent, with P(B) having a pdf proportional to h(y), and PBi having a distribution Gi , i = 1, . . . , m. Then we say that P has a Partition based (PB) prior distribution H(B, h, G) where G = G1 × · · · × Gm . The Gi ’s are probability measures on the space of probability measures and there are m of them. Hopefully, we will use standard nonparametric priors for Gi , i = 1, . . . , m, so that we also know the corresponding posterior distributions GiX . J. Sethuraman Partition Based Nonparametric Priors PB Dirichlet Priors Let α be a finite non-zero measure on X and take Gi to be DαAi , i = 1, . . . , m. J. Sethuraman Partition Based Nonparametric Priors PB Dirichlet Priors Let α be a finite non-zero measure on X and take Gi to be DαAi , i = 1, . . . , m. In this case, we will call the PB prior H(B, h, G) as a Partition Based (PB) Dirichlet prior and denote it by D(B, h, α). J. Sethuraman Partition Based Nonparametric Priors PB Dirichlet Priors Let α be a finite non-zero measure on X and take Gi to be DαAi , i = 1, . . . , m. In this case, we will call the PB prior H(B, h, G) as a Partition Based (PB) Dirichlet prior and denote it by D(B, h, α). If B ∗ is a subpartition of B, then a PB Dirichlet prior on B is also a PB Dirichlet prior on B ∗ . J. Sethuraman Partition Based Nonparametric Priors PB Dirichlet Priors Let α be a finite non-zero measure on X and take Gi to be DαAi , i = 1, . . . , m. In this case, we will call the PB prior H(B, h, G) as a Partition Based (PB) Dirichlet prior and denote it by D(B, h, α). If B ∗ is a subpartition of B, then a PB Dirichlet prior on B is also a PB Dirichlet prior on B ∗ . A PB Dirichlet prior D(B, h, α) is the Dirichlet prior Dα if and only of α(A1 )−1 h(y) ∝ y1 α(Am )−1 · · · ym . J. Sethuraman Partition Based Nonparametric Priors Posterior distributions in standard nonparametric Bayesian problem For any nonparametric prior G , let G X be the posterior distribution in the standard nonparametric Bayesian problem. J. Sethuraman Partition Based Nonparametric Priors Posterior distributions in standard nonparametric Bayesian problem For any nonparametric prior G , let G X be the posterior distribution in the standard nonparametric Bayesian problem. That is the unknown pm P has prior distribution G , and the the data X given P has distribution P. G X is simply the notation for the the posterior distribution of P given X . J. Sethuraman Partition Based Nonparametric Priors Posterior distributions of PB priors with repair data - 1 J. Sethuraman Partition Based Nonparametric Priors Posterior distributions of PB priors with repair data - 1 Consider the problem where P has the PB prior H(B, h, G) and the data X given P has distribution PA . J. Sethuraman Partition Based Nonparametric Priors Posterior distributions of PB priors with repair data - 1 Consider the problem where P has the PB prior H(B, h, G) and the data X given P has distribution PA . Suppose that A = ∪ri=1 Bi , and define yA = Pr i=1 yi . J. Sethuraman Partition Based Nonparametric Priors Posterior distributions of PB priors with repair data - 1 Consider the problem where P has the PB prior H(B, h, G) and the data X given P has distribution PA . Suppose that A = ∪ri=1 Bi , and define yA = Pr i=1 yi . Let R be the random index such that X ∈ BR ; note that R ∈ {1, . . . , r }. J. Sethuraman Partition Based Nonparametric Priors Posterior distributions of PB priors with repair data - 2 Theorem Then the posterior distribution of P given X is H(B, hX , G X ) where hX (y ) ∝ h(y) · yR /yA and G X = G1 × · · · GR−1 × GRX × GR+1 × · · · Gm . J. Sethuraman Partition Based Nonparametric Priors Posterior distributions of PB priors with repair data - 3 Suppose that P has the PB Dirichlet distribution D(B, h, α). Given P, let the data X have distribution PA where, as before, A = ∪ri=1 Bi . J. Sethuraman Partition Based Nonparametric Priors Posterior distributions of PB priors with repair data - 3 Suppose that P has the PB Dirichlet distribution D(B, h, α). Given P, let the data X have distribution PA where, as before, A = ∪ri=1 Bi . Then the posterior distribution of P given X is D(B, hX , α + δX ). J. Sethuraman Partition Based Nonparametric Priors Posterior distributions of PB priors with repair data - 3 Suppose that P has the PB Dirichlet distribution D(B, h, α). Given P, let the data X have distribution PA where, as before, A = ∪ri=1 Bi . Then the posterior distribution of P given X is D(B, hX , α + δX ). The condition A = ∪ri=1 Bi is not required when the prior is a PB Dirichlet prior; the finite partition B can be enlarged to a new partition B ∗ by including A; J. Sethuraman Partition Based Nonparametric Priors Posterior distributions of PB priors with repair data - 3 Suppose that P has the PB Dirichlet distribution D(B, h, α). Given P, let the data X have distribution PA where, as before, A = ∪ri=1 Bi . Then the posterior distribution of P given X is D(B, hX , α + δX ). The condition A = ∪ri=1 Bi is not required when the prior is a PB Dirichlet prior; the finite partition B can be enlarged to a new partition B ∗ by including A; the posterior distribution will still be a PB Dirichlet distribution on the partition on B ∗ . J. Sethuraman Partition Based Nonparametric Priors Posterior distributions of PB priors with repair data - 3 Suppose that P has the PB Dirichlet distribution D(B, h, α). Given P, let the data X have distribution PA where, as before, A = ∪ri=1 Bi . Then the posterior distribution of P given X is D(B, hX , α + δX ). The condition A = ∪ri=1 Bi is not required when the prior is a PB Dirichlet prior; the finite partition B can be enlarged to a new partition B ∗ by including A; the posterior distribution will still be a PB Dirichlet distribution on the partition on B ∗ . Even more; The partition B can be enlarged with the random restriction sets based on all of data, and still a blind application of the method above will give the correct answer. See Sethuraman and Hollander (2008). J. Sethuraman Partition Based Nonparametric Priors Illustration a Dirichlet Prior as a PB prior As an illustration, we will write a Dirichlet prior as a partition based prior on a partition determined by the data and compute the posterior; we will see that we still obtain the correct answer. J. Sethuraman Partition Based Nonparametric Priors Illustration a Dirichlet Prior as a PB prior As an illustration, we will write a Dirichlet prior as a partition based prior on a partition determined by the data and compute the posterior; we will see that we still obtain the correct answer. Suppose that P ∼ Dα , X |P ∼ P. J. Sethuraman Partition Based Nonparametric Priors Illustration a Dirichlet Prior as a PB prior As an illustration, we will write a Dirichlet prior as a partition based prior on a partition determined by the data and compute the posterior; we will see that we still obtain the correct answer. Suppose that P ∼ Dα , X |P ∼ P. Let A = (X − , X + ) and let α(A) > 0. J. Sethuraman Partition Based Nonparametric Priors Illustration a Dirichlet Prior as a PB prior As an illustration, we will write a Dirichlet prior as a partition based prior on a partition determined by the data and compute the posterior; we will see that we still obtain the correct answer. Suppose that P ∼ Dα , X |P ∼ P. Let A = (X − , X + ) and let α(A) > 0. Let B = {A, Ac }. J. Sethuraman Partition Based Nonparametric Priors Illustration a Dirichlet Prior as a PB prior As an illustration, we will write a Dirichlet prior as a partition based prior on a partition determined by the data and compute the posterior; we will see that we still obtain the correct answer. Suppose that P ∼ Dα , X |P ∼ P. Let A = (X − , X + ) and let α(A) > 0. Let B = {A, Ac }. α(A)−1 α(Ac )−1 y1 Then rewrite Dα = H(B, h, Dα ) with h ∝ y1 and J. Sethuraman Partition Based Nonparametric Priors Illustration a Dirichlet Prior as a PB prior As an illustration, we will write a Dirichlet prior as a partition based prior on a partition determined by the data and compute the posterior; we will see that we still obtain the correct answer. Suppose that P ∼ Dα , X |P ∼ P. Let A = (X − , X + ) and let α(A) > 0. Let B = {A, Ac }. α(A)−1 α(Ac )−1 Then rewrite Dα = H(B, h, Dα ) with h ∝ y1 y1 and a blind application of the main theorem will show that the posterior α(A) α(Ac )−1 is H(B, hX , Dα+δX ), with hX ∝ y1 y1 , which luckily reduces to the correct answer, namely Dα+δX . J. Sethuraman Partition Based Nonparametric Priors Bayes and Frequentist Estimates of the Survival Function Bayes and W−S Estimates from AC Data Set (small) of Whittaker −Samaniego 1 0.9 0.8 WS Estimate Survival Probability 0.7 0.6 0.5 0.4 0.3 0.2 Bayes Estimate 0.1 0 0 50 100 150 200 Failure Age (Hours) 250 300 350 J. Sethuraman Partition Based Nonparametric Priors PB Priors and Censored Data - 1 1. Suppose that, given P, the potential data X1 , X2 , . . . are i.i.d. P. J. Sethuraman Partition Based Nonparametric Priors PB Priors and Censored Data - 1 1. Suppose that, given P, the potential data X1 , X2 , . . . are i.i.d. P. 2. The potential data may be censored, J. Sethuraman Partition Based Nonparametric Priors PB Priors and Censored Data - 1 1. Suppose that, given P, the potential data X1 , X2 , . . . are i.i.d. P. 2. The potential data may be censored, i.e. there are sets A1 , A3 . . . . and we know the value of Xi if Xi ∈ Aci , J. Sethuraman Partition Based Nonparametric Priors PB Priors and Censored Data - 1 1. Suppose that, given P, the potential data X1 , X2 , . . . are i.i.d. P. 2. The potential data may be censored, i.e. there are sets A1 , A3 . . . . and we know the value of Xi if Xi ∈ Aci , otherwise we say the Xi is censored and we only know that Xi ∈ Ai . J. Sethuraman Partition Based Nonparametric Priors PB Priors and Censored Data - 1 1. Suppose that, given P, the potential data X1 , X2 , . . . are i.i.d. P. 2. The potential data may be censored, i.e. there are sets A1 , A3 . . . . and we know the value of Xi if Xi ∈ Aci , otherwise we say the Xi is censored and we only know that Xi ∈ Ai . 3. Here the set An can depend on (X1 , Y1 , . . . , Xn−1 , Yn−1 , Yn ), where Y1 , Y2 , . . . are the censoring variables. J. Sethuraman Partition Based Nonparametric Priors PB Priors and Censored Data - 1 1. Suppose that, given P, the potential data X1 , X2 , . . . are i.i.d. P. 2. The potential data may be censored, i.e. there are sets A1 , A3 . . . . and we know the value of Xi if Xi ∈ Aci , otherwise we say the Xi is censored and we only know that Xi ∈ Ai . 3. Here the set An can depend on (X1 , Y1 , . . . , Xn−1 , Yn−1 , Yn ), where Y1 , Y2 , . . . are the censoring variables. 4. The distribution of Yn given X1 , . . . , Xn , Y1 , . . . , Yn−1 is independent of P, n = 1, 2, . . . (usual assumption in Bayesian methods giving nice simplifications; J. Sethuraman Partition Based Nonparametric Priors PB Priors and Censored Data - 1 1. Suppose that, given P, the potential data X1 , X2 , . . . are i.i.d. P. 2. The potential data may be censored, i.e. there are sets A1 , A3 . . . . and we know the value of Xi if Xi ∈ Aci , otherwise we say the Xi is censored and we only know that Xi ∈ Ai . 3. Here the set An can depend on (X1 , Y1 , . . . , Xn−1 , Yn−1 , Yn ), where Y1 , Y2 , . . . are the censoring variables. 4. The distribution of Yn given X1 , . . . , Xn , Y1 , . . . , Yn−1 is independent of P, n = 1, 2, . . . (usual assumption in Bayesian methods giving nice simplifications; can assume that Y1 , Y2 , . . . are constants.) J. Sethuraman Partition Based Nonparametric Priors PB Priors and Censored Data - 2 Let P have prior H(B, h, G) and suppose that the censoring set A satisfies A = ∪ri=1 Bi ; define yA = ∪ri=1 yi . J. Sethuraman Partition Based Nonparametric Priors PB Priors and Censored Data - 2 Let P have prior H(B, h, G) and suppose that the censoring set A satisfies A = ∪ri=1 Bi ; define yA = ∪ri=1 yi . Let us consider just only one potential observation X and also assume that it is censored by A. (The uncensored case is the standard nonparametric Bayesian problem.) J. Sethuraman Partition Based Nonparametric Priors PB Priors and Censored Data - 3 Theorem Then the posterior distribution of P given that the potential data X is censored, J. Sethuraman Partition Based Nonparametric Priors PB Priors and Censored Data - 3 Theorem Then the posterior distribution of P given that the potential data X is censored, is H(B, h(y) · yA , G). J. Sethuraman Partition Based Nonparametric Priors PB Priors and Censored Data - 3 Theorem Then the posterior distribution of P given that the potential data X is censored, is H(B, h(y) · yA , G). (Do not need independence among PB1 , . . . , PBm ). J. Sethuraman Partition Based Nonparametric Priors PB Priors and Censored Data - 3 Theorem Then the posterior distribution of P given that the potential data X is censored, is H(B, h(y) · yA , G). (Do not need independence among PB1 , . . . , PBm ). Theorem If there are censoring sets A1 , . . . , An , each of which is a sum of sets in the partition B, J. Sethuraman Partition Based Nonparametric Priors PB Priors and Censored Data - 3 Theorem Then the posterior distribution of P given that the potential data X is censored, is H(B, h(y) · yA , G). (Do not need independence among PB1 , . . . , PBm ). Theorem If there are censoring sets A1 , . . . , An , each of which is a sum of sets in the partition B,then the posterior distribution of P given that all the data X1 , . . . , Xn are censored J. Sethuraman Partition Based Nonparametric Priors PB Priors and Censored Data - 3 Theorem Then the posterior distribution of P given that the potential data X is censored, is H(B, h(y) · yA , G). (Do not need independence among PB1 , . . . , PBm ). Theorem If there are censoring sets A1 , . . . , An , each of which is a sum of sets in the partition B,then the posterior distribution of P given that all the data X1 , . . . , Xn are censored is H(B, h(y) · yA1 · · · yAn , G). J. Sethuraman Partition Based Nonparametric Priors PB Dirichlet Priors and Censored Data Simpler results can be stated for PB Dirichlet priors and Dirichlet priors. J. Sethuraman Partition Based Nonparametric Priors PB Dirichlet Priors and Censored Data Simpler results can be stated for PB Dirichlet priors and Dirichlet priors. We do not need the condition that the censoring sets are sums of sets in the partition B. In particular, suppose that P has a Dirichlet prior Dα . Let A1 , . . . , An be the censoring sets for the potential data X1 , . . . , Xn , which are i.i.d. P. Let P B = (B1 , . . . , Bm ) be the partition based on A1 , . . . , An . Let YAi = j:Bj ⊂Ai yj , i = 1, . . . , n. J. Sethuraman Partition Based Nonparametric Priors PB Dirichlet Priors and Censored Data Simpler results can be stated for PB Dirichlet priors and Dirichlet priors. We do not need the condition that the censoring sets are sums of sets in the partition B. In particular, suppose that P has a Dirichlet prior Dα . Let A1 , . . . , An be the censoring sets for the potential data X1 , . . . , Xn , which are i.i.d. P. Let P B = (B1 , . . . , Bm ) be the partition based on A1 , . . . , An . Let YAi = j:Bj ⊂Ai yj , i = 1, . . . , n. Theorem Then the posterior distribution of P given that all the data X1 , . . . , Xn are censored is D(B, h(y) · yA1 · · · yAn , α) α(B1 )−1 where h(y) ∝ y1 α(Bm )−1 · · · ym , J. Sethuraman Partition Based Nonparametric Priors PB Dirichlet Priors and Censored Data Simpler results can be stated for PB Dirichlet priors and Dirichlet priors. We do not need the condition that the censoring sets are sums of sets in the partition B. In particular, suppose that P has a Dirichlet prior Dα . Let A1 , . . . , An be the censoring sets for the potential data X1 , . . . , Xn , which are i.i.d. P. Let P B = (B1 , . . . , Bm ) be the partition based on A1 , . . . , An . Let YAi = j:Bj ⊂Ai yj , i = 1, . . . , n. Theorem Then the posterior distribution of P given that all the data X1 , . . . , Xn are censored is D(B, h(y) · yA1 · · · yAn , α) α(B1 )−1 where h(y) ∝ y1 α(Bm )−1 · · · ym , and this is a PB Dirichlet distribution. J. Sethuraman Partition Based Nonparametric Priors Dirichlet Priors The moral of this talk is that if you start even from a Dirichlet prior, and the data consists of general repair or general censored data, then the posterior will be a PB Dirichlet distribution. J. Sethuraman Partition Based Nonparametric Priors Susarla-Van Ryzin Example -1 Susarla and Van Ruzin (1976) considered right censoring J. Sethuraman Partition Based Nonparametric Priors Susarla-Van Ryzin Example -1 Susarla and Van Ruzin (1976) considered right censoring (i.e. the censoring set Ai = [ai , ∞) among random variables on [0, ∞).). They also used a Dirichlet prior for P and obtained the expectation of the df F (x) under the posterior distribution. J. Sethuraman Partition Based Nonparametric Priors Susarla-Van Ryzin Example -1 Susarla and Van Ruzin (1976) considered right censoring (i.e. the censoring set Ai = [ai , ∞) among random variables on [0, ∞).). They also used a Dirichlet prior for P and obtained the expectation of the df F (x) under the posterior distribution. Their example has been revisited by many authors. J. Sethuraman Partition Based Nonparametric Priors Susarla-Van Ryzin Example -1 Susarla and Van Ruzin (1976) considered right censoring (i.e. the censoring set Ai = [ai , ∞) among random variables on [0, ∞).). They also used a Dirichlet prior for P and obtained the expectation of the df F (x) under the posterior distribution. Their example has been revisited by many authors. We will show how their example works with the use of PB priors. J. Sethuraman Partition Based Nonparametric Priors Susarla-Van Ryzin Example -2 Their data set is 0.8, 1.0+, 2.7+, 3.1, 5.4, 7.0+, 9.2, 12.1+, where a+ denotes that the censoring set was [a, ∞) and that potential observation was censored. J. Sethuraman Partition Based Nonparametric Priors Susarla-Van Ryzin Example -2 Their data set is 0.8, 1.0+, 2.7+, 3.1, 5.4, 7.0+, 9.2, 12.1+, where a+ denotes that the censoring set was [a, ∞) and that potential observation was censored. They used a Dirichlet prior for P with parameter α, which was 8 times the exponential distribution with failure rate 0.12. J. Sethuraman Partition Based Nonparametric Priors Susarla-Van Ryzin Example -2 Their data set is 0.8, 1.0+, 2.7+, 3.1, 5.4, 7.0+, 9.2, 12.1+, where a+ denotes that the censoring set was [a, ∞) and that potential observation was censored. They used a Dirichlet prior for P with parameter α, which was 8 times the exponential distribution with failure rate 0.12. The posterior distribution given the uncensored observations is Dirichlet with parameter α∗ = α + δ0.8 + δ3.1 + δ5.4 + δ9.2 . We can take this to be the prior distribution and say that the remaining data 1.0+, 2.7+, 7.0+, 12.1+, J. Sethuraman Partition Based Nonparametric Priors Susarla-Van Ryzin Example -2 Their data set is 0.8, 1.0+, 2.7+, 3.1, 5.4, 7.0+, 9.2, 12.1+, where a+ denotes that the censoring set was [a, ∞) and that potential observation was censored. They used a Dirichlet prior for P with parameter α, which was 8 times the exponential distribution with failure rate 0.12. The posterior distribution given the uncensored observations is Dirichlet with parameter α∗ = α + δ0.8 + δ3.1 + δ5.4 + δ9.2 . We can take this to be the prior distribution and say that the remaining data 1.0+, 2.7+, 7.0+, 12.1+, are all censored. J. Sethuraman Partition Based Nonparametric Priors Susarla-Van Ryzin Example - 3 The partition formed by the censoring sets is B = (B1 , . . . , B5 ) = ([0, a1 ), [a2 , a3 ), . . . , [a4 , ∞)) = ([0, 1.0), [1.0, 2.7), [2, 7, 7.0), [7.0, 12.1), [12.1, ∞)). J. Sethuraman Partition Based Nonparametric Priors Susarla-Van Ryzin Example - 3 The partition formed by the censoring sets is B = (B1 , . . . , B5 ) = ([0, a1 ), [a2 , a3 ), . . . , [a4 , ∞)) = ([0, 1.0), [1.0, 2.7), [2, 7, 7.0), [7.0, 12.1), [12.1, ∞)). The posterior distribution given all the data is therefore D(B, h(y)Y2 Y3 Y4 Y5 , α∗ ) where Yj = yj + · · · + y5 , j = 2, . . . , 5 are the tail sums of the y ’s and α∗ (Bj )−1 h(y) ∝ ×51 yj . J. Sethuraman Partition Based Nonparametric Priors Susarla-Van Ryzin Example - 4 The expectation of P(Bj ) under this posterior is the expectation of yj under the finite dimensional distribution with pdf proportional to h(y)Y2 Y3 Y4 Y5 . Q The transformation yj = zj 1j−1 (1 − zr ), j = 1, . . . , 4 makes z1 , . . . , z4 into independent Beta variables. This makes the calculation of the expected value of P(Bj ) very easy. J. Sethuraman Partition Based Nonparametric Priors Susarla-Van Ryzin Example - 4 The expectation of P(Bj ) under this posterior is the expectation of yj under the finite dimensional distribution with pdf proportional to h(y)Y2 Y3 Y4 Y5 . Q The transformation yj = zj 1j−1 (1 − zr ), j = 1, . . . , 4 makes z1 , . . . , z4 into independent Beta variables. This makes the calculation of the expected value of P(Bj ) very easy. The expected value of P([a, ∞)) can be computed just as easily for any a ∈ Bi since E (P([a, ∞))) = E (yi )E (PBi ([a, ai ))) + E (Yi+1 ) (Explain independence) J. Sethuraman Partition Based Nonparametric Priors Susarla-Van Ryzin Example - 4 The expectation of P(Bj ) under this posterior is the expectation of yj under the finite dimensional distribution with pdf proportional to h(y)Y2 Y3 Y4 Y5 . Q The transformation yj = zj 1j−1 (1 − zr ), j = 1, . . . , 4 makes z1 , . . . , z4 into independent Beta variables. This makes the calculation of the expected value of P(Bj ) very easy. The expected value of P([a, ∞)) can be computed just as easily for any a ∈ Bi since E (P([a, ∞))) = E (yi )E (PBi ([a, ai ))) + E (Yi+1 ) (Explain independence) and E (PBi ([a, ai ))) = α∗ ([a, ai ))/α∗ ([ai−1 , ai )). J. Sethuraman Partition Based Nonparametric Priors Susarla-Van Ryzin Example, Extended -1 Two way censoring Suppose that we write the data from the example of Susarla-Van Ryzin as 0.8−, 1.0+, 2.7+, 3.1−, 5.4−, 7.0+, 9.2−, 12.1+. J. Sethuraman Partition Based Nonparametric Priors Susarla-Van Ryzin Example, Extended -1 Two way censoring Suppose that we write the data from the example of Susarla-Van Ryzin as 0.8−, 1.0+, 2.7+, 3.1−, 5.4−, 7.0+, 9.2−, 12.1+. where a− indicates that a potential data was censored to the interval [0, a) (i.e. right censored). J. Sethuraman Partition Based Nonparametric Priors Susarla-Van Ryzin Example, Extended -1 Two way censoring Suppose that we write the data from the example of Susarla-Van Ryzin as 0.8−, 1.0+, 2.7+, 3.1−, 5.4−, 7.0+, 9.2−, 12.1+. where a− indicates that a potential data was censored to the interval [0, a) (i.e. right censored). The data generates a partition B = (B1 , . . . , B9 ) of 9 intervals. J. Sethuraman Partition Based Nonparametric Priors Susarla-Van Ryzin Example, Extended - 2 If we use a Dirichlet prior Dα , it can also be viewed as PB Dirichlet on this partition. The prior distribution of P(B) = (P(B1 ), . . . , P(B9 )) is the finite dimensional Dirichlet with pdf proportional to α(B )−1 h(y) = ×91 yi i . J. Sethuraman Partition Based Nonparametric Priors Susarla-Van Ryzin Example, Extended - 2 If we use a Dirichlet prior Dα , it can also be viewed as PB Dirichlet on this partition. The prior distribution of P(B) = (P(B1 ), . . . , P(B9 )) is the finite dimensional Dirichlet with pdf proportional to α(B )−1 h(y) = ×91 yi i . The posterior distribution of P is a PB Dirichlet distribution D(B ∗ , h(y)Z1 Y3 Y4 Z5 Z5 Y7 Z8 Y9 , α) whereP P Zj = j1 yi , Yj = 9j yi . J. Sethuraman Partition Based Nonparametric Priors Susarla-Van Ryzin Example, Extended - 2 If we use a Dirichlet prior Dα , it can also be viewed as PB Dirichlet on this partition. The prior distribution of P(B) = (P(B1 ), . . . , P(B9 )) is the finite dimensional Dirichlet with pdf proportional to α(B )−1 h(y) = ×91 yi i . The posterior distribution of P is a PB Dirichlet distribution D(B ∗ , h(y)Z1 Y3 Y4 Z5 Z5 Y7 Z8 Y9 , α) whereP P Zj = j1 yi , Yj = 9j yi . This also illustrates how to handle any kind of censoring in an arbitrary space. J. Sethuraman Partition Based Nonparametric Priors Susarla-Van Ryzin Example, Extended -3 J. Sethuraman Partition Based Nonparametric Priors Susarla-Van Ryzin Example, Extended -3 A closed form expression for E (yj ) under the posterior distribution is not available. However, it is just Eh (yj Z1 Y3 Y4 Z5 Z5 Y7 Z8 Y9 ) . Eh (Z1 Y3 Y4 Z5 Z5 Y7 Z8 Y9 ) J. Sethuraman Partition Based Nonparametric Priors Susarla-Van Ryzin Example, Extended -3 A closed form expression for E (yj ) under the posterior distribution is not available. However, it is just Eh (yj Z1 Y3 Y4 Z5 Z5 Y7 Z8 Y9 ) . Eh (Z1 Y3 Y4 Z5 Z5 Y7 Z8 Y9 ) where Eh denotes expectation under the finite dimensional Dirichlet distribution D(α(B1 ), . . . , α(B9 )). J. Sethuraman Partition Based Nonparametric Priors Susarla-Van Ryzin Example, Extended -3 A closed form expression for E (yj ) under the posterior distribution is not available. However, it is just Eh (yj Z1 Y3 Y4 Z5 Z5 Y7 Z8 Y9 ) . Eh (Z1 Y3 Y4 Z5 Z5 Y7 Z8 Y9 ) where Eh denotes expectation under the finite dimensional Dirichlet distribution D(α(B1 ), . . . , α(B9 )). We can generate samples (y1r , . . . , y9r ), r = 1, . . . , N from h(y) by using independent Gamma random variables and approximate the above as PN r r r r r r r r r r =1 yj Z1 Y3 Y4 Z5 Z5 Y7 Z8 Y9 r r r r r r r r r =1 Z1 Y3 Y4 Z5 Z5 Y7 Z8 Y9 PN . J. Sethuraman Partition Based Nonparametric Priors Susarla-Van Ryzin Example, Extended -4 This method is justified by the Law of Large Numbers and good rates of convergence are well known. J. Sethuraman Partition Based Nonparametric Priors Susarla-Van Ryzin Example, Extended -4 This method is justified by the Law of Large Numbers and good rates of convergence are well known. It does not use imputation and MCMC methods, which add another level of randomness in the final answer. J. Sethuraman Partition Based Nonparametric Priors Susarla-Van Ryzin Example, Extended -4 This method is justified by the Law of Large Numbers and good rates of convergence are well known. It does not use imputation and MCMC methods, which add another level of randomness in the final answer. This method will handle any kind of censoring and is applicable to distributions in multidimensional spaces. J. Sethuraman Partition Based Nonparametric Priors Susarla-Van Ryzin Example, Extended -4 This method is justified by the Law of Large Numbers and good rates of convergence are well known. It does not use imputation and MCMC methods, which add another level of randomness in the final answer. This method will handle any kind of censoring and is applicable to distributions in multidimensional spaces. Of course, we can estimate many features other than just the distribution function, i.e median, quantiles, etc. J. Sethuraman Partition Based Nonparametric Priors Susarla-Van Ryzin Example, Extended -5 We use our method to recalculate the estimates of F (t) based on Susarla-Van Ryzin data. When the transformation to independent Beta variables is used we get the same results as Susarla and Van Ryzin. However, this depends on the fact that all the censoring were right censoring.results. J. Sethuraman Partition Based Nonparametric Priors Susarla-Van Ryzin Example, Extended -5 We use our method to recalculate the estimates of F (t) based on Susarla-Van Ryzin data. When the transformation to independent Beta variables is used we get the same results as Susarla and Van Ryzin. However, this depends on the fact that all the censoring were right censoring.results. We also use on LLN method to compute estimates of F (t). A comparison of the results is given below: F̂ (t) t 0.80 1.00+ 2.70+ 3.10 5.40 7.00+ 9.20 12.10+ Exact 0.1083 0.1190 0.2071 0.3006 0.4719 0.5256 0.6823 0.7501 J. Sethuraman Partition Based Nonparametric Priors Susarla-Van Ryzin Example, Extended -5 We use our method to recalculate the estimates of F (t) based on Susarla-Van Ryzin data. When the transformation to independent Beta variables is used we get the same results as Susarla and Van Ryzin. However, this depends on the fact that all the censoring were right censoring.results. We also use on LLN method to compute estimates of F (t). A comparison of the results is given below: F̂ (t) t 0.80 1.00+ 2.70+ 3.10 5.40 7.00+ 9.20 12.10+ Exact 0.1083 0.1190 0.2071 0.3006 0.4719 0.5256 0.6823 0.7501 LLN 0.1082 0.1189 0.2068 0.3000 0.4708 0.5244 0.6810 0.7487 J. Sethuraman Partition Based Nonparametric Priors Susarla-Van Ryzin Example, Extended -5 We use our method to recalculate the estimates of F (t) based on Susarla-Van Ryzin data. When the transformation to independent Beta variables is used we get the same results as Susarla and Van Ryzin. However, this depends on the fact that all the censoring were right censoring.results. We also use on LLN method to compute estimates of F (t). A comparison of the results is given below: F̂ (t) t 0.80 1.00+ 2.70+ 3.10 5.40 7.00+ 9.20 12.10+ Exact 0.1083 0.1190 0.2071 0.3006 0.4719 0.5256 0.6823 0.7501 LLN 0.1082 0.1189 0.2068 0.3000 0.4708 0.5244 0.6810 0.7487 SSS 0.1083 0.1190 0.2084 0.3011 0.4706 0.5261 0.6802 0.7476 J. Sethuraman Partition Based Nonparametric Priors Susarla-Van Ryzin Example, Extended -5 We use our method to recalculate the estimates of F (t) based on Susarla-Van Ryzin data. When the transformation to independent Beta variables is used we get the same results as Susarla and Van Ryzin. However, this depends on the fact that all the censoring were right censoring.results. We also use on LLN method to compute estimates of F (t). A comparison of the results is given below: F̂ (t) t 0.80 1.00+ 2.70+ 3.10 5.40 7.00+ 9.20 12.10+ Exact 0.1083 0.1190 0.2071 0.3006 0.4719 0.5256 0.6823 0.7501 LLN 0.1082 0.1189 0.2068 0.3000 0.4708 0.5244 0.6810 0.7487 SSS 0.1083 0.1190 0.2084 0.3011 0.4706 0.5261 0.6802 0.7476 KME 0.1250 0.1450 0.1250 0.3000 0.4750 0.4750 0.7375 0.7375 J. Sethuraman Partition Based Nonparametric Priors Susarla-Van Ryzin Example with Two-way Censoring We take the data from Susarla-Van Ryzin and change the uncensored observations to be left censored. We use the same prior distribution for P and obtain estimates of F (t) using the LLN approximation. F̂ (t) t 0.80- 1.00+ 2.70+ 3.10- 5.40- 7.00+ 9.20- 12.10+ LLN 0.1737 0.1937 0.3625 0.3968 0.5274 0.5990 0.6796 0.7721 J. Sethuraman Partition Based Nonparametric Priors Graphical Comparisons Estimates of the df from censored (all types) data, based on PB priors 0.9 0.8 Two−way censoring 0.7 F(t) 0.6 Prior df 0.5 0.4 0.3 Right censoring 0.2 0.1 0 0 5 t 10 15 J. Sethuraman Partition Based Nonparametric Priors Some References Ferguson, T. (1973) A Bayesian Analysis of Some Nonparametric Problems. Ann. Statist. 1, 209–230. Ferguson, T. and Phadia, E. (1979) Bayesian Nonparametric Estimation Based on Censored Data Ann. Statist. 7, 163–186. Doksum, K. (1974) Tailfree and Neutral and Random Probabilities and their Posterior Distributions. Ann. Probab. 2, 183–201. Doss, Hani (1994) Bayesian Nonparametric Estimation for Incomplete Data via Successive Substitution Sampling. Ann. Statist. 22, 1763–1786. Susarla, V. and Van Ryzin, J. (1976) Nonparametric Bayesian Estimation of Survival Curves from Incomplete Observations. Jour. Amer. Statist. Assoc. 71 897–902. Sethuraman, J. and Hollander, M. (2008) Nonparametric Bayes Estimation in Repair Models Jour. Statist. Plan. Inf. – To appear .
© Copyright 2026 Paperzz