Derived Variables with Repeated Measurements Andrew W. Roddam University of Oxford, Wellcome Trust Centre for the Epidemiology of Infectious Disease South Parks Road Oxford, OX1 3PS, UK [email protected] Introduction This paper considers the extension of a technique discussed in Cox & Wermuth (1992) and Wermuth & Cox (1995), to the case of a multivariate response vector Y = (Y1; : : : ; Y ), which is observed at a series of time points t = 1; : : : ; n. The aim of the original paper was to seek a set of linear transformations, Y , of a multivariate response vector Y = (Y1; : : : ; Y ), such that in the multiple regression of Y on a set of explanatory variables X , only the co-ecient of X ; s = 1; : : : ; q was non-zero. In the current setting we have repeated measurements of the multivariate response vector Y , and the aim is to nd a set of linear transformations of Y which is constant over the observed time points. This paper proposes a methodology which estimates a q-dimensional vector of linear transformations Y , of the original response vector Y , such that at each time point Y = AY , t = 1; : : : ; n. For the purposes of exposition, we restrict attention to the case where n is either 2 or 3. We will also consider the natural extension to the case where in addition to the repeated measurements we also have a set of explanatory variables X . t q q s s t t t t t t Results In the case of two repeated measurements then there is in essence only one possibility for the evolution of the set of linear relationships, that is we require Y to depend only on Y ,1 , for s = 1; : : : ; q. This can be achieved by setting the matrix of regression co-ecients of Y on Y ,1 to be diagonal and equal to D, say. It can be shown that this is equivalent to solving AB ,1 = DA for A, where B ,1 is the matrix of regression coecients of Y on Y ,1. Thus the required solution is that the rows of A are the left eigenvectors of B ,1, with eigenvalues being the elements of D. If there is a set of explanatory variables X , on which arbitrary dependence is dened, then we simply compute B ,1 relative to the regression on X . It is possible that the computed eigenvalue will be complex or zero. In the former case, this would imply that the joint dependencies could not be represented in any simple linear relationship, whereas a zero eigenvalue would imply that that a reduced number of derived variables would be sucient to dene the joint dependence relationships. In the case of three repeated measurements then there are a number of dierent, interesting possibilities for the evolution of the set of linear relationships. We could assume the Markov property, i.e. Y depends only on Y ,1 , and it can be shown that in this case we require the matrices of regression coecients B n n,1 and B n,1 n,2 to have the same left eigenvectors. Alternatively, if we no longer assume the Markov property, we could require that Y depends on both Y ,1 and Y ,2 . This can be shown to be equivalent to solving AB ,1 ,2 = D ,1A for A, where D is some diagonal matrix. To ensure that we retain the desired properties, we need to check that this solution for A satises B n n,1 = (D1jD2), where D1; D2 are diagonal matrices, and Z ,1 = (AY ,1; AY ,2) . Some techniques will be explored in order to assess n;s n ;s n n n;n n;n n n n;n n;n t;s t Y ;s ;Y Y ;Y n;s n ;s n n ;s Y T n n n ;Z ;n n the appropriateness of these two structures for a given data set. Application The techniques discussed in this paper will be illustrated on a data set of childhood growth collected between 1972-77. Measurements of the children's height and weight were taken every 6 months from birth until age 5, and in addition there were a number of background measurements including, maternal weight gain, smoking, and maternal height, all of which are known to aect the height and weight of children. We will consider whether it is reasonable to assume that the set of joint relationships between height and weight is constant throughout the rst ve years of live, or whether it is more plausible to consider two dierent sets of joint relationships; one set until the child is approximately 1 year old and one set thereafter. We will also illustrate the case where we have two and three repeated measurements, and whether in the case of three repeated measurements it would be appropriate to assume the Markov property. REFERENCES Cox, D.R. and Wermuth, N. (1992). On the calculation of derived variables in the analysis of multivariate responses. Journal of Multivariate Analysis 42, 162-170 Wermuth, N. and Cox, D.R. (1995). Derived variables calculated from similar joint responses: some characteristics and examples. Computational Statistics & Data Analysis 19, 223-234 FRENCH RE SUME Cet expose porte sur le developpement d'une technique presentee par Cox et Wermuth (1992). Nous utilisons le cas d'un vecteur de reponse multivarie Y t = (Y1 ; : : : ; Yq ) observe aux temps t = 1; : : : ; n. Il s'agit d'estimer un vecteur a q-dimension, qui represente le vecteur de reponse initial Y t , par une combinaison lineaire des Y t tel que Y t = AY t , t = 1; : : : ; n. L'exemple illustre dans cet expose se limite uniquement aux cas ou n est egal a 2 ou 3. Nous presenterons egalement la suite logique de ce probleme ou, en plus de donnees repetees dans le temps, s'ajoute un groupe de variables explicatives X . Une banque de donnees sur la croissance d'enfants entre la naissance et 5 ans nous permettra d'illustrer cette technique avec un exemple concret.
© Copyright 2026 Paperzz