APPENDIX B
LONGITUDINAL DATA ANALYSIS
Appendix B: Notation and Taylor Series
The following is a generic review of Taylor series and the type of notation we use in certain parts of
the course.
REAL-VALUED FUNCTIONS: Let h(x, α) be a real-valued function of a vector x (which is irrelevant
to the developments here) and a (r × 1) vector α = (α1 , ... , αr )T . We write
∂/∂α1 h(x, α)
..
hα (x, α) = ∂/∂α h(x, α) =
(r × 1).
.
∂/∂αr h(x, α)
(B.1)
The vector hα (x, α) of partial derivatives of h(x, α) with respect to the elements of α is referred to as
the gradient (vector). Of course, hαT (x, α) denotes its transpose, a (1 × r ) vector.
We can extend this notation to second partial derivatives with respect to the elements of α. This is
done by writing the (r × r ) symmetric matrix of second partial derivatives of h(x, α) as follows.
∂ 2 /∂α12 h(x, α) ∂ 2 /∂α1 ∂α2 h(x, α) · · ·
T
hαα (x, α) = ∂/∂α∂α h(x, α) =
∂ 2 /∂α22 h(x, α)
···
..
.
∂ 2 /∂α1 ∂αr h(x, α)
∂ 2 /∂α2 ∂αr h(x, α)
..
.
∂ 2 /∂αr2 h(x, α)
(B.2)
This is often called the Hessian.
More generally, suppose that h(x, α, δ) is a real-valued function of two vectors α (r × 1) and δ (s × 1).
Then define
∂ 2 /∂α1 ∂δ1 h ∂ 2 /∂α1 ∂δ2 h · · ·
∂ 2 /∂α1 ∂δs h
2
∂ /∂α2 ∂δ1 h ∂ 2 /∂α2 ∂δ2 h · · ·
T
hαδ (x, α, δ) = ∂/∂α∂δ h(x, α, δ) =
..
..
..
.
.
.
∂ 2 /∂αr ∂δ1 h
···
···
∂ 2 /∂α2 ∂δs h
,
..
.
∂ 2 /∂αr ∂δs h
(B.3)
where we have written h = h(x, α, δ) for brevity. This is a (r × s) matrix; it follows that, by reducing the
roles of α and δ, hδα is a (s × r ) matrix, and is the transpose of hαδ in (B.3).
APPENDIX B
LONGITUDINAL DATA ANALYSIS
VECTOR-VALUED FUNCTIONS: Now suppose that h(x, δ) is a vector-valued function of dimension
n of a vector x and parameter δ (s × 1); i.e.
h1 (x, δ)
..
h(x, δ) =
.
.
hn (x, δ)
Thus, the vector-valued function h has real-valued component functions h1 , ... , hn .
We will write
∂/∂δ1 h1 (x, δ) · · ·
..
..
hδ (x, δ) = ∂/∂δ T h(x, δ) =
.
.
∂/∂δ1 hn (x, δ) · · ·
∂/∂δs h1 (x, δ)
..
,
.
∂/∂δs hn (x, δ)
a (n × s) matrix. In particular, note that if we have a real-valued function h(x, α, δ) for α (r × 1) and δ
(s × 1), then
∂/∂α1 h(x, α, δ)
h (x, α, δ)
α1
..
..
hα (x, α, δ) =
=
.
.
∂/∂αr h(x, α, δ)
hαr (x, α, δ)
(r × 1).
Thus, hα (x, α, δ) is a vector-valued function. Applying the above definition of the partial derivative of
a vector-valued function with respect to a vector parameter to hα (x, α, δ), we may conclude that the
result of differentiating hα (x, α, δ) with respect to δ is the (r × s) matrix hαδ (x, α, δ) defined in (B.3);
that is,
∂/∂δ T hα (x, α, δ) = ∂/∂α∂δ T h(x, α, δ).
We now consider various forms of Taylor’s theorem.
UNIVARIATE TAYLOR’S THEOREM: Assume h(α) has (k − 1) continuous derivatives in [a, b] and
finite k th derivative in (a, b), where α is univariate and h is a real-valued function. Let α0 ∈ [a, b]. For
each α ∈ [a, b], α 6= α0 , there exists α∗ interior to the interval joining α0 and α such that
h(α) = h(α0 ) +
k−1
X
`=1
1
1 `
{∂ /∂α` h(α)}α=α0 (α − α0 )` + {∂ k /∂αk h(α)}α=α∗ (α − α0 )k .
`!
k!
Taylor’s theorem is used heavily in making large sample arguments in statistics. In such arguments,
we usually deal with vector-valued functions of vector-values parameters, for which a multivariate
version of Taylor’s theorem is required.
APPENDIX B
LONGITUDINAL DATA ANALYSIS
MULTIVARIATE TAYLOR’S THEOREM: First consider a real-valued function h(α), where α is (r ×1).
We l state the theorem for general k, but write the form of the representation of h(α) for k = 2 only,
as things get messy pretty quickly. One should be able to deduce the form for larger k by analogy to
the univariate version.
Assume that h(α) on Rr has continuous partial derivatives of order k at each point in an open set
S ⊂ Rr . Let α0 ∈ S. For each α = α0 such that the line segment joining α and α0 lies in S, there
exists α∗ in the interior of this line segment such that, in the case k = 2,
h(α) = h(α0 )+
r
r X
r
X
X
{∂/∂α` h(α)}α=α0 (α` −α0,` )+(1/2)
{∂ 2 /∂α` ∂αt h(α)}α=α∗ (α` −α0,` )(αt −α0,t ).
`=1 t=1
`=1
Using the shorthand notation above, we can write this more succinctly as
h(α) = h(α0 ) + hαT (α0 )(α − α0 ) + (1/2)(α − α0 )T hαα (α∗ )(α − α0 ).
Here, we have used a further shorthand
hα (α0 ) = {∂/∂αh(α)}α=α0 .
We use similar notation for hαα (α) and other expressions.
If h is a function of two vectors α (r × 1) and δ (s × 1), we can define the single, “stacked” vector
(αT , δ T )T (r + s × 1) and apply the theorem to obtain a representation of h(α, δ) about some value
(αT0 , δ T0 )T . This is handy when we want to maintain the distinction between two arguments of a
function and treat them separately. Using the above, it is easy to show that, for k = 2, we can write
h(α, δ) = h(α0 , δ 0 ) + {hαT (α0 , δ 0 )(α − α0 ) + hδT (α0 , δ 0 )(δ − δ 0 )}
+(1/2){(α − α0 )T hαα (α∗ , δ ∗ )(α − α0 ) + 2(α − α0 )T hαδ (α∗ , δ ∗ )(δ − δ 0 ) +
(δ − δ 0 )T hδδ (α∗ , δ ∗ )(δ − δ 0 )}.
It is often necessary to apply Taylor’s theorem to vector-valued functions. This can appear complicated, but it is mainly a matter of notation. Suppose h(α) = [h1 (α), ... , hv (α)]T (v × 1), where again
α is (r × 1).
APPENDIX B
LONGITUDINAL DATA ANALYSIS
If we apply the multivariate Taylor theorem to each component of h (each of which is a real-valued
function), it can be shown that the expansion of h(α) about α = α0 for k = 2 can be written compactly
as
T (α )
h1α
0
..
h(α) = h(α0 ) +
(α − α0 ) + (1/2){I v ⊗ (α − α0 )T }H ∗αα (α − α0 ),
.
hvTα (α0 )
where H ∗αα is the (rv × r ) matrix consisting of the (r × r ) matrices h1αα (α∗ ),. . . , hv αα (α∗ ) stacked
vertically; I v is a (v × v ) identity matrix, and ⊗ denotes Kronecker product.
Finally, if h(α, δ) (v × 1) is a vector-valued function depending on α (r × 1) and δ (s × 1), and we wish
to separate explicitly the terms involving α and δ, the above extends in the obvious way; for k = 1,
we have, using obvious notation,
T (α , δ )
h1α
hT (α , δ )
∗ ∗
1δ ∗ ∗
..
..
h(α, δ) = h(α0 , δ 0 ) +
(α − α0 ) +
.
.
hvTα (α∗ , δ ∗ )
hvTδ (α∗ , δ ∗ )
This can, of course, be written more compactly as
T (α , δ )
hT (α , δ ) h1δ
∗ ∗
1α ∗ ∗
..
..
h(α, δ) = h(α0 , δ 0 ) +
.
.
hvTα (α∗ , δ ∗ ) hvTδ (α∗ , δ ∗ )
(δ − δ 0 ).
α−α
0
.
δ − δ0
© Copyright 2026 Paperzz