On Non-Response Adjustment via Calibration
August 2005
Michail Sverchkov, Alan H. Dorfman, Lawrence R. Ernst,
Thomas G. Moerhle, Steven P. Paben and Chester H. Ponikowski
Bureau of Labor Statistics
1. Linear Regression Estimator
Introduction
S ∗ = ∪kK=1 S k∗ ,
Let
We consider estimation of finite population
totals in the presence of non-response
assuming that non-responses arise randomly
within response classes. We compare in
Section 1 two regression estimators: one of
them is based on the adjusted for nonresponse probability weights and another is
based on unadjusted weights. We show that
when the auxiliary variables used for nonresponse adjustment are included in the
estimators then they differ only very
slightly. In this case the non-response
adjustment step can be omitted from the
estimation process without loss of generality
(from Result 5 of Deville and Särndal 1992
it follows that the same remains correct for a
wide class of calibration estimators). At the
end of Section 1 we suggest a general idea
of testing if regression estimators based on
adjusted and unadjusted weights are
significantly different. In Section 2 we
consider a multivariate analog of a
“regression through the origin" estimator,
and show that the “adjusted" and
“unadjusted" estimators coincide in this
case. Then in Section 3 we consider the
important practical case in which auxiliary
variables are stratum indicators. We show
that in this case all previous regression
estimators coincide. In Section 4 we
consider calibration estimators under
restrictions on weights. We show that if
there exists even one set of weights
satisfying the calibration equations and
restrictions then the regression through the
origin estimator does not depend on the
restrictions.
S k∗ ∩ Sk∗′ = ∅
when
k ≠ k ′ , and let s be a subset of S and put
sk∗ = s ∗ ∩ S k∗ .
Let
∗
∗
( yi , x% i = ( x1i ,..., xKi )T , di )i∈s∗ be such that
∗
xki = 0 if i ∈
/ sk for any k . Let c1 ,..., cK be
some
constants.
Denote
xi = (1, x ) = (1, x1i ,..., xKi ) and
T T
%i
d i* = d i [
T
K
∑
k =1
ck 1
],
i∈s *k
where 1 A = 1 if
otherwise.
(1)
A is true and 0
Let t x and t% x be respectively K + 1 dimensional and K -dimensional vectors of
xi
constants, corresponding to x i , ~
respectively.
Consider
∑
i ∈s *
tˆy , reg (d) =
vi x i = t x ,
minimizes
∑
i∈s∗
∑
i∈ s *
vi y i ,
v
L < i <U ,
di
where
and
v
(vi − di )2 /di , which in
case L = −∞ and U = ∞ is a Linear
Regression Estimator (see Deville and
Särndal 1992).
Until Section 4, we assume that L = −∞
and U = ∞ .
It can be shown (see Deville and Särndal
1992 and Valliant, et. al 2000) that
T
tˆ y ,reg (d) = tˆ y (d) + [t x − tˆ x (d)] B(d)
tx
where
is
a
vector
of
constants,
tˆ z (d) = ∑ i∈s∗ di z i , and B(d) is a solution
“ A " of the Weighted Least Squares (WLS)
Normal Equations:
First we note the following very simple
lemma.
∑ di ( yi − xTi A) = 0,
i∈s∗
∑ d (y − x
i∈s
∗
i
i
T
i
A ) xki = 0,
i
i∈s
then B(d) = [
k=1,...,K,
∑d x x
In particular if
∗
i∈s
i
i
T
i
is nonsingular
*
∑dxx
i
adjusted weights and another on the original
unadjusted weights, i.e. we would like to
estimate tˆ y , reg (d) − tˆ y ,reg (d ∗ ) .
T −1
i
]
∑dx y.
∗
i
i
i
i∈s
COMMENT 1. S ∗ and s ∗ can be
considered as a set indicating a population
and a set of finally selected units (sample)
respectively, {Sk∗ } and {sk∗ } represent nonresponse adjustment groups in the
population and the sample, yi and xi ’s
represent values of target and auxiliary
variables, di ’s are weights adjusted for nonresponse, and t x is a vector of population
totals of the auxiliary variables with the first
coordinate equal to N , the number of
population units (in particular if X i is a
variable used for non-response adjustment
with
known
“totals
by
group",
T X (k ) = ∑i∈S * X i , then xki = X i if
Lemma 1. Let A satisfy
∑ d (y − x
i∈s
∗
i
i
T
i
A) = 0
(2)
and A ∗ satisfy
∑d
i∈s
∗
∗
i
( yi − xTi A ∗ ) = 0.
(3)
Then for any vector t
[
∑
d i y i + (t −
∑
d i* yi + (t −
i∈ s *
[
i∈s *
∑
d i x i )T A ] −
∑
d i*x i )T A* ] =
i∈ s *
i∈s *
t T (A − A * ) .
(4)
This follows immediately from substitution
of (2) and (3) into (4).
Corollary 1.
∗
T
∗
tˆ y ,reg (d) − tˆ y , reg (d ) = t x [B(d) − B(d )].
This follows from the fact that (2) and (3)
are the first of the WLS Normal Equations
respectively for B(d) and B(d∗ ) .
k
i∈S
∗
k
and 0 otherwise); then tˆ y , reg is a
linear regression estimator of the population
total t y . Assume that di∗ are the original
inverse inclusion probabilities. Then in the
case of non-response, 1/di∗ can differ from
the ultimate probabilities of inclusion in the
sample and thus to get consistent estimates
based on these probabilities, non-response
adjustment is done. Usually when
adjustment is made based on auxiliary
variables the adjusted weights and primary
probability inverse weights are connected by
(1) with some ci ’s. In this section we try to
estimate the difference of two regression
estimators one of which is based on the
Lemma 2. Let N denote the size of a finite
population, and let the following conditions
be satisfied: as N → ∞ ,
i) lim N −1t y and lim N −1t x exist.
ii) N −1 (t y − tˆ(d )) → 0 and
N −1 (t x − tˆ x(d)) → 0 in design probability.
Then tˆ y ,reg (d ∗ ) − tˆ y , reg (d) → 0
in design probability if and only if
K
N −1 ∑
∑ di (ck − 1)( yi − xTi B(d))
i =1i∈s*
k
→ 0 in design probability
(5)
∑ d (y −x
T
i
A) = 0,
N −1 ∑ ∑ (ck − 1)( yi − xTi B(d)) → 0. (5′)
∑ d (y −x
T
i
A) xki = 0 , k = 1,..., K , (7)
where summation is over the population.
(Proof below.)
∑d
∗
i
( yi − xTi A) = 0,
∑d
∗
i
( yi − xTi A) xki = 0, k = 1,..., K . (7′)
In particular under i) and ii),
equivalent to
(5) is
i∈s∗
K
k =1
i∈Sk∗
Corollary
2.
Under
i)
∗
tˆ y ,reg (d ) − tˆ y, reg (d) → 0
and
in
ii),
design
probability if
N −1 ∑ di ( yi − xTi B(d)) → 0
i∈S k*
( yi − xTi B(d)) → 0
i∈s
i∈s
∗
i
for any k , ex hypothesis. Therefore
∑ d i ( yi − xTi A) xki =
in design probability for all k = 1,..., K ,
or equivalently,
∑
i
i∈s∗
∗
i
Recall from Section 1 that x ki = 0 if i ∉ s k*
(6)
i∈sk∗
N −1
i
i ∈s *
∑ d i ( yi − xTi A) xki =0
( 6′ )
i∈s *k
for all k = 1,..., K .
0 = ck
⇒
∑ d i ( yi − xTi A) xki =
i∈s *k
Note that (6) and (6′) do not use the ck ’s
and (6) refers to a sample and therefore can
be tested.
i∈s k*
COMMENT 2. If we consider yi and 1i∈S ∗
Therefore (7′) is equivalent to
k
as random variables generated by some
super population (model) distribution then
for some vector b ( 5′ ) converges in design
probability to
∑ d i* ( yi − xTi A) xki , if
∑d
i∈s∗
∗
i
( yi − xTi A) = 0,
∑ d (y − x
i∈s∗
k = 1,...K .
i
i
T
i
A) xki = 0, k=1,...,K.
K
Eξ [( y i − x Ti b)∑ (c k − 1)1i∈S * ] = 0 (5M)
k =1
k
and (6) converges to,
Eξ [( y i − x Ti b) | i ∈ S k* ] = 0 ,
(6M)
where Eξ means expectation with respect to
Now, under i) and ii), (7) is asymptotically
equivalent to
N −1 ∑ ( yi − xTi A) = 0,
i∈S ∗
N
−1
∑ (y − x
i
i∈S ∗
the model distribution. A usual basic model
assumption in linear regression theory is
and (7’) to
Eξ [( y i − x b) | x i ] = 0
N −1
T
i
which is in general stronger than (6M)
which is in turn stronger than (5M).
Proof of Lemma 2. It follows from Corollary
1 that we have to estimate the difference
between B(d) and B(d∗ ) . Consider the
WLS Normal equations for B(d) and
B(d∗ ) :
∑
i∈S ∗
T
i
A) xki = 0, k=1,...,K. (8)
( yi − xTi A)[
∑c 1
K
k =1
k i∈Sk∗
] = 0,
N −1 ∑ ( yi − xTi A) xki = 0, k=1,...,K. (8′)
i∈S ∗
Then (8) and (8′) coincide asymptotically
iff N −1
K
∑ ∑ (y −x
k =1 i∈Sk∗
i
proving the lemma.
T
i
A)(ck − 1) → 0,
ui = d i − wi and estimate coefficients of
ordinary linear regression without intercept
Remark 1. Suppose (1), i) and ii) do not
hold. We have in any case
∑ d i* ( yi − x Ti B(d* )) xki =
for K + 1 linear models, eik = C k ui + ε i
(for example use PROC REG SAS). If for
∑ d i ( ~yi − x Ti B(d* )) xki , where
all k slope coefficients, C k ,
insignificant then H 0 is accepted.
i∈s *
i ∈s *
d*
*
T
*
~
yi = i ( y i − x T
i B(d )) + x i B(d )) ,
di
2. Regression Through The Origin
In this section we consider a particular case
of the regression estimator (regression
without
intercept)
for
which
which implies
−1
⎤
⎡
B(d* ) = ⎢ ∑ d i x i x T
∑ d i x i ~yi ,
i ⎥
⎥⎦ i∈s *
⎢⎣i∈s*
−1
⎤
⎡
⎥ ∑ d x y and
B(d) = ⎢ ∑ d i x i x T
⎢⎣i∈s * i ⎥⎦ i∈s * i i i
∗
tˆ y ,reg (d) − tˆ y , reg (d ) = 0.
Lemma 3. Let β% be a solution of the
following system of equations:
∑ d ( y − x% β% ) = 0,
thus,
*
[tˆy, Re g (d ) − tˆy , Re g (d )] =
⎤
⎡
t Tx ⎢ ∑ d i x i x T
i ⎥
⎥⎦
⎢⎣i∈s *
∑
i∈s*
×
∑d
i∈sk∗
weights and on d -weights are significantly
different or not. Under the assumption that
(9)
k = 1,..., K
(10)
−1
implies
N −1[tˆy, Re g (d) − tˆy , Re g (w )]
∗
i
∗
( yi − x% Ti β% ) = 0, k = 1,..., K ( 10′ )
∗
*
N
T
i
∗
*
(di − di* )x i ( yi − xT
i B(d )) . (9)
and
i
and β% be a solution of the following
system of equations:
−1
⎡
⎤
n⎢ ∑ di xi xT
i ⎥
⎣i∈S
⎦
i
i∈sk∗
Remark 2. Based on (9) one can test
whether calibrated estimators based on d -
tx
are
are bounded
that
Then β% is a solution for (10) and β% is a
solution for ( 10′ ).
Proof. For k = 1,..., K
0 = ∑ di ( yi − x% Ti β% )
i∈sk∗
ck
⇒
∑ d i ( yi − ~xiT β~) =
i∈s *k
∑
i∈s*
k
~
di* ( yi − ~
xiT β ) = 0 proving the lemma.
is
statistically insignificant if
H 0 : E S (d i − wi )x i ( yi − x T
i B(w)) = 0 ,
where E S denotes expectation over the
sample distribution, Pr{⋅ | i ∈ S} . H 0 can
be tested for example by the following
simple
procedure:
put
ˆ
eik = ( yi − x T
i B(w)) x ki , k = 0,..., K and
Now consider
0
T
tˆ y ,reg (d) = tˆ y (d) + [t% x − tˆ% x(d)] β% (d) (11)
where t%ˆ x (d) = ∑ i∈S ∗ di x% i ,
~
β (d) = [ ∑ d i z i ~xiT ]−1 ∑ d i z i yi
and
i∈s *
z i = (1
i∈s *
T
,...,1 * ) .
i∈s1*
i∈ s K
∑
Note that the matrix
diagonal and thus [
∑
i∈s∗
di z i x% Ti
is
di z i x% Ti ]−1 is also a
i∈s∗
diagonal matrix with k -th diagonal element
equal to [ i∈s∗ di xki ]−1 , and so
∑
k
∑
∑
⎛
⎜
⎜
⎜
⎜
β% = ⎜
⎜
⎜
⎜
⎜
⎜
⎝
i∈s1∗
di yi
i∈s1∗
di x1i
.
.
.
∑
∑
di xKi
i∈s∗K
β% (d)
Therefore
terms cancel
di yi
i∈s∗K
⎞
⎟
⎟
⎟
⎟
⎟.
⎟
⎟
⎟
⎟
⎟
⎠
∑ t xk
k =1
∑ ∑
k =1 i∈s *
k
yi [d i
∑
K
k =1
xki = 1
for
Then
i
all
and
in
the
population.).
Consider
~
tˆy0, reg (d) = tˆy (d) + [ tx
~
~
− tˆx (d)]T β (d) ,
where
any
B(d)
B(d) = arg inf ∑
A
i∈s∗
is
solution to
di ( yi − x A)2 . (13)
T
i
[
]
It can be shown that t x − tˆ x (d ) B(d ) is
unique (compare Valliant, et. al 2000,
∑
Chapter 7.) Using
T
K
k =1
xki = 1 for all i
and denoting A% = (a1 + a0 ,..., aK + a0 )T we
=
can write xTi A = x% Ti A% and thus (13) is
equivalent to
t xk
∑ d i xki
~
~ ~
tˆy0, reg (d) = tˆy (d) + [ t x − tˆx (d)]T β (d) ,
]
(12)
where β% (d) = arg inf
b
i∈s k*
Additionally,
xki = 1i∈S ∗ .
let
k
units
i∈s *k
K
this,
tx0 = tx1 + ... + tx K (equals the number of
is a solution of (10). Thus
out in (11) so that
∑ d i xki
see
set S k∗} , the k -th total for k = 1,..., K and
∑ d i yi
i∈s *k
To
t x = (tx0 ,..., txK )T where txk = #{ units in a
~ ~
tˆy0, reg (d) = t xT β (d) =
K
indicators the Linear Regression estimators
with and without intercept coincide and thus
∗
tˆ y ,reg (d) = tˆ y ,reg (d ) .
∑ d ( y − x% b)
i∈s∗
i
i
T
i
2
which implies
from
Lemma
3,
∗
∗
0
0
tˆ y ,reg (d) = tˆ y ,reg (d) = tˆ y , reg (d ) = tˆ y , reg (d ) .
∗
0
0
tˆ y ,reg (d) = tˆ y ,reg (d ).
Remark 3. Note that if xki are nonnegative
numbers and xki > 0 for any sk∗ then
β% (d) = [∑
i∈s∗
where
di
σ
xx ]
σ i2 = σ 2
K
∑
di
∑σ
T −1
2 %i %i
i∈s∗
i
k =1
xy,
2 %i i
i
x ki . In particular
tˆy0, reg is a ratio estimator if K = 1 .
3. Calibration On Known Totals
Under the practically important case in
which auxiliary variables are strata
4. Bounds For Adjusted Weights
Let us return to the general expression for
the calibration estimator (see the beginning
of section 1), tˆ y ,reg (d) = i∈s∗ vi yi and
0
0
tˆ y ,reg (d) = ∑ i∈s∗ vi yi .
∑
One
of
the
requirements which we have to follow often
in practice is bounds for the ratio between
the final weight, v or v 0 , and the initial
(frame sample) weight, d ∗ in our case, that
is
L < vi /di∗ < U .
(14)
Multiplying (14) by di∗ x% i one can get
Ldi∗ x% i < vi x% i < Udi∗ x% i which implies (by
summing
over
s∗ )
Lt%ˆ x < t% x < U t%ˆ x
(component wise). Thus
∑ xki
L<
i∈S k*
d i* xki
i∈s *k
∑
< U , k = 1,..., K .
(15)
On the other hand suppose (15) holds.
Comparing this to (12), one can note that the
central part of (15) is the benchmark factor,
that is, is the multiplier of di used to get the
Thus a set of
calibration weights ν i .
weights satisfying (14) exists if and only if
(15) holds.
Therefore the following
statement is correct.
Lemma 4. If there exists a set of weights
satisfying
∑ v x%
i∈s∗
i
i
then tˆ 0y ,reg (d ) does not depend on L and
U . In particular if auxiliary variables are
strata indicators, i.e. xki = 1i∈S ∗ then the
k
same remains true for tˆ y ,reg (d∗ ) .
Remark 4. From Lemma 4 it follows that in
the case of calibration on known totals the
only way to get the restrictions (14) is to
collapse cells sk∗ ’s such that (15) is satisfied.
References
Deville, J. C. and Särndal, C. E. (1992),
Calibration Estimators in Survey Sampling,
JASA, v. 87, No. 418, pp. 376-382.
Valliant, R., Dorfman, A. H., and Royall, R.
M. (2000) Finite Population Sampling and
Inference, A Prediction Approach. Wiley,
New York.
= t% x, L < vi /di∗ < U
The opinions expressed in this paper are those of the authors and do not necessarily represent the policies
of the Bureau of Labor Statistics
© Copyright 2026 Paperzz