Supporting Information File 1. Derivation of the main B

Supporting Information File 1. Derivation of the main B-SHADE equations.
As above, let yi be the number of disease cases reported by hospital i, and let Y of Eq. (1) be the
observed number of cases reported by all N hospitals in the area during a time unit (say, a week), n
denotes the number of sentinel hospitals. One can estimate Y by the weighted sum of the sentinel
hospital cases, i.e., Eq. (2). The y(w) satisfies two conditions: (a) it is an unbiased estimate of the
observed total population cases Y, and (b) it minimizes the mean squared estimation error (MSEE),
 y2( w )
2
 E ( y(w)  Y ) .
The first condition implies that, E ( y ( w ))  E in1 wi yi  EY , or  ni1 wi Eyi / EY  1 , which
leads
 y2( w )
to
Eq.
(3).
Concerning
the
second
condition,
the
MSEE
is
given
by:
 E ( y ( w )  Y )  E (( y ( w)  Y )  E ( y(w)  Y ))  C ( y (
w), y(w))  2C ( y(w), Y )  C (Y , Y ) ,
2
2
i.e. Eq. (4). The 1st term in the right of Eq. (4) is
2
2
2
C ( y ( w ), y ( w ))  E ( y ( w )  Ey ( w ))  E ( in1 wi yi  E ( in1 wi yi ))  E (  in1 wi ( yi  E ( yi )) , or
C ( y ( w ), y ( w ))   in1  nj 1 wi w j C ( yi , y j )
(A1)
The 2nd term in the right of Eq. (4) is 2C ( y ( w ), Y )  2 E ( y ( w )  Ey ( w ))(Y  EY )
 2( EYy ( w )  Ey ( w ) EY )
 2 ni1 wi (EyiY  Eyi EY )  2 ni1 wi  Nj 1(Eyi y j  Eyi Ey j ) , or
2C ( y ( w ), Y )  2 iN1  nj 1 w j C ( yi , y j ) .
(A2)

And the 3rd item is C(Y,Y)  E(Y  EY) 2  E( Ni1 (y i  Eyi ))2   Ni1 Nj 1 E( yi  Ey i )(y j  Ey j ) , or
N
C(Y,Y)   N
i1 j 1C (y i , y j )
(A3)

By substituting Eqs. (A1)-(A3) into Eq. (4) one finds,

 y2w   ni1  nj 1 wi w j C (y i , y j )  2 Ni1  nj 1 w j C (y i , y j )   Ni1  Nj 1C (y i , y j ) .
(A4)
To minimize Eq. (A4) subject to the unbiasedness condition of Eq. (3) is a standard

constrained optimization problem [15] that leads to the minimization of the quantity
Ly ( w )   y2( w )  2 (in1 wb
i i  1) , where  is a Lagrange multiplier. Next, the partial
derivatives of Ly ( w ) wrt to w i and  are set equal to zero. The

unbiasedness

 wi
condition

of

Eq.
( E ( y(w)  Y )2  2 (in1 wi bi  1))  0 ,
(3).
or
Furthermore,



 wi
Ly ( w )  0 gives the
Ly ( w )  0
2 E (( y ( w )   Nj1 y j ) yi )  2 bi
,
0 ,
or
or
2 nj 1 w j C(yi , y j )  2 Nj 1C(yi , y j )  2 bi  0 , or
 nj 1 w j C(y i , y j )   bi   Nj 1C(yi , y j ) .


(A5)
for all i  1,...,n . Writing Eqs. (A5) and (3) in a matrix form yields Eq. (5). Eq. (5) is formally
similar to the Block Kriging equations [12], it focus on the estimation of the total number of
 disease cases in an area, it includes the additional coefficients bi to handle the biasedness of the
sample (sentinel hospital records), and is expressed in a suitable discrete form to account for
countable hospital distributions.
In light of Eqs. (3), (5), the 2nd term in the right of Eq. (A4) can be written as
n
n
n
n
n
2 N
i1 j 1 w j C(y i , y j )  2 i1 wi [ j 1C(y i , y j )   bi ]  2 i1 j 1 wi w j C(y i , y j )  2 ,
in which case Eq. (A4) can be written as

 y2( w )
 in1  nj 1 wi wj C ( yi , y j )  2in1  nj 1 wi wj C ( yi , y j )  2  iN1  Nj 1 C ( yi , y j ) , or
 y2( w )
 iN1  Nj 1 C ( yi , y j )  in1  nj 1 wi wj C ( yi , y j )  2 ,
which is Eq. (6).
(A5)