A NOTE ON REGIONS OF GIVEN PROBABILITY OF THE SKEW- NORMAL DISTRIBUTION A. Azzalini Department of Statistical Sciences University of Padua, Italy e-mail: [email protected] January 2000 revision June 2001 1 I NTRODUCTION AND SUMMARY One of the nice mathematical properties of the multivariate normal distribution is the availability of a simple method for constructing regions of assigned probability p and minimum geometric measure. This feature is useful in a number of theoretical and practical problems; the construction of tolerance regions is an example of the latter type. If Z is a d−dimensional random variable with distribution Nd (0, Ω) where Ω represents a covariance matrix, it is well known that the appropriate region is given by RN = {x : x> Ω−1 x ≤ cp } (1) where cp is the p-th quantile of the χ2d distribution. The present note addresses the same problem as before, namely the construction of a region with given probability p and minimum volume, when the assumption on the distribution of Z is replaced by that of skew-normality. By this term, we mean that the density function of Z at x (x ∈ Rd ) is f (x) = 2 φd (x; Ω) Φ(α> x) where φd (x; Ω) denotes the Nd (0, Ω) density function at x, Φ is the N(0, 1) distribution function and α is a vector of shape parameters. The above expression of the skew-normal density refers in fact to the special case in which the location parameter is the null vector and Ω is a correlation matrix. Since the stated problem is location and scale equivariant, this assumption does not involve any loss of generality, and it simplifies the notation. For a systematic treatment of the skew-normal distribution, see Azzalini & Dalla Valle (1996); further results are given by Azzalini & Capitanio (1999). Since a quadratic form Z > Ω−1 Z has a χ2d distribution also in the case of a skew-normal variate, (1) is still a region of exact probability p, but it does not have minimum volume, because it does not correspond to the set of points with highest values of the density function. Clearly, the appropriate set, RSN say, is of the form RSN = {x : f (x) ≥ f0 }, for a suitable value f0 , which depends on p, Ω and α, such that the condition P{RSN } = p holds. 1 An exact solution of this problem does not seem feasible, and one must look for an approximate one. The main part of this note deals with the case d = 2, which is the most relevant one after the basic case d = 1, and describes a solution which has been found to be satisfactory for practical purposes. The same sort of approximaton used for d = 2 has been considered for a few other values of d, and it turned out to work well also in these other cases. 2 T HE BIVARIATE CASE Since the quadratic form x> Ω−1 x in (1) for the normal case can be re-written as −2 log φd (x; Ω) − d log(2π) − log |Ω|, it is quite natural to consider the analogous expression for the skew-normal case, simply replacing the expression of the density φd (x; Ω) by f (x). This idea leads to consider the region {x : 2 log f (x) ≥ −cp − d log(2π) − log |Ω|} (2) as a candidate solution to our problem. We have examined the empirical performance of this simple rule in a set of simulation experiments, starting the case d = 2. Since Ω is a correlation matrix, then |Ω| = 1 − ω 2 , where ω is the off-diagonal element of the matrix. On recalling that cp = −2 log(1 − p) for d = 2, inequality in (2) leads to 2 log f (x) ≥ 2 log(1 − p) − 2 log(2π) − log(1 − ω 2 ). (3) In these simulation experiments, various combinations of the parameters α1 , α2 , ω have been selected. For each choice of the parameters, 106 replicated samples have been generated from the given distribution, and the rule (2) has been applied to a set of p values, namely p = (0.99, 0.975, 0.95, 0.90, 0.80, 0.70, 0.50, 0.30, 0.20, 0.10, 0.05, 0.025, 0.01), obtaining a corresponding vector of observed relative frequencies, p̃ say. The pseudo-random variates have been generated with the aid of software provided by Azzalini (1998). Figure 1 summarizes the main features for one of these experiments, having α1 = 2, α2 = 6, ω = −0.5. The circles indicate the points (cp , cp̃ ), where cp̃ denotes the quantile function of the χ22 distribution evaluated at p̃. Ideally, these points should be lying on the dashed line which corresponds on the identity function. This is not the case here, but it is apparent that the points are very well aligned along some line, and this line is almost perfectly parallel to the identity line. In the specific example displayed here, the slope of the line fitted to the points was 0.9966 and the sample correlation was 0.999996. The meaning of the crosses shown on the plot will be explained shortly. The features mentioned above are not specific of the case to which Figure 1 refers. Many others simulation examples have been performed and the same features remained constant: the points were invariably well aligned along a line, and the slope of the fitted line was extremely close to 1. In other words, the relationship cp̃ = h + cp (4) was remarkably accurate to summarize the empirical data in all cases considered. The only ingredient depending of the parameters of the distribution was the displacement amount, 2 10 8 6 c(p^) 4 2 0 0 2 4 6 8 10 c(p) Figure 1: Actual versus nominal values of the probability p, transformed to the quantile scale, in the case when the parameters of the distribution are α1 = 2, α2 = 6, ω = −0.5 h. Generally h > 0; it was 0 only when α1 = α2 = 0 which corresponds to the normal distribution. The next step was then to interpolate h as a function of α1 , α2 , ω. It turned out that h is a 1/2 monotone function of α∗ = α> Ωα , a quantity which has already emerged as a summary measure of skewness of this distribution (Azzalini & Capitanio, 1999). This fact is illustrated by Figure 2 which shows the points (α∗ , h) for all simulation experiments. For instance, the example described earlier corresponds to the point (5.29, 1.111). To interpolate the points of Figure 2, we notice that the relationship between α∗ and {log(eh/2 −1)}−1 is close to proportionality with ratio −0.6478; see Figure 3 for an illustration. This fact translates into the interpolatory function h = 2 log(1 + exp(−b/α∗ )) (5) where b = 1.544. This expression produces the continuous line plotted in Figure 2, with a satisfactory interpolation of the observed points. In practical terms, the above discussion leads to the following simple modification of the initial procedure; namely (3) is replaced by 2 log f (x) ≥ 2 log(1 − p) − 2 log(2π) − log(1 − ω 2 ) + 2 log[1 + exp(−b/α∗ )]. (6) If we denote by p̂ the estimated actual probabilities obtained by (6), the crosses in Figure 1 indicate the points (cp , cp̂ ). All new points are essentially on the identity line. Notice that the type of axes scale used in Figure 1 emphasizes the behaviour for large values of p, which is the most relevant case in practice. Table 1 gives numerical details for the same case of Figure 1, comparing the nominal values of p with the actual probabilities p̂ on the natural scale, rather than the quantile scale. 3 0.0 0.2 0.4 0.6 h 0.8 1.0 1.2 Relationship between h and alpha* (case d=2) 0 5 10 15 20 alpha* Figure 2: Observed values of h plotted versus α∗ and interpolating function −5 −10 −15 1/log(exp(h/2)−1) 0 Relationship between h and alpha* (case d=2) 0 5 10 15 20 alpha* Figure 3: Observed values of 1/{log(eh/2 − 1)} plotted versus α∗ and interpolating line 4 Table 1: Nominal and actual values of the coverage probability in the case α1 = 2, α2 = 6, ω = −0.5 p 0.01 0.025 0.05 0.1 0.2 0.3 0.5 0.7 0.8 0.9 0.95 0.975 0.99 p̂ 0.043 0.056 0.077 0.121 0.212 0.306 0.500 0.698 0.797 0.898 0.949 0.974 0.990 The agreement between p and p̂ is fully satisfacory for moderate and large p. Only if p is close to 0 one observes appreciable differences; however this discrepancy is not of much practical relevance, because usually p is larger that 12 , and often substantially larger. The behaviour shown in Figure 1 and in Table 1 is not specific of the parameters values considered there. The same sort of results has been observed almost identical in all cases considered. 3 O THER NUMBERS OF DIMENSIONS The basic criterion (2) can be adopted also for other values of d, provided cp now refers to the appropriate χ2d distribution. Some additional work has been done for the cases d = 1, d = 3 and d = 4. The case d = 1 is a special one, since the required region is an interval and it is not difficult to obtain an exact numerical solution, by finding two points, x1 and x2 , say, such that f (x1 ) = f (x2 ), and P{x1 < Z < x2 } = p. The numerical computations can be accomplished easily with the aid of software tools mentioned earlier (Azzalini, 1998) to compute the integral of the distribution function. Although the availability of a numerically exact solution removes the need of additional rules like (6), we can still proceed like in the case d = 2, to examine whether a similar behaviour is present. It turned out that the patterns observed with d = 2 were still there for all values of d which have been examined. Not only were the points aligned similarly to those of Figure 1, but also an interpolatory formula of type (5) worked well, by suitably changing the value of b, with a corresponding interpolation similar to the one observed in Figure 2. The end conclusion is then as follows: by modifying the region (2) similarly to (6), we obtain the approximation of type RSN ≈ {x : 2 log f (x) ≥ −cp − d log(2π) − |Ω| + 2 log[1 + exp(−b/α∗ )]} (7) to the required region; here b = 1.854, 1.544, 1.498, 1.396 when d varies from 1 to 4, respectively. Over the range of cases considered, (7) works well in practice to obtain regions of given probability p, provided this is not close to 0. Clearly, it would be welcome to have some theoretical understanding of the reasons why the proposed formula works so nicely numerically. However, even in its present form, the result can still be of interest, at least for practical usage. A CKNOWLEDGMENTS I would like to thank an anonymous referee for insightful remarks which led to substantial improvement of the paper. This research was supported partly by ‘Consiglio Nazionale delle Ricerche’ (grant No. 98.01532.CT10) and partly by MURST (grant PRIN 2000), Italy. 5 R EFERENCES Azzalini, A. (1998). The library sn for S-plus. Available on the WWW at URL: http://azzalini.stat.unipd.it/SN Azzalini, A. & Capitanio, A. (1999). Statistical applications of the multivariate skew-normal distribution. J.Roy. Statist. Soc., B 61, 579–602. Azzalini, A. & Dalla Valle, A. (1996). The multivariate skew-normal distribution. Biometrika 83, 715–26. 6
© Copyright 2026 Paperzz