ESTIMATING PRICE RELATIVES FOR THE U.S. CONSUMER PRICE INDEX WHEN PRICED ITEMS ARE RESTRATIFIED Mary Lee Fuxa, U.S. Bureau of Labor Statistics 2 Massachusetts Ave., N.E., Room 3655, Washington, D.C. 20212 Any opinions expressed in this paper are those of the author and do not constitute policy of the Bureau of Labor Statistics. Key Words: Item Strata, Price Relative, Quote-Level Weights 1. Introduction Approximately every 10 years the fixed market basket of commodities and services supporting the Consumer Price Index is updated in a process referred to as a revision of the index. Samples of specific items in the fixed market basket are selected within a newly designed, stratified sample of geographic areas on a continuing rotation basis. The prices of the specific items are quoted every month by representatives of the Bureau of Labor Statistics. Commodities and services have 70% relative importance by weight in the index, and shelter has 30%. For the January 1998 revised index, the stratification of the items in the fixed market basket for commodities and services was extensively redefined. For the 36 geographic areas that were not in the sample for the December 1997 index, new samples of specific items were selected for the 183 item strata. For the 51 geographic areas continuing in the sample: 24% of the item strata were designated for new sampling; 33% did not change in definition; 43% changed but were not designated for new sampling due to budgetary constraints. This paper documents the adjustment in the monthly estimation of the index that is necessary for the 43% item strata until new samples are selected and rotated in for the 51 geographic areas. 2. Method of Sampling and Formulation of the Index and its Components The geographic areas are referred to as the primary sampling units (PSU’s). The PSU’s were selected to represent 38 index areas. A PSU is classified by the region it is in as Northeast, South, Midwest, or West. It is also classified by its population size as selfrepresenting or non-self-representing. A selfrepresenting PSU is large enough to be its own index area. An example is Tampa, Florida, and its surrounding area. For each region, there is an index area composed of non-self-representing PSU’s. The index area is divided into geographic strata, and a single PSU is selected from each geographic stratum. An example of a PSU selected to represent the index area for the South is Ocala, Florida, and its surrounding area. Indexes are computed at the item-stratum, index-area level each month. The indexes are weighted and aggregated to produce the all-items U.S. index. The sample of priced items for a combination of an item stratum and index area is composed of replicate samples. A self-representing PSU has 2 or 4 replicate samples selected for it. A non-self-representing PSU is one of the replicate samples for its index area, or a component of a replicate sample. Indexes are computed at the replicate level each month for the purpose of estimating index variance. For the first stage of sampling for an item stratum within a PSU, or replicate of the PSU, the item stratum is divided into item categories referred to as entry-level items (ELI’s). An example is the item stratum Cakes, Cupcakes, and Cookies. It is composed of the ELI Cakes and Cupcakes (Excluding Frozen) and the ELI Cookies. One or more ELI’s are selected with probability proportional to the expenditure they represent. At each stage of sampling, multiple selections of a sample unit are possible. A single selection of an ELI is referred to as an ELI hit. Each ELI has a corresponding sampling frame for outlets in the PSU that sell the ELI. A bakery in the food court of a shopping mall is an example of an outlet that sells the ELI Cookies. More than one ELI may share the same sampling frame. In the second stage of sampling, two or more outlets from a frame are selected with probability proportional to the expenditure they represent. A single selection of an outlet is referred to as an outlet hit. The outlets selected from a frame are matched to the selected ELI’s that correspond to the frame. For a given combination of an ELI hit and an outlet hit, a unique item within the outlet is selected with probability proportional to the expenditure it represents. An example of a unique item for the ELI Cookies is a The expected value of the numerator (or denominator) in formula 2 equals the total updated expenditure for the item stratum in the PSU during month t (or t-1), where updating reflects the change in price of items purchased but not the change in quantity of items purchased. 1-lb. bag of chewy-style chocolate-chip cookies with walnuts, of a particular brand name. A single selection of a unique item is referred to as a quote. For each quote, the price of the unique item is quoted each month. An index I t , 0 , reflecting price change from month 0 to the current month t , is calculated at the item-stratum replicate level by multiplying the index for the previous month I (t −1), 0 by the estimate of price change between 3. Bridging of Samples for the 1998 Revised Index and Adjustments in Price-Relative Estimation For the 51 PSU’s continuing in the sample for the January 1998 revised index, the sample of quotes were mapped to codes for the newly defined item strata and ELI’s. Sample sizes were tested at the national level against the optimal sample sizes determined for the 1998 sample design. A sample for an item stratum was defined as insufficient if the ratio of actual size to optimal size was less than .50. t and t-1 . This one-month price change is referred to as the price relative Rt , (t −1) . (1) I t , 0 = I (t −1) , 0 Rt , (t −1) = t ∏R k =1 k , ( k −1) Based on this test, 24% of the item strata had insufficient samples and were designated for new sample selection. For the 76% item strata that had sufficient samples mapping to them, a process referred to as quote-level bridging was developed. The process involves: (1) coding of the quotes to retain their association with both the old and the new ELI definitions; (2) adjusting the quote-level weight for pt For a replicate of a PSU that is self-representing, the estimator for the price relative Rt , (t −1) is: (2) pt 1 w ∑ ∑ n n P a p ( ELI hit =1) ( quote =1) a c ,i , j c c,i , j ELI hit, quote n n c 1 wa pt −1 ∑ ∑ n n P p a c,i , j ( ELI hit =1) ( quote =1) c c,i , j ELI hit, quote n nc Where: n = sample size for ELI’s in the item stratum nc = sample size for outlets in the and pt −1 shown in formula 2. Samples were bridged for 33% of the item strata where there was no change in the item-stratum or ELI definitions. For estimation, no adjustment of the quotelevel weights is necessary for these item strata. Samples were also bridged for 43% of the item strata where there were changes in definition for the itemstratum and/or ELI’s. Since the quotes in a bridged sample were not selected based on sampling stages determined by the new definitions, the expected sample proportion associated with a quote for a sampling stage, given the new definition, would not equal the original probability of selection. Thus, the expected value of the numerator in formula 2 would not equal the updated total expenditure for the item stratum in the PSU during month t . frame corresponding to ELI c Pc ,i , j = probability that ELI c and outlet i and unique item j are selected given the item stratum and PSU p wa t p a c ,i , j The following formulation explains these concepts more explicitly. Formula 3 is the basis for the current computation of the numerator of Rt , (t −1) . It can be = measurement of expenditure in month a for unique item j in outlet i for ELI c which has been updated by the ratio of the price in month t to the price in month a shown that formula 3 is equal to the numerator of formula 2. 2 It is a random variable, such that: (3) rc ∑ ∑ n (rc nc ) Pc,i , j (c =1) ( quote =1) N rc nc wa p t p a c ,i , j quote N c N c ,i ∑∑ r i =1 j =1 c ,i , j = rc nc Where: Where: N = number of ELI’s in the item stratum rc = number of sample hits for ELI c . It is a random variable, such that: Nc = number of outlets in the frame corresponding to ELI c N c ,i = number of unique items N ∑r c=1 Note: c = n in outlet (rc nc ) * rc nc , the factor is used which has a value from 0 to rc nc . It is computed each month, The probability of selection associated with a quote is: Where: c = i = j = event that ELI c event that outlet (4) monthly computation of the factor (rc nc ) is based on quotes coded for the old ELI, not the new, and the expected sample proportion for the unique item equals the probability of selection for the unique item given the old ELI. Also, the factor rc / n does not change and is based on the old ELI. Thus, estimation is no different for a bridged sample than a newly selected sample, but the supporting sampling structure and coding system are different. * is selected i is selected event that unique item selected j is The expected sample proportion for ELI c is: r E c = Pr (c ) n Levels 2, 3, and 4 involve item strata where the old ELI’s in one or more old item strata may be split apart and/or recombined into new ELI’s. The newly defined ELI’s and the unchanged ELI’s are grouped into new item strata. An example is based on 2 old item strata: (1) White Bread, composed of the old ELI White Bread; and (2) Other Breads, Rolls, Biscuits, and Muffins, composed of 2 old ELI’s: (a) Bread Other Than White and (b) Rolls, Biscuits, Muffins (Excluding Frozen). A new item stratum Bread is composed of the new ELI Bread, and a second new item stratum Fresh Biscuits, Rolls, and Muffins is composed of the new ELI Fresh Biscuits, Rolls, and Muffins. For estimation, a poststratification method is used. Substrata within a new (5) The expected sample proportion for a unique item given ELI c is selected is: (6) rc,i , j | c = Pr (i | c) Pr (j | c, i) E rc nc Where: c Level 1, referred to as the 1987 method, involves item strata where the old ELI’s within an item stratum may be split apart and/or recombined into new ELI’s. However, the definition of the item stratum does not change. An example is the item stratum Rice, Pasta, and Cornmeal. The old ELI Rice and the old ELI Macaroni, Similar Products, and Cornmeal were combined into the new ELI Rice, Pasta, and Cornmeal. For estimation, a simple technique is used that was developed for the January 1987 revised index. The and its value depends on the number of usable quotes for that month and other contingencies (such as the duplication of certain quotes to compensate for the subsampling and dropping of quotes). Pc,i,j = Pr (c) Pr (i | c) Pr (j | c, i) ELI The adjustment of the quote-level weights for the 43% item strata is broken down into four levels of complexity. The bridged sample for an item stratum and PSU, or replicate of the PSU, is assigned to the lowest level of complexity that it qualifies for. In practice, an adjustment is made in formula 3. Instead of i for rc ,i , j = number of quotes for unique item j in outlet i for ELI c . 3 item stratum are defined and, for the numerator (or denominator) of the price relative, the estimate of total updated expenditure in month t (or t-1) is the sum of the substratum estimates of updated expenditure in month t (or t-1). For the new item stratum Bread, the first substratum is White Bread and the second substratum is Bread Other Than White. Formula 7 shows how formula 3 is adjusted for the poststratification method. For level 2, referred to as stage-1 poststratification, a substratum is defined as the intersection of the old ELI and the new item stratum. In formula 7, each intersection c is its own substratum. For a bridged sample to qualify for estimation at level 2, each of the substrata for the new item stratum must be represented by at least one bridged quote. The probability of selection for intersection c given substratum s is equal to 1.00 and is incorporated into Ps ,c ,i , j . The sample proportion for intersection c given substratum s, rs ,c / n s , is equal to 1.00 . If the outlet sample for the (7) rs ,c ∑ ∑∑ * s =1 c =1 qt =1 n s (m s ,c ) Ps ,c ,i , j N N s ms , c Where: N w p t a p a s , c , i , j qt old ELI is split apart when quotes for the old ELI are bridged to the new item stratum, then the probability of selection for an outlet given intersection c is adjusted and the probability is incorporated into Ps ,c ,i , j . ( Finally, the monthly computation of the factor m s ,c = number of substrata s = intersection of the old ELI and the new item stratum Ns = number of intersections c in substratum ns rs ,c For level 3, referred to as stage-2 poststratification, a substratum is defined as the intersection of the old item stratum and the new item stratum. For a bridged sample to qualify for estimation at level 3, each of the substrata for the new item stratum must be represented by at least one bridged quote. The method of estimation is the same as it is for level 2, with the following exceptions: (1) A substratum s may contain more than one intersection c . (2) The probability of selection for intersection c is conditioned on substratum s and may be less than 1.00 . (3) The sample proportion for intersection c, rs , c / n s , is conditioned on s = number of old-ELI sample hits bridged to substratum s = number of sample hits for intersection c, Ns ∑r c =1 s ,c such that: = ns substratum s and may be less than 1.00 . Note: If intersection c is smaller than the old ELI, then rs ,c For level 4, referred to as stage-2 poststratification with adjustment, a substratum is defined the same as it is for level 3. However, at least one substratum is not represented by at least one bridged quote. The method of estimation is the same as it is for level 3 with the addition of an adjustment factor in the quote-level weight to account for missing substrata. This factor equals the total expenditure at time b for the new item stratum divided by the sum of the expenditures at time b for the substrata having a bridged sample. (Time b is between month a, the sample-rotation reference month for the PSU, and January 1998.) The adjustment factor is important when estimating a higher level price relative where the summation in the numerator (or denominator) crosses over replicates and/or geographic strata. may equal 0 . Otherwise, it will equal the original number of sample hits for the old ELI. m s ,c = number of quotes bridged to intersection Ps ,c ,i , j = * is based on quotes coded for intersection c, and, as a result, the sample proportion for a unique item is conditioned on intersection c . in the new item stratum c ) c c and outlet i and unique item j are selected given substratum s probability that intersection 4 The number for replicates C and D is less than the number of samples shown in Table 1 since: (1) A and B replicate samples were not newly selected for 3 item strata and, as a result, control and test relatives could not be paired; (2) only nonimputed price relatives are used. An 8-month price relative is defined to be imputed if no quotes in its sample are usable in any month from January 1998 through August 1998. The adjustment of the quote-level weights for the 43% item strata is being implemented in 2 stages. Implementation of stage 1 was completed before the production of the January 1998 revised index. In stage 1, the level-1 adjustments were made for the qualifying bridged samples and the level-2 adjustments were made for the samples qualifying for levels 2, 3, and 4. In stage 2, the level-3 and level-4 adjustments will be made. Plot 1 shows no notable difference in the mean 8-month price relative between the levels of adjustment or between the replicates. Likewise, no notable divergence of the set of test means from the set of control means is indicated. The divergence that appears at level 3 is fairly negligible relative to the standard deviations at level 3 in Plot 2, which is examined below. These results could indicate that estimates from the reweighted bridged samples are as adequate as the estimates from the newly selected samples. However, such a conclusion is premature given the limited time frame and geographic representation of the analysis. 4. Empirical Findings An analysis of the bridged samples and adjustment of quote-level weights was conducted for the PSU consisting of New York City. It is self-representing and has 4 replicate samples selected for each item stratum. For 180 of the 183 item strata, 2 replicate samples (A and B) were newly selected based on the new item-stratum and ELI definitions. The third and fourth replicate samples (C and D) were bridged. For the remaining 3 item strata, all 4 replicate samples were bridged. Plot 2 shows the standard deviation of the 8-month price relatives for each level of adjustment for each replicate A, B, C, and D. At level 0, a divergence of the set of test standard deviations from the set of control standard deviations is not obvious. If there is a divergence, it could indicate a sample-rotation effect. At level 2, a notable divergence does exist. This divergence could indicate that the stage-1 poststratification method results in an estimate with lower variance when each substratum for a new item stratum is represented by at least one bridged quote. However, the divergence may fade over time. Also, it may not occur in other geographic areas, or it may depend on the particular groups of items in the analysis, or it may be due to a bridging effect other than reweighting. The analysis is based on the stage-1 implementation of the adjustment of the quote-level weights. For stage 1, the bridged samples assigned to the 3rd and 4th levels of adjustment are combined and coded as level 3. For these bridged samples, the level-2 adjustments were made with no adjustment factor added to account for missing substrata. At the time of the analysis, 8-month price relatives, reflecting price change from December 1997 to August 1998, were available. Price relatives estimated from the replicate A and B newly selected samples (month a = May 1995) are compared against price relatives estimated from the replicate C and D bridged samples (month a = May 1994). The term control relative is defined as an A or B price relative, and the term test relative is defined as a C or D price relative. A control relative for an item stratum is coded the same level of adjustment as its paired test relative for the same item stratum, where relatives for replicates A and C are paired and relatives for replicates B and D are paired. 5. Conclusion The poststratification method described in this paper provides a flexible tool when the reclassification of sampled items is necessary. Further study of both the immediate and long-term effects on the consumer price index and its component indexes is warranted. Table 1 shows how the C and D replicate samples break down for the 183 item strata. The average number of active quotes per sample is 4.9, where active is defined as a combination of an ELI and outlet that was still existing in January 1998. Plot 1 shows the mean 8-month price relative for each level of adjustment for each replicate A, B, C, and D. The number of price relatives coded for each level of adjustment is listed by replicate under Plots 1 and 2. 5 Table 1 6. References Type of C or D Replicate Sample Newly Selected Bridged Level of Adjustment 0 No Change 1 1987 Method 2 Stage-1 Poststrat. 3 Stage-1 Poststrat. No Adjustment Factor Designated as Bridged but No Quotes to Bridge All Bureau of Labor Statistics. BLS Handbook of Methods, September 1992, pp. 190-191. Casady, Robert J., Alan H. Dorfman, and Richard L. Valliant. “Sample Design and Estimation for the CPI,” BLS Note, July 2, 1996. Lane, Walter F. “Changing the Item Structure of the Consumer Price Index,” Monthly Labor Review, December 1996, pp. 18-25. Plot 1: MEAN 8-MONTH RELATIVE FOR REPLICATES A, B, C, D BY LEVEL OF ADJUSTMENT * Plot 2: STANDARD DEVIATION FOR REPLICATES A, B, C, D BY LEVEL OF ADJUSTMENT * Number of Relatives: A B C D * 46 43 46 43 13 13 13 13 13 15 13 15 23 27 23 27 178 52 56 100 A control relative is coded the same level of adjustment as its paired test relative, where relatives for replicates A and C are paired and relatives for replicates B and D are paired. 6 Number of Samples & Proportion 90 .24 118 32 55 57 .32 .09 .15 .16 14 366 .04 1.00
© Copyright 2026 Paperzz