[CANCER RESEARCH 30, 1969—1973, July 1970] Space-Time Clustering of Childhood Leukemia in San Francisco' Melville R. Klauber and Piero Mustacchi University of Utah College ofMedicine, Salt Lake Cyty, Utah 84112, and University ofCalifornia School of Medicine, San Francisco,California94122 SUMMARY berland and Durham clustering. but found no significant space-time Mantel's regression approach was used to test for space-time clustering for time of diagnosis and address of 149 leukemia cases in children under the age of 15 years diagnosed in San Francisco during the 20-year period 1946 to 1965. The cases were partitioned into consecutive time intervals, and the sum of the distances between pairs of cases falling within intervals was compared to its expectation, assuming a random alloca clustering for a case series which partially overlaps the series presented in this communication. A deficit of leukemia cases for all ages in the period 1949 to 1955 was discovered in those San Francisco census tracts where there had been an absence ofleukemia cases for the period 1946 to 1948. tion of space points to time points. Five different interval sizes were used: 0.5, 1, 2, 4, and 12 months. The 2-month intervals MATERIALS AND METHODS For San Francisco, Mustacchi (8) found a form of spatial were chosen in advance for significance testing purposes at the San Francisco resident cases of leukemia in children under age 15 years diagnosed during the 20-year period January 8, and 2 to 14 years. Neither of the age groupings showed statis 1946 to January 7, 1966, were identified through the co tically significant clustering for 2-month intervals. The only operation of the San Francisco Department of Public Health; analysis of this same type yielding a p value less than 0.05 was for ages 2 to 14 years and with 12-month intervals (p 0.024). San Francisco Medical Society ; Stanford University Medical A comparison of “within― time interval average distance School ; and administrative, medical, and record room staffs of every San Francisco hospital. Clinical charts and death certifi between cases to “between― time interval average distance cates were searched for the diagnosis of leukemia. indicated that “clustering― in the series was weak. In addition, Mantel's approach (6) was used in the present study. Two Knox' s approach was used for descriptive purposes. The clus distance measures were computed, X a spatial measure and Y a tering appeared to be the result of a slight excess of pairs temporal measure, for each of the n(n—l)/2 unordered pairs within relatively large time and/or large space distances. of cases. Let I and j index any pair of distinct cases. Statis Data are presented indicating that a single analysis of a case tically significant space-time clustering is evidenced by a large series with a long time span can be subject to artifactual departure (in the appropriate direction) of the statistic space-time clustering occurring in the population at risk. 5% level. Separate analyses were performed for ages 0 to 14 @ w = 0.5 Z = @‘ X,@ Y@1 INTRODUCTION Since the report of an apparent cluster of childhood leu kemia by Heath and Hasterlik (3) in 1963, a number of objec tive statistical methods have been proposed and applied to try from its expectation based on the assumption of random matching of the space and time coordinates. The spatial measure chosen was the distance in feet between the cases which was plotted to the nearest 100 feet. The temporal to determine whetheror not thereis significant space-time measure was defined as Y@1 1 if Cases i and / were diagnosed clustering of this disease. Knox (5) used a 2 X 2 contingency within the same time interval and Y,1 0 otherwise. table approach for the set of all spatial and temporal distances w is merely the sum of spatial distances between all pairs of (by residenceand time of onset)for leukemiacaseswith onset cases that were both diagnosed within the same time interval. before age 6 in Northumberland and Durham (1951 to 1960). A significant excess of cases less than 1 km apart in residence and less than 60 days apart in time of onset was found. The Five different time intervals were investigated: 0.5, 1, 2, 4, and 12 months. The 2-month interval was the predetermined value chosen for significance testing purposes at the 5% level. same method was applied to data for leukemia cases with Each of the 5 sizes of intervals consisted of 1 or more consecu onset under age 15 years for Portland, Ore., and its environs. A tive 0.5-month intervals: January 8 to January 22, January 23 significant excess was not found for cases less than 1 km and to February 7, February 8 to February 22, etc. This unusual 60 days apart, but interaction was best shown for distances arrangement of intervals was chosen in order to make the less than 4 km and 250 days apart (7). Barton et al. (2) applied intervals approximately the same length and at the same time the Barton-David approach (1) to Knox's data for Northum to try to reduce the likelihood of misclassification. Each study year, starting at January 8, was divided into 24 0.5-month 1 This investigation was supported by USPHS Grants CA 05924 and 9751 and the Laurence and Alice Anspacher Myers Fund. Received May 20, 1968; accepted March 20, 1970. intervals, 7 of 16 days, 16 of 15 days, and 1 of 13 days (14 days in leap years). In the case series, the day of the month was 1 for 37% of the cases and 15 for 32% of the cases. By JULY 1970 Downloaded from cancerres.aacrjournals.org on June 16, 2017. © 1970 American Association for Cancer Research. I969 Melville R. Klauber and Piero Mustacchi “centering―the 0.5-month intervals extensive discussion of the method and its relationship on the 1st and 15th days to a of themonth,it washopedto reducemisclassification errors number of other approaches to the space-time clustering prob that might arise if the division had been made closer to the 1st and 15th days of the month. Approximate p values were determin@4@y treating the standardized distributed statistic (W —Exp W) /v'Var lem. It was necessary to reduce space-time clustering due to varying population growth rates in different neighborhoods during the 20-year period. The 1950 and 1960 census data for San Francisco persons ages 0 to 14 years indicated that the rate of population growth varied considerably from census tract to census tract; 35% of the tracts had 25% or more W as a normally random variable. The reader is referred to Ref. 6 for the formulas usedto compute Exp Wand Var W and for Table 1 Normal approximation p values for Mantel's approach for leukemia under age 15 years in San Francisco (1946 to 1965) by age group and size oftime intervals deviation at0.5 Age mo.0—14 group (yr)p separate mo.4 2—140.71 0.850.31 0.460.32 0.260.17 mo.2 mo.1 mo.4 may have acquired mo.12 intervals (15) (38) (64) (128) (385) 17,720 Between 17,520 17,540 17,570 17,500 (1,407) intervals19,420 (1,664)16,740 (1,754)16,810 (1,728)16,820 (1,777)17,360 coordinates of the residences were determined to based on the January 8, 1946 to January 7, 1951; in parentheses. the disease as a result of a congenital con dition. A supplementary analysis with reciprocal spatial and temporal measures was performed to determine whether the use of time intervals and a distance measure which does not give extra weight to short distances obscured the degree of clustering. Two further analyses are presented for descriptive rather than significance testing purposes. For each time interval size used in the Mantel approach, a comparison was made of “within― time interval average distance between cases to the “between― time interval average distance. Knox's approach (5) which compares the number of pairs of cases less than a given spatial distance and less than a given temporal distance to expectation the nearest 100 feet, and the averagedistances were rounded to the nearest 10 feet. Pooled averagesare given from the 4 separate 5-year analyses. b No. of pairs shown in 1960 The data were analyzed separately for the full 15-year age group and ages 2 through 14 years. It was thought that clus tering might be enhanced by eliminating the children diagnosed during the first 2 years of life, since these children (611) (103) (194) (25)b (55) 17,650 17,670 17,670 17,700 17,740 (2,123)2—14Within (2,540)17,360 intervals18,630 (2,631)17,060 (2,709)17,080 (2,679)17,270 spatial population (@;W - @; Exp w)/-@J@; Var W intervals Between a The 5-year periods: 0.180.19 0.024 Table 2 A verage distance in feet between residences ofSan Francisco childhood leukemia cases, within and between time intervals, by age group and length oftime intervals (ft)at:(yr)averageda0.5 Age groupPairsDistance mo.0—14Within expected January 8, 1951 to January 7, 1956, etc. Summary (normal) p values were obtained with the statistic mo.12 mo.2 mo.1 from assumption that each tract had the same growth rate during the previous 10 years for ages 0 to 14 years as the whole city (9). In order to reduce the effect of this artifactual clustering in the population at risk, the case series was analyzed in 4 was tried for the 16 combinations obtainable for the spatial distances 0.25, 0.5, 1, and 2 miles with the tern poral distances 30, 60, l20, and 365 days. Knox's approach provides additional descriptive information over the average distance approach in that it shows the excess of pairs within Table 3 Observed/expected number ofpairs ofSan Francisco childhood leukemia cases diagnosed January 6, 1946, to January 7, 1966, less than indicated separation in space (address) and time (data ofdiagnosis) by age group Space 121)at:<30 149)at:Ages2—14 yr(n distance days<0.252/0.6 days<60 (miles)Ages0—14yr(n days<120 days<365 days<30 (6)4/2.9 (8)14/8.4 (18)1/0.3 days<120 days<365 (2)1/0.8 (2)2/1.7 (4)9/5.1 (8)5/5.9 days<60 (11)<0.52/2.1 (4)a3/1.3 (30)<l4/5.8 (4)7/4.4 (14)10/9.5 (20)31/27.7 (48)1/1.2 (2)4/2.6 (70)<220/19.4 (8)12/12.1 (22)26/25.9 (42)79/75.7 (98)3/3.5 (6)8/7.4 (15)17/16.6 (30)50/49.6 (36)46/40.4 (68)100/86.6 (102)264/252.7 (139)12/16.6 (22)32/24.6 (49)68/55.3 (76)177/163.3 (10)19/17.3 (110) a No. of individuals involved in pairs observed shown in parentheses. 1970 CANCER RESEARCH VOL. 30 Downloaded from cancerres.aacrjournals.org on June 16, 2017. © 1970 American Association for Cancer Research. Space-Time Clustering of Childhood Leukemia Table 4 Leukemia cases under age 15 years diagnosed with addresses in San Francisco by date ofdiagnosis, coordinates (V = No. ofIOO fret east oforigin, W No. oflOO feet north oforigin), age in years, and sex @ Coordinates Coordinates Coordinates Coordinates Coordinates Year, _________________ Year, _________________ Year, _________________ Year, _________________ Year, __________________ mo./day V W Age Sex mo./day V W Age mo./day mo./day mo./day Sex V W Age Sex V W Age Sex V W Age Sex 194619584/1691332M6/15725210F12/1286195OF1/133331013F7/1265542F5/1187260OM7/1662356M12/1911984M1/10179724M7/13482392M7/1483285M8/11933054F1 4/1 275 2 M 78 1947157617M10/20 3/1163619F2/13 5/1198 216103 1962842149M2/211053234M1/12682111F4/158819812F6/17312776M2/13303150Flo/l3232433F1/51003734F4/152822490M7/12292792M5/11081924F11/153652013M3/1 19513483122M1955 1243 8M F9/15 164249 2821210M F7/69/15208 7/1552 105282 3/20131 38714 1F F5/15 18571 1911 6F M5/288/1330 M194810/153503162M8/5843193M12/42773185M8/155534713F1/15137925F8/212033297M9/152002949F1/202431983F19529/13642503F195 125280 2773 12M 138 155 2 F 10/1 293 204 5/3 6 M 6/7 241 183 13 F 11/15231 5/2422030189 1541 3MF7/15 19491571049M1953 1/151852144M4/15 292286 1873 5M M3/297/1095 153312 M1/182691120M2/726117312F5/261223280F12/1923997M10/1135930214F3/12013450M3/13451783M6/1522715714M12/15322102M10/153362172F3/15 31114 5M 1/12 9/28 92 287 6 F 9/20 135 74 2 F 127251276 3M F7/15 12/15222 1/15116 352367 2853 1M F9/15 F1/1223022M2/12481842M4/11833291M1/151132164M2/73542970M19574/15992146F19654/1673294M2/152991460M6/153042098F4/212671380M12/28150903M4/12 19612583223F7/15 10/263283562572894 2M 19501813363F1954 various distances in both dimensions. Knox's approach was not used for statistical significance testing; however, Knox's statistic has the form of a Mantel Z-statistic and may be tested by Mantel's approach. RESULTS For the predetermined time interval of 2 months for ages 0 to 14 years a nonsignificant p value of 0.32 was obtained; for ages 2 to 14 years the p value was reduced only slightly (0.26). Table 1 shows that there appears to be a trend of greater space-time clustering with larger time intervals, especially for the 2- to 14-year age groups. For 12-month intervals, the p values are 0.19 for ages 0 to 14 years and 0.024 for ages 2 to JULY 14 years. Evidently, removing cases diagnosed during the first 2 years of life did enhance the space-time clustering con siderably when 12-month intervals were used. The average distance in feet between residences of the cases for those pairs within the same time intervals and those pairs not in the same time intervals were computed for each 5-year period, the 5 interval lengths, and the 2 age groupings. The 40 differences in averages were so small that only the pooled averages are shown in Table 2; the largest difference in averages among the 4 5-year periods was for 1956 to 1960 with 12-month intervals, where the “withintime intervals― average was 15,740 feet and the “betweentime intervals―average was 17,7 10 feet. In Table 2, the pooled averages were obtained merely by summing the distances between the indicated pairs from the individual 1970 Downloaded from cancerres.aacrjournals.org on June 16, 2017. © 1970 American Association for Cancer Research. 1971 Melville R. Klauber and Piero Mustacchi 5-year analyses and dividing by the number of pairs. In no case 62,500 and K@ = 60); for ages 2 to 14 years, the range did the “withinintervals―pooled average distance exceed the of p values was 0.020 to 0.49 with 7 of the 25 values less than 0.05 (p = 0.020 occurred with K5 62,500 and “betweenintervals―average by as much as 1,000 feet. 120). Apparently, no serious division of “clusters― The obse rved and expected pairs less than various K@ for the age group 2 to 14 years when the space and time distances apart are shown in Table 3. occurred These indicate that the smaller “within― than “between―interval time measure was used. For ages 0 to 14 years, average distance for intervals of 1 month or longer in the minimum p value obtained with the reciprocal trans Table 2 were not due to excessive numbers of cases oc formation was 0.058 compared to 0.17 with the interval distance measure; but with the former measure, 5 times curring very close to each other. It appears that there are slightly more than expected pairs less than 0.25 mile as many forms of the test were tried as with the apart, close to the expected number less than 0.5 and 1 latter, and none of all 30 results were significant at the mile apart, and more than expected less than 2 miles 5% level. The 6 of the 7 “significant― p values were probably apart. One of the largest observed to expected ratios obtained, 9/5.1 for the age group 2 to 14 years, for little affected by the time measurement errors mentioned pairs less than 0.25 mile and 365 days apart, involved above, since they involved the use of additive time con only 11 cases. One case was in 3 pairs, 5 cases were in stants, K@ of 60 (days) or more. 2 pairs, and the remaining 5 cases appeared in only 1 pair. The data in Table 3 are taken from the 11,026 DISCUSSION pairs obtainable from the 149 cases ages 0 to 14 and the 7,260 pairs from the 121 cases ages 2 to 14 for the The problem of assessing the statistical significance of full 20-year period and hence may be affected by space a number of space-time clustering studies of a rare dis time clustering in the population at risk. For descriptive ease is hard because the replication of studies under purposes, it was thought desirable to identify the extent of excess pairs of “close― cases for the whole period. Table 4 provides a complete listing of cases in chrono logical order of diagnosis and shows the coordinates of the home address, sex, and age at time of diagnoses. Further analysis of the present data clearly indicated the necessity of dividing the 20-year period in order to reduce the effect of space-time clustering of the popula similar conditions is difficult, if not impossible. Because of differences in geography and population density, results obtained in 1 area cannot be definitively con firmed or refuted by a study in another area. If one wishes to repeat a study in the same area, since the disease is rare, a great deal of time is required and, with the passage of time, important changes in population density and distribution can occur. This makes it desirable to try a variety of space and time measures in order to tion at risk. When the analysis was repeated with the whole 20-year series as a single unit, all 10 p values were elucidate the nature of the clustering, if there be any. reduced from what they were for the summary statistic When a number of measures are used for the same data, from the four 5-year analyses. The largest absolute and the probability values obtained become nominal. Further proportional reductions were 0.19 to 0.13 (ages 0 to 14 interpretation of them is clouded by the statistical de years, I 2-month intervals) and 0.024 to 0.015 (ages 2 to 14 years, 12-month intervals), respectively. By using the time measure which identifies whether or not pairs of cases lie in the same time interval, the problem of errors in time measurements was mitigated, but this ran the risk of putting children whose address and time of diagnosis were close into separate intervals. For determination of whether or not the separation of cases into time intervals and the use of untransformed spatial distances did in fact obscure clustering, the anal ysis was repeated in the same way, except with different space and time measures. The reciprocal measure sug gested by Mantel (5) was used for both measures: xii = l/(K@ +D11) andY,1 l/(Kt+ T@1), where K@ and K@ are constants and D11 and T@, are the distance in feet and time separation in days, respectively, between Cases i and j. The 25 combinations of constants obtainable from the values K5 100, 500, 2,500, 12,500, 62,500 and K@ = 15, 30, 60, 120, 365 were tried for both age groups, 0 to 14 years and 2 to 14 years. The range of p values obtained for ages 0 to 14 years was 0.058 to 0.49 (p 0.058 occurred with K5 1972 pendence of one such result upon another. In the present study, the problem of interpreting the p values is academic, since even if the space-time clustering is real, it is weak. The residence of the child in a relatively compact highly urbanized city, such as San Francisco, may not be an appropriate spatial location for studies of space-time clustering. A child in San Francisco may easily come into contact with large numbers of persons who live through out the city, but a child living in a suburban or rural community is likely to be in contact with far fewer per sons. Likewise, an environmental agent present in a small area can potentially higher percentage of a dense population dense population of the same size. which may be affect a much than of a less Is this to say that investigation of space-time clustering of leukemia cases as a means of elucidating the etiology of the disease is bankrupt? continued use of the No, what is needed general approach, but is some with more meaningful observations. An unsuccessful attempt at this was a study of hospital of birth clustering by Klauber (4). Other attempts should be made with the use of observations of times and places consistent hypothesis of a given type of leukemogenic with a given exposure. CANCER RESEARCH VOL. 30 Downloaded from cancerres.aacrjournals.org on June 16, 2017. © 1970 American Association for Cancer Research. Space-Time ACKNOWLEDGMENTS Clustering of Childhood Leukemia 2. Barton, D. E., David, F. N., and Merrington, M. A Criterion for Testing Contagion in Time and Space. Ann. Human Genet., Lon We are grateful to Nathan Mantel of the National Cancer In stitute for allowing us the use of a prepublication copy of Ref. 6 and for a number of suggestions that led to improvement in this communication, especially the suggestion to split the time period and to observe the effect of eliminating cases under the age of 2 years. However, we bear the responsibility for any er rors. We are indebted to Richard Walster for computer pro gramming and the University of Utah Computer Center for time on the UNIVAC 1180 computer. REFERENCES 1. Barton, D. E., and David, F. N. Randomization Bases for Multi variate Tests I. The Bivariate Case, Randomness of N Points in a Plane. BulLInst. Intern. Statist.,39: 455—467,1962. JULY don,29: 97—102,1965. 3. Heath, C. W., and Hasterlik, R. J. Leukemia among Children in a Suburban Community. Am. J. Med., 34: 796—812,1963. 4. Klauber, M. R. A Study of Clustering of Childhood Leukemia by Hospital of Birth. Cancer Res., 28: 1790—1792, 1968. 5. Knox, G. Epidemiology of Children Leukemia in Northumberland and Durham. Brit. J. Prevent. Social Med., 18: 17—24,1964. 6. Mantel, N. The Detection of DiseaseClustering and a Generalized RegressionApproach. Cancer Res., 27: 209—220,1967. 7. Meighan, S. S., and Knox, G. Leukemia in Childhood: Epidemi ology in Oregon. Cancer, 18: 8 11—814,1965. 8. Mustacchi, P. Some Intra-City Variations of Leukemia Incidence in San Francisco. Cancer, 18: 362—368,1965. 9. U. S. Census of Housing: 1950 and 1960 Series HC(3) City Blocks. San Francisco, Calif. : U. S. Dept. of Commerce, Bureau of the Census. 1970 Downloaded from cancerres.aacrjournals.org on June 16, 2017. © 1970 American Association for Cancer Research. 1973 Space-Time Clustering of Childhood Leukemia in San Francisco Melville R. Klauber and Piero Mustacchi Cancer Res 1970;30:1969-1973. Updated version E-mail alerts Reprints and Subscriptions Permissions Access the most recent version of this article at: http://cancerres.aacrjournals.org/content/30/7/1969 Sign up to receive free email-alerts related to this article or journal. To order reprints of this article or to subscribe to the journal, contact the AACR Publications Department at [email protected]. To request permission to re-use all or part of this article, contact the AACR Publications Department at [email protected]. Downloaded from cancerres.aacrjournals.org on June 16, 2017. © 1970 American Association for Cancer Research.
© Copyright 2026 Paperzz