Neural Networks 16 (2003) 405–410 www.elsevier.com/locate/neunet 2003 Special Issue Using self-organizing maps to identify potential halo white dwarfs Enrique Garcı́a-Berroa,*,1, Santiago Torresa, Jordi Isernb,1 a Departament de Fı́sica Aplicada, Universitat Politècnica de Catalunya, Jordi Girona Salgado S/N, Mòdul B-4, Campus Nord, 08034 Barcelona, Spain b Institut de Ciències de l’Espai, C.S.I.C., Edifici Nexus, Gran Capità 2-4, 08034 Barcelona, Spain Abstract We present the results of an unsupervised classification of the disk and halo white dwarf populations in the solar neighborhood. The classification is done by merging the results of detailed Monte Carlo (MC) simulations, which reproduce very well the characteristics of the white dwarf populations in the solar neighborhood, with a catalogue of real stars. The resulting composite catalogue is analyzed using a competitive learning algorithm. In particular we have used the so-called self-organized map. The MC simulated stars are used as tracers and help in identifying the resulting clusters. The results of such an strategy turn out to be quite satisfactory, suggesting that this approach can provide an useful framework for analyzing large databases of white dwarfs with well determined kinematical, spatial and photometric properties once they become available in the next decade. Moreover, the results are of astrophysical interest as well, since a straightforward interpretation of several recent astronomical observations, like the detected microlensing events in the direction of the Magellanic Clouds, the possible detection of high proper motion white dwarfs in the Hubble Deep Field and the discovery of high velocity white dwarfs in the solar neighborhood, suggests that a fraction of the baryonic dark matter component of our galaxy could be in the form of old and dim halo white dwarfs. q 2003 Elsevier Science Ltd. All rights reserved. PACS: 95.35. þ d; 95.75.Pq; 95.80. þ p; 97.10.Yp; 97.20.Rp; 98.35.Gi; 98.35.Ln Keywords: Stars; White dwarfs; Dark matter; Mathematical procedures and computer techniques; Catalogues 1. Introduction White dwarfs are the most common end-point of stellar evolution. Moreover, white dwarfs are well-studied objects. In fact, the relative simplicity of their constitutive physics allows us to obtain very detailed evolutionary models (Salaris, Garcı́a-Berro, Hernanz, Isern, & Saumon, 2000, and references therein). Although these evolutionary models can be extremely sophisticated, it can be said that their evolution is essentially a cooling process—see, for instance, the recent review of Fontaine, Brassard, and Bergeron (2001)—during which the degenerate and almost isothermal core releases gravothermal energy which is evacuated through the partially degenerate atmosphere, whereas the hydrostatic equilibrium is achieved mostly by the pressure of the nearly degenerate electrons. This atmosphere, in turn, controls the rate at which the energy is radiated away. Additionally, white dwarfs have very long * Corresponding author. Tel.: þ 34-93-401-6898; fax: þ34-93-401-6090. E-mail addresses: [email protected] (E. Garcı́a-Berro), [email protected] (S. Torres), [email protected] (J. Isern). 1 Institut d’Estudis Espacials de Catalunya, Spain. evolutionary time scales, which are comparable to the age of our galaxy. Due to these facts, white dwarfs provide us with an invaluable tracer of the early evolution of our galaxy and, consequently, allow us to explore how our galaxy, and other galaxies, formed and evolved (both chemically and kinematically). Given their intrinsic faintness, white dwarfs are difficult to detect at large distances and, thus, the currently available surveys reach modest distances, at most 300 pc. In fact, a large fraction of white dwarfs has been found in proper motion surveys. Thus, the vast majority of known white dwarfs belong to the solar neighborhood. Whether these white dwarfs are members of the known galactic disk populations (namely the thin and the thick disk) or are halo members is a crucial issue. There is now a widespread consensus that the distribution of faint white dwarfs in the solar neighborhood is in good agreement with the expectations of the standard old thick disk population, being their space density of about 0.005 pc23, with possibly a small fraction of halo white dwarfs present in the sample. The observational situation has improved dramatically in the last few years with the advent of the Hubble Space 0893-6080/03/$ - see front matter q 2003 Elsevier Science Ltd. All rights reserved. doi:10.1016/S0893-6080(03)00010-8 406 E. Garcı́a-Berro et al. / Neural Networks 16 (2003) 405–410 Telescope and large ground based telescopes. For instance, faint white dwarfs have been already detected in several open and globular galactic clusters, and there are some evidences that the galactic halo white dwarf population has been already detected, although this particular topic is still the subject of large controversies. To be more specific, halo white dwarfs have received a continuous interest during almost one decade from both the theoretical (Isern, Garcı́a-Berro, Hernanz, Mochkovitch, & Torres, 1998; Mochkovitch, Garcı́a-Berro, Hernanz, Isern, & Panis, 1990; Tamanaha, Silk, Wood, & Winget, 1990) and the observational points of view. From this last point of view it is important to stress the big effort of Liebert, Dahn, and Monet (1989) who studied a high proper motion sample and from it derived the very first (although severely incomplete) halo white dwarf luminosity function. Later Flynn, Gould, and Bahcall (1996) and Méndez, Minnitti, De Marchi, Baker, and Couch (1996) studied the white dwarf content of the Hubble deep field. Moreover, the MACHO team reported the discovery of microlenses towards the large Magellanic cloud and claimed that about 20% of the dark matter in the Galaxy could be in the form of white dwarfs (Alcock et al., 1997). However, recent analyses (Alcock, 2000) have shown that the fraction of dark matter in the form of white dwarfs is smaller, of the order of # 10%, in good agreement with the theoretical expectations of Isern et al. (1998). More recently, Ibata, Irwin, Bienaymé, Scholz, and Guibert (2000) have reported the discovery of two extremely cool white dwarfs in the solar neighborhood with very high proper motion, making them very likely observational counterparts of a putative ancient halo white dwarf population. Other faint white dwarfs with extremely large proper motions have been also discovered recently (Hambly, Smartt, & Hodgkin, 1997; Hambly et al., 1999; Hodgkin et al., 2000) making use of stacked photographic plates. Increasing attention has been paid to this topic since the very recent discovery (Oppenheimer, Hambly, Digby, Hodgkin, & Saumon, 2001) of 38 new, nearby and old white dwarfs with large space velocities. Whether these white dwarfs belong to the thick disk or to the halo is still the subject of a strong debate. In summary, although there are evidences of a possible detection of the halo white dwarf population, given the scarce number of halo white dwarfs it is difficult to ascertain whether a small sample of these objects remains hidden in the current catalogs and how this putative population could be identified. In this paper we describe how to identify the population of halo white dwarfs in the existing white dwarf catalogues. 2. Method and results With the advent of large astronomical databases the need of efficient techniques to improve automatic classification strategies has lead to a considerable amount of new developments in the field. Among these techniques the most promising ones are based in artificial intelligence algorithms. Neural networks have been used successfully in several fields such as pattern recognition, financial analysis, biology – see Kohonen (1990) for an excellent review – and in astronomy. For instance, Bazell and Peng (1998) used these techniques to automatically discriminate stars from galaxies, Naim, Lahav, Sodre, and Storrie-Lombardi (1995) used them to classify galaxies according to their morphology, Serra-Ricart, Aparicio, Garrido, and Gaitan (1996) found the fraction of binaries in stars clusters, and Hernández-Pajares and Floris (1994) used such techniques to classify populations in the Hipparcos Input Catalogue. The common characteristic of all the existing neural network classification techniques is the existence of a learning process very much in the same manner as human experts manually classify. Generally speaking there are two different approaches: the supervised and the unsupervised learning methods. The main advantage of the last class of methods is that they require minimum manipulation of the input data and, thus, the results are supposedly more reliable. Their leading exponent is the so-called Kohonen self-organizing map (SOM). Although we refer the reader to the specific literature (Kohonen, 1997) we will summarize here the basic features of the Kohonen SOM. The basic principle of this technique is to map a multi-dimensional input space ðSÞ into a bi-dimensional space ðLÞ: In fact, the SOM is the result of a vector quantization algorithm that places a number of reference vectors of a high-dimensional input space into a bi-dimensional lattice in an ordered fashion. When local order relationships are defined between the reference vectors, the relative values of the latter are made to depend on each other as if their neighboring values would lie along an ‘elastic surface’. This surface is defined as a non-linear regression of the reference vectors through the data points. A mapping from a high-dimensional space onto a two-dimensional lattice of points is thereby also defined. Such a mapping can be used to visualize metric ordering relationships of the input samples. Thus, neighbor groups in L have similar properties. The mapping is obtained as an unsupervised learning process. This process may be used to find clusters in the input data, and to classify individual objects within these clusters. The catalog of McCook and Sion (1999) is a compilation of the observational data of 2249 spectroscopically identified white dwarfs. In order to classify the stellar populations presumably present in this catalog a set of variables describing their properties should be adopted. It should be noted that the larger the set of variables adopted, the smaller the number of objects that will have determinations for all the variables. Conversely, if the number of variables in the set is small we could be disregarding valuable information. We have adopted a minimal set in order to be able to analyze the largest possible number of objects in the catalog. The variables adopted in this study are: the absolute visual magnitude MV ; the proper motion m; the galactic coordinates ðl; bÞ; the parallax p; and a color E. Garcı́a-Berro et al. / Neural Networks 16 (2003) 405–410 index, B 2 V: This reduces considerably the number of objects with all the determinations, but allows a secure classification. We have found very convenient to use the reduced proper motion defined as H ¼ MV 2 5 log p þ 5 log m; instead of m itself because the resulting groups are easier to visualize. The first step to be done, previous to any attempt to classify the above mentioned catalog is to determine if there exists any linear relationship between the set of variables that we have chosen. To this regard we have performed a principal component analysis of the set of white dwarfs which have observational determinations of all the necessary data. We have not found any zero eigenvalue, meaning that all the chosen variables are independent. We have also found that all the eigenvalues are significant and, hence, none of these variables can be disregarded. The statistical classification of an observational database usually ends up with the detection of groups in the input space that require an ‘a posteriori’ analysis. Since we are interested in detecting different stellar populations, simultaneously with the clustering process we mix in the input data a synthetic population of tracer stars that will allow us to label the groups detected by the classification algorithm as halo, disk or intermediate population. The results of the classification procedure are not sensitive to the fine details of these synthetic populations. These synthetic tracer stars have been produced using a Monte Carlo (MC) simulator. The description of the MC simulation of the disk population can be found in Garcı́a-Berro, Torres, Isern, and Burkert (1999). A comprehensive discussion of the results of the MC simulation of the halo population will be published elsewhere. However, and for the sake of completeness, a brief summary of the inputs is given here. We have adopted a standard, Salpeter-like, initial mass function (Salpeter, 1961). The halo was supposed to be formed 14 Gyr ago in an intense burst of star formation of 1 Gyr of duration. The sensitivity of our results to the exact value of these two parameters is very small. The stars are randomly distributed in a sphere of radius 200 pc centered in the sun according to a density profile given by the expression rðrÞ / ða2 þ R2( Þ ; ða2 þ r 2 Þ r being the galactocentric radius, a < 5 kpc and R( ¼ 8:5 kpc. The velocities of the tracer stars were randomly drawn according to normal distributions for both the radial and the tangential components, with velocity dispersions as given in Marković and Sommer-Larsen (1997); the adopted rotation velocity Vc is 220 km s21 : The values of the velocity dispersions depend on the galactocentric coordinates, but inside the above mentioned sphere of 200 pc of radiuspffiffitheir values are roughly the same sr . st . Vc = 2 , 155 km s 21 : The adopted density of halo white dwarfs was logðnÞ . 25:39 pc23 M21 bol at logðL=L( Þ . 24; in accordance with 407 the results of Liebert et al. (1989). The remaining inputs were the same adopted in Garcı́a-Berro et al. (1999). In order to reproduce accurately the properties of the real catalog both MC simulations were required to meet the additional set of criteria: d $ 08; 8:5 # MV # 16:5; m # 4:1 in: yr21 and 0:006 # p # 0:376 in:; which are derived from the subset of 232 white dwarfs which have all the determinations. An added value of the above-described procedure of mixing tracer and real stars is that in this way we can check the accuracy of the classification algorithm and the quality of the MC simulations. The simulated samples mimic fairly well the observational sample as can be seen in Fig. 1, where the results of the MC simulations for the disk and the halo are compared with the observational sample in the reduced proper motioncolor diagram. As can be seen in this diagram the two simulated samples are easily visualized. Similar diagrams can be produced for each pair of variables and the results of the MC simulations compare equally well with the real data. After that, we have run the public domain neural network software SOM_PAK (available at http://www.cis.hut.fi/ nnrc/som_pak/) with a catalog constructed as described. We have used a Gaussian kernel. We have used a linearly decreasing learning rate aðtÞ ¼ Að1 2 t=BÞ; where A and B are constants. In a first step the learning rate was chosen to be high ðA ¼ 0:5; B ¼ 103 Þ; whereas in subsequent runs we used a slow learning rate ðA ¼ 0:02; B ¼ 106 Þ: On its hand, the width of the kernel was chosen to be three nodes for the first run and in subsequent runs we actualized only one neighbor node. Finally, in the first run 103 iterations were used whereas in subsequent runs 106 iterations were required. The number of nodes was chosen in such a way that the error function was minimized. Nevertheless it is worth mentioning here that a compromise between the number of nodes and the number of white dwarfs in the observational catalog had to be reached. If the number of nodes was too large there were too few white dwarfs in the resulting groups and, hence, the interpretation turned out to be difficult. On the contrary if the number of nodes was too small the confusion error became too large. However, the halo white dwarfs identified using the neural network were the same for reasonable choices of the grid and, thus, the identifications can be considered as safe. We also checked different geometries of the grid, and the results turned out to be independent of the considered geometry. The SOM of the input catalog, after three passes over the entire sample and with a grid of 5 £ 5 nodes is shown in Fig. 2. The groups have been assigned either to halo (‘H’) or disk (‘D’) populations if the percentage of tracer stars of one of the populations was larger than 70%. In the groups labeled as ‘I’ (intermediate population) neither the halo nor the disk tracers were in excess of this recognition percentage. As can be seen in Fig. 2 all the halo groups are close neighbors and, furthermore, the intermediate population groups are surrounded by halo and disk groups. A good measure of the overall quality of the classification 408 E. Garcı́a-Berro et al. / Neural Networks 16 (2003) 405–410 Fig. 1. Reduced proper motion-color diagram for the MC simulations of the disk (solid dots) and the halo (solid squares) and of the observational sub-sample (open triangles). scheme can be obtained by checking how many of the synthetic stars are misclassified. This results in the following confusion matrix ! 0:98 0:03 C¼ 0:02 0:97 where the matrix element C11 indicates the percentage of disk tracers classified in disk groups, C21 is the percentage of disk tracers missclassified in halo groups, and so on. This matrix is very close to unity, and thus the classification seems to be secure. More confidence in this classification comes from the fact that the vast majority of old disk white dwarfs in the sample of Liebert, Dahn, and Monet (1988) are in the groups (0,2) and (2,1) which are labeled as intermediate population. Moreover, LHS 56, LHS 147 and LHS 291 belong to the group (1,0) which clearly is a halo group, and LHS 2984 belongs to the group (0,0), all these objects were already identified as halo members by Liebert et al. (1989), and were used to build their halo white dwarf luminosity function. The only object of the sample of Liebert et al. (1989) misclassified is LHS 282 which is classified in the group (0,2), which is intermediate population. All this evidence points in the same direction: the classification is correct. The percentages of halo tracers in the groups labeled as halo can be found for each of the halo groups in Fig. 2. Since all of them are larger than 80% all these groups can be securely labeled as halo.The so-called Sammon map of our groups is shown in Fig. 3. As it can be seen there, the distances between the disk nodes (labeled as ‘D’) are very small and, thus, these groups have similar characteristics. The same occurs for the halo groups (labeled as ‘H’) and thus this set of groups is homogeneous as well. However, the distances between Fig. 2. SOM of the sample of white dwarfs, see text for details. The group (0,0) is located in the lower left corner of the diagram and the group (4,4) is located in the upper right corner. As a rule of thumb H increases from right to left and MV decreases downwards in the diagram. E. Garcı́a-Berro et al. / Neural Networks 16 (2003) 405–410 409 Fig. 3. Sammon map of the SOM of Fig. 2. this last set of groups and the former set are much larger than distances inside both sets of groups. Moreover, the groups of intermediate population lie also at considerable distances from both sets of groups. All these facts allow us to conclude that the classification scheme is well defined. However, and for the sake of reliability we have only identified as halo candidates those white dwarfs belonging to groups which do not have a disk neighbor, namely (0,0), (0,1), (1,0) and (2,0). One interesting property of these white dwarfs is that all of them have MV # 14 and only four have proper motions in excess of 1:0 00 yr21 ; being the average kml ¼ 0:87 00 yr21 : However, most of them have p # 0:03 00 ; and are clustered around p , 0:01 in:; leading to tangential velocities in excess of 200 km s21 for 11 of our candidates. Only one candidate has a tangential velocity smaller than 100 km s21 : Therefore the detected population is intrinsically bright and distant. The halo white dwarf candidates detected here can be found in Table 1. Table 1 Halo white dwarf candidates identified using the neural network algorithm, along with their corresponding group and properties We have shown that an artificial intelligence algorithm is able to classify the catalog of spectroscopically identified white dwarfs and ultimately detect several potential halo white dwarfs. Some of these white dwarfs were already proposed as halo objects by Liebert et al. (1989). We have found as well that our halo candidates are bright and distant, and that most of them have large tangential velocities. The final answer to whether or not old white dwarfs are significant contributors to the baryonic dark matter will come from surveys which hopefully will be able to identify several high velocity and very cool white dwarfs. Such searches –the best example being the Sloan Digital Sky Survey (Harris et al., 2001)– are now underway. Most of these surveys are based on proper motion and color selection criteria, and will provide us with a fairly large amount of observational data which should be studied carefully. Moreover, future astrometric missions, like GAIA, will yield a huge amount of white dwarfs (Figueras et al., 1999). Processing all the data in order to identify the halo white dwarf population will, undoubtedly, require the use of complex classification techniques. Among these, Neural Networks, and in particular the SOM method, seem to provide excellent results and, hence, have a very promising future. Name Group MV m (in. yr21) p (in.) B2V Sp. type LHS 2984p LHS 3007 G 028-027 G 098-018 G 138-056 G 184-012 LP 640-069 LHS 56p LHS 147p LHS 151 LHS 291p LHS 529 LHS 1927 G 038-004 LHS 3146 G 021-015 G 035-026 G 128-072 G 271-106 GR 363 (0,0) (0,0) (0,0) (0,0) (0,0) (0,0) (0,0) (0,1) (0,1) (0,1) (0,1) (0,1) (1,0) (1,0) (2,0) (2,0) (2,0) (2,0) (2,0) (2,0) 11.62 13.06 12.41 11.81 13.34 13.18 12.75 13.51 13.64 13.46 13.39 13.94 11.41 12.31 11.88 11.54 11.12 12.53 11.77 11.39 0.930 0.636 0.281 0.426 0.692 0.427 0.284 3.599 2.474 1.142 1.765 1.281 0.661 0.428 0.579 0.390 0.335 0.457 0.396 0.133 0.015 0.028 0.003 0.003 0.006 0.017 0.009 0.069 0.016 0.053 0.012 0.046 0.009 0.010 0.024 0.015 0.007 0.025 0.014 0.003 0.03 0.29 0.03 0.38 0.37 0.26 0.29 0.36 0.40 0.33 0.11 0.64 0.11 0.17 0.17 0.05 20.14 0.21 0.18 20.03 DA DA DQ DA DA DC DA DA DC DA DQ DA DA DA DA DA DA DA DA DA The stars already identified in Liebert et al. (1989) are marked with an asterisk. 3. Summary and conclusions 410 E. Garcı́a-Berro et al. / Neural Networks 16 (2003) 405–410 Acknowledgements Part of this work was supported by the Spanish DGES project PB98-1183-C03-02, the MCYT grants ESP98-1348, AYA2000-1785 and HA2000-0038, and by the CIRIT. References Alcock, C. (2000). Science, 287, 74. Alcock, C., Allsman, R. A., Alves, D., Axelrod, T. S., Becker, A. C., Bennett, D. P., Cook, K. H., Freeman, K. C., Griest, K., Guern, J., Lehner, M. J., Mashall, S. L., Peterson, B. A., Pratt, M. R., Quinn, P. J., Rodgers, A. W., Stubbs, C. W., Sutherland, W., & Welch, D. L. (1997). The Astrophysical Journal, 486, 697. Bazell, D., & Peng, Y. (1998). The Astrophysical Journal Supplement Series, 116, 47. Figueras, F., Garcı́a-Berro, E., Torra, J., Jordi, C., Luri, X., Torres, S., & Chen, B. (1999). Balt. Astron., 8, 291. Flynn, C., Gould, A., & Bahcall, J. N. (1996). The Astrophysical Journal, 466, L55. Fontaine, G., Brassard, P., & Bergeron, P. (2001). Publications of the Astronomical Society of the Pacific, 113, 409. Garcı́a-Berro, E., Torres, S., Isern, J., & Burkert, A. (1999). Monthly Notices of the Royal Astronomical Society, 302, 173. Hambly, N. C., Smartt, S. J., & Hodgkin, S. T. (1997). The Astrophysical Journal, 489, L157. Hambly, N. C., Smartt, S. J., Hodgkin, S. T., Jameson, R. F., Kemp, S. N., Rolleston, W. R. J., & Steele, I. A. (1999). Monthly Notices of the Royal Astronomical Society, 309, L33. Harris, H. C., Hansen, B. M. S., Liebert, J., Vanden Berk, D. E., Anderson, S. F., Knapp, G. R., Fan, X., Margon, B., Munn, J. A., Nichol, R. C., Pier, J. R., Schneider, D. P., Smith, J. A., Winget, D. E., York, D. G., Anderson, J. E., Jr., Brinkmann, J., Burles, S., Chen, B., Connolly, A. J., Csabai, I., Frieman, J. A., Gunn, J. E., Hennessy, G. S., Hindsley, R. B., Ivezic, Z., Kent, S., Lamb, D. Q., Lupton, R. H., Newberg, H. J., Schlegel, D. J., Smee, S., Strauss, M. A., Thakar, A. R., Uomoto, A., & Yanny, B. (2001). The Astrophysical Journal, 549, L109. Hernández-Pajares, M., & Floris, M. (1994). Monthly Notices of the Royal Astronomical Society, 268, 444. Hodgkin, S. T., Oppenheimer, B. R., Hambly, N. C., Jameson, R. F., Smartt, S. J., & Steele, I. A. (2000). Nature, 403, 57. Ibata, R., Irwin, M., Bienaymé, O., Scholz, R., & Guibert, J. (2000). The Astrophysical Journal, 532, L41. Isern, J., Garcı́a-Berro, E., Hernanz, M., Mochkovitch, J., & Torres, S. (1998). The Astrophysical Journal, 503, 239. Kohonen, T. (1990). Proceedings of IEEE, 78(9), 1464. Kohonen, T. (1997). Self-organizing maps (Vol. 30). Springer series in information sciences. Berlin: Springer. Liebert, J., Dahn, C. C., & Monet, D. G. (1988). The Astrophysical Journal, 332, 891. Liebert, J., Dahn, C. C., & Monet, D. G. (1989). In G. Wegner (Ed.), White dwarfs (p. 15) Berlin: Springer. Marković, D., & Sommer-Larsen, J. (1997). Monthly Notices of the Royal Astronomical Society, 288, 733. McCook, G. P., & Sion, E. M. (1999). The Astrophysical Journal Supplement Series, 194, 520. Méndez, R. A., Minnitti, D., De Marchi, G., Baker, A., & Couch, W. J. (1996). Monthly Notices of the Royal Astronomical Society, 283, 666. Mochkovitch, R., Garcı́a-Berro, E., Hernanz, M., Isern, J., & Panis, J. F. (1990). Astronomy &Astrophysics, 233, 456. Naim, A., Lahav, O., Sodre, L., & Storrie-Lombardi, M. C. (1995). Monthly Notices of the Royal Astronomical Society, 275, 567. Oppenheimer, B. R., Hambly, N. C., Digby, A. P., Hodgkin, S. T., & Saumon, D (2001). Science 292, 698. Salaris, M., Garcı́a-Berro, E., Hernanz, M., Isern, J., & Saumon, D. (2000). The Astrophysical Journal, 544, 1036. Salpeter, E. E. (1961). The Astrophysical Journal, 134, 669. Serra-Ricart, M., Aparicio, A., Garrido, L., & Gaitan, V. (1996). The Astrophysical Journal, 462, 221. Tamanaha, C. M., Silk, J., Wood, M. A., & Winget, D. E. (1990). The Astrophysical Journal, 358, 164.
© Copyright 2026 Paperzz