Artificial neural network for the determination of Hubble Space Telescope aberration from stellar images Todd K. Barrett and David G. Sandier An artificial-neural-network method, first developed for the measurement and control of atmospheric phase distortion, using stellar images, was used to estimate the optical aberration of the Hubble Space Telescope. A total of 26 estimates of distortion was obtained from 23 stellar images acquired at several secondary-mirror axial positions. The results were expressed as coefficientsof eight orthogonal Zernike polynomials: focus through third-order spherical. For all modes other than spherical the measured aberration was small. The average spherical aberration of the estimates was -0.299 pumrms, which is in good agreement with predictions obtained when iterative phase-retrieval algorithms were used. Key words: Artifical neural network, Hubble Space Telescope,phase retrieval. 1. Introduction Recently investigators in several fields have found new applications for artificial neural networks. The nonlinear nature of these networks along with their inherent flexibility and adaptability makes them good candidates for problems that are not easily solved with more conventional algorithms. During the past two years an artificial neural network has been developed to estimate the distortion of an optical wave front directly from the intensity information in focused images. Specifically this method uses two point spread functions (PSF 's) obtained near the best focus of a telescope to determine the total phase aberration of the system. The majority of work has been directed toward the development of a neural network that is suitable for real-time compensation of atmospheric distortion in large astronomical telescopes, both for monolithic telescopes1' 2 and for array telescopes.3 4 Recently, the first results using the neural network for real-time control of an array telescope have been reported for the Multiple Mirror Telescope.5 We report on a neural network designed to recover the static aberration in the Hubble Space Telescope (HST) primary mirror. The neural-network method was usful to NASA's Jet Propulsion Laboratory for its Hubble Aberration The authors are with Thermo Electron Technologies, 9550 Distribution Avenue, San Diego, California, 92121. Received 15 April 1992. 0003-6935/93/101720-08$05.00/0. i 1993 Optical Society of America. 1720 APPLIED OPTICS / Vol. 32, No. 10 / 1 April 1993 RecoveryProject (HARP)because it provided independent confirmation of the HST aberration by using a method that is quite different from the iterative phase-retrieval methods used by other investigators involved in the project.6 7 Iterative methods81 0 have the advantage of devoting detailed numerical effort to any particular image, and in principle the resolution in either image or Fourier space is unlimited. Sometimes, however, iterative methods lead to pathological results when they are applied to data from real optical systems because of their inability to model camera noise, jitter, and other sources of image distortion correctly. The neural network is complementary in that it spends little time considering a single image. Instead the network concentrates on learning a general rule or mapping for a given set of physical parameters, and therefore the performance of the network tends to depend less on the details of a particular image and more on the repeatable comprehensive features of the data. In addition, because the neural network is trained to learn general inputoutput mappings, one could use networks to predict some of the gross optical system parameters of the HST by using PSF data. Examples of parameters that might be estimated include secondary-mirror defocus, component misalignment, and jitter. Also, because of the speed at which the neural-network algorithm can be executed, the method could be valuable in the future as a diagnostic tool for the HST by providing the capability for on-line beam-quality monitoring from stellar images before science runs. 2. Neural Networks for the Recovery of Phase As their experience with imaging systems increases, optical scientists develop the ability to recognize the presence of certain optical aberrations by the characteristic shapes and features that the phase distortion induces in the PSF. This process is analogous to that used in the neural-network method-in which a neural network is trained to correlate features in PSF data with aberrations in the pupil plane. The ability to learn a mapping or correlation from features in observed data to a quantitative description of the data is a property of neural networks that has been exploited for many applications and suggests the use of this method to determine aberrations across the pupil. The human ability is qualitative at best and often fails when the phase spectrum contains many aberrations that diffract collectivelyto form the image. The neural network, however, can be trained to recognize complicated aberrations even in the presence of high-spatial-frequency distortion that scintillates the image. The single-image phase-retrieval problem is nonlinear. Assuming uniform illumination of the aperture and ignoring all secondary effects occurring because of diffraction between optical elements of the system, we may express the angular PSF at the focal plane of a telescope as I(0) xC|w(r)exp[i4)(r)]exp( Ar 0 rdrS (1) where (r) is the phase distortion of the system projected onto the entrance pupil, r is a position vector in the plane of the pupil, Xis the wavelength, w(r) is the pupil function, and the integral is done over all r for which the pupil function is nonzero. The exponential in Eq. (1)accounts for the nonlinearity of phase recovery. A neural network is an inputoutput information processor composed of parallel layers of elements or neurons, loosely modeled on biological neurons, which possess local memory and are capable of elementary arithmetic."' 2 It is the nonlinearity of the individual neurons that allows the neural network to learn nonlinear input-output mappings, and in this respect the neural network is a good choice for the inherently nonlinear phase-retrieval problem. Notice from Eq. (1) that if an aperture that has an even pupil function [w(r) = w(-r)] is uniformly illumi- nated by a point source, the PSF intensity is the same for the two phase distortions +(r) and +'(r) if ¢'(r) = This ambiguity in the mapping from image -4(-r). data to phase distortion can lead to difficulties for the neural-network method, which uses PSF data from a single stellar point source as input. Strictly speaking few Author queried telescopes have pupil functions that are exactly even, and therefore in theory the ambiguity is no problem. Support structures for secondary mirrors and other optics usually break the symmetry of the aperture. This is true for the HST as it is for most ground- based telescopes. However, these structures are usually small and are designed so that diffraction will minimize their effects on the PSF of the telescope. Therefore reliance on features induced by a noneven pupil function to determine the correct solution can lead to pathological behavior by the network when data obtained near the best focus of the telescope are used, since the network must be trained to distinguish between two solutions that are quite different but that correspond to nearly identical input data. In our work with astronomical telescopes we have used a more practical method of resolving the near ambiguity described above. We have trained the network with the data from two simultaneous images obtained at different defocused image planes on opposite sides of the best focus of the telescope. The added information in the second image plane leads to a unique solution for the phase, which permits, for example, the sign of the defocus to be determined.13"14 The neural-network method of phase determination has several advantages for certain applications. Real-time operation (at rates in excess of 1 kHz) is possible because the computational architecture of the network is relatively simple and may be implemented in parallel. As mentioned above it is not iterative and does not dither parameters; rather the network learns an appropriate intensity-to-phase mapping. If the method is used as an on-line monitor of beam quality, the optics required for the neural network may be simpler than for most other wavefront measuring devices such as Hartmann sensors, since the focal-plane camera needed to acquire the input data for the network may be located near the entrance pupil of the telescope, which eliminates the need for a separate coud6 path. It is also well suited for imaging applications since it operates directly on the quantity of primary interest, namely, the PSF of the system. Finally, the approach is adaptable, since the network itself may be retrained or adapted to changing conditions. The neural network may be trained to estimate phase aberration in terms of a variety of representations. For example, we have generated networks that determine average phase and wave-front slopes over small subapertures of the entrance pupil, or alternatively we have trained networks that determine the phase aberration with respect to orthogonal functions defined over the entrance pupil. These representations may be suitable for use in a real-time atmospheric compensation system where a deformable mirror is used to control the wave front over subapertures, but for the HST phase-determination problem (where inspection of the data suggests mainly low-frequency distortion) the representation in terms of the orthogonal basis functions is simple since most low-order optical aberrations may be described with only a few of these orthogonal functions. Therefore we chose to train our neural network to estimate phase distortion in terms of a finite number of the well-known Zernike polynomials.15 In this configuration each output node of the network produced a 1 April 1993 / Vol. 32, No. 10 / APPLIED OPTICS 1721 coefficient Zi so that the phase aberration of the HST projected onto the entrance pupil could be accurately approximated by Each neuron in the hidden layer performs the sum S = + E WjkIk k N +k(r)= ZiPi(r), (2) where the Pi(r) is a radial Zernike polynomial orthogonal over the entrance pupil of the HST.16 The neural network does not require a high resolution to recover low-spatial-frequency distortion. For example, we have shown' that we can recover 18 Zernike modes from turbulence-degraded PSF data that are 20 times the diffraction limit by using pixels with 3 times the diffraction-limited angular resolution. This reduced pixel resolution may be achieved optically or by electronically binning adjacent pixels of a more highly resolved image. The use of pixels that are larger than the diffraction limit increases the signal-to-noise ratio of stellar data. It also reduces the number of inputs to the neural network, allowing the size and computational complexity of the network to be reduced. This shortens the training time required by the network and increases the throughput of the system. 3. Neural-Network Architecture Several different neural-network architectures exist and are differentiated by the manner in which the processing nodes are interconnected, the arithmetic is performed by each node, and the method of training the network. The phase-recovery network used in the characterization of the HST aberration employed three layers of neurons called perceptrons. Its architecture is shown schematically in Fig. 1. The first layer takes the pixelized intensity (camera data) from the two focal planes and passes these values to the second layer. Each of the k th input nodes is connected to thejth node in the second or hidden layer by a connection weight Wk, which multiplies the input intensity Ik. Similarly, the hidden nodes are connected to output nodes by a second set of weights Vij. OUTPUT NEURONS CONNECTION WEIGHTS VI) HIDDEN LAYER OFNEURONS CONNECTION WEIGHTS W I 00 . . . . DATAFROM FIRSTFOCAL PLANEIMAGE Ib0 . . . .- |INPUT NEURONS DATA FROM SECOND FOCAL PLANE IMAGE Fig. 1. Schematic representation of the two-input-plane neuralnetwork architecture used for phaso rotrieval of the HST aborration. 1722 APPLIED OPTICS / Vol. 32, No. 10 / 1 April 1993 (3) and passes the result through a nonlinear sigmoid function: 1 f I) exp(-u)(4 (4) The output of each hidden node or neuron, oj = f (sj), is then multiplied by Vijand passed to the ith output node. Finally the output nodes compute the sum zi = I Vyoj, i (5) where Zi is the output of the ith output node and represents the network's estimate of the distortion corresponding to the Zernike mode associated with that node. We trained the network by adjusting the connection weights Wk and Vij using the externally supervised backpropagation algorithm described by Rumelhart et al.17 This technique consists of a fixed-stepsize steepest-descent gradient-search method that uses training data to minimize a function represent- ing the error between the network's outputs and the correct solution. First, the training data are generated from many realizations of randomly chosen aberrations that fall within a range looselycorresponding with the conditions anticipated for the actual application of the network. Then these training data are repeatedly presented to the network, and the connection weights are adjusted to minimize the squared error betweenthe network's outputs and the known solutions. The training data consist of a set of individual input images and the Zernike coefficients describing the aberration used to generate the images (i.e., the known solution). In principle the training data can be geneated either numerically or optically, but for most applications, including the HST, it is more practical to use the former method. In the backpropagation algorithm the network weights are adjusted after each training iteration. An iteration consists of passing a single training image through the network and computing the gradient of the error function with respect to the connection weights given the present training image. The error function utilized in this study was simply the squared error between the network's outputs and the known Zernike coefficients: E= ( - Z)2 (6) where E is the squared error, and Zi are the Zernike coefficients corresponding to the aberration used to create the training data. Using Eqs. (3)-(5), we may derive the followingtraining relations: Wjk(t + 1) = Wjk(t) + U'8j Ik, (7) V. (t + 1) = Vjj(t) + aioj(t), (8) where (t) and (t + 1) indicate the training iteration number, a and u.' are learning rates that are prescribed by the user, and ai = 0i(t)[ - o(t)] 8i= -Zi. bivii, (9) (10) The learning rates determine how many training iterations are required for convergence of the network and how accurately the minimum of the error function is determined. In general small learning rates take longer to converge but produce a network with a smaller residual error.' 8 Experience with neural networks trained to recover phase aberration created by the turbulent atmosphere has shown that a neural network will typically converge to a small residual error after a few hundred thousand training iterations. However, for the present application where an extremely high degree of accuracy was desired, approximately 106 training iterations were used. Since the generation of training data requires the computation of a twodimensional FFT and is therefore quite time-consuming, the generation of 106 sets of training data is prohibitive. Instead a smaller number of training patterns are generated and passed through the network several times. There is a limit, however, to the minimum size of the training set. Care must be taken to include enough independent realizations of distortion to insure that the network learns a general functional input-output mapping for the phaseretrieval problem as opposed to memorizing the mapping for only a few patterns. By testing the neural network with data not in the original training set, we have found that 4000-6000 individual input patterns are all that are required to insure that no memorization occurs and that a general mapping is learned. For this size training set the process of generating training data and training the network could be accomplished in a 24-h period on a PC compatible 386 computer with a digital signal processing board containing a single Texas Instruments TMSC302OC30digital signal processing chip. 4. Results for the HST The major purpose of HARP was to produce estimates of the HST primary aberration with as much accuracy and confidence as possible. To achieve this goal we tested the neural network on simulated stellar images produced by the Space Telescope Science Institute (STScI), and we applied it to as many HST stellar images as possible. Testing the neural network against the STScI simulations permitted the accuracy and applicability of the method to be evaluated with data for which the exact answer was known. Obtaining as many estimates of the HST aberration as possible helped raise the confidence of the results and defined over what range of operating conditions any correction of the HST aberration would be effective. Another assessment of the accuracy and consistency of the neural-network results was obtained by tesing each network on 500 simulated images that we generated ourselves and by computing the rms residual error for each Zernike mode. These test images were similar to the data used to train the network and consisted of pixelized intensity data generated from aberrations with random mixtures of distortions, but they were not actually included in the training data. All the neural networks applied to the HST images were trained on simulated stellar images computed with a simple one-step propagation code based on Eq. (1). These images were generated on a 128 x 128 grid with a resolution that is equal to the resolution of the HST images. All HST images applied to the neural network were obtained with a single cameracalled the planetary camera 6 (PC6)-which is part of the wide field/planetary camera (14F/PC). This instrument consists of several cameras that are appropriate for different scientific applications. The WF/PC contains several narrow-band filters, which may be placed before the optical detectors. All stellar images used in this work were narrow band and were obtained through one of these filters. The resolution of PC6 is 0.043 arcsec, and its optical detector consists of an 800 x 800-pixel CCD array. We used a simple pupil mask, which was equal to the diameter of the HST aperture and which modeled the primary-mirror mounting pads and the secondary spiders as obscurations in the pupil plane. Diffraction from optics other than the primary mirror was ignored, as were the effects of telescope misalignment and jitter. We generated a diverse set of training data by using a different random mixture of Zernike coefficients for the phase distortion in each image. The average magnitude and the variance of the Zernike aberrations contained in the training set were chosen so that they would include any reasonable value of HST aberration. We preprocessed both the simulated stellar images and the real HST stellar images before application to the neural network. This preprocessing consisted of choosing a region of interest centered around the centroid of the stellar image, subtracting off any background pedestal, binning adjacent pixels to the resolution required by the network, and normalizing the pixelized intensity pattern to its brightest pixel. The exact dimensions of the region of interest and the resolution of the input pixels for the neural network were determined empirically and varied with the wavelength of the filter used and with the position of the HST secondary mirror. In general the region of interest was chosen so that it was just big enough to include the stellar image. For example, the region of interest for images obtained with an 889-nm filter and a secondary position that corresponded to dz = 333 plm was 96 x 96 pixels (or 4.13 x 4.13 arcsec), while for a 487-nm filter and a secondary position of dz = 90 pm, a 48 x 48-pixel (or 2.06 x 2.06-arcsec) region of interest was used, where the quantity dz 1 April 1993 / Vol. 32, No. 10 / APPLIED OPTICS 1723 corresponds to the axial secondary-mirror position relative to an arbitrary origin determined before the HST was put in orbit. Three of the simulated stellar images produced by STScI were used to test the neural-network method. The ambiguity of the phase-to-image mapping was overcome by using two 16 x 16 input images. The results are presented in Table 1 along with the Zernike coefficients used by STScI to generate the images. Two estimates of the aberration were computed by using two pairings of the three images. The results labeled A were obtained by input images simulating secondary-mirror positions that are equivalent to moving the HST secondary mirror +0 and -170 l.im from the point of best focus. The equivalent secondary-mirror positions for the results labeled B were +0 and +170 pim. Since the neural network does not require highly resolved images, and the STScI simulated images were produced with a resolution that is greater than diffraction limited, adjacent pixels were averaged to obtain square input pixels with dimensions of 0.258 arcsec for the images corresponding to secondary positions of ± 170 pimand 0.172 arcsec for the image at the secondary position of 0 [im. Coefficients for a total of eight Zernike modes were estimated (focus through third order spherical). The network's estimate of focus is not included in the table since it is dependent on the secondary-mirror position and does not represent an actual aberration in the HST. Sixty-four hidden elements were used in this network; the exact number of nodes in the hidden layer does not critically influence performance and is generally determined empirically. The small residual error of the network's estimates on the 500 test images indicate that the weights have converged. Also notice that the agreement between the neural network's estimates and the correct Zernikes coefficient are quite good. For all modes except spherical the aberrations are small, and the average value for spherical is within 0.0005 jim rms of the value used in creating the images. After testing the neural network method on the STScI simulated images, we applied it to real stellar images acquired by the WF/PC. Initially we reTable1. Neural-Network Resultsfor TwoPairs-A and B-of STScl SimulatedImagesa Zernike Coefficient (pumrms Distortion Across Aperture) Mode Rms Error on Test Correct Values ([lm rms Distortion A B Patterns Across Aperture) 5 6 7 0.003 -0.014 0.030 0.008 0.024 0.008 0.01 0.01 0.01 0.001 0.007 -0.012 8 -0.024 -0.003 0.02 -0.004 9 10 0.007 -0.006 0.011 0.013 0.008 0.01 0.0 0.0 11 -0.242 -0.231 0.01 -0.236 aMode 5 is 45o astigmatism, 6 is 00 astigmatism, 7 isx coma, 8 isy coma, 9 isx clover, 10 isy clover, and 11 is spherical. 1724 APPLIED OPTICS / Vol. 32, No. 10 / 1 April 1993 solved the ambiguity of the phase-to-intensity maping by using two input planes as described in Section 2. The neural-network parameters were similar to those used to estimate the phase of the STScI simulations (i.e., two 16 x 16 input images, 64 hidden elements, and eight Zernike coefficients). Six PC6 images were used to produce three independent evaluations of the HST primary mirror aberration. As did the networks used on the STScI images, the networks trained for use on the real WE/PC images performed quite well on the 500 test images produced with the single-step propagation. However, during the course of HARP some questions were raised concerning the repeatability and accuracy of the HST secondary-mirror axial position, and this can potentially impact the neural network's accuracy. An error in the secondary-mirror position degrades the accuracy of the network, since we trained the network assuming that the two input images are located at planes that are displaced a fixed and known amount from the best focus. This rather stringent requirement may be relaxed by training the network to recognize focal aberrations. In this case the position of the secondary relative to the best focus of the system need be determined only within a constant offset, because the network will now measure any constant offset in the secondary-mirror position as an aberration corresponding to the focus Zernike polynomial. But even in this configuration we must still train the neural network assuming an accurate knowledge of the difference in secondary-mirror positions between the two required input images. Therefore the distance moved by the secondary between the acquisition of the two neural-network images must be repeatable and precise. From discussions with other investigators in HARP we felt that, while this may not be a problem, it would be better to develop a method of phase recovery that did not rely on exact knowledge of the secondary position. As mentioned in Section 1 the neural-network method was originally conceived for use in a real-time adaptive optics system designed to compensate for the distortion caused by atmospheric turbulence. For this application the aberration at the pupil has a rapidly varying random shape with a spatial power spectrum determined by the atmospheric seeing conditions. Also the peak-to-peak distortion of the wave front may be several micrometers. This means that the neural network used for a ground-based adaptive optics system must be able to estimate the phase aberrations that are large when compared with the HST aberration and which may be much different from one realization to the next. In these conditions the near-ambiguity inherent in the phase-image relationship is a problem, and the neural-network method requires two planes to estimate the phase correctly. For the determination of the HST aberration the phase distortion is relatively small compared with that which can be expected from the turbulent atmosphere and it is also static in time. In this case the ambiguity may be resolved in another manner. It is -0.2 E A z a J LO 0r 8 A -0.3 - U 0a.) co AVERAGEOF SINGLE-PLANE ESTIMATES -0.4 -200 . I -100 . . . 0 . 100 . . 200 . I 300 . . . 400 500 dz Wn Fig. 3. Summary of the neural-network estimates of HST spherical aberration as a function of secondary-mirror spacing dz: El, single-input-plane estimates; A, two-input-plane estimates. Fig. 2. Example of a single-input-plane network applied to a HST stellar image: dz = +333, taken with a narrow-band filter with a center frequency of X = 889 nm. Top: of the predicted HST aberration. Middle: simulated stellar im- simulated interferogram age calculated with a one-step diffraction propagation algorithm by using the aberration predicted by the neural network. Bottom: HST image binned for input into the neural network. possible to use the original two-input-plane network to obtain an initial estimate of the phase distortion that is sufficiently accurate to define a range of Zernike coefficients that are large enough to include all reasonable values of the HST distortion but within which no ambiguity is encountered. New networks can then be trained that use only one image plane as input and yet converge without ambiguity because the training data are constrained by the initial estimate of the two-input-plane network. Figure 2 and Table 2 illustrate the performance of a typical single-input-plane neural network. Figure 2 shows the input data for the network, a simulated image corresponding to the network's estimate of phase distortion, and a simulated interferogram of the predicted distortion. Table 2 lists the Zernike coefficients for seven of the modes computed by the network and the rms residual error on the 500 training patterns. Preprocessing was essentially identical to that described above for the two-inputTable2. Exampleof Neural-Network Resultsonan HSTStellarImage Whena SingleInputPlaneis Used Coefficient Mode (pm rms distortion across aperture) Rms Error on 500 Test Patterns 5 6 7 8 9 10 11 0.013 -0.052 -0.035 0.017 -0.047 0.036 -0.289 0.008 0.008 0.012 0.011 0.007 0.007 0.008 plane network. All single-input-plane results were obtained with a 16 x 16 square array of intensity pixels, 32 hidden nodes, and eight output nodes. As with the two-input-plane method adjacent pixels were averaged to obtain the resolution appropriate for the network. The input image in Fig. 2 has 0.344-arcsec pixels that correspond to binning 8 x 8 PC6 pixels. Table3. Neural-Network Predictions of HSTSphericalAberration Spherical Aberration (pumrms distortion across aperture) dz Rms Error on (pum) Test Patterns Single-Input-Plane Results -0.300 -90 -0.295 -90 -0.309 -90 X Conic (nm) Constant 0.005 0.005 0.005 631 631 631 -1.0155 -1.0153 -1.0159 -0.307 -0.284 -90 -90 0.005 0.005 631 631 -1.0158 -1.0148 -0.307 -0.313 -0.313 -90 -90 -90 0.005 0.005 0.005 631 631 631 -1.0158 -1.0161 -1.0161 -0.296 -90 0.004 889 -1.0153 -0.291 -90 0.004 889 -1.0151 -0.299 -0.289 -0.281 -0.273 -0.287 -0.306 -0.306 333 333 333 333 333 10 -10 0.01 0.01 0.01 0.01 0.01 0.006 0.006 889 889 889 889 889 487 487 -1.0155 -1.0150 -1.0146 -1.0142 -1.0149 -1.0158 -1.0158 -0.327 -5 0.006 487 -1.0168 -0.332 -0.328 -0.272 -0.281 -0.281 0 5 250 250 250 0.006 0.006 0.02 0.02 0.02 487 487 487 889 889 -1.0170 -1.0168 -1.0142 -1.0146 -1.0146 0.02 0.01 0.01 487 889 889 -1.0136 -1.0147 -1.0151 Two Input-Plane Results -0.259 -0.283 -0.297 - 1 April 1993 / Vol. 32, No. 10 / APPLIED OPTICS 1725 A total of 26 estimates of the HST aberration was obtained. Three were made by application of a twoinput-plane network to pairs of HST images acquired at different secondary-mirror spacings. Twentythree estimates were made with single-input-plane neural networks. In Fig. 3 and Table 3 the estimates of the spherical aberration are summarized for all two-input and single-input results. We obtained the results by imaging through narrow-band optical filters with center frequencies indicated in the table. In all cases the Zernike coefficients for modes other than spherical aberration were small and were nearly zero mean when averaged over all 26 estimates. As mentioned above the two-input-plane results are viewed with less confidence because of the possible inaccuracies arising from errors in HST secondarymirror placement. The average spherical aberration predicted by the two-input-plane network was a -0.279-jim-rms wave-front error, and the average spherical aberration predicted by the single-inputplane networks was - 0.299,jm rms. The secondarymirror position corresponding to zero Zernike defocus may be determined from the average focal aberration predicted by the single-input-plane neural network and was found to be dz = 110 jim. This value is similar to that calculated from the results generated by the other phase-retrieval methods used approximately +0.01 jim rms, but the scatter in the neural-network estimates was larger than the 2or that might be expected. One source of this large variance in the measurements may have been the simple one-step numerical propagation used to create the training data for the neural network, which does not model multiple-surface diffraction. Other factors that limited the ability to simulate the HST exactly were off-axis effects caused by small misalignments between the HST and WF/PC optics and telescope jitter. The small residual error that resulted when the network was applied to the simulated in HARP. 6 The neural network was used to obtain estimates of spherical aberration at some secondary mirror positions (dz = -10, -5, 0, 5, 10) not examined by other participants in HARP. These positions were nearer the best focus of the HST, and therefore the stellar images acquired were smaller and contained fewer features. This may impact the accuracy of the estimates since the neural-network method relies on image features to recover phase. Notice that the results for these points were at the edge of the predictions of spherical aberrations. Dropping them from the calculations of average spherical coefficient yields a value of -0.291 m rms or a conic constant of -1.0151, which is closer to the values obtained by other groups. However, the networks trained for these defocus points did no worse on the test images than did those networks trained to operate on more defocused images. This suggests that the larger spherical aberration predicted at these points may be 5. Conclusions The neural-network method of recovering phase distortions provided estimates of the HST primarymirror aberration that were well converged and selfconsistent, in the sense that the network predicted phase within only a small residual error when it was tested on trial data not in the training set. The network also performed well on the STScI simulated images for which an exact answer was known; it estimated the spherical aberration to within 0.0005 jim rms. On real HST stellar images the singleinput-plane neural network predicted a spherical aberration of -0.299 m rms, which corresponds to a conic constant of -1.01546 0.0008; when the two-input-plane results are included a conic constant of -1.01534 ± 0.0008 was predicted. This agrees well with the spherical aberration predicted by other investigators. For PC6 the average spherical aberration computed by all authors 7 was -0.2855 jim rms or a conic constant of -1.0148 ± 0.0005. Table 3 provides a list of all the neural-network estimates. As can be seen the results were quite consistent, which leads to great confidence in both the neuralnetwork estimates and the average estimates of HARP as a whole. However, the data suggest some comments. The network trained and tested quite well on our simulated images, but the accuracy of the neural network and other phase-retrieval methods cannot be better than the simulations used to produce the results. This is evident from the fact that each individual network trained to an expected accuracy of 1726 APPLIED OPTICS / Vol. 32, No. 10 / 1 April 1993 trial data indicatesthat the neural-network method is capable of providing great accuracy and that improving the fidelity of the training data should improve the consistency of the estimated distortion. If the optical system cannot be modeled with the desired precision, it may be possible to train the neural network to ignore effects that cannot be predicted accurately. This might be accomplished by using training data in which the parameters that cannot be well characterized are varied throughout a range that includes all reasonable values. The training algorithm then tries to develop weights that will pick out the quantities of interest (in this case the spherical aberration) regardless of the values of the other parameters. attributable to the fidelityof the training data that may deteriorate near the focus of the system (because of off-axis or diffraction effects) or to a real dependence of HST spherical aberration on the secondarymirror position. This work was supported by the Jet Propulsion Laboratory, California Institute of Technology, Pasadena, Calif., and NASA. The authors thank Robert R. Korechoff and Art H. Vaughan of the Jet Propulsion Laboratory for inviting us to be members of the HARP panel and for many helpful suggestions. We acknowledge many interesting discussions with other members of the HARP team. We also thank Robert Q. Fugate of the Air Force Phillips Laboratory, Albuquerque, N.M., who funded our study and collab- orated with us on the use of the neural network to control turbulent aberrations in an astronomical telescope. 1991). References 1. D. G. Sandler, T. K. Barrett, D. A. Palmer, R. Q. Fugate, and W. J. Wild, "Use of a neural network to control an adaptive optics system for an astronomical telescope," Nature (London) 351, 300-302 (1991). 2. D. G. Sandler, T. K. Barrett, and R. Q. Fugate, "Recovery of atmospheric phase distortion from stellar images using an artificial neural network," in Active and Adaptive Optical Components, M. A. Ealey, ed., Proc. Soc. Photo-Opt. Instrum. Eng. 1543, 491-499 (1992). 3. J. R. P. Angel, P. Wizinowich, M. Lloyd-Hart, and D. Sandler, "Adaptive optics for array telescopes using neural-network techniques," Nature (London)348, 221-224 (1990). 4. P. L. Wizinowich, M. Lloyd-Hart, B. A. McLeod, D. Colucci, R. G. Dekany, D. Wittmann, J. R. Angel, D. W. McCarthy Jr., W. G. Hulburd, and D. G. Sandler, "Neural network adaptive optics for the multiple-mirror telescope," in Active and Adaptive Optical Components, 7. R. Korechoff, "Image inversion analysis," in Wide-Field and Planetary Camera: Special Review Panel, document JPL D-8181 (Jet Propulsion Laboratory, Pasadena, Calif., March M. A. Ealey, ed., Proc. Soc. Photo- Opt. Instrum. Eng. 1542, 148-158 (1992). 5. M. Lloyd-Hart, P. Wizinowich, B. McLeod, D. Wittman, D. Colucci, R. Dekany, D. McCarthy, J. R. P. Angel, and D. Sandler, "First results of an on-line adaptive optics system, with atmospheric wavefront sensing by an artificial neural network," Astrophys. J. Lett. 390, L41-L44 (1992). 6. S. Brewer, S. Ebstein, J. Fienup, R. Gonsalves, M. Levine, M. Litvak, R. Lyon, P. Miller, C. Roddier, F. Roddier, and M. Shao, Notes from HARP Workshops, 15-16 November (1990) and 5-6 February (1991); unpublished, obtainable from the authors of this paper. 8. J. R. Fienup, T. R. Crimmins, and W.Holsztynski, "Reconstruction of the support of an object from the support of its autocorrelation," J. Opt. Soc. Am. 72, 610-624 (1982). 9. J. R. Fienup, "Phase retrieval algorithms: a comparison," Appl. Opt. 21, 2758-2769 (1982). 10. J. R. Fienup, "Reconstruction of a complex-valuedobject from the modulus of its Fourier transform using a support constraint," J. Opt. Soc. Am. A 4, 118-123 (1987). 11. D. 0. Hebb, The Organization of Behavior (Wiley, New York, 1949). 12. F. Rosenblatt, Principles of Neurodynamics (Spartan, Washington, D.C., 1962). 13. R. A. Gonsalves, "Phase retrieval and diversity in adaptive optics," Opt. Eng. 21, 829-832 (1982). 14. R. G. Paxman and J. R. Fienup, "Optical misalignment sensing and image reconstruction using phase diversity," J. Opt. Soc. Am. 5,914-923 (1988). 15. M. Born and E. Wolf, Principles of Optics (Pergamon, New York, 1970), pp. 464-466. 16. C. Burrows, Hubble Space Telescope Optical TelescopeAssembly Handbook (Space Telescope Science Institute, Baltimore, Md., 1990), Version 1.0, pp. 38-40. 17. D. E. Rumelhart, G. E. Hinton, and R. J. Williams, Parallel Distributed Processing: Explorations in the Microstructure of Cognition (MIT, Cambridge, Mass., 1986), Vol. 1, pp. 318-362. 18. P. D. Wasserman, Neural Computing: Theory and Practice (Van Nostrand Reinhold, New York, 1989), pp. 43-60. 1 April 1993 / Vol. 32, No. 10 / APPLIED OPTICS 1727
© Copyright 2026 Paperzz