Artificial neural network for the determination of Hubble Space

Artificial neural network for the determination of
Hubble Space Telescope aberration from stellar
images
Todd K. Barrett and David G. Sandier
An artificial-neural-network method, first developed for the measurement and control of atmospheric
phase distortion, using stellar images, was used to estimate the optical aberration of the Hubble Space
Telescope. A total of 26 estimates of distortion was obtained from 23 stellar images acquired at several
secondary-mirror axial positions. The results were expressed as coefficientsof eight orthogonal Zernike
polynomials: focus through third-order spherical. For all modes other than spherical the measured
aberration was small. The average spherical aberration of the estimates was -0.299 pumrms, which is in
good agreement with predictions obtained when iterative phase-retrieval algorithms were used.
Key words: Artifical neural network, Hubble Space Telescope,phase retrieval.
1.
Introduction
Recently investigators in several fields have found
new applications for artificial neural networks. The
nonlinear nature of these networks along with their
inherent flexibility and adaptability makes them good
candidates for problems that are not easily solved
with more conventional algorithms. During the past
two years an artificial neural network has been
developed to estimate the distortion of an optical
wave front directly from the intensity information in
focused images. Specifically this method uses two
point spread functions (PSF 's) obtained near the best
focus of a telescope to determine the total phase
aberration of the system. The majority of work has
been directed toward the development of a neural
network that is suitable for real-time compensation
of atmospheric distortion in large astronomical telescopes, both for monolithic telescopes1' 2 and for array
telescopes.3 4 Recently, the first results using the
neural network for real-time control of an array
telescope have been reported for the Multiple Mirror
Telescope.5 We report on a neural network designed
to recover the static aberration in the Hubble Space
Telescope (HST) primary mirror.
The neural-network method was usful to NASA's
Jet Propulsion Laboratory for its Hubble Aberration
The authors are with Thermo Electron Technologies, 9550
Distribution Avenue, San Diego, California, 92121.
Received 15 April 1992.
0003-6935/93/101720-08$05.00/0.
i 1993 Optical Society of America.
1720
APPLIED OPTICS / Vol. 32, No. 10 / 1 April 1993
RecoveryProject (HARP)because it provided independent confirmation of the HST aberration by using a
method that is quite different from the iterative
phase-retrieval methods used by other investigators
involved in the project.6 7 Iterative methods81 0 have
the advantage of devoting detailed numerical effort to
any particular image, and in principle the resolution
in either image or Fourier space is unlimited.
Sometimes, however, iterative methods lead to pathological results when they are applied to data from real
optical systems because of their inability to model
camera noise, jitter, and other sources of image
distortion correctly. The neural network is complementary in that it spends little time considering a
single image. Instead the network concentrates on
learning a general rule or mapping for a given set of
physical parameters, and therefore the performance
of the network tends to depend less on the details of a
particular image and more on the repeatable comprehensive features of the data. In addition, because
the neural network is trained to learn general inputoutput mappings, one could use networks to predict
some of the gross optical system parameters of the
HST by using PSF data. Examples of parameters
that might be estimated include secondary-mirror
defocus, component misalignment, and jitter. Also,
because of the speed at which the neural-network
algorithm can be executed, the method could be
valuable in the future as a diagnostic tool for the HST
by providing the capability for on-line beam-quality
monitoring from stellar images before science runs.
2.
Neural Networks for the Recovery of Phase
As their experience with imaging systems increases,
optical scientists develop the ability to recognize the
presence of certain optical aberrations by the characteristic shapes and features that the phase distortion
induces in the PSF. This process is analogous to
that used in the neural-network method-in which a
neural network is trained to correlate features in PSF
data with aberrations in the pupil plane. The ability
to learn a mapping or correlation from features in
observed data to a quantitative description of the data
is a property of neural networks that has been
exploited for many applications and suggests the use
of this method to determine aberrations across the
pupil. The human ability is qualitative at best and
often fails when the phase spectrum contains many
aberrations that diffract collectivelyto form the image.
The neural network, however, can be trained to
recognize complicated aberrations even in the presence of high-spatial-frequency distortion that scintillates the image.
The single-image phase-retrieval problem is nonlinear. Assuming uniform illumination of the aperture
and ignoring all secondary effects occurring because
of diffraction between optical elements of the system,
we may express the angular PSF at the focal plane of
a telescope as
I(0) xC|w(r)exp[i4)(r)]exp(
Ar 0 rdrS
(1)
where (r) is the phase distortion of the system
projected onto the entrance pupil, r is a position
vector in the plane of the pupil, Xis the wavelength,
w(r) is the pupil function, and the integral is done
over all r for which the pupil function is nonzero.
The exponential in Eq. (1)accounts for the nonlinearity of phase recovery. A neural network is an inputoutput information processor composed of parallel
layers of elements or neurons, loosely modeled on
biological neurons, which possess local memory and
are capable of elementary arithmetic."' 2 It is the
nonlinearity of the individual neurons that allows the
neural network to learn nonlinear input-output mappings, and in this respect the neural network is a good
choice for the inherently nonlinear phase-retrieval
problem.
Notice from Eq. (1) that if an aperture that has an
even pupil function [w(r) = w(-r)] is uniformly illumi-
nated by a point source, the PSF intensity is the same
for the two phase distortions +(r) and +'(r) if ¢'(r) =
This ambiguity in the mapping from image
-4(-r).
data to phase distortion can lead to difficulties for the
neural-network method, which uses PSF data from a
single stellar point source as input. Strictly speaking few Author queried telescopes have pupil functions that are exactly even, and therefore in theory
the ambiguity is no problem. Support structures for
secondary mirrors and other optics usually break the
symmetry of the aperture.
This is true for the HST as it is for most ground-
based telescopes. However, these structures are usually small and are designed so that diffraction will
minimize their effects on the PSF of the telescope.
Therefore reliance on features induced by a noneven
pupil function to determine the correct solution can
lead to pathological behavior by the network when
data obtained near the best focus of the telescope are
used, since the network must be trained to distinguish between two solutions that are quite different
but that correspond to nearly identical input data.
In our work with astronomical telescopes we have
used a more practical method of resolving the near
ambiguity described above. We have trained the
network with the data from two simultaneous images
obtained at different defocused image planes on opposite sides of the best focus of the telescope. The
added information in the second image plane leads to
a unique solution for the phase, which permits, for
example, the sign of the defocus to be determined.13"14
The neural-network method of phase determination has several advantages for certain applications.
Real-time operation (at rates in excess of 1 kHz) is
possible because the computational architecture of
the network is relatively simple and may be implemented in parallel. As mentioned above it is not
iterative and does not dither parameters; rather the
network learns an appropriate intensity-to-phase mapping. If the method is used as an on-line monitor of
beam quality, the optics required for the neural
network may be simpler than for most other wavefront measuring devices such as Hartmann sensors,
since the focal-plane camera needed to acquire the
input data for the network may be located near the
entrance pupil of the telescope, which eliminates the
need for a separate coud6 path. It is also well suited
for imaging applications since it operates directly on
the quantity of primary interest, namely, the PSF of
the system. Finally, the approach is adaptable, since
the network itself may be retrained or adapted to
changing conditions.
The neural network may be trained to estimate
phase aberration in terms of a variety of representations. For example, we have generated networks
that determine average phase and wave-front slopes
over small subapertures of the entrance pupil, or
alternatively we have trained networks that determine the phase aberration with respect to orthogonal
functions defined over the entrance pupil. These
representations may be suitable for use in a real-time
atmospheric compensation system where a deformable mirror is used to control the wave front over
subapertures, but for the HST phase-determination
problem (where inspection of the data suggests mainly
low-frequency distortion) the representation in terms
of the orthogonal basis functions is simple since most
low-order optical aberrations may be described with
only a few of these orthogonal functions. Therefore
we chose to train our neural network to estimate
phase distortion in terms of a finite number of the
well-known Zernike polynomials.15 In this configuration each output node of the network produced a
1 April 1993 / Vol. 32, No. 10 / APPLIED OPTICS
1721
coefficient Zi so that the phase aberration of the HST
projected onto the entrance pupil could be accurately
approximated by
Each neuron in the hidden layer performs the sum
S =
+
E
WjkIk
k
N
+k(r)=
ZiPi(r),
(2)
where the Pi(r) is a radial Zernike polynomial orthogonal over the entrance pupil of the HST.16
The neural network does not require a high resolution to recover low-spatial-frequency distortion. For
example, we have shown' that we can recover 18
Zernike modes from turbulence-degraded PSF data
that are 20 times the diffraction limit by using
pixels with 3 times the diffraction-limited angular
resolution. This reduced pixel resolution may be
achieved optically or by electronically binning adjacent pixels of a more highly resolved image. The use
of pixels that are larger than the diffraction limit
increases the signal-to-noise ratio of stellar data.
It also reduces the number of inputs to the neural
network, allowing the size and computational complexity of the network to be reduced. This shortens the
training time required by the network and increases
the throughput of the system.
3.
Neural-Network Architecture
Several different neural-network architectures exist
and are differentiated by the manner in which the
processing nodes are interconnected, the arithmetic
is performed by each node, and the method of training
the network. The phase-recovery network used in
the characterization of the HST aberration employed
three layers of neurons called perceptrons. Its architecture is shown schematically in Fig. 1. The first
layer takes the pixelized intensity (camera data) from
the two focal planes and passes these values to the
second layer. Each of the k th input nodes is connected to thejth node in the second or hidden layer by
a connection weight Wk, which multiplies the input
intensity Ik. Similarly, the hidden nodes are connected to output nodes by a second set of weights Vij.
OUTPUT
NEURONS
CONNECTION
WEIGHTS
VI)
HIDDEN
LAYER
OFNEURONS
CONNECTION
WEIGHTS
W
I
00
. . . .
DATAFROM
FIRSTFOCAL
PLANEIMAGE
Ib0
. . . .- |INPUT
NEURONS
DATA FROM
SECOND FOCAL
PLANE IMAGE
Fig. 1. Schematic representation of the two-input-plane neuralnetwork architecture used for phaso rotrieval of the HST aborration.
1722
APPLIED OPTICS / Vol. 32, No. 10 / 1 April 1993
(3)
and passes the result through a nonlinear sigmoid
function:
1
f
I)
exp(-u)(4
(4)
The output of each hidden node or neuron, oj = f (sj),
is then multiplied by Vijand passed to the ith output
node. Finally the output nodes compute the sum
zi =
I Vyoj,
i
(5)
where Zi is the output of the ith output node and
represents the network's estimate of the distortion
corresponding to the Zernike mode associated with
that node.
We trained the network by adjusting the connection weights Wk and Vij using the externally supervised backpropagation algorithm described by Rumelhart et al.17 This technique consists of a fixed-stepsize steepest-descent gradient-search method that
uses training data to minimize a function represent-
ing the error between the network's outputs and the
correct solution. First, the training data are generated from many realizations of randomly chosen
aberrations that fall within a range looselycorresponding with the conditions anticipated for the actual
application of the network. Then these training
data are repeatedly presented to the network, and the
connection weights are adjusted to minimize the
squared error betweenthe network's outputs and the
known solutions. The training data consist of a set
of individual input images and the Zernike coefficients describing the aberration used to generate the
images (i.e., the known solution). In principle the
training data can be geneated either numerically or
optically, but for most applications, including the
HST, it is more practical to use the former method.
In the backpropagation algorithm the network
weights are adjusted after each training iteration.
An iteration consists of passing a single training
image through the network and computing the gradient of the error function with respect to the connection weights given the present training image. The
error function utilized in this study was simply the
squared error between the network's outputs and the
known Zernike coefficients:
E=
( - Z)2
(6)
where E is the squared error, and Zi are the Zernike
coefficients corresponding to the aberration used to
create the training data. Using Eqs. (3)-(5), we may
derive the followingtraining relations:
Wjk(t + 1) = Wjk(t) + U'8j Ik,
(7)
V. (t + 1) = Vjj(t) + aioj(t),
(8)
where (t) and (t + 1) indicate the training iteration
number, a and u.' are learning rates that are prescribed by the user, and
ai
= 0i(t)[ - o(t)]
8i=
-Zi.
bivii,
(9)
(10)
The learning rates determine how many training
iterations are required for convergence of the network and how accurately the minimum of the error
function is determined. In general small learning
rates take longer to converge but produce a network
with a smaller residual error.' 8
Experience with neural networks trained to recover phase aberration created by the turbulent
atmosphere has shown that a neural network will
typically converge to a small residual error after a few
hundred thousand training iterations. However, for
the present application where an extremely high
degree of accuracy was desired, approximately 106
training iterations were used. Since the generation
of training data requires the computation of a twodimensional FFT and is therefore quite time-consuming, the generation of 106 sets of training data is
prohibitive. Instead a smaller number of training
patterns are generated and passed through the network several times. There is a limit, however, to the
minimum size of the training set. Care must be
taken to include enough independent realizations of
distortion to insure that the network learns a general
functional input-output mapping for the phaseretrieval problem as opposed to memorizing the
mapping for only a few patterns. By testing the
neural network with data not in the original training
set, we have found that 4000-6000 individual input
patterns are all that are required to insure that no
memorization occurs and that a general mapping is
learned. For this size training set the process of
generating training data and training the network
could be accomplished in a 24-h period on a PC
compatible 386 computer with a digital signal processing board containing a single Texas Instruments
TMSC302OC30digital signal processing chip.
4.
Results for the HST
The major purpose of HARP was to produce estimates of the HST primary aberration with as much
accuracy and confidence as possible. To achieve this
goal we tested the neural network on simulated
stellar images produced by the Space Telescope Science Institute (STScI), and we applied it to as many
HST stellar images as possible. Testing the neural
network against the STScI simulations permitted the
accuracy and applicability of the method to be evaluated with data for which the exact answer was known.
Obtaining as many estimates of the HST aberration
as possible helped raise the confidence of the results
and defined over what range of operating conditions
any correction of the HST aberration would be effective. Another assessment of the accuracy and consistency of the neural-network results was obtained by
tesing each network on 500 simulated images that we
generated ourselves and by computing the rms residual error for each Zernike mode. These test images
were similar to the data used to train the network and
consisted of pixelized intensity data generated from
aberrations with random mixtures of distortions, but
they were not actually included in the training data.
All the neural networks applied to the HST images
were trained on simulated stellar images computed
with a simple one-step propagation code based on Eq.
(1). These images were generated on a 128 x 128
grid with a resolution that is equal to the resolution of
the HST images. All HST images applied to the
neural network were obtained with a single cameracalled the planetary camera 6 (PC6)-which is part of
the wide field/planetary camera (14F/PC). This instrument consists of several cameras that are appropriate for different scientific applications. The
WF/PC contains several narrow-band filters, which
may be placed before the optical detectors. All stellar images used in this work were narrow band and
were obtained through one of these filters. The
resolution of PC6 is 0.043 arcsec, and its optical
detector consists of an 800 x 800-pixel CCD array.
We used a simple pupil mask, which was equal to the
diameter of the HST aperture and which modeled the
primary-mirror mounting pads and the secondary
spiders as obscurations in the pupil plane. Diffraction from optics other than the primary mirror was
ignored, as were the effects of telescope misalignment
and jitter. We generated a diverse set of training
data by using a different random mixture of Zernike
coefficients for the phase distortion in each image.
The average magnitude and the variance of the
Zernike aberrations contained in the training set
were chosen so that they would include any reasonable value of HST aberration.
We preprocessed both the simulated stellar images
and the real HST stellar images before application to
the neural network. This preprocessing consisted of
choosing a region of interest centered around the
centroid of the stellar image, subtracting off any
background pedestal, binning adjacent pixels to the
resolution required by the network, and normalizing
the pixelized intensity pattern to its brightest pixel.
The exact dimensions of the region of interest and the
resolution of the input pixels for the neural network
were determined empirically and varied with the
wavelength of the filter used and with the position of
the HST secondary mirror. In general the region of
interest was chosen so that it was just big enough to
include the stellar image. For example, the region of
interest for images obtained with an 889-nm filter
and a secondary position that corresponded to dz =
333 plm was 96 x 96 pixels (or 4.13 x 4.13 arcsec),
while for a 487-nm filter and a secondary position of
dz = 90 pm, a 48 x 48-pixel (or 2.06 x 2.06-arcsec)
region of interest was used, where the quantity dz
1 April 1993 / Vol. 32, No. 10 / APPLIED OPTICS
1723
corresponds to the axial secondary-mirror position
relative to an arbitrary origin determined before the
HST was put in orbit.
Three of the simulated stellar images produced by
STScI were used to test the neural-network method.
The ambiguity of the phase-to-image mapping was
overcome by using two 16 x 16 input images. The
results are presented in Table 1 along with the
Zernike coefficients used by STScI to generate the
images. Two estimates of the aberration were computed by using two pairings of the three images.
The results labeled A were obtained by input images
simulating secondary-mirror positions that are equivalent to moving the HST secondary mirror +0 and
-170 l.im from the point of best focus. The equivalent secondary-mirror positions for the results labeled B were +0 and +170 pim. Since the neural
network does not require highly resolved images, and
the STScI simulated images were produced with a
resolution that is greater than diffraction limited,
adjacent pixels were averaged to obtain square input
pixels with dimensions of 0.258 arcsec for the images
corresponding to secondary positions of ± 170 pimand
0.172 arcsec for the image at the secondary position of
0 [im.
Coefficients for a total of eight Zernike modes
were estimated (focus through third order spherical).
The network's estimate of focus is not included in the
table since it is dependent on the secondary-mirror
position and does not represent an actual aberration
in the HST. Sixty-four hidden elements were used
in this network; the exact number of nodes in the
hidden layer does not critically influence performance
and is generally determined empirically. The small
residual error of the network's estimates on the 500
test images indicate that the weights have converged.
Also notice that the agreement between the neural
network's estimates and the correct Zernikes coefficient are quite good. For all modes except spherical
the aberrations are small, and the average value for
spherical is within 0.0005 jim rms of the value used in
creating the images.
After testing the neural network method on the
STScI simulated images, we applied it to real stellar
images acquired by the WF/PC. Initially we reTable1. Neural-Network
Resultsfor TwoPairs-A and B-of STScl
SimulatedImagesa
Zernike Coefficient
(pumrms Distortion
Across Aperture)
Mode
Rms Error
on Test
Correct Values
([lm rms
Distortion
A
B
Patterns
Across Aperture)
5
6
7
0.003
-0.014
0.030
0.008
0.024
0.008
0.01
0.01
0.01
0.001
0.007
-0.012
8
-0.024
-0.003
0.02
-0.004
9
10
0.007
-0.006
0.011
0.013
0.008
0.01
0.0
0.0
11
-0.242
-0.231
0.01
-0.236
aMode 5 is 45o astigmatism, 6 is 00 astigmatism, 7 isx coma, 8 isy
coma, 9 isx clover, 10 isy clover, and 11 is spherical.
1724
APPLIED OPTICS / Vol. 32, No. 10 / 1 April 1993
solved the ambiguity of the phase-to-intensity maping by using two input planes as described in Section
2. The neural-network parameters were similar to
those used to estimate the phase of the STScI simulations (i.e., two 16 x 16 input images, 64 hidden
elements, and eight Zernike coefficients). Six PC6
images were used to produce three independent evaluations of the HST primary mirror aberration.
As did the networks used on the STScI images, the
networks trained for use on the real WE/PC images
performed quite well on the 500 test images produced
with the single-step propagation. However, during
the course of HARP some questions were raised
concerning the repeatability and accuracy of the HST
secondary-mirror axial position, and this can potentially impact the neural network's accuracy. An
error in the secondary-mirror position degrades the
accuracy of the network, since we trained the network assuming that the two input images are located
at planes that are displaced a fixed and known
amount from the best focus. This rather stringent
requirement may be relaxed by training the network
to recognize focal aberrations. In this case the position of the secondary relative to the best focus of the
system need be determined only within a constant
offset, because the network will now measure any
constant offset in the secondary-mirror position as an
aberration corresponding to the focus Zernike polynomial. But even in this configuration we must still
train the neural network assuming an accurate knowledge of the difference in secondary-mirror positions
between the two required input images. Therefore
the distance moved by the secondary between the
acquisition of the two neural-network images must be
repeatable and precise. From discussions with other
investigators in HARP we felt that, while this may
not be a problem, it would be better to develop a
method of phase recovery that did not rely on exact
knowledge of the secondary position.
As mentioned in Section 1 the neural-network
method was originally conceived for use in a real-time
adaptive optics system designed to compensate for
the distortion caused by atmospheric turbulence.
For this application the aberration at the pupil has a
rapidly varying random shape with a spatial power
spectrum determined by the atmospheric seeing conditions. Also the peak-to-peak distortion of the wave
front may be several micrometers. This means that
the neural network used for a ground-based adaptive
optics system must be able to estimate the phase
aberrations that are large when compared with the
HST aberration and which may be much different
from one realization to the next. In these conditions
the near-ambiguity inherent in the phase-image relationship is a problem, and the neural-network method
requires two planes to estimate the phase correctly.
For the determination of the HST aberration the
phase distortion is relatively small compared with
that which can be expected from the turbulent atmosphere and it is also static in time. In this case the
ambiguity may be resolved in another manner. It is
-0.2
E
A
z
a
J
LO
0r
8
A
-0.3 -
U
0a.)
co
AVERAGEOF
SINGLE-PLANE
ESTIMATES
-0.4
-200
.
I
-100
.
.
.
0
.
100
.
.
200
.
I
300
.
.
.
400
500
dz Wn
Fig. 3. Summary of the neural-network estimates of HST spherical aberration as a function of secondary-mirror spacing dz: El,
single-input-plane estimates; A, two-input-plane estimates.
Fig. 2. Example of a single-input-plane network applied to a HST
stellar image: dz = +333, taken with a narrow-band filter with a
center frequency of X = 889 nm.
Top:
of the predicted HST aberration.
Middle: simulated stellar im-
simulated interferogram
age calculated with a one-step diffraction propagation algorithm by
using the aberration predicted by the neural network. Bottom:
HST image binned for input into the neural network.
possible to use the original two-input-plane network
to obtain an initial estimate of the phase distortion
that is sufficiently accurate to define a range of
Zernike coefficients that are large enough to include
all reasonable values of the HST distortion but within
which no ambiguity is encountered. New networks
can then be trained that use only one image plane as
input and yet converge without ambiguity because
the training data are constrained by the initial estimate of the two-input-plane network.
Figure 2 and Table 2 illustrate the performance of a
typical single-input-plane neural network. Figure 2
shows the input data for the network, a simulated
image corresponding to the network's estimate of
phase distortion, and a simulated interferogram of
the predicted distortion. Table 2 lists the Zernike
coefficients for seven of the modes computed by the
network and the rms residual error on the 500
training patterns. Preprocessing was essentially
identical to that described above for the two-inputTable2. Exampleof Neural-Network
Resultsonan HSTStellarImage
Whena SingleInputPlaneis Used
Coefficient
Mode
(pm rms distortion
across aperture)
Rms Error on
500 Test Patterns
5
6
7
8
9
10
11
0.013
-0.052
-0.035
0.017
-0.047
0.036
-0.289
0.008
0.008
0.012
0.011
0.007
0.007
0.008
plane network. All single-input-plane results were
obtained with a 16 x 16 square array of intensity
pixels, 32 hidden nodes, and eight output nodes. As
with the two-input-plane method adjacent pixels were
averaged to obtain the resolution appropriate for the
network. The input image in Fig. 2 has 0.344-arcsec
pixels that correspond to binning 8 x 8 PC6 pixels.
Table3. Neural-Network
Predictions
of HSTSphericalAberration
Spherical Aberration
(pumrms distortion
across aperture)
dz
Rms Error on
(pum) Test Patterns
Single-Input-Plane Results
-0.300
-90
-0.295
-90
-0.309
-90
X
Conic
(nm) Constant
0.005
0.005
0.005
631
631
631
-1.0155
-1.0153
-1.0159
-0.307
-0.284
-90
-90
0.005
0.005
631
631
-1.0158
-1.0148
-0.307
-0.313
-0.313
-90
-90
-90
0.005
0.005
0.005
631
631
631
-1.0158
-1.0161
-1.0161
-0.296
-90
0.004
889
-1.0153
-0.291
-90
0.004
889
-1.0151
-0.299
-0.289
-0.281
-0.273
-0.287
-0.306
-0.306
333
333
333
333
333
10
-10
0.01
0.01
0.01
0.01
0.01
0.006
0.006
889
889
889
889
889
487
487
-1.0155
-1.0150
-1.0146
-1.0142
-1.0149
-1.0158
-1.0158
-0.327
-5
0.006
487
-1.0168
-0.332
-0.328
-0.272
-0.281
-0.281
0
5
250
250
250
0.006
0.006
0.02
0.02
0.02
487
487
487
889
889
-1.0170
-1.0168
-1.0142
-1.0146
-1.0146
0.02
0.01
0.01
487
889
889
-1.0136
-1.0147
-1.0151
Two Input-Plane Results
-0.259
-0.283
-0.297
-
1 April 1993 / Vol. 32, No. 10 / APPLIED OPTICS
1725
A total of 26 estimates of the HST aberration was
obtained. Three were made by application of a twoinput-plane network to pairs of HST images acquired
at different secondary-mirror spacings. Twentythree estimates were made with single-input-plane
neural networks. In Fig. 3 and Table 3 the estimates
of the spherical aberration are summarized for all
two-input and single-input results. We obtained the
results by imaging through narrow-band optical filters with center frequencies indicated in the table.
In all cases the Zernike coefficients for modes other
than spherical aberration were small and were nearly
zero mean when averaged over all 26 estimates. As
mentioned above the two-input-plane results are
viewed with less confidence because of the possible
inaccuracies arising from errors in HST secondarymirror placement. The average spherical aberration
predicted by the two-input-plane network was a
-0.279-jim-rms wave-front error, and the average
spherical aberration predicted by the single-inputplane networks was - 0.299,jm rms. The secondarymirror position corresponding to zero Zernike defocus may be determined from the average focal
aberration predicted by the single-input-plane neural
network and was found to be dz = 110 jim. This
value is similar to that calculated from the results
generated by the other phase-retrieval methods used
approximately +0.01 jim rms, but the scatter in the
neural-network estimates was larger than the 2or
that might be expected. One source of this large
variance in the measurements may have been the
simple one-step numerical propagation used to create
the training data for the neural network, which does
not model multiple-surface diffraction. Other factors that limited the ability to simulate the HST
exactly were off-axis effects caused by small misalignments between the HST and WF/PC optics and
telescope jitter. The small residual error that resulted when the network was applied to the simulated
in HARP. 6
The neural network was used to obtain estimates of
spherical aberration at some secondary mirror positions (dz = -10, -5, 0, 5, 10) not examined by other
participants in HARP. These positions were nearer
the best focus of the HST, and therefore the stellar
images acquired were smaller and contained fewer
features. This may impact the accuracy of the estimates since the neural-network method relies on
image features to recover phase. Notice that the
results for these points were at the edge of the
predictions of spherical aberrations. Dropping them
from the calculations of average spherical coefficient
yields a value of -0.291 m rms or a conic constant of
-1.0151, which is closer to the values obtained by
other groups. However, the networks trained for
these defocus points did no worse on the test images
than did those networks trained to operate on more
defocused images. This suggests that the larger
spherical aberration predicted at these points may be
5.
Conclusions
The neural-network method of recovering phase distortions provided estimates of the HST primarymirror aberration that were well converged and selfconsistent, in the sense that the network predicted
phase within only a small residual error when it was
tested on trial data not in the training set. The
network also performed well on the STScI simulated
images for which an exact answer was known; it
estimated the spherical aberration to within 0.0005
jim rms. On real HST stellar images the singleinput-plane neural network predicted a spherical
aberration of -0.299 m rms, which corresponds to a
conic constant of -1.01546
0.0008; when the
two-input-plane results are included a conic constant
of -1.01534 ± 0.0008 was predicted. This agrees
well with the spherical aberration predicted by other
investigators. For PC6 the average spherical aberration computed by all authors 7 was -0.2855 jim rms
or a conic constant of -1.0148 ± 0.0005. Table 3
provides a list of all the neural-network estimates.
As can be seen the results were quite consistent,
which leads to great confidence in both the neuralnetwork estimates and the average estimates of HARP
as a whole. However, the data suggest some comments.
The network trained and tested quite well on our
simulated images, but the accuracy of the neural
network and other phase-retrieval methods cannot be
better than the simulations used to produce the
results. This is evident from the fact that each
individual network trained to an expected accuracy of
1726
APPLIED OPTICS
/
Vol. 32, No. 10
/
1 April 1993
trial data indicatesthat the neural-network method
is capable of providing great accuracy and that improving the fidelity of the training data should improve
the consistency of the estimated distortion. If the
optical system cannot be modeled with the desired
precision, it may be possible to train the neural
network to ignore effects that cannot be predicted
accurately. This might be accomplished by using
training data in which the parameters that cannot be
well characterized are varied throughout a range that
includes all reasonable values. The training algorithm then tries to develop weights that will pick out
the quantities of interest (in this case the spherical
aberration) regardless of the values of the other
parameters.
attributable to the fidelityof the training data that
may deteriorate near the focus of the system (because
of off-axis or diffraction effects) or to a real dependence of HST spherical aberration on the secondarymirror position.
This work was supported by the Jet Propulsion
Laboratory, California Institute of Technology, Pasadena, Calif., and NASA. The authors thank Robert
R. Korechoff and Art H. Vaughan of the Jet Propulsion Laboratory for inviting us to be members of the
HARP panel and for many helpful suggestions. We
acknowledge many interesting discussions with other
members of the HARP team. We also thank Robert
Q. Fugate of the Air Force Phillips Laboratory,
Albuquerque, N.M., who funded our study and collab-
orated with us on the use of the neural network to
control turbulent aberrations in an astronomical
telescope.
1991).
References
1. D. G. Sandler, T. K. Barrett, D. A. Palmer, R. Q. Fugate, and
W. J. Wild, "Use of a neural network to control an adaptive
optics system for an astronomical telescope," Nature (London)
351, 300-302 (1991).
2. D. G. Sandler, T. K. Barrett, and R. Q. Fugate, "Recovery of
atmospheric phase distortion from stellar images using an
artificial neural network," in Active and Adaptive Optical
Components, M. A. Ealey, ed., Proc. Soc. Photo-Opt. Instrum.
Eng. 1543, 491-499 (1992).
3. J. R. P. Angel, P. Wizinowich, M. Lloyd-Hart, and D. Sandler,
"Adaptive optics for array telescopes using neural-network
techniques," Nature (London)348, 221-224 (1990).
4. P. L. Wizinowich, M. Lloyd-Hart, B. A. McLeod, D. Colucci,
R. G. Dekany, D. Wittmann, J. R. Angel, D. W. McCarthy Jr.,
W. G. Hulburd, and D. G. Sandler, "Neural network adaptive
optics for the multiple-mirror telescope," in Active and Adaptive Optical Components,
7. R. Korechoff, "Image inversion analysis," in Wide-Field and
Planetary Camera: Special Review Panel, document JPL
D-8181 (Jet Propulsion Laboratory, Pasadena, Calif., March
M. A. Ealey, ed., Proc. Soc. Photo-
Opt. Instrum. Eng. 1542, 148-158 (1992).
5. M. Lloyd-Hart, P. Wizinowich, B. McLeod, D. Wittman, D.
Colucci, R. Dekany, D. McCarthy, J. R. P. Angel, and D.
Sandler, "First results of an on-line adaptive optics system,
with atmospheric wavefront sensing by an artificial neural
network," Astrophys. J. Lett. 390, L41-L44 (1992).
6. S. Brewer, S. Ebstein, J. Fienup, R. Gonsalves, M. Levine, M.
Litvak, R. Lyon, P. Miller, C. Roddier, F. Roddier, and M.
Shao, Notes from HARP Workshops, 15-16 November (1990)
and 5-6 February (1991); unpublished, obtainable from the
authors of this paper.
8. J. R. Fienup, T. R. Crimmins, and W.Holsztynski, "Reconstruction of the support of an object from the support of its
autocorrelation,"
J. Opt. Soc. Am. 72, 610-624 (1982).
9. J. R. Fienup, "Phase retrieval algorithms: a comparison,"
Appl. Opt. 21, 2758-2769 (1982).
10. J. R. Fienup, "Reconstruction of a complex-valuedobject from
the modulus of its Fourier transform using a support
constraint," J. Opt. Soc. Am. A 4, 118-123 (1987).
11. D. 0. Hebb, The Organization of Behavior (Wiley, New York,
1949).
12. F. Rosenblatt, Principles of Neurodynamics
(Spartan, Washington, D.C., 1962).
13. R. A. Gonsalves, "Phase retrieval and diversity in adaptive
optics," Opt. Eng. 21, 829-832 (1982).
14. R. G. Paxman and J. R. Fienup, "Optical misalignment
sensing and image reconstruction using phase diversity," J.
Opt. Soc. Am. 5,914-923
(1988).
15. M. Born and E. Wolf, Principles of Optics (Pergamon, New
York, 1970), pp. 464-466.
16. C. Burrows, Hubble Space Telescope Optical TelescopeAssembly Handbook (Space Telescope Science Institute, Baltimore,
Md., 1990), Version 1.0, pp. 38-40.
17. D. E. Rumelhart, G. E. Hinton, and R. J. Williams, Parallel
Distributed Processing: Explorations in the Microstructure
of Cognition (MIT, Cambridge,
Mass.,
1986), Vol. 1, pp.
318-362.
18. P. D. Wasserman, Neural Computing: Theory and Practice
(Van Nostrand Reinhold, New York, 1989), pp. 43-60.
1 April 1993 / Vol. 32, No. 10 / APPLIED OPTICS
1727