Meteorite Discovery using Unmanned Aerial Vehicles and

Meteorite Discovery using Unmanned Aerial Vehicles
and Convolutional Neural Networks
Amar Shah†,1 , Christopher Watkins†,2 , Robert Citron†,3 , Sravanthi Sinha†,4 and Peter Jenniskens†,5
Abstract— Recovering meteorites that fall on Earth’s surface
using observed fireballs is paramount to learning the composition and internal strength of Near Earth Objects. This is in
turn crucial for defending against the airburst risk of small
asteroid impacts. Of some 800 recorded meteorite falls, only
19 have been recovered. Globally, only about 10 meteorite falls
are recovered each year; meteorites are exceedingly difficult
to find as most are small, 1-4 cm in size. Search areas are
frustratingly large, typically 10 x 4 km in size. Typical human
search requires a hundred hours to find a single meteorite.
Commercial unmanned aerial vehicle hardware and image
classification software have both advanced to the point where
autonomous meteorite finding by aerial survey is technically
feasible. In this work, we provide the first attempt towards such
automation. Our key contributions are in the collection of an
image dataset of fresh fallen meteorites on a variety of terrains,
and the development of a deep learning image classification
algorithm. We tested our hardware and software solution in
Creston, California, which experienced a meteorite shower in
October 2015.
I. INTRODUCTION
In June 2013, NASA announced a Grand Challenge to find
all asteroid threats to human populations and knowing what
to do about them [1]. The increased fear of and attention
on asteroid strike undoubtedly stemmed from an unforeseen
meteorite that struck the Chelyabinsk region of Russia in
February 2013. In this instance, a meteoroid of diameter of
20 metre entered the Earth’s atmosphere with kinetic energy
very approximately estimated to be equivalent to that of 500
kilotons of TNT. The meteor travelled at 12 miles per second,
causing shockwaves strong enough to knock people down
and shatter thousands of windows, injured 1600 people and
causing vast damage to infrastructure [2], [3].
Understanding the composition of asteroids is critical to
calculating its potential threat to Earth and in formulating
an effective proactive deflection strategy. Nevertheless, little
is known about the exact composition of the roughly 40
asteroid families and additional sub-families that populate
the asteroid belt and Near-Earth Asteroids (NEAs).
† This work was supported by NASA Frontier Development Lab,
http://www.frontierdevelopmentlab.org/
1 Amar Shah is with Machine Learning Group, Department of Engineering, University of Cambridge, [email protected]
2 Christopher Watkins is with Commonwealth Scientific and Industrial Research Organisation, Scientific Computing, Clayton, Australia,
[email protected]
3 Robert Citron is with Earth & Planetary Science, University of
California, Berkeley, CA, USA, [email protected]
4 Sravanthi Sinha is with Holberton School of Software Engineering, San
Francisco, CA, USA, [email protected]
5 Peter Jenniskens is with The SETI Institute, Carl Sagan Center,
Mountain View, CA, USA, [email protected]
Fig. 1: Fragment of the Chelyabinsk meteorite discovered at
Lake Chebarkul in February 2013.
Apart from prohibitively costly sample acquisition missions in outer space, one of the few methods of determining
asteroid composition is by studying freshly fallen meteorites and linking them to specific asteroid families. This
is achieved by using the fireball trajectory observed as the
meteor enters the Earth’s atmosphere to compute the orbit of
the meteoroid prior to atmospheric entry [4]. The observed
trajectory combined with the lightcurve, deceleration profile,
and location of recovered meteorites on the ground also
divulge useful information about how particular types of
meteoroids disrupt upon impact. We may further infer at what
altitude asteroids of this material type would fragment and
transfer kinetic energy to the atmosphere causing damaging
shockwaves.
Since their inception, fireball camera networks have
recorded approximately 800 trajectories of meteoroids significant enough to leave meteorites scattered on the ground,
of which only 19 have been recovered. The meteorites are
exceptionally hard to find; the meteoroids enter at cosmic
speeds of 12-30 km per second and break into fragments,
most of which are only 1-4 cm in size. The small fragments
are quickly stopped by the atmosphere and fall early, while
the larger fragments keep going and can end up many
kilometres away. Only for large falls does Doppler weather
radar pick up the falling rocks and help direct meteorite
search. In most other cases, when a meteoroid is only a few
kg in mass initially, fragments end up scattering so thinly that
they are impossible to find. The problem has been described
as that of finding a “needle in a haystack” on the front cover
of Nature 458 [5].
In this work we develop what, to the best of our knowledge, is the first attempt to automate the process of finding small meteorites with unmanned aerial vehicles (UAV)
and image processing software. Our main contributions are
collecting an image dataset of meteorites in various terrains
similar to where they are typically found, and a convolutional
neural network model trained to be able to classify whether a
patch has a meteorite in it with very high accuracy. We tested
our algorithm by taking our UAV to Creston, California,
where a meteorite fall occurred in October 2015 [6]. After
obtaining permission to fly our UAV over private land, we
placed meteorites across the field and collected aerial images.
Our algorithm achieved over 99% accuracy in classifying
patches of land as having or not having meteorites in them.
We proceed to describe the UAVs available to us and our
initial approaches to the classification problem without much
training data. Poor results led us to collect a dataset on which
we trained convolutional neural networks with extensive data
augmentation. Finally we discuss our field trip experimental
results and what may be developed further.
II. UNMANNED AERIAL VEHICLE
Unmanned Aerial Vehicles (UAVs) are the next frontier
when it comes to autonomous sensing. With the ability to
survey large areas without human intervention, these devices
offer an accessible platform for automated object detection.
We experimented with two types of UAVs: a bespoke radio controlled drone from UVIONIX and a commercially
available autonomously controlled drone from 3DR, the Solo.
Both drones are quadcopter designs with on board imaging
capability (still photos and video).
Very quickly we learned that automation is crucial to the
deployment of the drone as a hunting aid. While the radio
controlled model could be flown seamlessly by a professional
pilot within short range, higher wind levels and large distance
between drone and controller made manual flight control
very difficult. We therefore favoured the programmatic approach to flight path planning offered by the 3DR model,
which was able to provide finer level control over the
survey area under a range of environmental conditions with
sufficient flexibility in adapting parameters such as flight
speed, height and photo taking frequency.
Unsurprisingly, a fundamental requirement was the ability
to capture crisp, blur free images from the UAV. The 3DR
Solo was able to overcome this challenge with the addition
of an active gimbal element, which provided smooth and
seamless recording, unaffected by minor fluctuations or
swaying in the UAV’s flight. A GoPro HERO 4 was attached
to the bottom of the 3DR Solo, capturing an image every
second, when flying horizontally at a speed of 3 metres per
second. The GoPro was able to take still shots at a resolution
of 3264 × 4928, and we aimed to capture meteorites with
minimum resolution of 30 × 30 pixels, close to a minimum
requirement for the image algorithms to detect them.
The UAV was therefore best flown at a constant height
of between 1.5-2 m from the ground. In hilly landscapes, a
LiDAR or Sonar would be required on the UAV, to maintain
a constant elevation above the surface. Unfortunately, at the
time we carried out this work, the firmware available for the
3DR Solo did not allow automated control of the attitude
when using the LiDAR for height sensing, however it is
anticipated that this feature will be made available within
the coming months.
The batteries available to us for the 3DR Solo lasted
about 15 minutes when in flight and taking pictures regularly.
With this kind of payload the Solo can survey approximately
1000 - 2000m2 per charged battery. The typical desired
survey area would be on the order of km2 , suggesting
many optimizations would be required to flight planning and
battery life maximization.
The original appeal of the UVIONIX drone was that it
had an on board Nvidia Jetson TX1 graphics processing
unit (GPU), potentially allowing on board image processing
and decision making. However, as the autonomous flight
planning software was not available for us on this drone,
we settled for post-processing all images taking by the
3DR Solo on a laptop containing a GPU, however, as UAV
hardware advances, we feel confident that on board flight
planning and decision making may be possible in the future.
We subsequently pursued developing an online classification
algorithm in this work as a proof of concept.
III. INITIAL APPROACHES
Meteorites are typically distinguishable from ordinary
rocks as they contain higher proportions of dense metals and
are sometimes magnetic. From a purely visual standpoint,
freshly fallen meteorites often have a shiny black crust,
which may be exposed in places as in Figure 1.
Whilst one may find many images of meteorites on the
internet, a very large number do not have the properties of
common freshly fallen meteorites, and the images are not
representative of how meteorites may be found on the ground
in practice. Therefore, with respect to our end goal of being
able to detect meteorites from aerial images, we lacked a
useful training set of image data. For this reason, our first
attempts involved trying to hand craft features for image
classification. We collected a very small number of sample
images from a birds-eye view of meteorites in grassy and
dry terrain for experimentation.
An initial approach was simply to scan an image, and pick
out regions of clusters of dark pixels, since we expected to
look for black crusts. Another method we tried involved a
simple anomaly detector, as meteorites are foreign objects in
a terrain. Each pixel can be represented as a 3-dim vector,
representing the levels of red, green and blue. For each
pixel location, u = (ux , uy ), we compute its N nearest
0
the Euclidean
neighbours, {u0n }N
n=1 . Let d(u, u ) represent
PN
distance between pixels u and u0 . Then N1 n=1 d(u, u0n ),
the average distance to u’s nearest neighbours, gives a
measure of how anomalous u is.
However, this method can fail in typical scenarios. Whilst
in the grass example, the anomaly detector does well, in the
dry grass, the algorithm is fooled by daisy flowers in the
midst of the terrain and fails to notice the meteorites.
Such unreliable performance was common among all of
the approaches we attempted using hand crafted features
including, dark color detection and contrast based features.
The only way that seemed sensible to proceed was to collect
Fig. 2: Example training image and illustration of how we
created individual patch data points.
more training data from a range of terrains using physical
meteorites, with the intention of introducing a machine
learning based approach to the image classification.
IV. CONVOLUTIONAL NEURAL NETWORKS
In order to overcome the problems of hand picking features, we decided to apply deep learning algorithms to the
task of discriminating between images of patches of land
with and without meteorites. In particular, we chose to use
convolutional neural networks [7], which have achieved state
of the art performance in a variety of computer vision based
tasks [8]. Neural network classifiers are trained to learn the
features most relevant for the task at hand jointly with output
predictions, relieving the user from the task of engineering
useful features by hand. In particular, convolutional neural
networks are well suited for image data and compose several
simple operations in sequence: convolution, pooling and nonlinearites.
In this section we discuss how we artificially increase the
size of our dataset, leverage the large amount of general
image data available for training models and how we compress information into a smaller model for fast inference.
We used the Caffe [13], Keras [14] and Theano [15] software packages for implementation and experimentation of
convolutional neural network models.
A. Data Collection
We had 8 meteorites in our possession in the size range
we are interested in (1-4 cm diameter). Using fairly high
resolution smart phones, we took birds-eye view photos
of meteorites placed in a range of relevant terrain in the
California bay area, from a height of roughly 1.8 metres off
the ground, somewhat representative of the height at which
the UAV would hover.
Since our aim was to detect meteorites in a photo and we
knew the approximate size of the object we were looking
for, we constructed multiple data points from each image
by partitioning it into a grid of patches, as is illustrated in
Figure 2. Patches are then grouped into those with meteorites
in them (roughly 320 patches), and those without. Note that
a very small fraction of patches have meteorites in them,
leading to a highly imbalanced dataset. We shall discuss this
problem and our remedy in more detail in the next section.
Most of the images from the internet found by a Google
search were not useful, however, a small number (roughly
280) were photos of meteorites found in relevant terrains
with no interference. Finally, in order to include rare meteorite examples in our training set, we found images of
rare meteorites from Google and EBay search, and used
Photoshop to artificially place them in terrains of interest
to us (35 patches).
Our total training set therefore consisted of roughly 635
patches with meteorites in them and many more without
meteorites.
B. Data Augmentation
As seen in Figure 2, cutting up images into patches results
in a very imbalanced dataset, in the sense that most of our
patches end up empty. The problem with such a dataset is
that a machine learning algorithm would be able to achieve
incredibly high accuracy by simply predicting every image to
not contain a meteorite, something we would want to avoid.
In order to artificially increase the size of our “positive”
dataset, we apply random transformations of the meteorite
patch images, which we include in our training data. The
data augmenting techniques we apply are:
• Rotations. We randomly rotated images by a multiple
of 90 degrees.
• Reflections. We randomly reflected the pixels in a
vertical plane.
• Resolution. We randomly downsampled and subsequently upsampled image patches.
• Brightness. We randomly increased or decreased the
brightness of the image.
• Saturation. We randomly decided to saturate each color
channel to improve the contrast within images.
Figure 3 displays examples of each data augmenting
operation. For each of the 645 meteorite patches, we applied
50 random augmentation operations described in the list
above to give approximately 32,000 patches of meteorite
images. One may view the augmentations described above
as a method to deter the model to look for features like
brightness, orientation or resolution in order to discriminate
between patches with and without meteorites.
C. Pre-training with a Large Image Dataset
The internet hosts a huge number of images and we
may use these to assist a model to learn general features
of images, before focusing on the specific features useful
for our specific task of meteorite detection. In particular, a
dataset of roughly 15 million images called ImageNet [9] has
been extensively used for image classification competitions
over the last few years. Researchers have developed highly
accurate convolutional neural network architectures to train
on large datasets e.g. GoogLeNet [10] and AlexNet [11].
For our task of classifying patches of land with meteorites,
we chose an existing architecture such as GoogLeNet, with
(a) Original patch
(b) Rotation
(c) Reflection
(d) Low resolution
(e) Brightness
(f) Saturation
Fig. 3: Examples of data augmentation operations we apply to artificially increase our “positive” dataset.
parameters initialized to those which were the result of
training on ImageNet. We subsequently train the network
with a low learning rate, to adapt the features slightly in a
way more conducive to the meteorite detection task. This
approach means that from initialization, the network would
already be familiar with objects such as grass, rocks, hay,
etc, and would not have to learn their features anew.
We independently trained 5 convolutional neural network
architectures: 3 GoogleNet architectures and 2 AlexNet architectures, using a random subset of 85% of the labelled
data for training and the remainder for validation. The
validation accuracies for the 4 models were 99.5%, 99.6%,
99.8%, 99.8% and 99.9%. We create one predictive model
by averaging the outputs of each of the 4 neural networks,
which achieves a validation accuracy of 99.9%. This method
of model averaging helps to remove individual biases that
each model may have, improving the final accuracy.
One issue with using large, powerful architectures is their
memory consumption and inference speed. We eventually
would like the ability to process images in real time on a
graphics processing unit mounted on a UAV. Until hardware
is available to achieve this, we have been processing images
taken by the UAV on a laptop after flight. We would like
this to be as fast as possible at the search site.
The GoogleNet and AlexNet architectures consist of
roughly 6 million and 61 million parameters respectively,
and take on average 3.60 and 1.24 milliseconds respectively
to process 1 patch on a Nvidia Tesla K20 GPU, which had
similar processing speed to the Nvidia GTX 980 available
to us on the field trip. Therefore, the final model which
averages 5 models takes an average of 13.28 milliseconds per
patch. However, in a 20 minute flight where the UAV takes a
picture every second, we would collect 1,200 images. Each
image would consist of roughly 100,000 patches. Hence, data
collected from a 20 minute flight would require roughly 25
minutes of processing for our algorithm, let alone the extra
time required to display the output and for the user to parse
this output. On a typical field trip, there would be a window
of several hours in which a scientist would like to examine
vast areas of land, making limiting computational processing
time very useful.
We therefore constructed a smaller convolutional neural
network model with the following architecture:
• convolution (32 5 × 5 kernels, stride 2, ReLU),
• convolution (32 3 × 3 kernels, stride 1, ReLU),
• max pooling (3 × 3 kernel, stride 2),
•
•
•
•
•
•
•
•
•
•
•
convolutional layer (64 3 × 3 kernels,
convolutional layer (64 3 × 3 kernels,
max pooling (3 × 3 kernel, stride 2),
convolutional layer (64 3 × 3 kernels,
convolutional layer (64 3 × 3 kernels,
max pooling (2 × 2 kernel, stride 2),
dropout (probability 0.33),
dense (output size 64, ReLU),
dropout (probability 0.33),
dense (output size 2),
softmax.
stride 2, ReLU),
stride 1, ReLU),
stride 1, ReLU),
stride 1, ReLU),
This is a vanilla convolutional architecture with roughly
270,000 parameters, where inference now takes 0.44 milliseconds per patch on average, offering a 30 times speedup
over our bag of bigger models. However, when trained
on our dataset of roughly 64,000 patches, we achieve a
validation accuracy of 98.5%. Whilst this sounds like a high
performing algorithm, recall that a single image taken by the
UAV consists of roughly 100,000 patches, i.e. roughly 1,500
patches would be incorrectly labelled, which is unacceptably
high in practice.
In the next subsection we describe our approach to improving the accuracy of the architecture by leveraging from
the previously trained models.
D. Model Distillation
Whilst the bag of models we trained first are too large and
slow in practice, they contain useful knowledge which could
be distilled into the smaller model to boost performance.
In particular, we chose to use the dark knowledge model
distillation framework [16], which attempts to encourage the
penultimate layer of the small model, or the student, to be
similar to that of the larger model, or the teacher. We refer
to the 5 larger models as the teacher models and the smaller
simpler model as the student model.
Concretely, for a specific single training patch, let vi0 be
the 2 dimensional penultimate layer output of the ith teacher
model, i.e. the input to the softmax layer. Since the softmax
0
operation is invariant to additive constants, we assume vi,1
+
0
vi,2 = 0. We average the representations for each teacher to
P5
give v = 15 i=1 vi0 .
Now let u be the penultimate layer output of the student
model for this specific training patch, with label y = 1 if it
contains a meteorite and y = 0 if it does not. The typical loss
function that is optimized for binary classification problems
Fig. 4: Creston, California field trip looking for placed
meteorites with a UAV.
is the binary cross entropy (bce) given by
eu1
eu2
lbce = y log u1
+
(1
−
y)
log
. (1)
u
u
u
e +e 2
e 1 +e 2
However, in order to distill information from the teacher
models to the student, we may attempt to promote the student
model to learn a similar penultimate layer representation to
the teacher’s penultimate layer. [16] suggest the following
loss to achieve this
ev1 /T
eu1 /T
ldistill = v /T
log
e 1 + ev2 /T
eu1 /T + eu2 /T
v2 /T
e
eu2 /T
+ v /T
log u /T
,
(2)
v
/T
u
/T
e 1 +e 2
e 1 +e 2
for a temperature parameter, T . Higher values of T lead
to a softer probability distribution over the two classes.
In practice we optimize a combination of the binary cross
entropy and distillation losses given by
lfinal = αlbce + (1 − α)ldistill ,
(3)
for some α ∈ [0, 1]. We applied this technique on the
dataset we created previously, and found the best settings of
T = 10, α = 0.66 by grid search over validation accuracy.
Training the smaller convolutional neural network using the
bigger, more accurate models as teachers using the loss in
(3), we achieved a validation accuracy of 99.4%, a significant
improvement over the network with no model distillation.
which we downsampled by a factor of 2 in each dimension. Subsequently we scanned the image with a patch of
size 64 × 64 and stride 16, running each patch through
the classification algorithm. Note that this means a single
image consisted of 16,000 patches. When the output of the
algorithm was higher than 0.75, a red box was drawn around
the patch and then the output was between 0.5 and 0.75, a
white box was drawn around the patch. The image contained
exactly 2 meteorite pieces which were both found, and
rather remarkably, only 2 patches were incorrectly labelled as
containing meteorites out of a total of 60,000! This illustrates
the potential of convolutional neural network architectures in
the task of meteorite detection.
Whist this was a promising finding, most images we tested
on typically had 40-80 or 40-120 incorrectly labelled patches
(still below 1% test error), when we used the slower bag
of models, or the smaller model respectively. An example
is shown in Figure 6, and would be impractical for a final
verison of the software, and an example is . However, in tests
on photos we collected on grassy terrain at Moffett Field,
California, we found that errors were largely reduced when
the resolution of the meteorites was increased, i.e. when
the UAV flew at a lower height. Once LiDAR software is
available, we may be able to exploit this finding.
VI. DISCUSSION
In this work, we have made a first attempt at automating
the task of detecting small meteorites in large fields using
unmanned aerial vehicles and image data. A primary first
step towards this goal was in collecting a labelled dataset to
train an image classification algorithm on. We leveraged the
vast amounts of image data available for scientific research,
and corresponding previously successful convolutional neural
network architectures and parameters as a starting point for
training models to discriminate between patches with and
without meteorites. Finally, in order to speed up inference
and limit the memory usage of our algorithm, we imple-
V. FIELD TRIP RESULTS
To test our image collection and processing pipeline, we
visited Creston, California, the site of a meteorite fall on
October 24, 2015 (Figure 4). Given the vastness of the
plausible fall region, it was very unlikely for us to find new
meteorites in the time we had available. We therefore placed
meteorites in our possession on the field, surveyed the area
with the UAV to obtain test data to assess the performance
of the algorithm we had developed.
Figure 5 is an example of an image taken by the GoPro
on the 3DR Solo after being processed by the bag of large
models. The original image had resolution 3264 × 4928,
Fig. 6: Sample output from the distilled convolutional neural
network model. There are 42 incorrectly labelled patches,
and 2 of the 3 meteorites are found.
Fig. 5: Sample output from the bag of convolutional neural network classification algorithms where the input is an image
taken by the UAV. Red boxes are drawn over patches which output 0.75 or higher, which boxes are drawn over patches with
output 0.5 to 0.75. The orange flags in the image mark where the 2 meteorites were placed, both of which were detected.
On the right, we zoom in 200% into a large patch of the image to illustrate the difficulty of the problem. Only 2 boxes are
incorrectly labelled out of about 16,000.
mented a method to distill the knowledge of the bigger
models into a single, lean model, achieving high accuracy.
We believe the primary limitation in achieving even higher
accuracy is in the lack of data. Over time, as more images of
different types of meteorites in various terrains are collected,
the algorithms should improve in performance. Whilst data
is limited, we may pursue other avenues to reduce the
false positive rate, active learning for example. In an active
learning setting, the algorithm may ask the user to label a set
of judiciously chosen test patches after image collection, in
order to maximally increase the algorithm’s performance in
expectation. Such an approach would also be helpful when
the UAV is being used in a slighlty unfamiliar terrain where
the algorithm has not seen much training data. Realistically,
a practical algorithm would require a test accuracy of 99.9%,
where an image would have of the order of 10 errors. This
seems possible, as was demonstrated by the bag of larger
models we trained.
With regards to the hardware, there are also improvements
that could be made to the UAV. First and foremost, LiDAR
software would allow the UAV to maintain a constant height
in its hovering fairly easily. Secondly, if the UAV had a
fast GPU, we could further pursue the idea of on board
processing. It would perhaps be possible to control the UAV
based on the algorithmic output; if the algorithm believes
there is a meteorite below but is not very certain, it may
reduce its height from the ground and reassess the situation
with higher resolution, much in the way a human would.
R EFERENCES
[1] NASA Announces Asteroid Grand Challenge. https://www.
nasa.gov/mission_pages/asteroids/initiative/
asteroid_grand_challenge.html , 2013.
[2] O. P. Popova, P. Jenniskens, V. Emelyanenko, A. Kartashova, E.
Biryukov and others. Chelyabinsk airburst, damage assessment, meteorite recovery, and characterization. Science, 342:1069-1073, 2013.
[3] D. Yeomans and P. Chodas. Additional Details on the Large Fireball Event over Russia on Feb. 15, 2013. NASA/JPL Near-Earth
Object Program Office, http://neo.jpl.nasa.gov/news/
fireball_130301.html , 2013.
[4] P. Jenniskens, A. Rubin and Q. Z. Yin. Fall, recovery and characterization of the Novato L6 chondrite breccia. Meteoritics & Planetary
Science, 49(8), 2014.
[5] P. Jenniskens, M. H. Shaddad, D. Numan, S. Elsir, A. M. Kudoda
and others. The impact and recovery of asteroid 2008 TC3. Nature,
458:485-488, 2009.
[6] Event 2635, Creston, CA. Meteor shower. American Meteor Society,
2015.
[7] Y. LeCun, L. Bottou, Y. Bengio and P. Haffner. Gradient-Based
Learning Applied to Document Recognition IEEE, 86(11):2278-2324,
1998.
[8] J. Gu, Z. Wang, J. Kuen, L. Ma, A. Shahroudy and others. Recent Advances in Convolutional Neural Networks. arXiv preprint
arXiv:1512.07108, 2015.
[9] J. Deng, W. Dong, R. Socher, L. J. Li, K. Li and F. F. Li. Imagenet:
A large-scale hierarchical image database. CVPR, 2009.
[10] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed and others. Going
Deeper With Convolutions. CVPR, 2015.
[11] A. Krizhevsky, I. Sutskever and G. E. Hinton. Imagenet classification
with deep convolutional neural networks. NIPS, 2012.
[12] G. Hinton, O. Vinyals and J. Dean. arXiv preprint arXiv:1503.02531,
2015.
[13] Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long and others.
Caffe: Convolutional Architecture for Fast Feature Embedding. arXiv
preprint arXiv:1408.5093, 2014.
[14] F. Chollet. Keras. GitHub repository, https://github.com/
fchollet/keras, 2015.
[15] Theano Development Team. Theano: A Python framework for
fast computation of mathematical expressions. arXiv preprint
arXiv:1605.02688, 2016.
[16] G. Hinton, O. Vinyals and J. Dean. Distilling the Knowledge in a
Neural Network. arXiv preprint arXiv:1605.02688, 2015.