Exercise #7b - Classification

Exercise #7b - Classification
• 7.3 Unsupervised classification
• 7.4 Evaluating your training areas
Objective: To learn tools you can use to analyze and improve your training set for a
supervised classification, and to make the result from the supervised classification better,
After this you will run an Unsupervised classification. At the end, you will compare the
results that you got from each of these classifications.
The data you need:
You will use the data that you used in Lab 7a. These files include the satellite data file called
tm070717.img (also the Pyramid Layer file tm070717.rrd), your Signature file (sigl.sig), and
your supervised classification Superl.img.
7.3 Unsupervised classification
Choose the Classifier Button on the Main Menu Icon Panel, then Unsupervised
Classification. As the Input Raster File, use the tm070717.img. Give the output Cluster Layer
the filename Unsupl.img. and the Output signature name Unsupl.sig. Choose the Initialize
from Statistics button (Meaning you want the algorithm to be based on more statistics than
only the signature means), and then choose 10 as the number of classes (Normally you would
take more classes, but for the sake of time, we will just make this 10). Then change the
number of iterations to 10. Now this should look like the dialog below.
Press OK to start the Unsupervised classification.
Open the file Unsupl.img. Now open up the Signature Editor, and open the signature file
Unsupl.sig. These are the 10 "clusters" that were created by ISODATA. You can read about
the how the ISODATA algorithm works under Imagine's help for Unsupervised
Classification, or the textbook. The Unsupl.img shows what cluster each pixel belongs to.
Assigning class names to the clusters
In Unsupervised classification, you want to first identify what these clusters are,
and give them information class names. In this part of the exercise, you will generally try to
identify the same classes as you had for the Supervised classification, but a little more simply,
so a suggestion is to use the class names Water, Bare Soil, Coniferous Forest, Broadleaved
Forest, and clouds. (Although you are free to name them differently if you want).
There are several ways to know how to name these clusters. Here a couple different ways are
described: Image Alarm, Inquire Cursor
1) Using Image Alarm
Perhaps the "easiest" way is to use the Image Alarm. Make sure you have the Unsupl.sig file
open in one viewer, and geolinked to a second viewer with the tm070717.img file zoomed out
so you can see much of the image. In the Signature Editor, under Edit/Image Association,
make sure that the associated image is tm070717.img.
Select the first cluster in the Unsupl.sig file, and change the color to something different from
the colors already in the satellite image, (such as orange), and run the Image Alarm. I think
you can see what this class is, so change the color to something appropriate (try the same as
what you used in the Supervised classes), and change the name of the Signature (where it says
Class 1), to Water 1 (I am pretty sure this cluster should be Water for all of you. If not, label it
whatever it is).
Highlight the next cluster, and do an image alarm. Remember you can use Swipe
or Flicker with the Alarm Mask. You can also just hit the OK button again in the Signature
Alarm dialog, and the alarm will flicker off and back on again.
You can also zoom in and out and look at certain areas. To identify what this class is you can
use a combination of sources - the stand maps, the aerial photos and also some visual
interpretation now that you are so familiar with what some of these different land cover types
look like in satellite imagery.
Identify what you think this second cluster is and change the Signature Name and color.
Tip: The Alarm Mask should change with every cluster. But if for some reason they
accumulate in your viewer (shows more than one alarm mask at a time, and just builds up),
you might need to take away the Alarm mask. Go into the Viewer and then View/Arrange
Layers, and delete the Alarm Mask, and then hit Apply. You do not have to save the changes
to the Alarm Mask.
2) Using Inquire Cursor
Another easy way to identify clusters is to use the Inquire Cursor in your Unsupl.img, and
look at areas that you know what they are. You will see in the Inquire Cursor information the
Class Name (e.g., Class 3). Now you can go back into the Signature Editor and change the
Signature Name and the color.
Continue to assign names to the rest of the 10 clusters using both or either one of these
methods. You should try to arrange those 10 clusters into generally the same classes as you
had for the Supervised classification, but a little more simply, so a suggestion is to use the
class names: Water, Bare Soil, Coniferous Forest, Broadleaved Forest, and Clouds. (Although
you are free to name them differently if you want).
Problem clusters and evaluation
You may run into clusters that may highlight several classes in the image, and you might feel
these are wrong. If you find one of these, what you could do is label this Signature name as
Problem 1, and give it the color Gray.
If you have time, you could later evaluate these clusters with the tools mentioned earlier, and
see why they might be problems. For the Feature Space, you can use the same file you
generated earlier (tmfs__4_3.fsp.img).
But for now label them simply as Problem (if you have Problem clusters), and go on to
identify the other clusters.
Save your work
Every so often, save your signature file! Save it as the same name (unsupl.sig),
Assigning final class names to the Unsupervised Classification file
Now, in the viewer where you have the Unsupl.img file open, go to Raster/Attributes. This
brings up the Raster Attribute Editor. This is where you can assign the new class names and
colors to the result from the unsupervised classification. Again, use approximately the same
kind of color scheme you used for the supervised classification.
Class 0 you can make black. This is not a class within the data we're looking at. Next, under
Row 1, is your Class #1 in the Signature Editor. Change the color of this to blue. It now
changes in the unsupl.img image. Also change the Class Name, although now you can just
call it by it's information class name if you want (or you can keep the Forestl, Forest 2, Forest
3, etc... if you want).
Press the save button in the Raster Attribute Editor once you are done.
Now you should be able to compare the results of the two classifications you have done, the
supervised (Superl.img) and the unsupervised (Unsupl.img). Compare them visually.
As the assignment for Lab #7, send the two final signature files (for both the
Supervised and Unsupervised classifications)
These classifications that result from these signature files do not have to be perfect
classifications, but they should show that you have understood the concept, and not look
terribly unreasonable (such as if you only had three classes in the Supervised file which were
labeled incorrectly).
7.4 Evaluating your training areas
In Exercise 7a you created a training set with training areas for several
information classes. In the middle of the exercise, you looked into one way of evaluating your
training areas, which was the Image Alarm.
At the end of the exercise you ran a Supervised Classification, using a Maximum Likelihood
Classifier.
You should have looked at the results of your classification (Superl.img). Probably this
classification needs some refining. Maybe some of the training areas you made were too
general spectrally, and classified too many pixels into a wrong class in the resulting
classification.
The main goal of deleting or adding training areas is to get spectrally separable (i.e., different)
representative digital numbers for each information class, in order to get a better
classification. Refer again to the illustration below to show what is meant by spectrally
separable:
Imagine's tools for training set evaluation
So, you want to identify problem training areas, and perhaps delete them from your training
set.
There are some other tools in the Signature Editor with which to look at and evaluate the
training areas. In this exercise, you will just look at these tools and learn that they exist and
how to interpret them. You can try to make your supervised classification as good as possible,
however, to complete a very proper supervised classification could take more time than we
have in the lab. It's more important to learn the tools that you could use to do this.
Open the file tm070717.img in the TM Band combination 4, 3, 2 (RGB) in a
Viewer. Also, open up your Signature file (sigl.sig) that you created in 7a.
The Mean Plot Signature is one way to look at your training areas. Select one of your
signatures, and then select the Display Mean Plot Window button.
All of these Output files represent different 2 band combinations that will be displayed in
scatter plots. In this dialog you can see the band used in the X axis and the band used for the
Y axis of the scatter plot. One of the more common ways to look at a scatter plot is with TM
Band 4 on the X axis and TM Band 3 on the Y axis. This is why we chose the Reverse Axes
option.
This brings up what looks like the Spectral Profile Window, except this gives you
the mean digital number of the pixels within the selected training set.
The default mode when this opens is to show one training area at a time. But you can also
look at the spectral mean values of several training areas at once. First you have to go into
your Signature Editor, and select a few of the training sets you want to compare. Then go
back into the Signature Mean Plot window, and click on the Multiple Signature Mode.
This information can begin to show you whether you have some training sets that you have
labeled with different information classes, yet have very similar spectral means. However, it
may be a bit hard to distinguish in this tool, since the DNs sometimes don't differ by very
much. Also, with this tool you are just looking at the means of the training set. It's good to
look at the range of your training signatures as well.
You can close the Signature Mean Plot window.
Feature Space
To look at the ranges of the training sets this, you can look at the training areas in "Feature
Space". Feature Space is the 2-dimensional plotting of pixel values in the image. It is also
called a Scatter Plot.
In the Signature Editor, choose Feature/Create/Feature Space Layers. This brings
up a dialog. In the Input Raster Layer give the file tm070717.img. Now choose the option
Reverse Axes. Now you can give an Output Root Name as tmfs (press the return key after) then a lot of Feature Space Layer Output File Names also appear.
All of these Output files represent different 2 band combinations that will be displayed in
scatter plots. In this dialog you can see the band used in the X axis and the band used for the
Y axis of the scatter plot. One of the more common ways to look at a scatter plot is with TM
Band 4 on the X axis and TM Band 3 on the Y axis. This is why we chose the Reverse Axes
option.
Choose OK. All of these feature space files are then created.
Open a new Viewer, and open the file tmfs_4_3.fsp.img. Zoom in to just the Scatter Plot
image (like below). This file represents the DNs for TM Band 4 on the X axis and the DNs for
TM Band 3 on the Y axis. The different colors represent the number of pixels with a range of
pink being less and up to red being the higher number of pixels. This band combination is one
of the more common Scatter Plots to look at. Another band combination you might want to
look at is TM Band 5 vs. TM Band 4.
Now, back in the Signature Editor, select one training area from your sigl.sig file.
Then go to Feature/Objects on the Signature Editor. This brings up the Signature Objects
dialog. Choose the Viewer number (in the case above it's Viewer #2), and then check the
boxes Plot Ellipses and Label. Then press OK.
You will see that in the Feature Space image the ellipse for your training area has
appeared. The ellipses represent the range of pixels in your image that this training area is
similar too, and might classify when run in a classification.
If you cannot see the ellipse well, you can change the color in that signature (white is good).
Then go back to the Signature Objects window, and say OK again. Now you see it well. Ify
ou are going to change colors in this, you should save this signature file as a different name
now (sigl_fs.sig), so that you can go back to your old colors and signature file in sigl.sig if
you want to.
Now select several signatures you want to compare and go back into Signature Objects, and
press OK. If you selected all your signatures, you might have something that looks like the
graphic below, from another training set. You can zoom in and look more closely at which
classes might overlap, or encompass too many other classes as well (Such as the Bare5 class
below).
This is one of the tools you can use to identify potential problems with your
training areas. You can look at several of your training sets, and see if there are overlapping
training sets which are from different information classes (such as Bare soil overlapping with
Coniferous Forest). You can also look to see where information classes are in the Scatter plot
(Water classes down to the lower left for example, when you display TM band 4 vs TM band
3. You can close the Feature Space window and the Signature Objects dialog now, if you
want.
Transformed Divergence
Another tool to help you evaluate your training sets is a measure of statistical
separation between your training areas. One of these measures is called Transformed
Divergence. To look at this, in the Signature Editor make sure you have no training areas
selected. Then go to Evaluate/Separability. You get the following dialog, in which you should
choose the options shown in the box below.
Press OK. This brings up a cell array which is a matrix of the Transformed Divergence values
for every training area pair you have. Very generally, in Transformed Divergence, a value of
2000 tells you that these training sets are spectrally separable. A value of 1000 or less
indicates that these classes are spectrally similar. You can read more about Transformed
Divergence in Lillesand and Kiefer.
Values below 1500, meaning Water 2 and Water 3 signatures and also Water 2 and Water 4
signatures are not very separable. You could expect that, seeing as these are in the same
information class. Of course this is more useful to look at to see if there are classes you would
not want to be mixed up (but there weren't any in this particular example above! There may or
may not be some in your own.)
These are all a few ways of deciding how to refine your training set. Using these tools, you
could decide to remove signatures from your training set, and run the supervised classification
again.
Using any of the above mentioned tools, remove any of the training sets you feel are really
not good in the supervised classification. (If you remove several, you may have to replace
them with new signatures in that information class). Save your new training set as Sig2.sig
and run your supervised classification again, naming the result super2.img. Look at the result
and compare it to your Superl.img.
Smoothing and Accuracy Assessment
If you finished what you would consider a good supervised classification, you
might still notice that the result may look "pixelly", with stray pixels scattered throughout the
image. Often a "smoothing" or "generalization" is performed to get rid of this effect. We
won't do this in this lab, although the tools are available to do that in Imagine. Instead,
smoothing and filters will be covered in a class lecture.
Of course, the "real" evaluation of the classification is done after you believe a satisfactory
classification has been achieved, by using another independent source of data to do an
Accuracy Assessment. You should have already heard in the lectures about accuracy
assessment, and can read about it in the text book. We won't do an accuracy assessment on the
classification in this lab, although the tools also exist to do this in Imagine.