Detecting Hidden Information in Images: A Comparative Study

Detecting Hidden Information in Images:
A Comparative Study
Yanming Di, Huan Liu, Avinash Ramineni, and Arunabha Sen
Department of Computer Science and Engineering
Arizona State University, Tempe, AZ 85287
yanming.di, huan.liu, avinash.ramineni, arunabha.sen @asu.edu
Abstract
During the process of information hiding in a cover
image, LSB-based steganographic techniques like JSteg
change the statistical properties of the cover image. Accordingly, such information hiding techniques are vulnerable to statistical attack. The understanding of steganalysis methods and their effects can help in designing methods and algorithms preserving data privacy. In this paper, we compare some steganalysis methods for attacking
LSB-based steganographic techniques (logistic regression,
the tree-based method C4.5, and a popular method Stegdetect). Experimental results show that the first two methods,
especially the logistic regression method, are able to detect
hidden information with high accuracy. We also study the
relationship between the number of attributes (the frequencies of quantized DCT coefficients) and the performance of
a classifier.
1. Introduction
The last few years have seen a significant rise in interest among the computer security researchers in the science
of steganography. Steganography [8, 9] is the art of hiding
and transmitting data through apparently innocuous carriers in an effort to conceal the existence of the secret data.
The term steganography in Greek means covered writing
whereas cryptography means secret writing. Steganography is different from cryptography in that while the goal of
a cryptographic system is to conceal the content of the messages, the goal of information hiding or steganography is
to conceal their existence. Steganography in essence camouflages a message to hide its existence and make it seem
invisible thus concealing the fact that a message is being
carried altogether. Steganography provides a plausible deniability to the secret communication which cryptography
Yanming Di can also be reached at yanming [email protected].
does not provide. Covert information is not necessarily secure and secure information is not necessarily covert. The
goal of cryptography is the secure transfer of the secret message where as the goal of steganography is to make sure the
transfer of a secret message is undetectable. The understanding of steganalysis methods and their effects can help
in designing methods and algorithms preserving data privacy.
With the increase of the digital content (and distribution
of multimedia data) on the Internet, steganography has become a topic of growing interest. A number of programs
for embedding hidden messages in images and audio files
are available [18]. Most of these steganographic methods
modify the redundant bits in the cover medium (carrier)
to hide the secret messages. Redundant bits are those bits
that can be modified without degrading the quality of the
cover medium. Replacing these redundant bits with message bits creates stego medium. The modification of the
redundant bits can change the statistical properties of the
cover medium. As a result statistical analysis may reveal the
presence of hidden content. Detecting the steganographic
content is called steganalysis [7]. It is defined as the art and
science of breaking the security of steganographic systems.
As the goal of steganography is to conceal the existence of
a secret message, a successful attack on a steganographic
system consists of detecting that a certain file contains hidden information in it. Detection of steganographic modifications in an image can be made possible by testing its statistical properties. If the statistical properties deviate from
a given norm it can be identified as a stego image.
In this paper we present two new methods for steganalysis. We try to attack the steganographic tool JSteg and
attempt to break it with higher accuracy. There have been
many attempts at breaking JSteg but the accuracy is not really high. In this paper we present two novel methods of
breaking JSteg which can further be extended to other tools
like JPhide and Outguess [18].
The rest of the paper is organized as follows. Section 2
introduces JSteg steganographic method, the idea of the sta-
tistical attack, and some related work. Section 3 presents
two new methods for attacking LSB-based steganographic
methods. We also discuss how the knowledge of the distribution of the DCT coefficients can help in building our
models. Section 4 reports results of our experiments of
comparing the two new methods with the popular program
Stegdetect [16]. Section 5 shows that careful model selection is needed to achieve high accuracy when using data
mining and statistical methods for steganalysis. Section 6
concludes the paper.
2. Histogram Analysis
The JPEG image format [15] uses a discrete cosine transblock of source imform (DCT) to transform each
age pixels into
DCT coefficients. The DCT coefficients
of an
block of image pixels
are given
by
! #"%& $ & $ /.
')(+* ,-(!*
0-132 5467 8
:9 0-132 ;43<7 8=9 >
where
and
#?AB C@ D 4 for EGFH
for JIKFL
NMPOQF RL)LRL/TSU=L
V XWY
[Z 1 ]\+^J" L
V
>
if the message bits are equally distributed, modifying the
least significant bits will reduce the difference in frequency
of the PoVs, which would otherwise be unequal with very
high probability. This equalization can be detected by appropriate statistical tests.
Based on this idea, Provos and Honeyman [16] carried
out an extensive analysis of JPEG images using the steganalytic software Stegdetect. Stegdetect is based on Chi-square
test. It is able to detect messages hidden in a JPEG image
using steganography software such as JSteg, JPHide or OutGuess 0.13. However the Chi-square test used in [16] does
not produce results with very high accuracy. In this paper,
we will demonstrate that the histogram information can be
used more efficiently by using logistic regression or a decision tree method, C4.5 [17].
Similar work in this area is reported by Berg et al. [1],
Farid [4] and Zhang and Ping [22]. Berg et al. [1] used
statistical learning methods in steganalysis. The attributes
used in their learning procedure are unconditional entropy,
conditional entropies and transition probabilities. Farid [4] built higher order statistical models of the
images using a type of wavelet decomposition. Zhang and
Ping [22] proposed an attack on the JSteg method based on
a different Chi-square test. [5] is a very good survey on the
state of art of the steganalysis.
3
3. Classification Methods
3
The coefficients are then quantized using a
-element
quantization table
by the following operations:
The least-significant bits (LSB) of those quantized DCT coefficients can be used to embed hidden messages.
JSteg hides data in JPEG image files by changing the
LSB bits of the quantized DCT coefficients. It replaces
the LSB bits of the quantized DCT coefficients with secret
message bits. Other similar steganography methods include
JPHide and OutGuess 0.13. Steganographic techniques like
JSteg that modify the LSB bits can be detected by analyzing
the frequencies of the quantized DCT coefficient values.
Modifying the least significant bits transforms one value
into another that differs only by 1. These pairs of values are
called PoVs in [19]. They introduced a powerful statistical
attack that can be applied to any steganographic technique
in which a fixed set of PoVs are flipped into each other during the process of embedding message bits. The insight
of their statistical attack is based on the observation that
To distinguish between the images with and without hidden data can naturally be viewed as a classification problem.
We refer to the classes as stego and normal. We use a set of
images as the training data to construct classifiers. When a
classification algorithm is run on a data set (stego and normal images), it needs to find a boundary between the two
classes and create a model. Given a set of images, the model
learned can be used to predict the class to which each image belongs to. Here we propose to apply two classification
methods to detecting hidden messages in JPEG images, the
logistic regression method and the tree-based method C4.5.
We discuss next the attributes to be used in the statistical
models.
3.1. Predicting Variables
As discussed earlier, LSB-based steganography techniques such as JSteg insert information into images by replacing the LSB of the quantized DCT coefficients with
the secret text. This changes the frequency of the quantized DCT coefficient values. Therefore, the frequencies
of quantized DCT coefficient values are natural candidates
for predicting variables to be used in the statistical models.
However, for JPEG images, the DCT coefficients can have a
F
4
3.2. Logistic Regression
Logistic models are widely used in statistical applications where binary responses (two classes) occur. In our
case, an image is either stego or normal. We can assume that
the probability of an image being stego is a function
of some image characteristics (a vector). For example,
can be the frequencies of the quantized DCT coefficient values. In the case of two classes, the logistic model has a very
simple form. The probability function is modeled by
Z]
Z]
Z 7 7 7%.R.). 7+!
* Z
where ’s are the components of the vector . The model
parameters ’s are usually estimated
' by maximum likelihood. Note that the function
func ' is a monotone
%' $
tion and as such its inverse function '& !#" )' $ guarantees
Z#
!("
that the values of are between 0 and 1. More details
on logistic regression can be found in [6].
We use the S-PLUS [2] function glm [12] to estimate
the probability function . As mentioned in the previ-
Z]
1400
200
400
600
h -2
800
1000
1200
3
3
3 3
2
2 3
32 3 3 3
2 2
2 3
11
1 2 1
1
1 2 2
3 33 3
1 12
3
1
3
22
1
33
3
2 2
3 3332 2 32
2
1 11
23
1
3333
21
2
1
2
3
3
1
1
3 33 3333333 3 2 3 1 2
3 33
333 3 33 32 2 2
1 1
2 2
3 33 33 3333
33 3333 33 222 2
1 2 12 1
33
333 3 3322 222 1 2 1 1
3 333 333 32
1
2 22212 221 1
3 3 33 33
23 222222221 222 22
33
3 3 333 3
11111 1 1
22222
3 3
333 3
22 2122111111 11
333 3 333 33 2 222 212
2
1
1
3 3333333
1 1111 1 1 1
33 3 22 22
3 33 3 33
2
2
1 1 1 11 11 1
2222222 2111 1
333 333 3 332
2
1 211 11
3 3333333 333 22 22222 2221
2 1 11 1
1
333 3 223 222 2
1
22221 2 2211 1
3333 3 22 22 222222112
2
2
1
1
21 1111
111 12
111
33 322 2 2222 222
21112 1 1
1
1
1
3
3
1
1
11
1 1
1 1
1 1111 1 111 11
2 222 1
33
1
1
2
1
2
1
2
1
222 1 1111 1
1
1
1
2 222222221
1
11
12 1
111 1 1
3
1 1
1 11 1 1
111 11 1 1
21
1 1
0
wide range of values. Using the frequencies of all these values to build a model is not practical. More importantly, using more variables than needed may introduce several problems. For a regression model, it may lead to ill-conditioned
matrices and unstable estimate of the model parameters. In
general, adding attributes that are not really important into
a statistical model is like adding noise to the model, thereby
degrading the performance of the model. Therefore careful
model selection is very important.
Research shows for normal images, the distribution of
the DCT coefficients can be approximately modeled as a
Laplacian or a generalized Gaussian distribution [3, 14, 10].
These models on the distribution of the DCT coefficients
suggest that the frequencies of DCT coefficients with small
values are more unevenly distributed. For example, the difference between the frequency of value and value is
generally greater than that between value and value .
So the DCT coefficient values with small magnitude are
more sensitive to information inserted using a JSteg-like
method. (See Figure 2 in [16] for histograms of the DCT
coefficients before and after messages are inserted into a
JPEG image using JSteg-like method). We use to denote
the frequency of the DCT coefficient with value , i.e., the
number of DCT coefficients that take the value . In this
study, we use the central six frequencies , , ,
, and to build our models. We do not use the frequencies of the values and , since the JSteg method does
not modify DCT coefficients with these values. Our experiments demonstrate (results presented in Section 5) that the
use of more variables does not improve the performance of
the models.
0
1000
2000
3000
h -1
4
Figure 1. The frequencies of the quantized
DCT coefficients having values and .
Black dots are for the normal JPEG images;
Symbols 1, 2 and 3 are for stego images with
20, 100 and 200 bytes hidden messages respectively.
X
ous subsection, we use the frequencies , , ,
, and *
as predicting variables. Addition of more
variables increases the complexity of the model but does not
necessarily guarantee higher accuracy. In fact, we show in
Section 5, in some instances, accuracy can actually deteriorate.
We include only linear terms of the variables in the
model. This is based on the following argument. Take
the pair and for example. A plot of these pairs
is shown in Figure 1. The plot suggests that in a normal
JPEG image, the frequency of the DCT coefficients having
a value is almost always higher than the ones with a
value . This is true because the distribution of each individual DCT coefficients tends to have a mode in the center
(see [3, 14, 10] for modeling of the DCT coefficients). If
we have a very large image with large number of DCT coefficients, this trend should also hold for other coefficient
value pairs. However, for small size images, occasionally
this trend may not be seen. As the image sizes in our experi#+
+
ments was rather small,
, we chose the frequencies
, , , , and as predicting varialbes in
the statistical models, because the trend seem to be correct
in this case. This important piece of extra information is not
utilized by any known steganalysis method. An implication
of this observation is that linear logistic models should work
well in detecting JSteg-like steganographic techniques. The
above discussion also suggests that the methods presented
in this paper should work better for larger images.
Compared with a Chi-square test, logistic regression is
X 4
4 4 more refined. Logistic regression has many advantages,
0.5
0.4
0.1
0.2
3. It is flexible. We can adjust one simple parameter in
the model to meet different accuracy needs.
0.3
error rates
Z]
2. It is easy to interpret. We can actually derive a closed
form expression for the probability function .
0.6
1. It is fast. Logistic regression is implemented efficiently
in almost all the professional statistical package, such
as SPLUS and SAS.
0.0
3.3. Tree based Method
logistic
Tree based methods can also be used. We present a tree
based model, C4.5 [17], to fit a tree structure to the training
data. C4.5 builds classification models called decision trees
from the training data. Each internal node in the tree specifies a binary test on a single attribute, using thresholds on
numeric attributes. If as a result of tests conducted at internal nodes, an image ends up in a leaf node where majority
of the images are stego, it is classified as a stego image;
otherwise it is classified as a normal image.
The tree is constructed by the following procedure:
1. Choose the best attribute for splitting the data into 2
groups at the root node.
2. Determine a splitting point by maximizing some specified criterion (say, information gain).
3. Recursively carry out the first two steps until information gained by the process cannot be improved any further.
Information gained by splits can be used as the criterion
for determining the attributes and the splitting points. Once
the tree is constructed it can be used for classifying the test
data.
Tree based methods can be more flexible than logistic regression. It makes less assumptions about the data, so can
be generalized to other situations more easily. One disadvantage of the tree based method is that the decision regions
for classifications are constrained to be hyper-rectangles
with boundaries constrained to be parallel to the input variable axes. As in the case of logistic model, the training data
in this case also consist of the frequencies of quantized DCT
coefficient values , , , , and *
of images,
as attributes or predicting variables. We use the data mining
tool WEKA [20] to run the C4.5 algorithm on the training
data.
4. Experiments and Results
We have a data set of 180 normal JPEG images. The
images are downloaded from the Internet. All the images
tree
stegdetect logistic
tree
stegdetect logistic
tree
stegdetect
Figure 2. Boxplots of the error rates. The left
three bars are error rates for experiment 1,
the middle three are for experiment 2, and the
right three are for experiment 3. Smaller values are better.
4+ #4 + FF
43F
4FF
have been cropped to
in size. We used the JSteg
method to insert
bytes,
bytes, and
bytes of text
messages into the images. The JPEG image sizes range
from to
kilobytes. According to the author of JSteg,
the maximum size of the message that can be inserted in
a cover image is approximately
of the size of the image file. For some image files, this limit is only 200 bytes.
The secret message used in our experiments for insertion in
cover images was taken from Gutenberg’s Etext of Shakespeare’s First Folio.
We use 10-fold validation to compare the three steganalysis methods, the logistic regression, the tree-based method
C4.5, and the Stegdetect method. In each of our experiments, we take the original 180 JPEG images, and one
,
group of 180 images with embedded text messages.
and bytes of texts were embedded in cover images in
our experiments 1, 2 and 3 respectively. The 10-fold validation results are summarized in Tables 1, 2, 3, and Figure 2.
From the results, we can see that the logistic regression
method performs better than the other two methods in all
three experiments. The tree-based method C4.5 performs
better than the Stegdetect method in experiment 1. The performance of the latter two methods is similar in experiment
2. While in experiment 3 both methods fail to perform better than random guess.
The performance of the logistic regression method is
noteworthy. When
byte messages are embedded in the
images, the mean error rate is for this method. This
implies even when only
bytes of message is embedded,
the logistic regression method is able to perform better than
4
F
F
F=F
43FF
4F
43F
43F
FL 3
Table 1. Error rates for the logistic regression, the tree-based method C4.5 and the
Stegdetect method in experiment 1 (in the
stego images, 200 bytes of text message is
embedded using JSteg). Mean and standard
deviation of the error rates are shown in the
bottom of the table.
run
1
2
3
4
5
6
7
8
9
10
mean
stdev
logistic
0.000000
0.000000
0.027778
0.000000
0.000000
0.000000
0.027778
0.000000
0.000000
0.000000
0.005556
0.011712
tree
0.055556
0.055556
0.083333
0.055556
0.000000
0.027778
0.194444
0.027778
0.027778
0.027778
0.055556
0.053990
stegdetect
0.194444
0.305556
0.111111
0.250000
0.111111
0.194444
0.166667
0.222222
0.111111
0.277778
0.194444
0.070516
Table 2. Error rates for the logistic regression, the tree-based method C4.5 and the
Stegdetect method in experiment 2 (in the
stego images, 100 bytes text message is
embedded using JSteg). Mean and standard
deviation of the error rates for each method
are shown in the bottom of the table.
run
1
2
3
4
5
6
7
8
9
10
mean
stdev
logistic
0.055556
0.083333
0.111111
0.055556
0.055556
0.000000
0.083333
0.055556
0.083333
0.000000
0.058333
0.035741
tree
0.227778
0.202222
0.202222
0.138889
0.222222
0.194444
0.444444
0.166667
0.305556
0.083333
0.218778
0.098394
stegdetect
0.194444
0.250000
0.166667
0.250000
0.194444
0.138889
0.250000
0.222222
0.250000
0.333333
0.225000
0.054700
Table 3. Error rates for the logistic regression, the tree-based method C4.5 and the
Stegdetect method in experiment 3 (in the
stego images, 20 bytes text message is
embedded using JSteg). Mean and standard
deviation of the error rates for each method
are shown in the bottom of the table.
run
1
2
3
4
5
6
7
8
9
10
mean
stdev
logistic
0.388889
0.305556
0.444444
0.500000
0.388889
0.388889
0.333333
0.416667
0.444444
0.333333
0.394444
0.059720
tree
0.527778
0.555556
0.555556
0.611111
0.527778
0.583333
0.416667
0.583333
0.507778
0.611111
0.548000
0.057986
stegdetect
0.555556
0.527778
0.472222
0.416667
0.444444
0.305556
0.638889
0.555556
0.500000
0.555556
0.497222
0.093008
random guess whereas the other two methods perform no
better than random guess. In the tree-based method the
boundaries of the decision regions are constrained to be parallel to the input variable axes. However, it may be observed
in Figure 1, the true boundary in terms of the attributes of
the normal and the stego images is not parallel to the input
variable axes. For this reason the performance of the treebased method is not as good as that of logistic regression.
Our chosen methods do not rely on the knowledge of the
locations where the information is hidden. As such they can
be effectively utilized to break similar LSB-based methods
that use random bit selection, e.g., OutGuess 0.13.
5. On the number of predicting variables
We indicated in Section 3 that the use of excessive variables may not lead to better results. We illustrate this phenomenon with the help of results from our experiments. In
Figure 3, we present the results of our experiments where a
varying number of predicting variables (2, 4, 6,...,20) were
used instead of just 6 ( , , , , and *
).
The estimation error rates for logistic regression method using 2, 4, 6, ..., 20 variables (10-fold cross validation) are
summarized in Figure 3. The figure indicates that using
only the center two frequencies and , it may not
be possible to capture all the information in the data. Increasing the number of variables to or improves the accuracy. However, the use of more than variables does
Error rates
0.25
0.50
2
4
6
8
10
12
14
16
18
20
14
16
18
20
14
16
18
20
Number of variables used
Error rates
0.0
0.20
(a)
2
4
6
8
10
12
Number of variables used
Error rates
0.0 0.04
(b)
2
4
6
8
10
12
Number of variables used
(c)
Figure 3. Comparison of logistic models with
different number of predicting variables. The
sizes of hidden messages in the stego images
are 20, 100 and 200 bytes in figures (a), (b) and
(c) respectively.
not improve the accuracy but causes slightly larger variance
in the estimation error rates. We observe similar trends in
the tree-based methods. In our experiments, therefore, we
use a model with predicting variables. More sophisticated feature selection algorithms can be found in the literature [6, 11, 13, 21]. We intend to explore if applying these
feature selection algorithms can lead to further performance
improvement.
6. Conclusion
LSB-based steganographic techniques like JSteg change
the statistical properties of the cover image when it embeds
secret message in the image. Accordingly, such methods
are vulnerable to statistical attack. Previous methods such
as Stegdetect are based on Chi-square test. The accuracy
of Stegdetect can be improved. When the size of the hidden message is small, it performs no better than random
guess. In this paper we have proposed two new steganalysis
methods based on the logistic regression and the tree-based
method C4.5 for attacking LSB-based steganographic techniques.
We conducted experiments to evaluate the performance
of the two data mining techniques and compared them with
the performance of the well known method Stegdetect. The
experiments demonstrated that the performance of the logistic regression based technique is very impressive. When
large amount of information is hidden, it can detect with
very high accuracy. Even when the amount of hidden information is very small, it performs better than random guess.
The tree-based method C4.5 outperforms Stegdetect in the
experiment where a relatively large amount of information
is hidden. However, it does not perform well when the
amount of hidden information is small. We suggest that
one reason for C4.5 not performing as well as the logistic
regression is that it tends to produce boundaries that are parallel to the input variable axes, which in this case may not
be appropriate.
We also pointed out that the number of attributes used
in classification can be related to a classifier’s performance.
Many present steganalysis methods do not consider this as
a serious problem. Hence they tend to use all attributes that
are related. However using more variables than needed does
not necessarily lead to good performance and may even
significantly degrade the performance of a statistical learning model. When selecting the predicting variables for our
model, we take the distribution of the DCT coefficients into
consideration.
The understanding of steganalysis methods and their effects can help in designing methods and algorithms preserving data privacy. Our experiments were carried out to break
methods like JSteg that are employed to hide information.
Our methods do not rely on the placement of the hidden
information. Therefore they can be used without any modification on LSB based steganographic techniques that use
random bit selection.
7. Acknowledgements
The authors would like to thank Sidi Goutam and Amit
Mandvikar for their help in this project. The authors also
wish to thank the reviewers for their helpful comments in
the preparation of this manuscript.
References
[1] G. Berg, I. Davidson, M.-Y. Duan, and G. Paul. Searching for hidden messages: automatic detection of steganography. In 15th AAAI Innovative Applications of Artifical Intelligence (IAAI) Conference 2003, 2003.
[2] J. M. Chambers and T. Hastie, editors. Statistical models in
S. London: Chapman & Hall, 1991.
[3] R. J. Clarke. Transform Coding of Images. London: Academic Press, 1985.
[4] H. Farid. Detecting hidden messages using higher-order statistical models. In International Conference on Image Processing (ICIP), Rochester, NY, 2002, 2002.
[5] J. Fridrich and M. Goljan. Practical steganalysis of digital
images - state of the art. In Proc. SPIE Photonics West, Vol.
4675, Electronic Imaging 2002, Security and Watermarking of Multimedia Contents, San Jose, California, January,
2002, pp. 1-13., 2002.
[6] T. Hastie, R. Tibshirani, and J. H. Friedman. The elements
of statistical learning: data mining, inference, and predic-
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
tion: with 200 full-color illustrations. New York: SpringerVerlag, 2001.
N. F. Johnson and S. Jajodia. Steganalysis of images created using current steganography software. In D. Aucsmith,
editor, Information Hiding: Second International Workshop,
volume 1525 of Lecture Notes in Computer Science, pages
273–289. Springer-Verlag, Berlin, Germany, 1998.
D. Kahn. The Codebreakers – The Story of Secret Writing.
Scribner, New York, New York, U.S.A., 1996.
D. Kahn. The history of steganography. In R. J. Anderson, editor, Information Hiding, First International Workshop, volume 1174 of Lecture Notes in Computer Science,
pages 1–5. Springer-Verlag, Berlin, Germany, 1996.
E. Y. Lam and J. W. Goodman. A mathematical analysis of
the DCT coefficient distributions for images. IEEE Transactions on Image Processing, 9(10):1661–1666, 2000.
H. Liu and H. Motoda. Feature Selection for Knowledge
Discovery & Data Mining. Boston: Kluwer Academic Publishers, 1998.
P. McCullagh and J. A. Nelder. Generalized linear models
(Second edition). London: Chapman & Hall, 1989.
A. Miller. Subset Selection in Regression. Chapman &
Hall/CRC, 2 edition, 2002.
F. Müller. Distribution shape of two-dimensional DCT coefficients of natural images. ELECTRONICS LETTERS,
29(22):1935–1936, 1993.
W. B. Pennebaker and J. L. Mitchell. JPEG Still Image Data
Compression Standard. Van Nostrand Reinhold, New York,
NY, USA, 1993.
N. Provos and P. Honeyman. Detecting steganography content on the Internet. Technical report, CITI, 2001.
J. R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, 1993.
Steganography software. www.stegoarchive.com, 19972003.
A. Westfeld and A. Pfitzmann. Attacks on steganography
systesms, 1999.
I. Witten and E. Frank. Data Mining - Practical Machine
Learning Tools and Techniques with JAVA Implementations.
Morgan Kaufmann Publishers, 2000.
L. Yu and H. Liu. Feature selection for high-dimensional
data: A fast correlation-based filter solution. In T. Fawcett
and N. Mishra, editors, Proceedings of the 20th International Conference on Machine Learning (ICML-03), August
21-24, 2003, pages 856–863, Washington, D.C., 2003. Morgan Kaufmann.
T. Zhang and X. Ping. A fast and effective steganalytic technique against JSteg-like algorithms. In ACM Symposium on
Applied Computing, March 9 to 12, 2003, Florida, USA,
2003.