Characterization of virgin olive oils according to its triglycerides and

Food Control 16 (2005) 339–347
www.elsevier.com/locate/foodcont
Characterization of virgin olive oils according to its triglycerides
and sterols composition by chemometric methods
T. Galeano Diaz
b
a,*
, I. Dur
an Mer
as a, J. S
anchez Casas b, M.F. Alexandre Franco
a
a
Department of Analytical Chemistry, Faculty of Sciences, University of Extremadura, E-06071 Badajoz, Spain
Institute of Agricultural Technology of Junta de Extremadura, Ctra. San Vicente, s/n, Finca Santa Engracia, E-06071 Badajoz, Spain
Received 19 October 2003; received in revised form 16 March 2004; accepted 19 March 2004
Abstract
Principal component analysis (PCA), and soft independent modelling class analogy (SIMCA), were applied to data of content of
the various triglycerides, sterols, or both data, to explore their capacity for the typification of a variety of olive oil, belonging to a
Spanish origin denomination. This study has demonstrated that it is possible to characterize the oils obtained from a specific type of
olives (‘‘Manzanilla Cacere~
na’’ of North of Caceres (Extremadura––Spain)) according to their chemical composition. Best results
were obtained with the content of triglycerides. The plots of PCs showed that the PC1 is related with the category variable ‘‘variety’’
and the PC2 is related with ‘‘maturity’’. SIMCA was employed to assign unknown samples into one of two groups or classes,
depending on the ‘‘variety’’ of olives, for those which independent PCA models were made. Comman’s plot showed that different
olive oils are clustered in different groups and each group could be distinguished clearly.
2004 Elsevier Ltd. All rights reserved.
Keywords: Olive oils; Classification; PCA; SIMCA; Triglycerides; Sterols
1. Introduction
All food products, as the olive oils in this case, are
complex chemical objects which we perceive and evaluate in a global way. In occasions, it seems to be
important to have a reliable identification and classification of olive oils according to the olive variety and the
geographic origin, and to do this they usually must be
evaluated from a multivariate point of view, which is
predicted by food chemometrics (Forina, Lanteri, &
Armanino, 1987). Several studies have been carried out
to correlate the chemical composition of olive oil to
geographic origin (Aparicio, Albi, Lanzon, & Navas,
1987; Ferreiro & Aparicio, 1992; Fiorino & Nizzi, 1991;
Gigliotti, Daghetta, & Sidoli, 1993; Leardi & Paganuzzi,
1987; Tsimidou & Karakostas, 1993) and chemometric
methods have been applied to several chemical components, for the classification of the olive oils. Frequently
the contents of the different chemical components are
determined by chromatographic methods and several of
*
Corresponding author. Tel.: +34-2428-9300; fax: +34-2428-9375.
E-mail address: [email protected] (T.G. Diaz).
0956-7135/$ - see front matter 2004 Elsevier Ltd. All rights reserved.
doi:10.1016/j.foodcont.2004.03.014
the chromatographic procedures employed for the
characterization of vegetable oils, as well as for their
authentication have been recently revised (Aparicio &
Aparicio-Ruiz, 2000). Besides the methods cited in this
review we can mention other methods based in the
contents of fatty acids (Spangenberg, Macko, & Hunziker, 1998; Stefanoudaki, Kotsifaki, & Koutsaftakis,
1999); sterols (Paganuzzi, 1985); fatty acids and sterols
(Forina, Armanino, Lanteri, Calcagno, & Tiscornia,
1983); fatty acids, fatty alcohols and triperpenes (Bianchi, Giansante, Shaw, & Kell, 2001); sterols, triterpenic
alcohols and hydrocarbons (Aparicio, Ferreiro, Cert, &
Lanzon, 1990), triglycerides (Damiani et al., 1997;
Favretto et al., 1999); fatty acids and triglycerides
(Tsimidou, Macrae, & Wilson, 1987). More recently,
other different techniques to the chromatographic ones
have been also used. This way, classification of olive oils
of different origin has been gotten using near IR spectral
data in combination with an artificial neural network or
with the logistic regression (Bertran et al., 2000); the
high-field 1 H NMR spectroscopic data corresponding to
minor constituents subjected to principal component
analysis or to cluster analysis (Mannina, Patumi, Proietti, Bassi, & Segre, 2001; Sacchi et al., 1998); or the data
340
T.G. Diaz et al. / Food Control 16 (2005) 339–347
corresponding to aroma compounds obtained by means
of sensors constituting an electronic nose (Capone et al.,
2000; Guadarrama, Rodrıguez-Mendez, Sanz, Rıos, &
de Saja, 2001; Pardo, Sberveglieri, Gardini, & Dalcanale, 2000).
Chemometric techniques have been defined as the
utilization of mathematical and statistical methods for
handling, interpreting and predicting chemical data
(Bertsch, Mayfield, & Thomason, 1981). There is an
advantage in the multivariate techniques, which associate an n-dimensional bounded region to each category,
over those that only construct a separator of the different categories, as they allowed the extraction of hidden information, and they characterize a redundant
information source which often masks, completely, the
relevant information contained in the chemical composition (Forina & Lanteri, 1984).
Thus, one of the main goals will be to use multivariate methods to enable the visualization of data when
more than three variables have been measured. The
mathematical basis will be representation and manipulation of data by vectors and matrices (Kellner, Mermet,
Otto, & Widmer, 1998).
These methods are aimed at projecting the original
data set from a high dimensional space onto a line, a
plane, or a 3D coordinate system. The principal component analysis (PCA) finds and alternative set of axes
about which a data set may be represented and it is
designed to provide the best possible view of variability
in the independent variables of a multivariate data set.
We can try of relating this variability with fundamental
variables as origin, variety. . .. Samples whose coordinates regarding these new axes are similar belong,
probably, to classes or groups with similar values of
these fundamental variables. Therefore, when a preliminary exploratory analysis of data reveals clustering
of data, related with the values of some variables, new
PCA models can be constructed for these groups and the
capacity of the measured data to assign new samples to
these classes can be examined. One of the supervised
pattern recognition methods, based in the description of
individual categories by means of PCA independent
mathematical models, is soft independent modelling of
class analogy (SIMCA) (Wold et al., 1984; Wold et al.,
1983).
The object of this study is to explore the possibilities
of different chemical parameters, normally determined
in olive oil analysis, to differentiate and classify olive oil
samples of different origin, in order to confirm the
authenticity of the ‘‘Manzanilla Cacere~
na’’ olive oils.
PCA and SIMCA have been employed with the data
corresponding to the contents in the various triglycerides, or the various sterols, or both. Other variables
usually measured, acidity, index of peroxide, colour,
fatty acids, stability, etc. have also been examined. The
oils obtained from olives of ‘‘Manzanilla Cacere~
na’’ of
the North of Caceres (Extremadura––Spain) are very
good quality and they are protected by a ‘‘Denominaci
on de Origen’’ (Origin Denomination).
2. Experimental
2.1. Samples
A total of 80 samples of extra virgin olive oil,
including ‘‘Manzanilla Cacere~
na’’ oils from almazaras
of the North of Caceres in the region of Extremadura
and other mono-varietal oils, being 44 ‘‘Cacere~
nas’’ and
36 ‘‘non-Cacere~
nas’’, were obtained in the harvesting
periods 1999–2000 and 2000–2001.
Sampling was carried out over a time period from the
beginning of November to the end of January at three
sampling dates (different stage of maturity).
2.2. Reagents
Anhydrous sodium sulphate; potassium hydroxide;
acetone, acetonitrile, chloroform, ethanol, ether, and
hexane, from Panreac. Pyridine, hexamethyldisilazane,
chloromethylsilane, phenolphtaleine, and 60 20 · 20 cm
silica gel plates, from Merck. b-sitosterol from Sigma,
uvaol and stigmasterol from ICN Biomedicals, and 0.45
lm filters of nylon from Agilent.
All other reagents were of analytical grade and were
purchased from Merck or Aldrich.
2.3. Analysis of triglycerides, sterols, and other variables
The analysis of triglycerides was performed according
to the official chromatographic method of the EC no.
2472/97 (Diario oficial de las Comunidades Europeas L
341, 12.12.1997, p. 25). The apparatus was a Hewlett
Packard HPLC instrument model 1100 consisted by a
degasser, quaternary pump, manual six-way injection
valve, refractometer detector, and Chemstation Software package for instrument control, data acquisition,
and data analysis. A Lichrosorb FP 18 (4.6 · 0.25 mm)
analytical column was used.
The analysis of sterols was performed according to
the official method of the EC no. 2568/91 (Diario oficial
de las Comunidades Europeas L 248, 5.9.1991, p. 1).
The apparatus was a Hewlett Packard instrument model
6890 gas chromatograph, equipped with a flame ionization detector (FID); a HP-5 (Crosslinked 5% PH ME
Siloxane) capillary column (30 m · 0.25 mm · 0.25 lm)
and a 6890 Agilent automatic injector.
The determination of content of acidity, index of
peroxide, parameters of colour (L , a , b , C ab,
< H ða Þ), polyphenols, stability in olives oils, was performed according to the official methods of the EC.
T.G. Diaz et al. / Food Control 16 (2005) 339–347
2.4. Chemometric analysis
The supervised pattern recognition method soft
independent modelling of class analogy (SIMCA) based
in the description of individual categories by means of
principal component analysis (PCA) independent
mathematical models, was used for the classification of
the samples of olive oil as belonging to one of the two
classes ‘‘Manzanilla Cacere~
na’’ or ‘‘Non-Manzanilla
Cacere~
na’’.
The UNSCRAMBLER (Unscrambler software, version 6.0 of CAMO, Trondheim, Noruega) software
package was used for the application of both PCA and
SIMCA as well as for the preliminary exploratory
analysis of data. Different groups of variables were
measured: parameters of general character, (acidity, index of peroxides, colour), the fatty acids, triglycerides,
and sterols and were used separately or in combination.
3. Results and discussion
The measure of several variables in a number of
samples gives rise to large data tables that usually contain a large amount of information, too complex to be
easily interpreted, and subsequently a part of this
information can be hidden. PCA is a commonly used
multivariate technique which acts unsupervised, and it
helps us to find in what aspect a sample is different from
another. The principle of PCA is finding the linear
combinations of the initial variables that more contribute to making the samples different from each other.
These combinations are called principal components
(PCs). They are computed iteratively, in such a way that
the first PC is the one that carries most information (or
in statistical terms: most explained variance). The second PC will then carry the maximum share of the
residual information.
Therefore, PCA finds an alternative set of coordinate
axes, PCs, about which data set may be represented. The
PCs are orthogonal to each other and they are ranked so
that each one carries more information that any of the
following ones. In a first step of PCA the number of
principal components is estimated by the several criteria:
the percentage of explained variance, eigenvalue-one
criterion, Scree-test, and cross-validation.
Each component of a PCA model is characterized by
three complementary sets of attributes: Variances, that
are error measures, loadings describe the data structure
in terms of variable correlations, and scores describe the
properties, differences or similarities of the samples.
When the principal component scores are plotted they
may reveal natural patterns and clustering in the samples.
In order to find an operative classification role for
discriminating the samples, supervised-learning pattern
341
recognition techniques must be applied, such as SIMCA.
Soft independent modelling class analogy (SIMCA) is
an extremely informative technique, widely used in
chemometrics (Wold et al., 1984, 1983) to which
improvements have been introduced (Forina, Drava, &
Leardi, 1997). It is based on the evaluation of the
principal components of each category, the setting up of
a critical distance with probabilistic meaning and the
calculation of the distance of each object from the model
of each category. This implies that we accept, a priori,
that the data will have a geometric and probabilistic
structure. Then, unknown samples are then compared to
the class models, and assigned to classes according to
their analogy to the training samples. There are two
steps in classification: Modelling: build one PCA separate model for each class; Classifying new samples: fit
each sample to each model and decide whether the
sample belongs to the corresponding class.
Reliable results obtained from a SIMCA analysis are:
variable results as the Modelling power of one variable in
one model, [1-(variable residual variance/variable total
variance)1=2 ] and the Discrimination power; and Sample
results as Si (square root of the residual variance of the
sample) which is a measure of the distance of a sample
to a modeled class, and it is compared to the overall
variation of the class (called S0) being the basis of the
statistical criterion to decide whether a new sample can
be classified as a member of the class or not; and Hi,
leverage, that expresses how different the sample can be
considered from the other class members. Using
graphical plots, Si vs. Hi or Si vs. Si (Comman’s plot),
the samples can easily be classified.
3.1. Results obtained from values of triglycerides
The mean values and the confidence intervals of the
different triglycerides of the samples are shown in Table
1 together with the results for the other analyzed
parameters. The obtained results for PCA of these data
are the following:
The number of principal components has been decided by cross-validation, and it is observed that three
principal components are enough to explain the 98.9%
of the data variance.
The interpretation of the results of a principal component analysis is usually carried out by visualization of
the component scores and loadings. In Fig. 1 the loadings vectors for the first three components are plotted
and in the Fig. 2 the score vectors for the first three
components are plotted. The used notation for triglycerides makes mention to the acids that they present in
their structure, being O ¼ oleic acid; P ¼ palmitic acid;
S ¼ stearic acid; L ¼ linoleic acid and Ln ¼ linolenic
acid.
342
T.G. Diaz et al. / Food Control 16 (2005) 339–347
Table 1
Mean values of measured variables obtained for Manzanilla Cacere~
na
and Non-Manzanilla Cacere~
na olive oils
Variable
Acidity
Index of peroxide
k270
ak
k232
Polyphenols
L
a
b
C ab
< H ða Þ
Estability
Palmitic acid
Palmitoleic acid
Margaric acid
Stearic acid
Oleic acid
Linoleic acid
Linolenic acid
Arachidic acid
Gadolic acid
Behenic acid
Lignoceric acid
LOL + OLnO
PLL
LOO
PLO
OOO
SLO, POO
POP
PPP
SOO
SLS, POS
Cholesterol
Brassicasterol
Campesterol
Stigmasterol
D-7-Campesterol
chlerosterol
b-Sitosterol
D-5-Avenasterol
D-5,24-Stigmastadienol
D-7-Stigmasterol
D-7-Avenasterol
Total b-sitosterol
Mean and standard deviation
Manzanilla
Cacere~
na oils
Non-Manzanilla
Cacere~
na oils
90
76
16
1
180
10754
8663
)1287
6080
6219
10250
5844
1204
92
42
198
7903
449
70
36
27
13
61
252
56
890
388
4843
2574
328
77
393
92
14
12
267
147
60
60
7997
1288
73
40
48
9373
21
358
14
3
172
27637
8958
)1178
4730
4877
10250
8111
1179
71
70
351
7393
837
69
48
21
14
60
370
52
1213
595
4135
2352
300
50
626
134
20
22
334
70
55
79
8246
980
79
57
47
9354
89
288
2
1
23
4906
636
201
1746
1741
222
1501
107
12
15
40
242
190
5
4
3
1
8
80
6
199
108
367
212
34
20
50
21
9
7
17
79
32
15
224
213
22
22
13
118
20
400
7
5
47
10638
857
568
2621
2676
1320
4873
159
22
47
59
564
493
9
8
3
2
1
213
10
498
296
857
308
95
12
108
40
23
26
90
54
45
9
427
457
28
49
17
192
A loading plot for the plane PC1 and PC2 and the
plane PC1 and PC3 (Fig. 1) reveals that the variable
OOO with PLO and LOO have an inverse correlation,
and the three variables give their variance to PC1. The
variable that more contributes to the first component is
PLO since the other two also contribute to PC2 and
PC3. Also it can be observed in the figure that SLO,
POO and SOO are inversely correlated.
In the same planes PC1–PC2 and PC1–PC3, a score
plot (Fig. 2) reveals that ‘‘Manzanilla Cacere~
na’’ samples have positive scores for the first component and
Fig. 1. Loadings plots obtained from the PCA of data about tryglicerides composition, in the PC1–PC2 and PC1–PC3 planes.
therefore they will have superior values to the mean
value of those variables whose loadings regarding to this
PC are large and positives, that is to say they will have
values superior to the mean of OOO variable, while they
will have values inferior to the mean of PLO and LOO.
In the plane PC1–PC3 it can be also observed that
samples of ‘‘Manzanilla Cacere~
na’’ oils have negative
scores for the third component and so values inferior to
the mean of SOO. Therefore, it seems to be that the first
and the third component are reliable with the category
variable ‘‘variety’’.
If we now represent the same plots but taking into
account the state of maturity of the samples of olives
oils: olive oil from green olive, olive oil from semi-black
olives and olive oil from black olives, it can be observed
a differentiated distribution of the samples only
regarding the second component, and so this component
seems to be reliable with the category variable ‘‘state of
maturity’’.
Therefore, by means of PCA, the occurrence of predictors variables is appreciated, so that it is possible to
apply classification methods as SIMCA. This method,
SIMCA was used to determine which variables better
modulate and discriminate between the classes or the
categories established depending on variety of olives.
Two categories were predefined: class 1 including 33
T.G. Diaz et al. / Food Control 16 (2005) 339–347
343
Once each class has been modeled, and since they are
enough separated, new samples can be assigned to each
class. For it, new values of all the variables are calculated for each new sample, using the scores and the
loadings of each class and these are compared with
measured values. The residuals are combined in Si.
When representing Si=S0 in front of Hi, we have
found that, with a confidence level of 5%, among the 11
samples that belong to the MC group, only two are
erroneously assigned to the group NMC. However of
the 12 samples of oils from other types of olives, only 4
are correctly assigned to this group the rest being
without assigning, which can be explained by the fact
that this group is much less homogeneous since it is
made of oils all of them mono-varietals but of different
types of olives.
The results of the classification can be also easily seen
on a Coomans’s plot (Fig. 3). In this, they are shown
simultaneously the distances of the new samples in the
two classes and, as shown in the figure, one ‘‘Manzanilla
Cacere~
na’’ olive oil sample is classified in the NMC
class, another in both classes, and another ‘‘Manzanilla
Cacere~
na’’ olive oil sample is not classified in any class.
The rest (about 73%) are correctly assigned. The most of
the oils from ‘‘Non-Manzanilla Cacere~
na’’ are classified
as not belonging to none of the classes.
Fig. 2. Scores plots obtained from the PCA of data about tryglicerides
composition, in the PC1–PC2 and PC1–PC3 planes (( ) ‘‘Manzanilla
Cacere~
na’’ olive oils, ( ) ‘‘Non-Manzanilla Cacere~
na’’ olive oils).
samples of ‘‘Manzanilla Cacere~
na’’ olive oils (MC) and
class 2 for 24 samples of ‘‘Non-Manzanilla Cacere~
na’’
(NMC) olive oils and also a group of aleatory samples
containing samples of both groups were built to be used
in the step of classification. In the first step, PCA is used
to model each class, and two reduced models with 3
significant principal components, obtained by crossvalidation, were employed, with 99.0% of the explained
variance for the MC model, and 99.3% of the explained
variance for the NMC model. The variables that influence in each PC are similar in both cases to those already mentioned in the PCA model constructed for the
global group of samples.
Before using the models to predict the ownership to
one of the classes of a group of samples, we have evaluated the specificity of these models. In our case the
distance among models is of 7.24, what indicates that
the models are sufficiently distant to each other.
We have found that OOO, LOO, PLO, and
SLO + POO are the variables with higher values of the
modelling power in both classes of olive oils, according
to the variety. About the discrimination power, we have
found that PPP, SOO and OOO are the most important
variables for the differentiation of the classes of olive
oils.
3.2. Results obtained from values of sterols
The mean values and the confidence intervals of the
different sterols for the samples are shown in Table 1.
The results that are obtained of the PCA for sterols
are the following: Three principal components are enough to explain 98.9% of the data variance. A loading
Fig. 3. Coomans’s plot corresponding to classification of new samples
in the models obtained from triglycerides data (( ) ‘‘Manzanilla
Cacere~
na’’ olive oils, ( ) ‘‘Non-Manzanilla Cacere~
na’’ olive oils).
344
T.G. Diaz et al. / Food Control 16 (2005) 339–347
plot in the planes PC1–PC2 and PC1–PC3 (Fig. 4) reveals that b-sitosterol and D-5-avenasterol variables give
their variance to PC1. The total b-sitosterol gives its
variance to PC2, stigmasterol gives its variance to PC3
and campesterol gives its variance to PC2 and PC3.
Furthermore, from the score plots in the same planes
(Fig. 5), it seems that the third component is reliable
with the variety variable. If we now represent the same
plots but taking into account the state of maturity of the
samples of olives oils, we found that a differentiated
distribution of the samples is not observed when they
were projected regarding the three components and
therefore these components do not seem to be reliable
with the state of maturity variable.
When the SIMCA was made, the results of the
modelling power analysis show that b-sitosterol, D-5avenasterol and total b-sitosterol are the more important variables to characterize the olive oils, according to
the variety, and regarding the discrimination power,
campesterol is the more important variable to characterize the ‘‘Non-Manzanilla Cacere~
na’’ olive oils and
stigmasterol is the more important variable to characterize the ‘‘Manzanilla Cacere~
na’’ olive oils.
As it is shown in the Coomans’s plot of Fig. 6, four
‘‘Manzanilla Cacere~
na’’ olive oil samples are classified in
their class, the rest can belong to both classes. However
Fig. 5. Scores plots obtained from the PCA of data about sterols
composition, in the PC1–PC2 and PC1–PC3 planes (( ) ‘‘Manzanilla
na’’ olive oils).
Cacere~
na’’ olive oils, ( ) ‘‘Non-Manzanilla Cacere~
Fig. 6. Coomans’s plot corresponding to classification of new samples
in the models obtained from sterols data (( ) ‘‘Manzanilla Cacere~
na’’
olive oils, ( ) ‘‘Non-Manzanilla Cacere~
na’’ olive oils).
Fig. 4. Loadings plots obtained from the PCA of data about sterols
composition, in the PC1–PC2 and PC1–PC3 planes.
most of the oils ‘‘Non-Manzanilla Cacere~
na’’ are classified in their group.
Since the results obtained in the classification
according to the content in sterols are different but
complementary to those obtained in the classification
T.G. Diaz et al. / Food Control 16 (2005) 339–347
according to the content in triglycerides a new classificatory analysis was made using both types of variables
jointly.
3.3. Results obtained from values of triglycerides and
sterols
In this case, it is observed that five principal components are enough to explain 97.3% of the data variance. PC1, PC3, and PC5 are reliable with variables that
include the triglycerides: OOO and LOO (first component); SLO, POO (second component) and SOO (fifth
component). PC2 and PC4 are reliable with variables
that include to sterols: b-sitosterol and D-5-avenasterol
(second component); all b-sitosterol (fourth component).
Representing the scores vectors plots regarding the
different principal components obtained and distinguishing among the samples of oil according to the
variety of the olive or according to the maturity grade, it
can be observed a differentiated distribution of samples
according the variety, when they were projected
regarding the first component and therefore this component seems to be reliable with the variety. On the
other hand, this was foregone when contributing to this
PC the contents of triglycerides. The state of maturity
variable is only reliable with the third component.
For the SIMCA classification, two categories or
classes were predefined by means of independent
mathematical models: one for ‘‘Manzanilla Cacere~
na’’
olive oils (MC) and other for ‘‘Non-Manzanilla Cacere~
na’’ for other olive oils samples (NMC).
The PCA provides for both classes a large number of
principal components: 6 PCs were employed, with a
95.8% of the explained variance for the MC, and 98.1%
of the explained variance for the NMC model. The
variables with more decisive influence in the first PC of
the class MC are similar to those mentioned in the
previous section although in the PC5 the stigmasterol
plays a bigger paper. In the class NMC there are bigger
differences, being OOO, LOO and D-5-avenasterol the
variables that influence in the PC1; SLO + POO and
OOO in the PC2; LOO, b-sitoterol and D-5-avenasterol
in the PC3; total b-sitosterol and campesterol in the PC4
and SOO in the PC5. In this case sterols variables participate in all the PC, being more important variables to
define the model. The distance between both models is
of 6.76 and SOO and OOO are variables that have
greater discrimination power between models.
Results of the SIMCA classification are very similar
to those obtained with the sterols data alone and,
regarding to the classification of new samples of
‘‘Manzanilla Cacere~
na’’ as belonging to this class, they
are worst than the obtained with tryglicerides data. We
can conclude that the results of the tryglicerides analysis
are the more conclusive to classify these oils, although
345
the sterols could contribute to define other different
classes.
3.4. Results obtained from values of triglycerides, sterols
and a group of selected variables
Lastly the analysis of the samples has been made
using a group of selected variables, that seem to contain
the biggest proportion of information. Different groups
of variables have been measured: parameters of general
character, (acidity, index of peroxides, colour); the fatty
acids, triglycerides and sterols. In each case, variables
that more contributed to the first principal component
were selected. Regarding to triglycerides all the variables
have been conserved because they have provided the
better results in the classification of the ‘‘Manzanilla
Cacere~
na’’ samples. In the case of fatty acids oleic and
linoleic acids have been suppressed since they present a
strong correlation with different triglycerides, OOO and
PLO, respectively. Among variables that we have
denominated as general, L , a , b have been selected;
among fatty acids, palmitic and stearic acids and among
the sterols, campesterol, stigmasterol, b-sitosterol, D-5avenasterol and total b-sitosterol. Therefore, we have
obtained a group of 22 variables to build again two
models, one for ‘‘Manzanilla Cacere~
nas’’ oils and other
for ‘‘Non-Manzanilla Cacere~
na’’ oils. The classification
has been made with the same group of randomly selected samples that we have used in previous studies.
The results show a slight improvement, regarding to
classificatory analysis made with triglycerides, concerning the classification of ‘‘Non-Manzanilla Cacere~
nas’’
oils but, for against, it is smaller the number of samples
of ‘‘Manzanilla Cacere~
nas’’ oils that are classified as
such. In definitive the inclusion of other variables does
not seem to improve the results obtained with the triglycerides content, regarding the classification of oils as
resultants of ‘‘Manzanilla Cacere~
nas’’ olives.
4. Conclusions
In the present work, PCA and SIMCA have been
used to characterize or classify 80 different olive oils
according to their origin. PCA is used mainly to achieve
a reduction of dimensionality, and to allow a primary
evaluation of category similarity. Cross-validation was
used to decide how many principal components should
be retained in order to summarize the original data
effectively.
Data of triglyceride composition combined with
SIMCA showed better results for the classification of
‘‘Manzanilla Cacere~
na’’ olive oil samples, although in
this case most of the samples from another varieties were
classified as belonging to any class. However, with results
of the analysis of sterols, the most of ‘‘Non-Manzanilla
346
T.G. Diaz et al. / Food Control 16 (2005) 339–347
Cacere~
na’’ olive oil samples were classified as such. With
the remaining groups of variables examined (acidity,
index of peroxide, colour, fatty acids) and even when all
of the measured parameters were used, the number of
samples that are classified correctly is inferior.
With the data of the triglycerides analysis there is a
bigger number of samples of ‘‘Manzanilla Cacere~
na’’
olive oils that are classified as belonging only to this
group, and we can conclude that these results are the
more conclusive to classify the ‘‘Manzanilla Cacere~
na’’
olive oils although the sterols could contribute to define
other different classes. The comparison of these results
with the obtained in other similar studies made with
olive oils from other countries or even spanish regions is
difficult as the size of data sets used, the sources of
differences (category variables) and also their distribution over the time are different. We can however highlight that the analytical procedure for triglycerides is
easy in comparison with that for fatty acids or sterols
which are other of parameters more frequently used in
classification and which are usually analyzed by GC
requiring previous stages of preparation of the sample
and derivatization of analytes.
Acknowledgements
The authors are grateful to Ministerio de Ciencia y
Tecnologıa (1FD1997–0517–C03–02) and the Junta de
Extremadura (Proyect 2PR03A073) for the financial
support.
References
Aparicio, R., Albi, T., Lanzon, A., & Navas, M. A. (1987). SEXIA, un
sistema experto para la identificaci
on de aceites: base de datos de
zonas olivareras. Grasas y Aceites, 38(1), 9–14.
Aparicio, R., & Aparicio-Ruiz, R. (2000). Authentication of vegetables
oils by chromatographic techniques. Journal of Chromatography A,
881, 93–104.
Aparicio, R., Ferreiro, L., Cert, A., & Lanzon, A. (1990). Caracterizaci
on de aceites de oliva vırgenes andaluces. Grasas y Aceites,
41(1), 23–39.
Bertran, E., Blanco, M., Coello, J., Iturriaga, H., Maspoch, S., &
Montoliu, I. (2000). Near-infra-red spectrometry and pattern
recognition as screening methods for the authentication of virgin
olive oils of very close geographical origins. Journal of Near
Infrared Spectroscopy, 8(1), 45–52.
Bertsch, M., Mayfield, H. T., & Thomason, M. M. (1981). Proceedings
of the fourth international symposium on capillary chromatography,
Hindelang, FRG (p. 313). Heidelberg, Germany: H€
uthing.
Bianchi, G., Giansante, L., Shaw, A., & Kell, D. B. (2001).
Chemometric criteria for characterisation of Italian protected
denomination of origin (DOP) olive oils from their metabolic
profiles. European Journal of Lipid Science Technology, 103(3),
141–150.
Capone, S., Siciliano, P., Quaranta, F., Rella, R., Epifani, M., &
Vasanelli, L. (2000). Analysis of vapours and foods by means of an
electronic nose based on a sol–gel metal oxide sensors array.
Sensors and Actuators B, B69(3), 230–235.
Damiani, P., Cossignani, L., Simonetti, M. S., Campisi, B., Favretto,
L., & Favretto, L. G. (1997). Stereospecific analysis of the
triacylglycerol fraction and linear discriminant analysis in a
climatic differentiation of Umbrian extra-virgin olive oils. Journal
of Chromatography A, 758, 109–116.
Favretto, L. G., Capmpisi, B., Favretto, L., Simonetti, M. S.,
Cossignani, L., & Damiani, P. (1999). Cross-validation in linear
discriminant analysis of triacylglycerol structural data from Istrian
olive oils. Journal of AOAC International, 82(6), 1489–1494.
Ferreiro, L., & Aparicio, R. (1992). Influencia de la altitud en la
composici
on qıimica de los aceites de oliva vırgenes de Andalucıa.
Ecuacione/8 s matematicas de clasificaci
on. Grasas y Aceites, 43(3),
149–156.
Fiorino, P., & Nizzi, F. (1991). The spread of olive farming. Olivae, 44,
9.
Forina, M., Armanino, C., Lanteri, S., Calcagno, C., & Tiscornia, E.
(1983). Valutazione delle caractteristiche chimiche dell
olio di oliva
in funzione dellannata di produzione mediante metodi di classificazione multivariati. Rivista Italiane delle Sostanza Grasse, LX, 607–
613.
Forina, M., Drava, G., & Leardi, R. (1997). Chemometrics in
transparencies. University of Genova, Genova, Italy.
Forina, M., & Lanteri, S. (1984). In B. R. Kowalski (Ed.), Chemometrics, mathematics and statistics in chemistry (p. 305). Dordrecht:
Reidel.
Forina, M., Lanteri, S., & Armanino, C. (1987). In M. J. S. Dewar, J.
D. Dunitz, K. Hafner, E. Heilbronner, S. Ito, J.-M. Lehn, K.
Niedenzu, K. N. Raymond, C. W. Rees, F. W€
ogtle, & G. Witting
(Eds.), Topics in current chemistry (pp. 91–144). Berlin: SpringerVerlag.
Gigliotti, C., Daghetta, A., & Sidoli, A. (1993). Indagine conoscitiva
sul contenuto triglyceridico di oli extra vergini di oliva di varia
provenienza. Rivista Italiane delle Sostanza Grasse, LXX, 483–489.
Guadarrama, A., Rodrıguez-Mendez, M. L., Sanz, C., Rıos, J. L., &
de Saja, J. A. (2001). Electronic nose based on conducting polymers
for the quality control of the olive oil aroma. Discrimination of
quality, variety of olive and geographic origin. Analytica Chimica
Acta, 432, 283–292.
Kellner, R., Mermet, J.-M., Otto, M., & Widmer, H. M. (Eds.). (1998).
Analytical chemistry. Weinheim: Wiley-VCH.
Leardi, R., & Paganuzzi, V. (1987). Caratterizzazione dell’origine di oli
di oliva extravergini mediante metodi chemiometrici applicati alla
frazione sterolica. Rivista Italiane delle Sostanza Grasse, LXIV,
131–136.
Mannina, L., Patumi, M., Proietti, N., Bassi, D., & Segre, A. L. (2001).
Geographical characterization of Italian extra virgin olive oils
using high-field 1 H NMR spectroscopy. Journal of Agricultural and
Food Chemistry, 49(6), 2687–2696.
Paganuzzi, V. (1985). Influenza dell’origine e dello estato di conservazione sulla composizione sterolica degli oli d’oliva non etrattati.
III. Oli di presione di origine Grecia. Rivista Italiane delle Sostanza
Grasse, 62, 399.
Pardo, M., Sberveglieri, G., Gardini, S., & Dalcanale, E. (2000). A
hierarchical classification scheme for an electronic nose. Sensors
and Actuators B, 69(3), 359–365.
Sacchi, R., Mannina, L., Fiordiponti, P., Barone, P., Paolillo, L.,
Patumi, M., & Segre, A. (1998). Characterization of Italian extra
virgin olive oils using hydrogen-1 NMR spectroscopy. Journal of
Agricultural and Food Chemistry, 46(10), 3947–3951.
Spangenberg, J. E., Macko, S. A., & Hunziker, J. (1998). Characterization of olive oil by carbon isotope analysis of individual fatty
acids: Implications for authentication. Journal of Agricultural and
Food Chemistry, 46(10), 4179–4184.
Stefanoudaki, E., Kotsifaki, F., & Koutsaftakis, A. (1999). Classification of virgin olive oils of the two major Cretan cultivars based
on their fatty acid composition. Journal of the American Oil
Chemists Society, 76(5), 623–626.
T.G. Diaz et al. / Food Control 16 (2005) 339–347
Tsimidou, M., & Karakostas, K. X. (1993). Geographical classification
of Greek virgin olive oil by non-parametric multivariate evaluation
of fatty acid composition. Journal of the Science of Food and
Agriculture, 62, 253–257.
Tsimidou, M., Macrae, M., & Wilson, I. (1987). Authentication of
virgin olive oils using principal component analysis of triglyceride
and fatty acid profiles. Part 1. Classification of Greek olive oils.
Food Chemistry, 25, 227–239.
347
Wold, S., Albano, C., Dunn, W. J., Edlunk, U., Esbensen, K., Geladi,
P., Helberg, S., Johansson, E., Lindbergand, W., & Sjostrom, M.
(1984). In B. R. Kowalski (Ed.), Chemometrics, mathematics and
statistics in chemistry (pp. 17–96). Dordrecht: Reidel.
Wold, S., Albano, C., Dunn, W. J., Esbense, K., Hellberg, S.,
Johansson, E., & Sjostrom, M. (1983). In H. Martens, & H.
Russwurm (Eds.), Food research and data analysis (pp. 147–188).
Barking: Applied Science.