Food Control 16 (2005) 339–347 www.elsevier.com/locate/foodcont Characterization of virgin olive oils according to its triglycerides and sterols composition by chemometric methods T. Galeano Diaz b a,* , I. Dur an Mer as a, J. S anchez Casas b, M.F. Alexandre Franco a a Department of Analytical Chemistry, Faculty of Sciences, University of Extremadura, E-06071 Badajoz, Spain Institute of Agricultural Technology of Junta de Extremadura, Ctra. San Vicente, s/n, Finca Santa Engracia, E-06071 Badajoz, Spain Received 19 October 2003; received in revised form 16 March 2004; accepted 19 March 2004 Abstract Principal component analysis (PCA), and soft independent modelling class analogy (SIMCA), were applied to data of content of the various triglycerides, sterols, or both data, to explore their capacity for the typification of a variety of olive oil, belonging to a Spanish origin denomination. This study has demonstrated that it is possible to characterize the oils obtained from a specific type of olives (‘‘Manzanilla Cacere~ na’’ of North of Caceres (Extremadura––Spain)) according to their chemical composition. Best results were obtained with the content of triglycerides. The plots of PCs showed that the PC1 is related with the category variable ‘‘variety’’ and the PC2 is related with ‘‘maturity’’. SIMCA was employed to assign unknown samples into one of two groups or classes, depending on the ‘‘variety’’ of olives, for those which independent PCA models were made. Comman’s plot showed that different olive oils are clustered in different groups and each group could be distinguished clearly. 2004 Elsevier Ltd. All rights reserved. Keywords: Olive oils; Classification; PCA; SIMCA; Triglycerides; Sterols 1. Introduction All food products, as the olive oils in this case, are complex chemical objects which we perceive and evaluate in a global way. In occasions, it seems to be important to have a reliable identification and classification of olive oils according to the olive variety and the geographic origin, and to do this they usually must be evaluated from a multivariate point of view, which is predicted by food chemometrics (Forina, Lanteri, & Armanino, 1987). Several studies have been carried out to correlate the chemical composition of olive oil to geographic origin (Aparicio, Albi, Lanzon, & Navas, 1987; Ferreiro & Aparicio, 1992; Fiorino & Nizzi, 1991; Gigliotti, Daghetta, & Sidoli, 1993; Leardi & Paganuzzi, 1987; Tsimidou & Karakostas, 1993) and chemometric methods have been applied to several chemical components, for the classification of the olive oils. Frequently the contents of the different chemical components are determined by chromatographic methods and several of * Corresponding author. Tel.: +34-2428-9300; fax: +34-2428-9375. E-mail address: [email protected] (T.G. Diaz). 0956-7135/$ - see front matter 2004 Elsevier Ltd. All rights reserved. doi:10.1016/j.foodcont.2004.03.014 the chromatographic procedures employed for the characterization of vegetable oils, as well as for their authentication have been recently revised (Aparicio & Aparicio-Ruiz, 2000). Besides the methods cited in this review we can mention other methods based in the contents of fatty acids (Spangenberg, Macko, & Hunziker, 1998; Stefanoudaki, Kotsifaki, & Koutsaftakis, 1999); sterols (Paganuzzi, 1985); fatty acids and sterols (Forina, Armanino, Lanteri, Calcagno, & Tiscornia, 1983); fatty acids, fatty alcohols and triperpenes (Bianchi, Giansante, Shaw, & Kell, 2001); sterols, triterpenic alcohols and hydrocarbons (Aparicio, Ferreiro, Cert, & Lanzon, 1990), triglycerides (Damiani et al., 1997; Favretto et al., 1999); fatty acids and triglycerides (Tsimidou, Macrae, & Wilson, 1987). More recently, other different techniques to the chromatographic ones have been also used. This way, classification of olive oils of different origin has been gotten using near IR spectral data in combination with an artificial neural network or with the logistic regression (Bertran et al., 2000); the high-field 1 H NMR spectroscopic data corresponding to minor constituents subjected to principal component analysis or to cluster analysis (Mannina, Patumi, Proietti, Bassi, & Segre, 2001; Sacchi et al., 1998); or the data 340 T.G. Diaz et al. / Food Control 16 (2005) 339–347 corresponding to aroma compounds obtained by means of sensors constituting an electronic nose (Capone et al., 2000; Guadarrama, Rodrıguez-Mendez, Sanz, Rıos, & de Saja, 2001; Pardo, Sberveglieri, Gardini, & Dalcanale, 2000). Chemometric techniques have been defined as the utilization of mathematical and statistical methods for handling, interpreting and predicting chemical data (Bertsch, Mayfield, & Thomason, 1981). There is an advantage in the multivariate techniques, which associate an n-dimensional bounded region to each category, over those that only construct a separator of the different categories, as they allowed the extraction of hidden information, and they characterize a redundant information source which often masks, completely, the relevant information contained in the chemical composition (Forina & Lanteri, 1984). Thus, one of the main goals will be to use multivariate methods to enable the visualization of data when more than three variables have been measured. The mathematical basis will be representation and manipulation of data by vectors and matrices (Kellner, Mermet, Otto, & Widmer, 1998). These methods are aimed at projecting the original data set from a high dimensional space onto a line, a plane, or a 3D coordinate system. The principal component analysis (PCA) finds and alternative set of axes about which a data set may be represented and it is designed to provide the best possible view of variability in the independent variables of a multivariate data set. We can try of relating this variability with fundamental variables as origin, variety. . .. Samples whose coordinates regarding these new axes are similar belong, probably, to classes or groups with similar values of these fundamental variables. Therefore, when a preliminary exploratory analysis of data reveals clustering of data, related with the values of some variables, new PCA models can be constructed for these groups and the capacity of the measured data to assign new samples to these classes can be examined. One of the supervised pattern recognition methods, based in the description of individual categories by means of PCA independent mathematical models, is soft independent modelling of class analogy (SIMCA) (Wold et al., 1984; Wold et al., 1983). The object of this study is to explore the possibilities of different chemical parameters, normally determined in olive oil analysis, to differentiate and classify olive oil samples of different origin, in order to confirm the authenticity of the ‘‘Manzanilla Cacere~ na’’ olive oils. PCA and SIMCA have been employed with the data corresponding to the contents in the various triglycerides, or the various sterols, or both. Other variables usually measured, acidity, index of peroxide, colour, fatty acids, stability, etc. have also been examined. The oils obtained from olives of ‘‘Manzanilla Cacere~ na’’ of the North of Caceres (Extremadura––Spain) are very good quality and they are protected by a ‘‘Denominaci on de Origen’’ (Origin Denomination). 2. Experimental 2.1. Samples A total of 80 samples of extra virgin olive oil, including ‘‘Manzanilla Cacere~ na’’ oils from almazaras of the North of Caceres in the region of Extremadura and other mono-varietal oils, being 44 ‘‘Cacere~ nas’’ and 36 ‘‘non-Cacere~ nas’’, were obtained in the harvesting periods 1999–2000 and 2000–2001. Sampling was carried out over a time period from the beginning of November to the end of January at three sampling dates (different stage of maturity). 2.2. Reagents Anhydrous sodium sulphate; potassium hydroxide; acetone, acetonitrile, chloroform, ethanol, ether, and hexane, from Panreac. Pyridine, hexamethyldisilazane, chloromethylsilane, phenolphtaleine, and 60 20 · 20 cm silica gel plates, from Merck. b-sitosterol from Sigma, uvaol and stigmasterol from ICN Biomedicals, and 0.45 lm filters of nylon from Agilent. All other reagents were of analytical grade and were purchased from Merck or Aldrich. 2.3. Analysis of triglycerides, sterols, and other variables The analysis of triglycerides was performed according to the official chromatographic method of the EC no. 2472/97 (Diario oficial de las Comunidades Europeas L 341, 12.12.1997, p. 25). The apparatus was a Hewlett Packard HPLC instrument model 1100 consisted by a degasser, quaternary pump, manual six-way injection valve, refractometer detector, and Chemstation Software package for instrument control, data acquisition, and data analysis. A Lichrosorb FP 18 (4.6 · 0.25 mm) analytical column was used. The analysis of sterols was performed according to the official method of the EC no. 2568/91 (Diario oficial de las Comunidades Europeas L 248, 5.9.1991, p. 1). The apparatus was a Hewlett Packard instrument model 6890 gas chromatograph, equipped with a flame ionization detector (FID); a HP-5 (Crosslinked 5% PH ME Siloxane) capillary column (30 m · 0.25 mm · 0.25 lm) and a 6890 Agilent automatic injector. The determination of content of acidity, index of peroxide, parameters of colour (L , a , b , C ab, < H ða Þ), polyphenols, stability in olives oils, was performed according to the official methods of the EC. T.G. Diaz et al. / Food Control 16 (2005) 339–347 2.4. Chemometric analysis The supervised pattern recognition method soft independent modelling of class analogy (SIMCA) based in the description of individual categories by means of principal component analysis (PCA) independent mathematical models, was used for the classification of the samples of olive oil as belonging to one of the two classes ‘‘Manzanilla Cacere~ na’’ or ‘‘Non-Manzanilla Cacere~ na’’. The UNSCRAMBLER (Unscrambler software, version 6.0 of CAMO, Trondheim, Noruega) software package was used for the application of both PCA and SIMCA as well as for the preliminary exploratory analysis of data. Different groups of variables were measured: parameters of general character, (acidity, index of peroxides, colour), the fatty acids, triglycerides, and sterols and were used separately or in combination. 3. Results and discussion The measure of several variables in a number of samples gives rise to large data tables that usually contain a large amount of information, too complex to be easily interpreted, and subsequently a part of this information can be hidden. PCA is a commonly used multivariate technique which acts unsupervised, and it helps us to find in what aspect a sample is different from another. The principle of PCA is finding the linear combinations of the initial variables that more contribute to making the samples different from each other. These combinations are called principal components (PCs). They are computed iteratively, in such a way that the first PC is the one that carries most information (or in statistical terms: most explained variance). The second PC will then carry the maximum share of the residual information. Therefore, PCA finds an alternative set of coordinate axes, PCs, about which data set may be represented. The PCs are orthogonal to each other and they are ranked so that each one carries more information that any of the following ones. In a first step of PCA the number of principal components is estimated by the several criteria: the percentage of explained variance, eigenvalue-one criterion, Scree-test, and cross-validation. Each component of a PCA model is characterized by three complementary sets of attributes: Variances, that are error measures, loadings describe the data structure in terms of variable correlations, and scores describe the properties, differences or similarities of the samples. When the principal component scores are plotted they may reveal natural patterns and clustering in the samples. In order to find an operative classification role for discriminating the samples, supervised-learning pattern 341 recognition techniques must be applied, such as SIMCA. Soft independent modelling class analogy (SIMCA) is an extremely informative technique, widely used in chemometrics (Wold et al., 1984, 1983) to which improvements have been introduced (Forina, Drava, & Leardi, 1997). It is based on the evaluation of the principal components of each category, the setting up of a critical distance with probabilistic meaning and the calculation of the distance of each object from the model of each category. This implies that we accept, a priori, that the data will have a geometric and probabilistic structure. Then, unknown samples are then compared to the class models, and assigned to classes according to their analogy to the training samples. There are two steps in classification: Modelling: build one PCA separate model for each class; Classifying new samples: fit each sample to each model and decide whether the sample belongs to the corresponding class. Reliable results obtained from a SIMCA analysis are: variable results as the Modelling power of one variable in one model, [1-(variable residual variance/variable total variance)1=2 ] and the Discrimination power; and Sample results as Si (square root of the residual variance of the sample) which is a measure of the distance of a sample to a modeled class, and it is compared to the overall variation of the class (called S0) being the basis of the statistical criterion to decide whether a new sample can be classified as a member of the class or not; and Hi, leverage, that expresses how different the sample can be considered from the other class members. Using graphical plots, Si vs. Hi or Si vs. Si (Comman’s plot), the samples can easily be classified. 3.1. Results obtained from values of triglycerides The mean values and the confidence intervals of the different triglycerides of the samples are shown in Table 1 together with the results for the other analyzed parameters. The obtained results for PCA of these data are the following: The number of principal components has been decided by cross-validation, and it is observed that three principal components are enough to explain the 98.9% of the data variance. The interpretation of the results of a principal component analysis is usually carried out by visualization of the component scores and loadings. In Fig. 1 the loadings vectors for the first three components are plotted and in the Fig. 2 the score vectors for the first three components are plotted. The used notation for triglycerides makes mention to the acids that they present in their structure, being O ¼ oleic acid; P ¼ palmitic acid; S ¼ stearic acid; L ¼ linoleic acid and Ln ¼ linolenic acid. 342 T.G. Diaz et al. / Food Control 16 (2005) 339–347 Table 1 Mean values of measured variables obtained for Manzanilla Cacere~ na and Non-Manzanilla Cacere~ na olive oils Variable Acidity Index of peroxide k270 ak k232 Polyphenols L a b C ab < H ða Þ Estability Palmitic acid Palmitoleic acid Margaric acid Stearic acid Oleic acid Linoleic acid Linolenic acid Arachidic acid Gadolic acid Behenic acid Lignoceric acid LOL + OLnO PLL LOO PLO OOO SLO, POO POP PPP SOO SLS, POS Cholesterol Brassicasterol Campesterol Stigmasterol D-7-Campesterol chlerosterol b-Sitosterol D-5-Avenasterol D-5,24-Stigmastadienol D-7-Stigmasterol D-7-Avenasterol Total b-sitosterol Mean and standard deviation Manzanilla Cacere~ na oils Non-Manzanilla Cacere~ na oils 90 76 16 1 180 10754 8663 )1287 6080 6219 10250 5844 1204 92 42 198 7903 449 70 36 27 13 61 252 56 890 388 4843 2574 328 77 393 92 14 12 267 147 60 60 7997 1288 73 40 48 9373 21 358 14 3 172 27637 8958 )1178 4730 4877 10250 8111 1179 71 70 351 7393 837 69 48 21 14 60 370 52 1213 595 4135 2352 300 50 626 134 20 22 334 70 55 79 8246 980 79 57 47 9354 89 288 2 1 23 4906 636 201 1746 1741 222 1501 107 12 15 40 242 190 5 4 3 1 8 80 6 199 108 367 212 34 20 50 21 9 7 17 79 32 15 224 213 22 22 13 118 20 400 7 5 47 10638 857 568 2621 2676 1320 4873 159 22 47 59 564 493 9 8 3 2 1 213 10 498 296 857 308 95 12 108 40 23 26 90 54 45 9 427 457 28 49 17 192 A loading plot for the plane PC1 and PC2 and the plane PC1 and PC3 (Fig. 1) reveals that the variable OOO with PLO and LOO have an inverse correlation, and the three variables give their variance to PC1. The variable that more contributes to the first component is PLO since the other two also contribute to PC2 and PC3. Also it can be observed in the figure that SLO, POO and SOO are inversely correlated. In the same planes PC1–PC2 and PC1–PC3, a score plot (Fig. 2) reveals that ‘‘Manzanilla Cacere~ na’’ samples have positive scores for the first component and Fig. 1. Loadings plots obtained from the PCA of data about tryglicerides composition, in the PC1–PC2 and PC1–PC3 planes. therefore they will have superior values to the mean value of those variables whose loadings regarding to this PC are large and positives, that is to say they will have values superior to the mean of OOO variable, while they will have values inferior to the mean of PLO and LOO. In the plane PC1–PC3 it can be also observed that samples of ‘‘Manzanilla Cacere~ na’’ oils have negative scores for the third component and so values inferior to the mean of SOO. Therefore, it seems to be that the first and the third component are reliable with the category variable ‘‘variety’’. If we now represent the same plots but taking into account the state of maturity of the samples of olives oils: olive oil from green olive, olive oil from semi-black olives and olive oil from black olives, it can be observed a differentiated distribution of the samples only regarding the second component, and so this component seems to be reliable with the category variable ‘‘state of maturity’’. Therefore, by means of PCA, the occurrence of predictors variables is appreciated, so that it is possible to apply classification methods as SIMCA. This method, SIMCA was used to determine which variables better modulate and discriminate between the classes or the categories established depending on variety of olives. Two categories were predefined: class 1 including 33 T.G. Diaz et al. / Food Control 16 (2005) 339–347 343 Once each class has been modeled, and since they are enough separated, new samples can be assigned to each class. For it, new values of all the variables are calculated for each new sample, using the scores and the loadings of each class and these are compared with measured values. The residuals are combined in Si. When representing Si=S0 in front of Hi, we have found that, with a confidence level of 5%, among the 11 samples that belong to the MC group, only two are erroneously assigned to the group NMC. However of the 12 samples of oils from other types of olives, only 4 are correctly assigned to this group the rest being without assigning, which can be explained by the fact that this group is much less homogeneous since it is made of oils all of them mono-varietals but of different types of olives. The results of the classification can be also easily seen on a Coomans’s plot (Fig. 3). In this, they are shown simultaneously the distances of the new samples in the two classes and, as shown in the figure, one ‘‘Manzanilla Cacere~ na’’ olive oil sample is classified in the NMC class, another in both classes, and another ‘‘Manzanilla Cacere~ na’’ olive oil sample is not classified in any class. The rest (about 73%) are correctly assigned. The most of the oils from ‘‘Non-Manzanilla Cacere~ na’’ are classified as not belonging to none of the classes. Fig. 2. Scores plots obtained from the PCA of data about tryglicerides composition, in the PC1–PC2 and PC1–PC3 planes (( ) ‘‘Manzanilla Cacere~ na’’ olive oils, ( ) ‘‘Non-Manzanilla Cacere~ na’’ olive oils). samples of ‘‘Manzanilla Cacere~ na’’ olive oils (MC) and class 2 for 24 samples of ‘‘Non-Manzanilla Cacere~ na’’ (NMC) olive oils and also a group of aleatory samples containing samples of both groups were built to be used in the step of classification. In the first step, PCA is used to model each class, and two reduced models with 3 significant principal components, obtained by crossvalidation, were employed, with 99.0% of the explained variance for the MC model, and 99.3% of the explained variance for the NMC model. The variables that influence in each PC are similar in both cases to those already mentioned in the PCA model constructed for the global group of samples. Before using the models to predict the ownership to one of the classes of a group of samples, we have evaluated the specificity of these models. In our case the distance among models is of 7.24, what indicates that the models are sufficiently distant to each other. We have found that OOO, LOO, PLO, and SLO + POO are the variables with higher values of the modelling power in both classes of olive oils, according to the variety. About the discrimination power, we have found that PPP, SOO and OOO are the most important variables for the differentiation of the classes of olive oils. 3.2. Results obtained from values of sterols The mean values and the confidence intervals of the different sterols for the samples are shown in Table 1. The results that are obtained of the PCA for sterols are the following: Three principal components are enough to explain 98.9% of the data variance. A loading Fig. 3. Coomans’s plot corresponding to classification of new samples in the models obtained from triglycerides data (( ) ‘‘Manzanilla Cacere~ na’’ olive oils, ( ) ‘‘Non-Manzanilla Cacere~ na’’ olive oils). 344 T.G. Diaz et al. / Food Control 16 (2005) 339–347 plot in the planes PC1–PC2 and PC1–PC3 (Fig. 4) reveals that b-sitosterol and D-5-avenasterol variables give their variance to PC1. The total b-sitosterol gives its variance to PC2, stigmasterol gives its variance to PC3 and campesterol gives its variance to PC2 and PC3. Furthermore, from the score plots in the same planes (Fig. 5), it seems that the third component is reliable with the variety variable. If we now represent the same plots but taking into account the state of maturity of the samples of olives oils, we found that a differentiated distribution of the samples is not observed when they were projected regarding the three components and therefore these components do not seem to be reliable with the state of maturity variable. When the SIMCA was made, the results of the modelling power analysis show that b-sitosterol, D-5avenasterol and total b-sitosterol are the more important variables to characterize the olive oils, according to the variety, and regarding the discrimination power, campesterol is the more important variable to characterize the ‘‘Non-Manzanilla Cacere~ na’’ olive oils and stigmasterol is the more important variable to characterize the ‘‘Manzanilla Cacere~ na’’ olive oils. As it is shown in the Coomans’s plot of Fig. 6, four ‘‘Manzanilla Cacere~ na’’ olive oil samples are classified in their class, the rest can belong to both classes. However Fig. 5. Scores plots obtained from the PCA of data about sterols composition, in the PC1–PC2 and PC1–PC3 planes (( ) ‘‘Manzanilla na’’ olive oils). Cacere~ na’’ olive oils, ( ) ‘‘Non-Manzanilla Cacere~ Fig. 6. Coomans’s plot corresponding to classification of new samples in the models obtained from sterols data (( ) ‘‘Manzanilla Cacere~ na’’ olive oils, ( ) ‘‘Non-Manzanilla Cacere~ na’’ olive oils). Fig. 4. Loadings plots obtained from the PCA of data about sterols composition, in the PC1–PC2 and PC1–PC3 planes. most of the oils ‘‘Non-Manzanilla Cacere~ na’’ are classified in their group. Since the results obtained in the classification according to the content in sterols are different but complementary to those obtained in the classification T.G. Diaz et al. / Food Control 16 (2005) 339–347 according to the content in triglycerides a new classificatory analysis was made using both types of variables jointly. 3.3. Results obtained from values of triglycerides and sterols In this case, it is observed that five principal components are enough to explain 97.3% of the data variance. PC1, PC3, and PC5 are reliable with variables that include the triglycerides: OOO and LOO (first component); SLO, POO (second component) and SOO (fifth component). PC2 and PC4 are reliable with variables that include to sterols: b-sitosterol and D-5-avenasterol (second component); all b-sitosterol (fourth component). Representing the scores vectors plots regarding the different principal components obtained and distinguishing among the samples of oil according to the variety of the olive or according to the maturity grade, it can be observed a differentiated distribution of samples according the variety, when they were projected regarding the first component and therefore this component seems to be reliable with the variety. On the other hand, this was foregone when contributing to this PC the contents of triglycerides. The state of maturity variable is only reliable with the third component. For the SIMCA classification, two categories or classes were predefined by means of independent mathematical models: one for ‘‘Manzanilla Cacere~ na’’ olive oils (MC) and other for ‘‘Non-Manzanilla Cacere~ na’’ for other olive oils samples (NMC). The PCA provides for both classes a large number of principal components: 6 PCs were employed, with a 95.8% of the explained variance for the MC, and 98.1% of the explained variance for the NMC model. The variables with more decisive influence in the first PC of the class MC are similar to those mentioned in the previous section although in the PC5 the stigmasterol plays a bigger paper. In the class NMC there are bigger differences, being OOO, LOO and D-5-avenasterol the variables that influence in the PC1; SLO + POO and OOO in the PC2; LOO, b-sitoterol and D-5-avenasterol in the PC3; total b-sitosterol and campesterol in the PC4 and SOO in the PC5. In this case sterols variables participate in all the PC, being more important variables to define the model. The distance between both models is of 6.76 and SOO and OOO are variables that have greater discrimination power between models. Results of the SIMCA classification are very similar to those obtained with the sterols data alone and, regarding to the classification of new samples of ‘‘Manzanilla Cacere~ na’’ as belonging to this class, they are worst than the obtained with tryglicerides data. We can conclude that the results of the tryglicerides analysis are the more conclusive to classify these oils, although 345 the sterols could contribute to define other different classes. 3.4. Results obtained from values of triglycerides, sterols and a group of selected variables Lastly the analysis of the samples has been made using a group of selected variables, that seem to contain the biggest proportion of information. Different groups of variables have been measured: parameters of general character, (acidity, index of peroxides, colour); the fatty acids, triglycerides and sterols. In each case, variables that more contributed to the first principal component were selected. Regarding to triglycerides all the variables have been conserved because they have provided the better results in the classification of the ‘‘Manzanilla Cacere~ na’’ samples. In the case of fatty acids oleic and linoleic acids have been suppressed since they present a strong correlation with different triglycerides, OOO and PLO, respectively. Among variables that we have denominated as general, L , a , b have been selected; among fatty acids, palmitic and stearic acids and among the sterols, campesterol, stigmasterol, b-sitosterol, D-5avenasterol and total b-sitosterol. Therefore, we have obtained a group of 22 variables to build again two models, one for ‘‘Manzanilla Cacere~ nas’’ oils and other for ‘‘Non-Manzanilla Cacere~ na’’ oils. The classification has been made with the same group of randomly selected samples that we have used in previous studies. The results show a slight improvement, regarding to classificatory analysis made with triglycerides, concerning the classification of ‘‘Non-Manzanilla Cacere~ nas’’ oils but, for against, it is smaller the number of samples of ‘‘Manzanilla Cacere~ nas’’ oils that are classified as such. In definitive the inclusion of other variables does not seem to improve the results obtained with the triglycerides content, regarding the classification of oils as resultants of ‘‘Manzanilla Cacere~ nas’’ olives. 4. Conclusions In the present work, PCA and SIMCA have been used to characterize or classify 80 different olive oils according to their origin. PCA is used mainly to achieve a reduction of dimensionality, and to allow a primary evaluation of category similarity. Cross-validation was used to decide how many principal components should be retained in order to summarize the original data effectively. Data of triglyceride composition combined with SIMCA showed better results for the classification of ‘‘Manzanilla Cacere~ na’’ olive oil samples, although in this case most of the samples from another varieties were classified as belonging to any class. However, with results of the analysis of sterols, the most of ‘‘Non-Manzanilla 346 T.G. Diaz et al. / Food Control 16 (2005) 339–347 Cacere~ na’’ olive oil samples were classified as such. With the remaining groups of variables examined (acidity, index of peroxide, colour, fatty acids) and even when all of the measured parameters were used, the number of samples that are classified correctly is inferior. With the data of the triglycerides analysis there is a bigger number of samples of ‘‘Manzanilla Cacere~ na’’ olive oils that are classified as belonging only to this group, and we can conclude that these results are the more conclusive to classify the ‘‘Manzanilla Cacere~ na’’ olive oils although the sterols could contribute to define other different classes. The comparison of these results with the obtained in other similar studies made with olive oils from other countries or even spanish regions is difficult as the size of data sets used, the sources of differences (category variables) and also their distribution over the time are different. We can however highlight that the analytical procedure for triglycerides is easy in comparison with that for fatty acids or sterols which are other of parameters more frequently used in classification and which are usually analyzed by GC requiring previous stages of preparation of the sample and derivatization of analytes. Acknowledgements The authors are grateful to Ministerio de Ciencia y Tecnologıa (1FD1997–0517–C03–02) and the Junta de Extremadura (Proyect 2PR03A073) for the financial support. References Aparicio, R., Albi, T., Lanzon, A., & Navas, M. A. (1987). SEXIA, un sistema experto para la identificaci on de aceites: base de datos de zonas olivareras. Grasas y Aceites, 38(1), 9–14. Aparicio, R., & Aparicio-Ruiz, R. (2000). Authentication of vegetables oils by chromatographic techniques. Journal of Chromatography A, 881, 93–104. Aparicio, R., Ferreiro, L., Cert, A., & Lanzon, A. (1990). Caracterizaci on de aceites de oliva vırgenes andaluces. Grasas y Aceites, 41(1), 23–39. Bertran, E., Blanco, M., Coello, J., Iturriaga, H., Maspoch, S., & Montoliu, I. (2000). Near-infra-red spectrometry and pattern recognition as screening methods for the authentication of virgin olive oils of very close geographical origins. Journal of Near Infrared Spectroscopy, 8(1), 45–52. Bertsch, M., Mayfield, H. T., & Thomason, M. M. (1981). Proceedings of the fourth international symposium on capillary chromatography, Hindelang, FRG (p. 313). Heidelberg, Germany: H€ uthing. Bianchi, G., Giansante, L., Shaw, A., & Kell, D. B. (2001). Chemometric criteria for characterisation of Italian protected denomination of origin (DOP) olive oils from their metabolic profiles. European Journal of Lipid Science Technology, 103(3), 141–150. Capone, S., Siciliano, P., Quaranta, F., Rella, R., Epifani, M., & Vasanelli, L. (2000). Analysis of vapours and foods by means of an electronic nose based on a sol–gel metal oxide sensors array. Sensors and Actuators B, B69(3), 230–235. Damiani, P., Cossignani, L., Simonetti, M. S., Campisi, B., Favretto, L., & Favretto, L. G. (1997). Stereospecific analysis of the triacylglycerol fraction and linear discriminant analysis in a climatic differentiation of Umbrian extra-virgin olive oils. Journal of Chromatography A, 758, 109–116. Favretto, L. G., Capmpisi, B., Favretto, L., Simonetti, M. S., Cossignani, L., & Damiani, P. (1999). Cross-validation in linear discriminant analysis of triacylglycerol structural data from Istrian olive oils. Journal of AOAC International, 82(6), 1489–1494. Ferreiro, L., & Aparicio, R. (1992). Influencia de la altitud en la composici on qıimica de los aceites de oliva vırgenes de Andalucıa. Ecuacione/8 s matematicas de clasificaci on. Grasas y Aceites, 43(3), 149–156. Fiorino, P., & Nizzi, F. (1991). The spread of olive farming. Olivae, 44, 9. Forina, M., Armanino, C., Lanteri, S., Calcagno, C., & Tiscornia, E. (1983). Valutazione delle caractteristiche chimiche dell olio di oliva in funzione dellannata di produzione mediante metodi di classificazione multivariati. Rivista Italiane delle Sostanza Grasse, LX, 607– 613. Forina, M., Drava, G., & Leardi, R. (1997). Chemometrics in transparencies. University of Genova, Genova, Italy. Forina, M., & Lanteri, S. (1984). In B. R. Kowalski (Ed.), Chemometrics, mathematics and statistics in chemistry (p. 305). Dordrecht: Reidel. Forina, M., Lanteri, S., & Armanino, C. (1987). In M. J. S. Dewar, J. D. Dunitz, K. Hafner, E. Heilbronner, S. Ito, J.-M. Lehn, K. Niedenzu, K. N. Raymond, C. W. Rees, F. W€ ogtle, & G. Witting (Eds.), Topics in current chemistry (pp. 91–144). Berlin: SpringerVerlag. Gigliotti, C., Daghetta, A., & Sidoli, A. (1993). Indagine conoscitiva sul contenuto triglyceridico di oli extra vergini di oliva di varia provenienza. Rivista Italiane delle Sostanza Grasse, LXX, 483–489. Guadarrama, A., Rodrıguez-Mendez, M. L., Sanz, C., Rıos, J. L., & de Saja, J. A. (2001). Electronic nose based on conducting polymers for the quality control of the olive oil aroma. Discrimination of quality, variety of olive and geographic origin. Analytica Chimica Acta, 432, 283–292. Kellner, R., Mermet, J.-M., Otto, M., & Widmer, H. M. (Eds.). (1998). Analytical chemistry. Weinheim: Wiley-VCH. Leardi, R., & Paganuzzi, V. (1987). Caratterizzazione dell’origine di oli di oliva extravergini mediante metodi chemiometrici applicati alla frazione sterolica. Rivista Italiane delle Sostanza Grasse, LXIV, 131–136. Mannina, L., Patumi, M., Proietti, N., Bassi, D., & Segre, A. L. (2001). Geographical characterization of Italian extra virgin olive oils using high-field 1 H NMR spectroscopy. Journal of Agricultural and Food Chemistry, 49(6), 2687–2696. Paganuzzi, V. (1985). Influenza dell’origine e dello estato di conservazione sulla composizione sterolica degli oli d’oliva non etrattati. III. Oli di presione di origine Grecia. Rivista Italiane delle Sostanza Grasse, 62, 399. Pardo, M., Sberveglieri, G., Gardini, S., & Dalcanale, E. (2000). A hierarchical classification scheme for an electronic nose. Sensors and Actuators B, 69(3), 359–365. Sacchi, R., Mannina, L., Fiordiponti, P., Barone, P., Paolillo, L., Patumi, M., & Segre, A. (1998). Characterization of Italian extra virgin olive oils using hydrogen-1 NMR spectroscopy. Journal of Agricultural and Food Chemistry, 46(10), 3947–3951. Spangenberg, J. E., Macko, S. A., & Hunziker, J. (1998). Characterization of olive oil by carbon isotope analysis of individual fatty acids: Implications for authentication. Journal of Agricultural and Food Chemistry, 46(10), 4179–4184. Stefanoudaki, E., Kotsifaki, F., & Koutsaftakis, A. (1999). Classification of virgin olive oils of the two major Cretan cultivars based on their fatty acid composition. Journal of the American Oil Chemists Society, 76(5), 623–626. T.G. Diaz et al. / Food Control 16 (2005) 339–347 Tsimidou, M., & Karakostas, K. X. (1993). Geographical classification of Greek virgin olive oil by non-parametric multivariate evaluation of fatty acid composition. Journal of the Science of Food and Agriculture, 62, 253–257. Tsimidou, M., Macrae, M., & Wilson, I. (1987). Authentication of virgin olive oils using principal component analysis of triglyceride and fatty acid profiles. Part 1. Classification of Greek olive oils. Food Chemistry, 25, 227–239. 347 Wold, S., Albano, C., Dunn, W. J., Edlunk, U., Esbensen, K., Geladi, P., Helberg, S., Johansson, E., Lindbergand, W., & Sjostrom, M. (1984). In B. R. Kowalski (Ed.), Chemometrics, mathematics and statistics in chemistry (pp. 17–96). Dordrecht: Reidel. Wold, S., Albano, C., Dunn, W. J., Esbense, K., Hellberg, S., Johansson, E., & Sjostrom, M. (1983). In H. Martens, & H. Russwurm (Eds.), Food research and data analysis (pp. 147–188). Barking: Applied Science.
© Copyright 2026 Paperzz