Investigation of the Fuzzy System for the Assessment of Cadastre Operators’ Work Dariusz Król1 , Grzegorz Kukla1 , Tadeusz Lasota2 , and Bogdan Trawiński1 1 2 Institute of Applied Informatics, Wroclaw University of Technology, Poland [email protected] grzegorz [email protected] [email protected] Faculty of Environmental Engineering and Geodesy, Agricultural University of Wroclaw, Poland [email protected] 1 Introduction Cadastre systems are mission critical systems designed for the registration of parcels, buildings and apartments as well as their owners and users. Those systems have complex data structures and sophisticated procedures of data processing. They are constructed in client-server architecture for LAN as well as in Web technology to be used in intranets and extranets. There are above 400 information centres located in district local self-governments as well as in the municipalities of bigger towns. Managers of information centres often complain they have no adequate tools for the assessment of work of cadastre system operators. The fuzzy model for the assessment of operators’ work of a cadastre information system was proposed in [6]. According to centre managers’ suggestions the following four input criteria were designed: i.e. productivity, complexity, time and quality. Productivity was expressed by the number of changes input into the cadastre database within a given period, complexity of changes was specified as the mean number of objects which were modified in the database falling per one change, time was determined by average time of inputting one change and the quality of work defined as the percentage of changes without any corrections. The results of the investigation of the fuzzy model are discussed in the present paper. The tests have been carried out using real data taken from one cadastre centre. The data comprised all change records input into cadastre database during the period of one year from October 2004 till September 2005. In general numerous methods are used for the determining optimal parameters of fuzzy models [7] including such approaches as neuro-fuzzy systems [1], genetic algorithms [4] and fuzzy clustering techniques [3]. Fuzzy models are also evaluated using specific analyses like interpretability [9], sensitivity [8] or regression [2]. In our approach descriptive statistics, correlation, multiple 2 Dariusz Król, Grzegorz Kukla, Tadeusz Lasota, and Bogdan Trawiński regression and the distance between rankings have been used in the analysis of the test results. 2 The Structure of the Fuzzy System 2.1 General description of the system The fuzzy Mamdani’s model with Larsen’s implication proposed in [6] constitutes the basis of the fuzzy system which is intended to rationalize the management of information centres, to improve the organization of work and to determine wages of part-time employees. General architecture of the fuzzy system is shown in Fig. 1. It comprises five main modules of operators’ work statistics, fuzzification, inference, defuzzification and visualization. Fig. 1. Architecture of the fuzzy system for the assessment of operators’ work For each input criterion i.e. productivity (P), complexity (C), time (T) and quality (Q) as well as for output assessment triangular and trapezoidal membership functions have been defined. The statistics module provides initial parameters of the model and values of input criteria. The idea of obtaining the final assessment consists in calculating the average value of P, C, Q and T criteria taking into account the change records saved in cadastre database for all operators and for long period of time, e.g. a year or a half of year. These average values are used as the reference values of 100% for calculating what percentage of corresponding average value a given operator achieved within an assessment period. The domain for Q, P and T has been set up from 0% to 200% that means, if an operator achieves better results than 200% of a mean during the assessment period, his result will be trimmed to 200%. Data for quality variable are used directly, because this criterion is expressed in percents. Standard deviations, calculated for each criterion separately, determine the initial width of the basement of triangle and trapezoid. The domain for output is an arbitrary assessment scale from 0 to 200; with 0 being the lowest and 200 the highest mark. Investigation for the Assessment of Cadastre Operators’ Work 3 2.2 Characteristics of the inputs and the output of the system The first step of data analyses was to examine significant relations between input criteria. Work statistics for the period of 12 months for 10 operators were calculated and the values of input criteria were obtained. Criteria with zero values, for months where a given operator did not work and criteria with values trimmed to 200, for months when a given operator achieved results more than 200 percent better than average, were neglected, so finally the correlation matrix was calculated for 97 quadruplets of input values (see Table 1). Correlation coefficients between T and C as well as between T and P turned to be significant. This result has led to the decision to remove the input variable of Time from the fuzzy model. Table 1. Correlation matrix for input criteria Complexity Productivity Quality Time Complexity Productivity 1.0 −0.04 1.0 0.05 0.11 -0.45 0.27 Quality Time 1.0 0.01 1.00 Fig. 2. Examples of membership functions of input and output variables Three main models of input and output variables have been programmed and named 3x5, 5x7 and 7x9. In the 3x5 model 3 fuzzy sets determine each input criteria and the output is defined by 5 fuzzy sets. In the 5x7 and 7x9 models there are 5 and 7 fuzzy sets for each input as well as 7 and 9 for each output respectively. Examples of fuzzy membership functions used to define 4 Dariusz Król, Grzegorz Kukla, Tadeusz Lasota, and Bogdan Trawiński C, P and Q criteria and an output are shown in Fig. 2, where EL, VL, L, BM, M, AM, H, VH, EH denote Extremely Low, Very Low, Low, Below Medium, Medium, Above Medium, High, Very High and Extremely High respectively. Delta (∆) is a parameter which determines the width of the basement of a triangular membership function. Initial value of ∆ is set up with standard deviation (σ), calculated for each criterion separately. During tests the value of ∆ was changed from 1.0σ to 0.2σ. 2.3 Rule base and inference process The rule base for each model contains simple IF-THEN rules where the condition consists of only two input variables combined by AND operator and the conclusion is built by one variable. An example of a rule is as follows: IF Complexity is low AND Productivity is medium THEN Assessment is low. Fig. 3. Representation of rule base in matrix form for 3x5 and 5x7 models Thus the rules for one pair of input criteria can be given in the form of a matrix shown in Fig. 3 and 4. Three matrices for each pair of input criteria i.e. (C,P), (C,Q) and (P,Q) have been designed and they comprise 9, 25 and 49 rules for the 3x5, 5x7 and 7x9 models respectively. In order to express the strength of rules belonging to particular combination, rule weights can be assigned to each combination as the multipliers of rule conditions in aggregation step in the inference module, for example w(C,P ) = 0.60, w(C,Q) = 0.20 and w(P,Q) = 0.20. In order to assure that each input value and each rule will have an impact on the final assessment following operators has been used: PROD for aggregation of rule conditions, PROD for activation of rule conclusions and ASUM for accumulation of output membership functions, where PROD means algebraic product and ASUM denotes algebraic sum [5]. In the defuzzification step the centre of gravity method has been used. Investigation for the Assessment of Cadastre Operators’ Work 5 Fig. 4. Representation of rule base in matrix form for 7x9 models 3 Plan of the Investigation The experiment has been carried out using cadastre database taken from one information centre and change records added by 10 operators into the database during the period of one year from October 2004 till September 2005. The fuzzy system has been treated as a black box, that means only input values of Complexity, Productivity and Quality and corresponding output assessments have been taken into account in the study. Multiple regression, descriptive statistics, and the distance between rankings have been used in the analysis of the test results. In order to examine how the output assessments change for different parameters of the system 180 variants of fuzzy model have been constructed by a simulation program and tested. The variants covered all possible combinations of three basic 3x5, 5x7 and 7x9 models, five values of ∆ parameter determining the widths of triangle basements, three sets of rules and four sets of rule weights. Each variant of the model has been coded according to the method shown in Table 2 where (1), (2), (3) and (4) by a code caption indicate the position of a digit in the code. For example 7413 denotes the 7x9 model with ∆=0.4σ using the rule set of (C,P) with the weight equal 0.8, (C,Q) with the weight equal 0.4 and (P,Q) with the weight equal 0.1 In turn 3134 denotes the 3x5 model with ∆=1.0σ using the rule matrix of (C,P) with the weight equal 0.6, and (P,Q) with the weight equal 0.2. The purpose of the experiment was to examine how input values influence the output of the system, how the assessments produced by the system make it possible to differentiate the results of operators’ work and how close are the system assessments to subjective manager’s judgments. 6 Dariusz Król, Grzegorz Kukla, Tadeusz Lasota, and Bogdan Trawiński Table 2. Coding method of variants tested C(1) Model C(2) Delta C(3) C(4) Weights Rule sets 3 3x5 1 1.0σ 1 (C, P ), (C, Q), (P, Q) 1 1.0, 1.0, 1.0 5 5x7 2 0.8σ 2 (C, P ), (C, Q) 2 0.9, 0.6, 0.3 7 7x9 3 0.6σ 3 (C, P ), (P, Q) 3 0.8, 0.4, 0.1 4 0.4σ 4 0.6, 0.2, 0.2 5 0.2σ 4 Results of the Investigation Data for the analysis of descriptive statistics and multiple linear regression have been prepared in analogous way as data which were used during correlation study. Input criteria with zero values, for months where a given operator did not work and criteria with values trimmed to 200, for months when a given operator achieved results more than 200 percent better than average, were neglected. However the comparison of the assessments produced by the fuzzy system with subjective judgments of information centre manager has been conducted using statistical data of changes added by operators into the cadastre system during September 2005. 4.1 Multiple Linear Regression Analysis The multiple linear regression with no intercept has been calculated for all 180 models. In each case the coefficient R was greater than 0.9 (minimum value equal 0.935 and maximum value equal 0.997), F-value scaled between 219.473 and 4760.762 and p-value very close to zero. This indicates that input criteria are strongly related to the output assessments. The analysis of β coefficients has revealed that p-value for βQ coefficient by Quality variable was greater than 0.05 in 37 cases, i.e. 20.6%. Moreover the value of βQ coefficient was negative in 156 cases what may be interpreted that operators achieved better complexity or productivity at the cost of decreasing quality. The results of the regression analysis of 9 selected models for which all βC , βP and βQ coefficients were significant are shown in Table 3. 4.2 Analysis of descriptive statistics General question of the analysis of descriptive statistics was how the output generated by the system made it possible to differentiate the results of operators’ work. So two measures have been taken into account namely the variability coefficient which is expressed by the standard deviation divided by the mean and the range which equals the difference between maximum and minimum assessments. It may be expected that if the fuzzy system provides more differentiated results then it will better assist managers’ in assessing their workers. The variability coefficient calculated for 180 models has had Investigation for the Assessment of Cadastre Operators’ Work 7 Table 3. Results of multiple linear regression analysis for 9 selected models Model type Model code Multiple R 3x5 3111 0.984 3x5 3223 0.981 3x5 3434 0.969 5x7 5131 0.993 5x7 5313 0.984 5x7 5422 0.971 7x9 7312 0.965 7x9 7424 0.956 7x9 7531 0.948 F-value 933.819 807.087 475.597 2264.987 973.042 507.758 426.105 331.688 275.480 βC 0.326 0.626 0.538 0.256 0.617 0.706 0.555 0.759 0.392 βP 0.411 0.439 0.724 0.543 0.447 0.402 0.516 0.627 0.791 βQ 0.201 −0.200 −0.453 0.289 −0.137 −0.285 −0.220 −0.613 −0.498 the values between 0.283 and 0.773 and the range between 128 and 182. The values of variability coefficient presented in Fig. 5 are greater for 7x9 models than for 3x5 and 5x7 models. Fig. 5. Values of variability coefficient for 180 models tested You can draw similar conclusions when you examine the fuzzy system output surface. The plots generated by Matlab Surface Viewer has shown that the 7x9 model assures more distinguishable assessments than the assessments produced by the 3x5 model (see Fig. 6). In Fig. 7a range values for different ∆ sizes for corresponding 3x5, 5x7 and 7x9 models (the same rule sets: (C,P), (C,Q), (P,Q) and the same weight variant: 0.6, 0.2, 0.2) are presented, where 1, 2, 3, 4, 5 on X axis denote ∆ equal 1.0σ, 0.8σ, 0.6σ, 0.4σ, 0.2σ respectively. In Fig. 7b range values for different rule weight variants for corresponding 3x5, 5x7 and 7x9 models (the same rule sets: (C,P), (C,Q), (P,Q) and the same ∆=0.6σ) are presented, where 1, 2, 3, 4 on X axis denote 1.0, 1.0, 1.0 and 0.9, 0.6, 0.3 and 0.8, 0.4, 0.1 and 0.6, 0.2, 0.2 variants respectively. In both Fig. 7a and 7b it is clearly seen that 7x9 models provide more distinguishable results than other models. 8 Dariusz Król, Grzegorz Kukla, Tadeusz Lasota, and Bogdan Trawiński Fig. 6. Assessment surface versus C and P criteria for 3x5 and 7x9 models Fig. 7. Value of a range for a) different variants of ∆ and for b) different variants of rule weights 4.3 Comparison of the assessments assigned by the system and by a centre manager In order to evaluate how the output of the system is related to a centre manager’s judgment one information centre manager was asked to appraise his operators’ work in September 2005. He was not informed how fuzzy system worked and he did not see the results of statistics module so that his judgments were entirely subjective. The manager was able to give relatively rough assessments expressed in percents: 150%, 120%, 120%, 120%, 110%, 100%, 80%, 80%, and 70% for successive operators. It could be easily seen that the manager had difficulties in differentiating individual operators. Nevertheless in the case of equal assessment he was asked to rank the operators. So we were able to compare the rankings determined by the manager with produced by the fuzzy system. The tenth operator was not classified by the manager who stated that operator fulfilled different tasks and added changes to the cadastre database sporadically and therefore was assigned last position in the Investigation for the Assessment of Cadastre Operators’ Work 9 manager’s ranking. We used following measure of the distance between these two rankings: DRank = 10 X |rmi − rsi | (1) i=1 where rmi denotes the position of i-th operator in the manager’s ranking and rsi the position of i-th operator in the ranking produced by the fuzzy system. The DRank measure was calculated for each of 180 models tested and its value was between 18 and 26. Rank positions of individual operators produced by the system for three selected 3x5, 5x7 and 7x9 models and positions assigned by the centre manager are presented in Fig. 8. Fig. 8. Rank positions assigned to individual operators by the manager and the system It can be seen that the manager and the system equally recognized the best and the worst operators. However the manager clearly underestimated operator c and operator d. Maybe it has been caused by manager’s subjective approach, which for example when assessing the d operator’s work for 70% stated that operator admittedly was a very experienced person but she tended to work slowly. It is also possible that there are other criteria of operators’ work assessment, maybe even immeasurable, which the manager took into consideration. 10 Dariusz Król, Grzegorz Kukla, Tadeusz Lasota, and Bogdan Trawiński 5 Conclusions and Future Works The fuzzy system for the multi-criteria assessment of information system operators’ work has been evaluated using real data taken from one cadastre centre. Input data generated by the statistical module have been processed using automatically created 180 variants of fuzzy models. The variants covered all possible combinations of three basic 3x5, 5x7 and 7x9 models, five values of parameter determining the widths of triangle basements, three sets of rules and four sets of rule weights. Multiple linear regression, descriptive statistics, correlation and the distance between operator rankings have been used in the analysis of the test results. The experiment allowed us to investigate the properties of the fuzzy system. In 79% all input variables influenced the output significantly. The assessments generated by the models differed in the value of variability coefficient and the range. The 7x9 models assured better differentiation of the results. It is not possible to determine definitely which model is optimal, nevertheless the study proved usefulness of the model. It is planned to carry out further evaluation experiments with the active participation of the centre managers. This time the centre managers will be instructed how the fuzzy system operates and will be got familiar with the statistics of operators’ work within a given time. Moreover they will be able to determine the weights of the rules in order to adjust the system to their preferences. References 1. Ajith A (2001) Neuro-Fuzzy Systems: Sate-of-the-Art Modelling Techniques. In: Proceedings of the 6th International Conference on Neural Networks 269–276 2. Cheung W, Pitcher T, Pauly D (2004) A Fuzzy Logic Expert System for Estimating the Intrinsic Extinction Vulnerabilities of Seamount Fishes to Fishing. Fisheries Centre Research Reports 12(5):33–50 3. Gomez A, Delgado M, Vila M (1999) About the use of fuzzy clustering techniques for fuzzy model identification. Fuzzy Sets and Systems 106(2):179–188 4. Herrera F (2005) Genetic Fuzzy Systems: Status, Critical Considerations and Future Directions. Journal of Computational Intelligence Research 1(1):59–67 5. IEC 1131 - Programmable Controllers (1997) Part 7 - Fuzzy Control Programming. Committee Draft CD 1.0 (Rel. 19 Jan 97) 6. Król D, Kukla G S, Lasota T, Trawiński B (2006) Fuzzy Model for the Assessment of Operators’ Work in a Cadastre Information System (to be published 7. Piegat A (2003) Fuzzy Modelling and Control (in Polish). Akademicka Oficyna Wydawnicza EXIT Warszawa 8. Saez D, Cipriano A (2001) A new method for structure identification of fuzzy models and its application to a combined cycle power plant. Engineering Intelligent Systems 9(2):101–107 9. Xing Zong-Yi, Jia Li-Min, Zhang Yong, Hu Wei-Li, Qin Yong (2005) A Case Study of Data-driven Interpretable Fuzzy Modeling. Acta Automatica Sinica 31(6):815–824
© Copyright 2026 Paperzz