PATHMOX: A PLS-PM Segmentation Algorithm Gastón Sánchez1 Tomàs Aluja2 Laboratory of Information Analysis and Modelling (LIAM) Universitat Politècnica de Catalunya, 1 e-mail: [email protected], [email protected] Summary: One of the main issues within path modeling techniques, especially in business and marketing applications, concerns the identification of different segments in the model population. The approach proposed by the authors consists of building a path models tree having a decision tree-like structure by means of the PATHMOX (Path Modeling Segmentation Tree) algorithm. This algorithm is specifically designed when prior information in form of external variables (such as socio-demographic variables) is available. Inner models are compared using an extension for testing the equality of two regression models; and outer models are compared by means of a Ryan-Joiner correlation test. Keywords: PLS-PM Segmentation, equality of regression models, Ryan-Joiner test, 1 Path Modeling and Segmentation Within marketing and business management researches, PLS-PM has been applied successfully in studies concerning the measurement of intangibles like customer and employee perceptions (e.g. satisfaction, motivation, loyalty). In this type of studies it is interesting to try to identify groups of individuals with similar behavior, that is, to identify customer/employee segments. This segmentation task is crucial to managers so they can improve their decision making process and increment organizations profitability. Different proposals have been developed for tackling this problem: The finite mixture for PLS was proposed by Hahn et al (2002) and extended later in Ringle et al (2005); Squillacciotti (2005) extends PLS Typological Regression to perform PLS path modeling classification. In many occasions, external information (information out of the model) is available, regarding individuals’ characteristics such as socio-demographic variables (e.g. age, gender, level of studies, etc.). In these cases the process of segments identification has to take into account not only the structure of the model but also the available external information. However there is one main problematic issue, common to all path modeling segmentation approaches, concerning the following question: given two segments and their corresponding models, how should they be compared? This may require the definition of a measure of distance among the models, which is not an easy task due to model complexity because every path model is integrated by two models: the inner model and the outer model; hardly jointly-comparable with other (inner-outer) models. 2 PATHMOX Algorithm In this work a different approach for path modeling segmentation is proposed with the PATHMOX1 (Path Modeling Segmentation Tree) algorithm. The idea is to build a path models tree having a decision tree-like structure with models for different segments in each of its nodes. The segments identification takes into account not only the available prior information, in form of external variables (such as socio-demographic variables), but also considers the structural relationships between variables. That is, different segments at the level of construct relationships can be identified using external information which is very useful for management executives who often require such variables (gender, age, levels of study, etc.) to direct decision making processes and allocate company resources to increase organization profits. Also, by producing a binary segmentation tree, the segments are clearly identified and easily described. Until now, there is no agreement about which criterion should be used for path models to be compared. We suggest that this comparison should be performed at two levels: first at the inner models level, and then at the outer models level. For the inner models the comparison, which serves to identify different segments, should be based on the path coefficients because they imply the causal structural relationships. Once the segments are identified, the outer models comparison is tested in order to answer the next question: should latent variables in children nodes remain the same as in the parent node? The algorithm starts with the estimation of the global PLS Path Model (over all the individuals) at the root node. Then, with the help of the explanatory external variables, all possible binary splits of the latent variables are produced and local models for each partition are calculated. Among all the possible splits, the best one is selected by means of a test for comparing inner models. The applied test is an extension for testing the equality of two regression models in Lebart et al (1985). In addition, outer models are also compared using a Ryan-Joiner correlation test to decide whether latent variables in children nodes remain the same as in the parent node. The stop criterion considers the number of individuals in a node, and the significance level for the best split. PATHMOX Algorithm Step 1: Start with the global PLS path model at the root node Step 2: Find the best split: test for equality of coefficients of the inner model Step 3: If (stop criteria = false) then For each child node Evaluate the outer models equivalence by means of Ryan-Joiner test If (distinct outer models = true) then Re-estimate the PLS path model in the child node Return to Step 2 Else Stop algorithm Path Modeling Segmentation Tree Algorithm 1 Actually, the term MOX in PATHMOX refers to Moxexeloa which is a Nahuatl word (the Aztec language) that means “divide into groups”. The splitting process is the following. Consider the modalities of an external explicative variable; then for every possible two-way split of these modalities, the set of latent variables is divided into two groups 1 and 2 of size n1 and n2 respectively, and the inner model for each set is estimated, that is, every partition produces two potential models for the children nodes. The inner models of the children nodes are compared with the inner model of the parent node, performing this comparison by an extension of a test for evaluating the equality of regression models using an F-statistic hypothesis test. In this case, this test is based on the path coefficients and assumes that residual terms ε’s have a normal distribution. In the null hypothesis H0 all the coefficients are assumed to be identical; in the alternative hypothesis H1 the coefficients of the two models are considered to be different: η1 = ξ1 Β1 + ε1 , η2 = ξ2 Β2 + ε2 where ηi is a column vector of all endogenous latent variables; ξi is the matrix of the explicative latent variables related to each endogenous LV; Βi is a column vector of all path coefficients; and εi is the residual vector assumed to be normally distributed. Under the null hypothesis H0 all coefficients are equal: Β1 = Β2 = Β η1 = ξ1 Β + ε1 , η2 = ξ2 Β + ε2 The models in both hypotheses can be expressed in matrix notation as follows: Under H0: η1 ξ1 ε 1 η = ξ [β ] + ε 2 2 2 Under H1: η1 ξ1 0 β 1 ε 1 η = 0 ξ β + ε 2 2 2 2 Calculating the sum square error SSE0 and SSE1 from each hypothesis, the test statistic is an F statistic with p* and (n* - 2p*) degrees of freedom: F= (n * −2 p*) SSE 0 − SSE1 p* SSE1 where n* = N J ; N = n1 + n2, (number of elements in the model); J: number of endogenous LVs; p* = Σj pj; pj: number of explicative LVs for each j-th endogenous LV, j =1,…,J. The partition resulting with the most significant p-value is considered as a candidate for the best split. This process is applied for each external explicative variable selecting the partition with the minimum p-value among all the candidates as the optimal split. Once a child node (segment) is identified, the next step consists in testing the equivalence of the child’s outer model with the parent’s outer model. This is done in order to verify if the estimated latent variables remain the same as in the parent node, or if they should be re-estimated in the child node. In order to compare the outer models, we are focusing on the correlations between the LVs in the parent node and the LVs in the child node. Assuming that if the outer model in a child node is very similar to the outer model in the parent node, correlations between the LVs in the parent node and the LVs in the child node should be high (close to unity). To asses how high correlations are, the Ryan-Joiner correlation test (Ryan & Joiner, 1976) is used. The Ryan-Joiner test is an objective way of judging normal probability plots used for testing normality on a set of data. In other words, this test is used to measure the straightness of a probability plot. By using a Ryan-Joiner test we do not pretend to perform any normality test; instead we use it as a tool for assessing how close to unity are the correlations between the LVs in the parent node and the LVs in the child node. It may be argued that this test is being misused the way it is applied in the PATHMOX algorithm, however we use it as a first (although primitive) tool for outer models comparison. Finally, the stop rule evaluates two conditions: (1) a fixed number of individuals in a node, and/or (2) the p-value significance level. The first condition is used to avoid the presence of small size segments which are not duseful in practice. The second criterion avoids the identification of segments with low significance levels. 3 Job Motivation and Satisfaction For many decades, the importance of intangibles has been recognized among businessmanagement literature. But it is until recent years that more attention has been paid to the development of methodologies for measure and reporting them (Eskildsen et al, 2004a). Some examples are the American Customer Satisfaction Index (ACSI) or the European Customer Satisfaction Index (ECSI). Measuring the levels of employee perceptions like satisfaction, motivation, commitment, or intention to leave the job, is an important task because of their implications for job related with productivity, absenteeism, competitiveness, etc. Usually, a motivated worker is assumed to have a better performance, and consequently he or she will give a better contribution to the business. Another aspect closely related to motivation is satisfaction, and it is important to know how satisfied –or dissatisfiedan employee is, because a frustrated worker or a passive worker could have serious intentions to leave the job. Turnover could be a serious problem for many businesses due to the skills and the necessary experience employees must have which are hard to acquire and require years of formation and training. Even if employee’s knowledge and skills are not important, companies must face a decrease in productivity when its labor force is reduced because of turnover. Employee perceptions analyses have a long tradition among psychologists, sociologists and human resources researchers, being part of a field of study known as Organizational Psychology. However, the application of causal models aimed to develop standardized measuring methods for such perceptions is relatively new. 3.1 Causal Model The proposed causal model is an adaptation of the models exposed in Känd and Rekor (2005) and Eskildsen et al (2004b). Some other similar models are found in Gaertner, (1999), Currivan, (1999), and Kim (1999). One of the main differences between the actual model and those which served as basis is the fact that not only a satisfaction construct is considered but also a motivation construct is taken into account. The reason for consider satisfaction and motivation by separate, is based on their definitions. It is assumed that a positive emotional state causes someone to perform efficiently on the job; in other words, it is assumed that satisfaction causes motivation. The theoretical framework for the causal model comprises Herzberg’s two factor theory and expectancy theory. Herzberg’s theory states that persons have two classes of needs: (1) hygiene needs and (2) motivation needs (Furnham, 2001). The first type of needs is influenced by the physical and psychological conditions in which people work. Factors related to hygiene needs are immediate supervision, work conditions, workload, salary, corporate policies, managerial practices, personal relationships, among others. Motivation needs are related to autonomy, achievement, responsibility, recognition, feedback, enrichment and promotional chances. On the other hand, expectancy theory assumes employees enter organizations with a set of beliefs about their workplace divided into three categories: expectancy, instrumentality and valence. Expectancy is the belief that a personal effort will conduce to an efficient performance. Instrumentality is the belief that a good performance will be rewarded. Valence is the perceived value of the expected rewards. Thus, motivation is seen as a multiplicative process of expectancy, instrumentality and valence. Motivation will be achieved with high levels of valence, instrumentality and expectancy. The model includes eight constructs of which five are exogenous and three are endogenous. The exogenous constructs can be divided among four main groups of work environment characteristics (hygiene factors) and one construct related to the motivational factors. These are the following: 1. Conditions of work: Perceptions of the workplace conditions and facilities 2. Salary: Remuneration of work performed in the organization 3. Leadership: Degree of consideration expressed from an employee in a subordinate position 4. Image: Degree to which an employee feels identified with the organization 5. Empowerment: Perceptions of autonomy, initiative, responsibility, recognition The three endogenous latent variables are satisfaction, motivation and loyalty. The concept of loyalty is a complex one because it involves the employee propensity to stay in the organization and the degree to which he or she is committed with it. It is assumed that loyalty is the output of satisfaction and motivation. The proposed causal model showed in figure 1 focuses on the causal relationship between job satisfaction, job motivation, and loyalty. All three endogenous variables are assumed to be influenced by the exogenous constructs. In addition, satisfaction affects motivation, and both of them affect loyalty. By letting all the exogenous latent variables to be related to the three endogenous constructs, the aim is to cover different job satisfaction and motivation dimensions and allow comparison of impacts to the three endogenous variables. Empowerment Satisfaction Image Salary Loyalty Work Conditions Motivation Leadership Figure 1 Path diagram for the Causal Model 3.2 Data and Results The PATHMOX algorithm is applied using data collected from an employee satisfaction and motivation survey from employees working in a Spanish banking entity. The data contain 41 variables observed over 8020 employees and grouped in 8 sets regarding the 8 latent variables: Empowerment, Image, Salary, Work Conditions, Leadership, Satisfaction, Motivation, and Loyalty. In addition to the 41 manifest variables, five external explanatory variables regarding socio-demographic aspects are considered: gender, age, job level, seniority, and sector differences. For comparison purposes we give in Table 1 the estimates of the path coefficients (inner model) for the global and the 6 models of the identified segments. The resulting final segments are: (1) female managers, (2) male managers, (3) senior assistant managers, (4) medium-junior assistant managers, (5) employees in sector A, and (6) employees in sector B. Figure 2 illustrates the obtained segmentation tree. The number of employees forming each segment is shown inside each node, and segments in final nodes are numbered from 1 to 6. Also, every split is characterized by its corresponding explanatory partition. The segmentation tree shows a first division according to the job level into managers and other employees. Managers are further splitted according to gender. With respect to the other employees they are divided into assistant managers, and the rest of workers. Assistant managers are partitioned according to their seniority (before 1975, after 1975); and finally the rest of the workers are segmented according to the sector (A or B). Table 1 Inner model results for global model and final segments SATISFACTION R2 Empowerment Work Conditions Leadership Image Salary Global Seg 1 Seg 2 Seg 3 Seg 4 Seg 5 Seg 6 0,568 0,4972 0,1263 0,0822 0,1494 0,1432 0,505 0,3883 0,2354 0,1176 0,2155 0,0244 0,533 0,4467 0,1587 0,0624 0,2045 0,1137 0,571 0,4865 0,0609 -0,0119 0,2023 0,2241 0,572 0,4931 0,1568 0,0784 0,1528 0,1135 0,6356 0,5161 0,1486 0,1238 0,1315 0,1272 0,6263 0,5698 0,1331 0,1149 0,0634 0,0869 MOTIVATION R2 Empowerment Work Conditions Leadership Image Salary Satisfaction Global Seg 1 Seg 2 Seg 3 Seg 4 Seg 5 Seg 6 0,47 0,1536 0,1456 0,1036 0,1025 -0,0649 0,3964 0,3904 0,1686 -0,1062 0,2317 -0,0218 -0,0485 0,4344 0,376 0,1463 0,0751 0,1118 0,096 -0,0644 0,3741 0,51 0,2869 -0,0144 -0,0438 0,1395 0,0254 0,4141 0,4263 0,1313 0,1592 0,1332 0,0713 -0,0238 0,3663 0,5538 0,1491 0,1936 0,0771 0,1588 -0,1315 0,4095 0,5412 0,1366 0,1243 0,1002 0,0815 0,0397 0,428 Global Seg 1 Seg 2 Seg 3 Seg 4 Seg 5 Seg 6 0,56 0,1457 0,0004 -0,0405 0,2109 0,1405 0,2879 0,2302 0,5338 0,3213 0,0156 0,0938 0,2459 -0,0867 0,1502 0,2036 0,526 0,1334 -0,0296 0,0274 0,1899 0,1513 0,2863 0,2182 0,568 0,1219 0,0143 -0,0222 0,0614 0,1659 0,3194 0,2603 0,5087 0,1795 -0,024 -0,0091 0,2339 0,1039 0,2795 0,1744 0,5777 0,1573 0,0448 -0,0703 0,2271 0,1394 0,2082 0,2803 0,6 0,1873 0,0225 -0,0854 0,1825 0,1405 0,2455 0,2748 LOYALTY R2 Empowerment Work Conditions Leadership Image Salary Satisfaction Motivation From Table 1 one can see some differences between the global model and the identified segments according to the three endogenous constructs. Regarding Satisfaction, female managers give much importance to Work Conditions and Image; male managers are satisfied mainly by the Image; Salary influences senior assistant managers; Leadership is important for employees in Sectors A and B. With respect to Motivation, female managers are motivated mainly by having a good Leadership and Image; senior assistant managers consider Empowerment and Salary important factors while the other assistant managers consider Leadership; workers in Sector A relate Motivation with Work Conditions and Image. Finally with Loyalty, female managers are influenced by Empowerment, Work Conditions and Leadership; senior assistant managers give importance to Work Conditions, Satisfaction and Motivation; other assistant managers are influenced by Empowerment; and the rest of the workers consider Work Conditions and Motivation important to be loyal. 8020 Root node Other employees Managers Women Men 2650 5370 Assi stant Managers Rest of employees 2 1 272 2378 2102 3268 Sen <= 75 3 174 Sen > 75 4 1928 Sector A 5 1609 Sector B 6 1659 Figure 2 Segmentation Tree 4 Conclusions Future research on Path Modeling segmentation with the PATHMOX algorithm will be mainly focused in two problematic issues, which are common also to other approaches in Path Modeling segmentation. They concern (1) the comparison among different path models, and (2) whether structural models remain the same for all segments. Specifically, the model comparison criteria applied within the PATHMOX algorithm have two problems. On the one hand the F-test used for inner model comparison assumes normal distribution with the residual terms ε’s which may not be applied in practice. On the other hand, the Ryan-Joiner correlation test for normal probability plots could be misused for outer models comparison. Thus, other comparison methods should be developed disregarding normal distributional assumptions. With respect to the second problematic issue, all path modeling segmentation techniques assume that the inner model remains the same for all segments. However, further analysis might be performed considering the possibility that segments may also differ for the structural model itself. References Currivan D. B. (1999) The Causal Order of Job Satisfaction and Organizational Commitment in Models of Employee Turnover, Human Resource Management Review, 9 (4), 495-524 Eskildsen J. K., Kristensen K., Westlund A. (2004a) Work motivation and job satisfaction in the Nordic countries, Employee Relations, 26 (10), 122-136 Eskildsen J. K., Kristensen K., Westlund A. (2004b) Measuring employee assets: The Nordic Employee Index, Business Process Management, 10 (5), 537-550 Furnham A. (2001) Psicología Organizacional: el comportamiento del individuo en las organizaciones. Oxford University Press, Mexico. (Trans from: The Psychology of behaviour at work: the individual in the organization) Gaertner S. (1999) Structural Determinants of Job Satisfaction and Organizational Commitment in Turnover Models, Human Resource Management Review, 9 (4), 479493 Hahn C., Johnson M., Herrmann A., Huber F. (2002) Capturing Customer Heterogeneity using a Finite Mixture PLS Approach, Schmalenbach Business Review, 54, 243-269. Känd M., Rekor M. (2005) Perceived Involvement in Decision Making and job Satisfaction: The Evidence from a Job Satisfaction Survey among Nurses in Estonia, SSE Riga Working Papers, 6, Stockholm School of Economics in Riga. http://www.sseriga.edu.lv/library/working_papers/FT_2005_6.pdf Kim S. (1999), Behavioral Commitment Among the Automobile Workers in South Korea, Human Resource Management Review, 9 (4), 419-451 Lebart L., Morineau A., Fénelon J. P. (1985) Tratamiento estadístico de datos, Marcombo, Barcelona, Spain. Ringle C. M., Wende S., Will A. (2005) Customer Segmentation with FIMIX-PLS, in: Proceedings of the PLS’05 International Symposium, T. Aluja, J. Casanovas, V. Esposito, A. Morineau, M. Tenenhaus (Eds.), SPAD Test&go, 507-514, . Ryan T. A., Joiner B. L. (1976) Normal Probability Plots and Tests for Normality, Technical Report, Statistics Department, The Pennsylvania State University, USA. Squillacciotti S. (2005) Prediction oriented classification in PLS Path Modelling, in: Proceedings of the PLS’05 International Symposium, T. Aluja, J. Casanovas, V. Esposito, A. Morineau, M. Tenenhaus (Eds.), SPAD Test&go, 499-506.
© Copyright 2026 Paperzz