PLS aims at minimizing

Soft modeling – PLS PM
Wyznaczanie parametrów – nie jest wnioskowaniem statystycznytm, mimo używania określenia „estymacja” (Wold, 1980)
OLS częściowe – sekwencyjne; Projekcja ortogonalna danych na wielo-konstruktową przestrzeń liniową
OLS regressions Minimum distance property of orthogonal projections
1-blok refleksywny = PCA
2-bloki formatywne – Korelacja kanoniczna
Minimalizuje sumę kwadratów błędów predykcji dla każdego z równań pomiarowych z osobna a wartości konstruktów wyznacza iterayjnei tak, aby
zmaksymalizować współczynniki korelacji między nimi
Konstrukty skorelowane ze sobą
Asymetria równań – przyczynowa interpretacja – rekursyność - Identyfikacja – no problem
Distibution –free – nie ma wnioskowania statystycznego – zmienne o niewielkiej liczbie wartości,
Algorytm Wolda wyznaczania modeli z formatywnymi konstruktami jest monotonicznie zbbieżny (dowód – Hanafi, 2007)
Algorytmy wyznaczania modeli złożonych nie zawsze jest zbieżny i stabilny względem wartości początkowych
Proce
„under regularity condition” dla prostych modeli (1-2 bloki) – parametry wyznaczane metodą PLS oraz ML są podobne
PLS regression
preserves the asymmetry of the relationship between predictors and dependent variables, whereas these
other techniques treat them symmetrically
Wold, H. (1980). Model Construction and Evaluation When Theoretical Knowledge Is Scarce. Theory and
Application of Partial Least Squares. In J. Kmenta & J. B. Ramsey (Eds.), Evaluation of Econometric
Models (pp. 47–74). Academic Press.
The structural equations (2) are estimated by individual multiple regressions where the latent variables ξj are
replaced by their estimations Yj.
Each LV is estimated as a weighted aggregate of its indicators. The weights of the indicators in each aggregate are determined
by the weight relations of the various blocks.
The estimation proceeds in three stages.
1. First, an iterative procedure estimates the weights and the LVs
2. Second, the LYs estimated in the first stage provide regressors for estimating the other unknown coefficients of the
model by OLS (Ordinary Least Squares) regressions
3. In the third stage the location parameters are estimated.
With respect to the testing and evaluation of the model, classical methods such as confidence intervals and goodness of fit,
which are based on distributional properties of the observables, are not available because of the scarcity of theoretical
knowledge. PLS modeling instead uses LS (Least Squares)-oriented but distribution-free methods. This parting of the ways is
technical rather than real, for ML aims for optimal accuracy but PLS for consistency. Under regular conditions ML and PLS
estimates are co-consistent, so that there is no substantial difference between the two set of estimates.
In the basic design of PLS path models with LVs, the block structure is assumed to be linear:
where is the LV of the jth block, xJk is the loading of the kth indicator Xjk in thejth block, the jts are location parameters and
the vs are residuals in the block structure.
In each relation in (4), the systematic part is assumed to be the conditional expectation of the indicator when the latent
variable is given that is,
The "predictor" assumption in (5) implies that each residual in each equation has a conditional expectation of zero and is
uncorrelated with the latent variable occurring in that equation. Since both xi,, and c are unknown in the block structure
in (4), some standardization of scale is necessary in order to avoid ambiguity. The choice of standardization does not affect
the substantive results of the statistical analysis. In PLS path models all LVs are standardized so as to have unit variance:
It is a fundamental principle in PLS path modeling that the information between the blocks and the ensuing
causalpredictive inference is conveyed through the latent variables. Accordingly, it is assumed that the latent variables
are, in general, intercorrelated, say,
whereas the residuals of any block are assumed to be uncorrelated with the residuals of other blocks and with all latent
variables: In the basic model design it is assumed, furthermore, that the residuals are mutually uncorrelated within blocks:
Inner relations
The LVs of a PLS path model are related by a path of "inner" relations, which are linear and form a causal chain system. Using
subscripts j,, for the endogenous variables say, H in number the inner relations are denoted by
and are specified by the conditional expectations
where
denotes the column vector of the latent variables that appear as regressors with nonzero coefficients in the ith
equation as shown in (10) and is the row vector of the corresponding coefficients.
To begin, the first principal component is numerically equivalent, up to a scale factor, to a one-block PLS
model estimated using the mode A procedure.
The two-block PLS path model estimated using the mode B procedure gives the first canonical coefficient as
the estimated correlation between the two latent variables
For one- and two-block models the PLS estimation procedure converges almost certainly.8 For PLS models
with three or more blocks, convergence of the estimation procedure has never been a problem in applications to
real-world models and data
The second stage of the PLS algorithm uses the LVs obtained in the first stage to estimate the parameters of the
structural relations of the model. This is performed by minimizing the residual variances in the block structure,
the inner relations, and the causal-predictive relations SELY with reestimation. Thus the PLS procedure is
partial LS in the sense that each step of the procedure minimizes a residual variance with respect to a subset of
the free parameters, given proxy or final estimates for the other parameters. In the limit the PLS estimation
procedure is coherent in the sense that all the residual variances are minimizedjointly. However, the PLS
procedure remains partial in the sense that no total residual variance or other overall criterion is set up for
optimization.
As stated at the outset PLS modeling is primarily designed for causal predictive analysis of problems with high
complexity, but low information . Being distribution-free, PLS estimation imposes no restrictions on the format
or the data. The available data may be time series or cross sections. The observations on the indicators may be
quantitative measurements, ordinal ranks, or records of occurrence, nonoccurrence, or of low versus high levels
of the indicator.
Wold, H. On the consistency of least squares regression. Sankhya, Series A, 25(2), 1963,211-215.
Wold, H. Nonlinear estimation by iterative least squares procedures. In F. N. David (Ed.), Research papers in
statistics, Festschrift for J. Neynian. New York: Wiley, 1966. Pp. 411-444.
Wold, H. Casual flows with latent variables: Partings of the ways in the light of NIPALS modeling. European
Economic Review. Special Issue in Honour of Ragnar Frisch, 1974, 5(1), 67-86.
Wold, H. Path models with one or two latent variables. In H. M. Blalock Jr., F. M. Borodkin, R. Boudon, & V.
Capecchi (Eds.), Quantitative sociology. New York: Seminar Press, 1975. Pp. 307-357. (a)
Wold, H. Soft modelling by latent variables: The non-linear iterative partial least squares (NIPALS) approach.
In J. Gani (Ed.), Perspectives in Probability and Statistics, Papers in Honour of M. S. Bartlett.
London: Academic Press, 1975. Pp. 117-142. (b)
Wold, H. From hard to soft modeling. In H. Wold (Ed.), Group report: Modeling in complex situations with
soft information. Toronto: Third world congress of econometrics, August 21-26, 1975. Research
Report 1975:4, University Institute of Statistics, Uppsala, 1975. Chapter 1. (c)
Wold, H. (Ed.) Group report: Modeling in complex situations with soft information. Toronto: Third world
congress of econometrics, August 21-26, 1975. Research Report 1975 :4, University Institute of
Statistics, Uppsala, 1975. (d)
Wold, H. On the transition from pattern recognition to model building. In R. Henn and 0. Moeschlin (Eds.),
Mathematical economics and game theory: Essays in honor of Oskar Morgenstern. Berlin: Springer,
1977. Pp. 536-549. (a)
Wold, H. Open path models with latent variables. In H. Albach, E. Helmstedter and R. Henn (Eds.),
KvantUative Wirtschaftsforschung. Wilhel,n Krelle zum 60. Geburslag. Tubingen: Mohr, 1977(b)
Wold, H. Ways and means of interdisciplinary studies. In Transactions of the Sixth International Conference on
the Unity of the Sciences, San Francisco, 1977. New York: The International Cultural Foundation,
1978. Pp. l071-1095. (a)
Wold, H. Factors influencing the outcome of economic sanctions: Tentative analysis by PLS (Partial Least
Squares) modeling. University of Haifa: International Workshop of Conflict Resolution, June 19-24,
1978. (b)
Wold, H. (Ed.). The fix-point approach to interdependent syste,ns. Amsterdam: North-Holland Publ., 1980(a)
Wold, H. Soft modeling: The basic design and some extensions. In K. G. Jöreskog & H. Wold (Eds.),
Proceedings of the conference "Systems under indirect observation. Causality structur eprediction,"
October 18-20, 1979, Cartigny near Geneva. Amsterdam: North-Holland Publ., 1980, in press. (b)
Wold, S. Cross-validatory estimation of the number of components in factor and principal components models.
Technometrics, 1978, 20, 397-405.
Wold, S., et al. Four levels of pattern recognition. Analytica Chimica Ada, 1978, 103,429-443.
Wold, S., et al. The indirect observation of molecular chemical systems. In K. G. Joreskog & H. Wold (Eds.),
Proceedings of the conference "Systems under indirect observation, Causality structur eprediction,"
October 18-20, 1979, Carligny near Geneva. Amsterdam: North-Holland PubI., 1980, in press.
Partial Least Squares Regression Versus Other Methods
Smail Mahdi
Lovric_2010_InternationalEncyclopediaOfStatisticalScience.pdf
It is a multivariate technique that generalizes and combines ideas from principal component analysis (PCA) and ordinary least squares (OLS)
regression methods. It is designed to not only confront situations of correlated predictors but also relatively small samples and even the
situation where the number of dependent variables exceeds the number of cases. original idea came in the work of the Swedish statistician
Herman Wold and became popular in computational chemistry and sensory evaluation.
Ciavolino Al.-Nasser (2009)
Fornell
Michael Haenlein, Andreas M. Kaplan
This results in a set of theoretical equations (Equation 3), representing nonobservational hypotheses and
theoretical definitions, and measurement equa- tions (Equations 1 and 2), representing correspondence rules (Bagozzi &
Philipps, 1982). The theoretical equations are then also referred to as the structural model, whereas the measurement
equations build the measurement model, and both com- bined can be subsumed by the term structural equation model.
As Bagozzi (1984) emphasized, there are three different types of unobservable variables: (a) variables that are
unobservable in principle (e.g., theoretical terms); (b) variables that are unobservable in principle but either imply
empirical concepts or can be inferred from observations (e.g., attitudes, which might be reflected in evaluations); and (c)
unobservable variables that are defined in terms of observables. Because none of these types can be measured
directly, the researcher needs to measure indicators instead, which cover different facets of the unobservable
variable. In general, indicators can be split into two groups: (a) re- flective indicators that depend on the construct and (b)
formative ones (also known as cause measures) that cause the formation of or changes in an unobservable vari-able
(Bollen & Lennox, 1991).
Whereas reflective indicators should have a high correlation (as they are all dependent on the same unobservable
variable), formative indicators of the same construct can have positive, negative, or zero correlation with one another
(Hulland, 1999), which means that a change in one indicator does not necessarily imply a similar directional change in
others (Chin, 1998a). I
In contrast to covariance-based SEM, which estimates first model parameters and then case values (i.e., estimated values
for each latent variable in each data set) by regressing them onto the set of all indicators (Dijkstra, 1983), PLS starts by
calculating case values. For this purpose, the “unobservable variables are estimated as exact linear combinations of their
empirical indicators” (Fornell & Bookstein, 1982, p. 441), and PLS treats these estimated proxies as perfect substitutes for
the latent variables (Dijkstra, 1983). The weights used to determine these case values are estimated so that the resulting
case values capture most of the variance of the independent variables that is useful for predicting the dependent variable
(Garthwaite, 1994). This is based on the implicit assumption that all measured variance of the variables in the model is
useful variance that should be explained (Chin, Marcolin, & Newsted, 1996).
Fornell, Bookstein, 1982
Tannenhaus 2005
Vinzi, 2007
Predictor Specification is the only condition imposed in PLSPM to assure desirable
estimation properties in least squares modeling.
It avoids the assumptions that the observations are jointly ruled by a specified multivariate
distribution and are independently distributed (classical i.i.d. assumptions).
The stochastic concept of causality is expressed in terms of conditional expectations
(for any regression equation in the model):
yk = E(yk|yk’, xt) + ek
which implies:
- non-reversibility of the variables’ role;
- conditional expectation is the systematic part of the relationship;
- conditional expectation is a linear function of explanatory variables;
- expectation of residuals is 0 (centred);
- residuals and explanatory variables are uncorrelated;
- consistent estimates “at large”;
- minimum variance predictions.
The residual covariance structure is not restricted
Differently from covariance structure models, where the residual covariance
matrix is minimized when reproducing the observed covariances by means of the
implied moments, PLS aims at minimizing:
– the trace (sum of diagonal elements: variances) of Ψ;
– and, with reflective specification, also of θε;
– and, with formative specification, also of θδ.
• Reflective (Effects) indicators:
– constructs give rise to observed variables (unique cause → unidimensional)
– typical of classical test theory and factor analysis models
– aim at accounting for observed variances or covariances
– Mode A estimation: LV is a principal component of its MV’s (minimizes outer residual
variances) under the constraint of being the best neighbor of its adjacent LV’s (minimizes
inner residual variances) → higher AVE (mean communality)
– OLS simple regressions → not affected by multicollinearity
• Formative (Causal) indicators:
– constructs are combinations of observed indicators (multidimensional)
– not designed to account for observed variables
– aim at minimizing residuals in structural relationships (explanation of unobserved
variance) → higher R2
– Mode B estimation: LV is the best predictand of its MV’s under the constraint of minimizing
the trace of the residual variances in the structural model → lower AVE
– OLS multiple regression → affected by multicollinearity
– In case of high multicollinearity: use loadings instead of weights; use PLS regression