Multivariate Statistical Methods for Engineering and Management
Factor Analysis
Laboratory Guide III - R & SPSS
This dataset referers to national track records for men, recorded in 2005 World
Championships in Athletics (see Johnson and Wichern, 2007).
1. Make a preliminary analysis of the data and discuss what you have learned from
this analysis.
2. Analyze the data using factor analysis, consider the variables in its original scale.
3. How many factors would you extract from the data set. Explain the reasons
behind your decision.
4. How much oh the information from the original set is accounted for by these
factors?
5. How would you interpret the common factors? Try several types of rotation and
see if it makes a difference in your interpretation of the data.
6. Repeat the analysis based on the standardized variables. Do the results change?
What analysis do you recommend?
7. Analyse the data using principal components analysis. Which method would you
select to analyze this dataset?
Variable description:
X1
X2
X3
X4
X5
X6
X7
X8
100 meters (seconds)
200 meters (seconds)
400 meters (seconds)
800 meters (minutes)
1500 meters (minutes)
5000 meters (minutes)
10000 meters (minutes)
Marathon (minutes)
1
To answer these question, using R, you may need the following R commands:
#
### National Track Records_Men
#
lx<-read.table("National Track Records_Men.txt",header=TRUE)
track<-lx[,2:9]
rownames(track)<-lx[,1]
colnames(track)<-colnames(lx[2:9])
#
### Exploratory analysis
#
library(car)
par(mfrow=c(3,3))
qq.plot(track[,1]);title(colnames(track)[1])
qq.plot(track[,2]);title(colnames(track)[2])
qq.plot(track[,3]);title(colnames(track)[3])
qq.plot(track[,4]);title(colnames(track)[4])
qq.plot(track[,5]);title(colnames(track)[5])
qq.plot(track[,6]);title(colnames(track)[6])
qq.plot(track[,7]);title(colnames(track)[7])
qq.plot(track[,8]);title(colnames(track)[8])
par(mfrow=c(1,1))
boxplot(track)
boxplot(scale(track,scale=FALSE))
round(cor(track),digits=3)
pairs(track)
apply(track,2,summary)
#
### Elimination of "CookIslands" and "Samoa", since they seem outliers
#
o<-order(track[,8])
lx<-track[o,]
track<-lx[1:52,]
#
### PCFA
#
e<-eigen(cor(track))
loadings<-matrix(NA,8,8)
loadings[,1]<-sqrt(e$values[1])*e$vector[,1]
loadings[,2]<-sqrt(e$values[2])*e$vector[,2]
loadings[,3]<-sqrt(e$values[3])*e$vector[,3]
round(loadings,3)
m<-2
com<-c(sum(loadings[1,1:m]^2),sum(loadings[2,1:m]^2),sum(loadings[3,1:m]^2))
round(com,3)
round(cbind(com,diag(cor(track))-com),3)
2
#
### Factor Analysis - Principal Axes Factor Analysis
#
library(psych)
### TWO factor, rotation=None, original data (covariance matrix)
lxPAcov<- fa(track,nfactors=2,rotate="none",scores=TRUE,digits=3,covar=TRUE,fm="pa")
print(lxPAcov,cut=.0,digits=3)
print(round(cbind(lxPAcov$loadings,1-lxPAcov$uniquenesses),digits=3),cutoff=0.0)
### TWO factor, rotation=None, standardized data (correlation matrix)
lxPA<- factor.pa(scale(track),nfactors=2,rotate="none",scores=TRUE,digits=3,covar=TRUE)
print(lxPA,cut=.0,digits=3)
print(round(cbind(lxPA$loadings,1-lxPA$uniquenesses),digits=3),cutoff=0.0)
### TWO factor, rotation=Varimax
lxPA<- factor.pa(track,nfactors=2,rotate="varimax",scores=TRUE,digits=3)
print(lxPA,cut=.0,digits=3)
print(round(cbind(lxPA$loadings,1-lxPA$uniquenesses),digits=3),cutoff=0.0)
plot(lxPA$scores,lwd=2)
abline(h=0,v=0,col="green",lwd=2)
#identify(lxPA$scores,labels=rownames(track),plot=TRUE,cex=0.8)
#
### Factor Analysis - Maximum Likelihood
#
### One factor, no rotation
lxML<-factanal(scale(as.matrix(track)),factors = 1,cutoff=0.0,
scores = "regression", rotation = "none")
print(lxML,cutoff=0.0)
print(round(lxML$loadings,digits=3),cutoff=0.0)
print(round(cbind(lxML$loadings,1-lxML$uniquenesses),digits=3),cutoff=0.0)
### TWO factors, Rotation=None
lxMLn<-factanal(scale(track),factors = 2,cutoff=0.0,
scores = "regression", rotation = "none")
print(round(lxMLn$loadings,digits=3),cutoff=0.0)
print(round(cbind(lxMLn$loadings,1-lxMLn$uniquenesses),digits=3),cutoff=0.0)
### TWO factors, Rotation=Varimax
lxMLv<-factanal(scale(track),factors = 2,cutoff=0.0,
scores = "regression", rotation = "varimax")
print(round(lxMLv$loadings,digits=3),cutoff=0.0)
print(round(cbind(lxMLv$loadings,1-lxMLv$uniquenesses),digits=3),cutoff=0.0)
3
To answer these question, using SPSS, you may need the following SPSS commands:
FACTOR
/VARIABLES @100ms @200ms @400ms @800mmin @1500mmin @5000mmin @10000mmin Marathonmin
/MISSING LISTWISE
/ANALYSIS @100ms @200ms @400ms @800mmin @1500mmin @5000mmin @10000mmin Marathonmin
/PRINT EXTRACTION
/PLOT ROTATION
/CRITERIA FACTORS(2) ITERATE(25)
/EXTRACTION PC
/ROTATION NOROTATE
/METHOD=CORRELATION.
FACTOR
/VARIABLES @100ms @200ms @400ms @800mmin @1500mmin @5000mmin @10000mmin Marathonmin
/MISSING LISTWISE
/ANALYSIS @100ms @200ms @400ms @800mmin @1500mmin @5000mmin @10000mmin Marathonmin
/PRINT EXTRACTION ROTATION
/PLOT ROTATION
/CRITERIA FACTORS(2) ITERATE(25)
/EXTRACTION PC
/CRITERIA ITERATE(25)
/ROTATION VARIMAX
/METHOD=CORRELATION.
FACTOR
/VARIABLES @100ms @200ms @400ms @800mmin @1500mmin @5000mmin @10000mmin Marathonmin
/MISSING LISTWISE
/ANALYSIS @100ms @200ms @400ms @800mmin @1500mmin @5000mmin @10000mmin Marathonmin
/PRINT EXTRACTION ROTATION
/PLOT ROTATION
/CRITERIA FACTORS(2) ITERATE(25)
/EXTRACTION PAF
/CRITERIA ITERATE(25)
/ROTATION VARIMAX
/METHOD=CORRELATION.
4
© Copyright 2026 Paperzz