We are supposed to apply the factor analysis to the same data set we applied the principal component analysis to. To get some basic idea about the correlation structure of the data, we can use the correlation matrix from the 6th assignment, now we also add a different version with each-pair-correlations:
Now the factor analysis:
##
## Call:
## factanal(x = uscrime[, 2:9], factors = 3, rotation = "varimax")
##
## Uniquenesses:
## popu 1985 murder rape robbery assault burglary larceny autotheft
## 0.52 0.19 0.31 0.20 0.09 0.15 0.14 0.41
##
## Loadings:
## Factor1 Factor2 Factor3
## rape
## burglary 0.80
## larceny 0.91
## murder 0.89
## assault 0.87
## popu 1985 0.64
## robbery 0.79
## autotheft
##
## Factor1 Factor2 Factor3
## SS loadings 2.26 2.02 1.70
## Proportion Var 0.28 0.25 0.21
## Cumulative Var 0.28 0.54 0.75
##
## Test of the hypothesis that 3 factors are sufficient.
## The chi square statistic is 4.45 on 7 degrees of freedom.
## The p-value is 0.727
Maybe 3 factors could be enough. Let us try only 2 factors.
##
## Call:
## factanal(x = uscrime[, 2:9], factors = 2, rotation = "varimax")
##
## Uniquenesses:
## popu 1985 murder rape robbery assault burglary larceny autotheft
## 0.74 0.24 0.34 0.50 0.06 0.09 0.28 0.47
##
## Loadings:
## Factor1 Factor2
## rape
## robbery
## burglary 0.91
## larceny 0.84
## autotheft 0.72
## murder 0.87
## assault 0.92
## popu 1985
##
## Factor1 Factor2
## SS loadings 2.96 2.32
## Proportion Var 0.37 0.29
## Cumulative Var 0.37 0.66
##
## Test of the hypothesis that 2 factors are sufficient.
## The chi square statistic is 23.43 on 13 degrees of freedom.
## The p-value is 0.0368
Two factors would not be enough. So for model building, we would include 3 factors. We can see that burglary and larceny have strong correlation with the 1st factor, murder and assault with the 2nd factor and popu 1985 and robbery with the 3rd factor. In case of rape and autotheft we lack the information, so we present the complete information in the following table
Factor1 | Factor2 | Factor3 | |
---|---|---|---|
popu 1985 | 0.1413397 | 0.2248102 | 0.6388286 |
murder | 0.0042920 | 0.8893298 | 0.1299604 |
rape | 0.5642399 | 0.5447061 | 0.2776778 |
robbery | 0.3214524 | 0.2704073 | 0.7898995 |
assault | 0.2402066 | 0.8664179 | 0.3155509 |
burglary | 0.7999613 | 0.2366180 | 0.3983232 |
larceny | 0.9115873 | 0.0562551 | 0.1510362 |
autotheft | 0.5420003 | 0.0332821 | 0.5457015 |
In case of rape and autotheft, we cannot see that strong correlation as in case of the remaining variables, so (according to what we would model) we should include them in the model on their own.
The correlation of the variables with each factor can be also visualised as follows: