17.7.1.2 Interpreting Results of Principal Component Analysis

Principal Component Analysis Report Sheet

Descriptive Statistics

The descriptive statistics table can indicate whether variables have missing values, and reveals how many cases are actually used in the principal components.

If there are only a few missing values for a single variable, it often makes sense to delete an entire row of data. This is known as listwise exclusion. If there are missing values for two and more variables, it is typically best to employ pairwise exclusion.

Inspection of means and standard deviations (SDs) can reveal univariate/variance differences between the groups. We should take notice when the means and SDs are very different, as this may indicate that the variables are measured on different scales. In this case, we may use correlation matrix for analysis.

Correlation Matrix

This table reveals relationships between variables. PCA aims to produce a small set of independent principal components from a larger set of related original variables. In general, higher values are more useful, and you should consider excluding low values from the analysis.

Eigenvalues of the Correlation/Covariance Matrix

Eigenvalue Eigenvalues of the correlation/covariance matrix. This represents a partitioning of the total variation accounted for each principal component.
Proportion The proportion of variance explained by each eigenvalue.
Cumulative The cumulative proportion of the variance accounted for by the current and all preceding principal components. If the i-th component retains over 90% original information, it is usually recommended to retain i components.
Note: If we select Covariance Matrix from the Analyze radio box in dialog, the result of Bartlett's Test, which is used to test whether the eigenvalues along each principal component are equal, will be shown in the additional 3 columns of the table.

Extracted Eigenvectors

The principal component variables are defined as linear combinations of the original variables X_1, ...,X_k,...,X_m. The Extracted Eigenvectors table provides coefficients for equations below.

Y_k = C_{k1}X_1 + C_{k2}X_2 + ... +C_{km}X_m (1)


where

  • Y_k is the k-th principal component k
  • C's are the coefficients in table

Scree Plot

The scree plot is a useful visual aid for determining an appropriate number of principal components. The scree plot graphs the eigenvalue against the component number. To determine the appropriate number of components, we look for an "elbow" in the scree plot. The component number is taken to be the point at which the remaining eigenvalues are relatively small and all about the same size.

Loading Plot

The Loading Plot is a plot of the relationship between original variables and subspace dimensions. It is used for interpreting relationships among variables.

Scores Plot

The score plot is a projection of data onto subspace. It is used for interpreting relations among observations.

BiPlot

The bi-plot shows both the loadings and the scores for two selected components in parallel.

Score Data

The worksheet provides the principal component scores for each variable.