A Robust Principal Component Biplot using ROBPCA



Principal component analysis, Robust, Biplot


The principal component biplot (PCA biplot) is a graphical tool to simultaneously visualise the scores and loadings of the principal components obtained from the classical principal component analysis. The plot is widely using in the areas of plant breeding, genetics, manufacturing industry, agriculture, etc. Unfortunately, the least-square principal component analysis is not robust to the presence of outliers in the data set and hence the principal component biplot too. The extreme observations may unduly influence the form of the first few principal components and change the actual structure of the biplot. Consequently, the inference based on this plot will be misleading when the data contain outliers. This paper introduces a robust principal component biplot based on ROBPCA method proposed by Hubert, Rousseeuw and Vanden Branden (2005). The length of a vector representing a variable is then approximately proportional to its robust standard deviation while the cosine of the angle between two variables is approximately equal to their robust correlation.