Use of principal components analysis (PCA) on estuarine sediment datasets: the effect of data pre-treatment

Environ Pollut. 2009 Aug-Sep;157(8-9):2275-81. doi: 10.1016/j.envpol.2009.03.033. Epub 2009 May 1.

Abstract

Principal components analysis (PCA) is a multivariate statistical technique capable of discerning patterns in large environmental datasets. Although widely used, there is disparity in the literature with respect to data pre-treatment prior to PCA. This research examines the influence of commonly reported data pre-treatment methods on PCA outputs, and hence data interpretation, using a typical environmental dataset comprising sediment geochemical data from an estuary in SE England. This study demonstrated that applying the routinely used log (x + 1) transformation skewed the data and masked important trends. Removing outlying samples and correcting for the influence of grain size had the most significant effect on PCA outputs and data interpretation. Reducing the influence of grain size using granulometric normalisation meant that other factors affecting metal variability, including mineralogy, anthropogenic sources and distance along the salinity transect could be identified and interpreted more clearly.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • England
  • Environmental Monitoring / methods*
  • Fresh Water / chemistry
  • Geologic Sediments / chemistry*
  • Principal Component Analysis*
  • Seawater / chemistry
  • Water Pollutants, Chemical / analysis
  • Water Pollution, Chemical / statistics & numerical data*

Substances

  • Water Pollutants, Chemical