Cluster analysis and principal components analysis (PCA) are exploratory tools which are able to look for structure within the data set (Baxter 1994, 2001); in the present context they are able to identify chemical groupings within the data set. Those groupings can then be examined for their potential significance in terms of the origin of the raw material, technology of production or other factor. At a practical level, multivariate techniques should not normally be applied to small datasets, that is less than 20 samples. Of the many methods of cluster analysis, the average linkage (or between groups linkage) method is the most commonly used. Cluster analysis is often carried out in conjunction with PCA; the principal clusters in the dendrogram can then be superimposed on the plot of the first and second (or other) principal components. The importance of PCA lies in the way it quantifies what proportion of the overall variance in composition each principal component gives and furthermore identifies which combination of elements dominates each principal component.
Fig. A shows a typical PC plot showing the classification of ICP-ES compositions of ceramic material from site CN. There are two groupings, A and B, which separate along PC2 which is dominated by Ca, Cr and Y contents. The PC plot has identified some outliers – CN690, 1270, 1280, 2012 and 1172 – most of which have high scores on PC1. The variance and main elements loading each of the two PCs are indicated. PCA could if necessary be repeated in the absence of those outliers.
Turning to discriminant analysis, Fig. B shows four chemical chemically-defined groups, each of them in this case representing likely production sites of medieval White Gritty ware. Ceres, Colstoun, Kelso Abbey and Elgin are very well differentiated along discriminant function 1, but Colstoun can only be differentiated from Kelso Abbey with respect to discriminant function 2. If we have composition data for pottery samples from another findspot and have reason to believe that pottery could have come from one or other of these four centres, discriminant analysis can assign each sample, on the basis of its composition, to one of those centres with a given probability. However, its success depends critically on the confidence that those four centres are indeed the candidate centres of production.
Return to Section 3.3 Methods of Examination