Rivista Italiana di Paleontologia e Stratigrafia volume lut numero 3 pagrne Jb/--./4 Dicembre 1996 COMPARISON BET\TEEN IN LARGER CLADISTIC AND PHENETIC METHODS FORAMINIFERA ANALYSIS FABRIZIO MAIA Key-uords: Miogypsinidae, Phenetic analysis, Cladistic analysis. Riassunto. Oggetto di questo lavoro è l'analisi numerica dei Miogypsinidi, macroforaminiferi presenti nel1e facies terrigene e bio- clastiche alternate a facies pelitiche della Collina di Torino e del Monferrato, nell'intervallo Oligocene sup. - Miocene inf. In accordo con la letteratura, 1o studio dei Miogypsinidi si basa sulla valutazione dei parametri biometrici relativi all'apparato embrionale e nepionico. I dati ottenuti dalle misurazioni vengono trartati statisticamenre e, successivamente. viene e{fettuato un con{ronto con i dati relativi alle unità tassonomiche riconosciute in letteratura. Alcune attribuzioni specifiche ottenute in questo modo appaiono poco affidabili: ciò si- gnifica che, in diversi casi, l'elaborazione statistica tradizionale risulta inefficace. Al fine di superare tale problema e rrovare nuovi metodi non soggettivi per assegnare nuove popolazioni alle rispettive specie, vengono esaminate alcune tecniche fenetiche e cladistiche. Nel pre- sente lavoro vengono quindi confronrari vanraggi e difficoltà nell'ap- plicare 1'analisi delle componenri principali, l'analisi dei clusters, I'analisi discriminante e l'analisi filogenetica col sistema della parsi- monia. Lanalisi discriminante sembra fornire i risultati più interes- santi. Abstract. The analysis is {ocused on Miogypsinidae, larger Fo- raminifera characterizing the terrigenous and bioclastic facies that al- ternate with pelitic facies in the Turin Hill and in the Monferrato area (l'{lV ltaly) from Upper Oligocene to Early Miocene. According to the literature, the study of Miogypsinidae is based on biometry of embryonic and nepionic apparatus characters. Measurements are pro- cessed statistically and comparison is also made with the values for taxonomic units recognized in literature. Cenain specific determina- tions so obtained result ambiguous: this means that, in many cases, the traditional statistical elaboration appears to be inefficient. To overcome this problem and to find a new method to àssrBn new po- pulations to their species in a non-sub.jective way, an artempt is made to use either phenetic or cladistic systems. In this work there is a comparison between the advantages and the difficulties in using tech- niques like principal components analysis, cluster analysis, discrimi- nant analysis and phylogenetic analysis using parsimony. Discrimi- nant analysis seems to provide the best results. lntroduction. Miogypsinidae are benthic polythalamic non-sessi- les larger Foraminifera, belonging to the super-family of Orbitoidacea. They originate in Late Oligocene and di- sappear in Burdigalian. Each specimen is sectioned and studied in equatorial plane, then the data of the speci- mens of a sample are collected to obtain an average of the whole population, according to Drooger (1952) and subsequent papers. Attribution of a population to a certain species is based on biometry of embryonic apparatus parameters pertinent to the juoenarium (protoconch and deutero- conch) and to the nepionic protoconchal spirals (Fig. 1). These parameters are the main spiral ienght (X), the y angle between the medio-embryonic line and the fron- tal-apical line, the symmerry of the nepiont (V), the protoconchal diameter (D1), the deuteroconchal diame- ter (D2), the ratio D2/Dl, the distance between the cen- ter of the protoconch and the apical margin (e), the ra- tio t/Dt. ffi=t Y = 200a/9 Fig. 1 - Schematic drawing showing the measures of the internal features in embryonic-nepionic stages of Miogypsina. Mea- ning of the symbols: FA : frontal-apical line; ME : me- dio-embryonic line; cx, : angle made by the shortest spiral around the protoconch; B : angle made by both spirals around the protoconch; y : angle between ME and FA lines; Dl : diameter of the protoconch; e : distance be- ts/een the cenrer of the protoconch and the apical margin; X : number of chambers of the principal spiral around the protoconch; V : degree of simmetry of the nepiont. Dipanimento di Scienze della Terra - Via Accademia delle Scienze, 5 - 10123 Torino 368 F. Maia Species Samples X 8 D1 € €.|D1 M. gunteri M. gunteri M. gunteri M. gunteri M. gunteri M. gunteri-tani M. gunteri-tani M. gunteri-tani M. guntèri-tani M. gunteri-tani M. guntefi-tani M. tani M. tani M. tani M. tani M. tani M. tani M. tani M. tani M. tani M. globulina M. globulina M. globulina M. globulina M. globulina M. globulina-intermedia M. globulina-intermedia M. globulina-intermedia M. globulina-intermedia M. globulina-intermedia M. globulina-intermedia M. globulina-intermedia M. socini M. socini M. socini M. socini M. socini M. socini-burdigalensis M. socini-burdigalensis M. burdigalensis M. burdigalensis M. burdigalensis M. negrii M. negrii M. negrii Superga 1 Civera 7 bis Monferrato Mc18 Monferrato MC12 Monferrato MC2 Superga 33 bis Baldissero 28 Baldissero 34 Monlerrato MU1 Monferrato Pc2E Monferrato PC2C Monferrato RO Monferrato VM Monferrato SM3 Monferrato SM2 Monferrato SM1 Monferrato SB1 Monferrato FB2 Monferrato FA2 Monferrato FAl Superga 93 Civera 10 Civera 13 Civera 16 Monferrato CAl b Superga 84 Civera 29 Civera 36 Civera 961 Baldissero 68 Monferrato M10 Monferrato MB Superga 43 Bric Palouch 3a Rivodora 3F Rivodora 4 Monferraio VDMU Superga 52 Baldissero 48 Baldissero 41b Vergnana 1 Brlc Palouch 3b Superga 72 Monferrato CAl a Monterrato RS t 10.44 '10.43 9.40 9.90 9.23 9.16 9.63 9.32 8.52 9.18 8.50 ó.oY 8.07 8.50 7.OO 7.97 7.91 8.40 6.95 6.17 6.42 o.ou 5.88 5.40 5.78 a.47 8.92 ó.oJ 8.43 10.00 9.21 7.09 5.16 -101 .06 -109.00 -49.00 -78.00 43.00 -46.79 43.34 -49. 1 6 -24.OO -27.OO -34.00 -19.00 { 8.00 -13.00 -20.00 17.00 -30.o0 -15.00 44.OO 38.OO 't 6.48 27.O0 19.17 15.30 18.45 29.69 34.50 30.50 25.50 27.80 35.71 -1 0.1 0 42.90 -17.63 -38.00 -63.00 -52.20 4.83 4.69 1.E9 4.76 7.22 5.04 6.90 3-70 8.60 8.40 f1.00 12.10 6.50 5.60 '10.60 35.47 27.22 24.5'l 23.79 24.20 44.13 48.19 40.28 52.'14 44.43 42.50 40.60 19.87 14.43 'I 8.23 17.30 28.18 '13.64 30.99 29.83 38.61 68.10 75.90 55.30 182 151 '155 160 181 189 172 155 161 177 178 163 190 171 | /o '168 163 137 150 182 toz 141 164 185 189 184 172 '168 172 '141 157 too 174 188 183 3'12 zc9 255 254 257 308 241 251 267 253 257 269 268 261 261 270 351 213 232 241 262 273 250 292 311 312 309 3'12 309 320 301 1.74 1 .75 1.67 1.70 1.60 1.64 1.61 1.81 1.48 1.57 1.58 1 .50 1.57 1 .52 1.62 1 .50 1 .56 '1.68 2.10 '1.58 1.70 1.55 1.49 1.71 '1.75 lao I oo 1.70 '1.58 1.68 1.72 1.81 1.88 2.09 1 .71 1.91 2.06 1.70 During the evolution of this group, parameter changes permit to reconstruct an hypothetical phyloge- nesis. Up to now; the determination of one PoPulation of Miogypsinidae was based on the comparison of its parameters with the values for taxonomic units recogni- sed in literature. Student's "t" test was used to compare the specific determinations thus obtained with all the populations assigned to these forms by other workers and hence confirm our species attribution. The aim of this work is to find identification met- hods more reliable to assign a new population to a pre- determined taxon, using more objectivity as possible and taking advantage of all the measurement collected. Such methods have been identified in the realm of phe- netic analysis (Barrai, 1984; Camussi et al., 1986; Dunn & Everitt, 1982; Elliott, 1'977) and cladistic analysis (Fo- fey ef al., t992). The input data for the analysis correspond to the mean values of the populations studied in Piedmont (Tab. 1) and those of the populations reported in litera- ture (Tab. 2). On the base of available measurements, for each analysis we used the one or the other data set. The species considered are those from M. gunteri to M. Glo- Tab. tr - Mean values of counts and measuremenÌs on internal characteristics of Miogypsina populations studied in Piedmont. bulina-intermedia along the main branch of the phyletic tree, and those from M. socini to M. negrii along the lateral branch. \fle have chosen to use the mean values rather than individual values, because the objects of our study are the populations rather than the individuals that compose the populations. The mean values have been standardized before exploiting phenetic analysis: standardization consists in subtracting the mean of each variable from each value and dividing the difference by the standard deviation. A true incentive to undertake this work has come from the development of computer packages suitable for many types of numerical anaiysis. Among them, we have selected NTSYS-pc (a software for phenetic analy- sis), STATGRAPHICS (statistics) and PAUP (cladistic analysis). About the phenetic analysis, in this work we will describe the principal component analysis, the cluster analysis and the discriminant analysis. About cladistic analysis, we will discuss the method of maximum parsi- mony, used to infer the phylogenetic ree of Miogypsini- dae from their characters. First the principle of simiiari- ty will be introduced. Species References Samples M. gunteri M. gunteri M. gunteri-tani M. gunteri-tani M. tani M. tani M. tani M. tani M. tani-globulina M. tani-globulina M. tani-globulina M. tani-globulina M. globulina M. globulina M. globulina M. globulina M. globulina M. globulina M. globulina M. globulina M. globulina M. globulina M. globulina M. globulina-intermedia M. globulina-intermedia M. globulina-intermedia M. globulina-intermedia M. globulina-intermedia M. globulina-intermedia M. intermedia M. intermedia M. intermedia M. intermedia M. intermedia M. intermedia M. socini M. socini M. socini M. socini M. socini M. socini M. socini-burdigalensis M. burdigalensis M. burdigalensis M. burdigalensis-negrii M- neorii Drooger,1954(a) FerreÌo, 1 965 Drooger, 1 952 Drooger et al., 'l 955 Drooger, 1 952 Drooger et al., 1 955 Delicati & Schiavinotto, 1985 Vilizzi, 1991 Drooger, 1 952 Raju,1974 De Mulder, 1975 Fermont & Troelstra, 1983 Drooger, 1 952 Drooger,1954(a) Drooger, 1 954 (b) Drooger et al., 1955 Ujiié & Oshima, 1969 Matsumaru,1971 Raju, 1 974 De Mulder, 1975 Schúttenhelm, 1976 Schiavinotto, 1979 Delicati & Schiavinotto, 1985 Drooger, 1 952 Drooger, 1 954(a) Drooger et al., 1955 De Mulder, 1975 Schúttenhelm, 1 976 Wildemborg, 1991 Drooger et al., 1955 De Mulder, 1975 Schúttenhelm, 1976 Schiavinotto, 1985(a) Schiavinotto, 1 985 (b) Wildemborg, 1991 Drooger, 1 954 (a) Vervloet, 1 966 Schùttenhelm, 1976 De Bock, 1977 Schiavinotto, 1979 Delicati & Schiavinotto, 1985 Schúttenhelm, 1976 Schùttenhelm, 1976 Schiavinotto, 1979 Schiavinotto, 1979 Schiavinotto, 1979 1l 1 11a 22a',34t Mor222 12 2b; 1 3-1 5; 1 5a; 1 6-1 8; M1 -6; 26" PMT16; PMTT OT2; OT4 18 G1 437 A179; DM363 80PC02 19-22 5;6;10;18 240b 1 ; 2: 5; 2Oa; 24-25; 35-36; 4344; M7 Shuk.A CH; HO KR36-B; G1401-B; G1401 AB; G1406-8; G1 40648; G1 421 -B; Jag.W.B DM114; DM116; DM117 sM-281 87 181 1268 11 57 Al 1 1 o | 1 31 | 1 35 I 1 66 | 251 I 248 I 46e I 1 40 I 1 se | 347 TLS76 PMT-6/80 24;25 7', 8;9 22a';29i 3 4194; DM106; DM608 sM-2214811258 JTs1 24; JT5063 2'l DM684; DM107; DM140 sM-2O51237144o AC5 Ca82 Jî -7980151 1 7 151 1 2 151 07 17 97 8 17977 1797 61 7 97 5 ls1 02 | s09 8 l5o9 4 17 97 4 17 1 7 7 I 5077 12 13 326-3 sM482 | 483 1234 I 267 I 447 /266 | 1 1 6 | 1 24 | 287 I 444 1283 11 881 1 87 l't 84 12261227 1298 M13 TLSl 07 PMT3; PMT2 sM-233/373 sM4781479 TLS-42/39 TLS-1 0s/3s TLS1 1 O Cladistic and phenetic metbods 369 fr ll T I Similarity. The principle of similarity is essential to deal with some problems related to the principal component analysis and the cluster analysis. Similarity is the resemblance or affinity among the taxonomic units, based on their characters; in ot- her words, it's their phenetic relationship. The com- plement of the similarity of taxonomic units is their dissimilarity or phenetic distance, measured by means of processes that satisfy mathematical properties which make them particuiarly suitable for phenetic analysis. Among the available measures of dissimilarity, in this work we used the average taxonomic distance. Its expression is similar to that of euclidean distance, resul- ting from the Pythagora's theorem. Tab.2 - Populations oÍ Miogypsina re- .^r"J i. lit"r.r'qrs and con- sidered in this work. See Fer- rero et al. (1992, 1994) Íor references that are not cited at the end of this work. Principal Component Analysis. The Principal Component Analysis (PCA) is use- ful to find new variables that are linear combinations of the original measures and describe the sample without the abundance of information rising from correlation among the original measures. The new variables have to be uncorrelated, so that it is possible to choose those showing the greatest variance. The first few measures that account for most of the variation of the sample correspond to the principal components. The results of this transformation can be better explained by calcula- ting the correlation among the original measures and the new variables, so that a loading matrix is obtained in which the absolute value and the sign of the correla- tions allow to understand the connections among mea- sures and principal components. A geometrical interpre- Pri Component Eigenvalue Percentage of Variance Cumulative Percent 1 2 4 2.71035 1.08278 0.1 551 3 0.0s173 67.76 27.07 3.88 1.29 67.76 94.83 98.71 100.00 370 F. Maia matrix Parameter Comp.1 Comp. 2 Comp.3 X 8 e lD1 o.974 -0.962 -0.914 o.027 -0.115 0.1 10 -o,267 -0.993 -0,080 0.210 -0.303 0.1 14 Tab.3 - Results of principal components analysis. The eigenvalues (atent roots) are proportional to the variance accounted for by each of the first four components. The component loadings (atent vectors) for the first three principal compo- nents are shown in the loading matrix, in which it appears that the first component loads heavily on X, 1 and V, whi- le the second loads heavily on e/DI. tation of the PCA is possible if we imagine to find the axis (or dimension) that express the variation of the characters, that is the axis which maximizes the variance of the proìections of the values onto itself. This axis is given by the line that minimizes the sum of squares of the distances between the values and itself. If there are "p" characters, the first principal component is the best- fitting straight line in the p-dimensional space. A very useful visualization of the results of PCA is given by a scatterplot of the first component score for each taxono- mic units against the second. How well this scatterplot describes the configuration in the original p-dimensional space may be measured by the proportion of the varian- ce in the data acccounted for by the first two principal components. The analysis has been carrred out either on the data of the populations reported by literature or on the data of the populations studied in Piedmont. In both cases the first two principal components account for more than 90olo (cumulative) of the total variance. Only the results concerning the populations stu- died in Piedmont are shown (Tab. 3). The loading ma- trix shows the respective significance of each of the ori- ginai measures (X, y, V and e/D1) in making the com- ponents. In the scatterplot based on the first two princi- pal components (Fig. 2), the groups of the popuiations belonging to each species appear well distinct. To verify how weli this two-dimensional mapping preserves the original distances among populations (measured with the aver^ge taxonomic distance, pre- viously discussed), we can use the Minimum Spanning Componènt 2 1|- I I I l'4 I Og Component 1 tr M. qunteri O M. Elobulinalntermedia U M. gunteri-tani O M, socini I M. tani O M- socinl-burdigalensis O M. globulina a M. burdigalensls Fig.2 - Plot of rhe first two principal componenr scores for popu- lations studied in Piedmont. Tree (lt4ST) of the distance matrix. The spanning tree is a set of straight-line segments joining pairs of points such that no closed ioops occur, each point is touched by at least one line and the tree has continuous link between any pair of points. If a weight is assigned to each segment, than the lenght of the tree is defined to be the sum of these weights. The MST is defined as the spanning tree of minimum lenght. This tool helps to detect local distorsions in the diagram resulting from the ordination technique, like the case of pairs of popula- tions which look close together in the two-dimensional representation, but actually are far aparf íf other dimen- sions are taken into account. For example, in the scat- terplot of the PCA applied to the data of Piedmont (Fig. 2), populations SUP52 and BAL48, belonging to M. socini-burdigalensis, appear to be well separated from nM-gunteri OM.globulina-intermedia ta M. gunteri-tani Q M. socini I M. tani O M, socini-burdigalensis o M. slobulina a M. burdigalensis Fig. 3 - Minimum spanning tree superimposed on scatterplot of the first two principal cornponent scores for populations studied in Piedmont. Eo ooo o o Component 2 T Cladistic and pbenetic methods 371 1.8 1.5 1.2 0.9 0.6 0.3 0.00 M. ounterl M. lunterl M. soclnl-burdlgalensls M. ounterl-tanl M. Soclnl M. soclnl-burdlgalensls M. soclnl M. gunterl-tanl M. gunterl-tanl M. gunterl-lanl M. ounterl-tanl M. fanl M. tani M. soclnl M. soclnl M. tanl M. tanl M. tanl M. tanl M. tanl M. olobullna-lntermedla M. 6lobullna-lntermedla M. globullnalntermedla M. globullnalntermedla M. globullnalntermedla M. globullna-lntermedla M. globulinalntermedla 0.96 0.80 0.64 0.48 0.32 0.16 0.00 M. ounterl M. !unterl M. gunterl-tanl M. gunterl-tanl M. gunterl-tanl M. ounterl-tanl M. Ianl M. tanl M. soclnl M. soclnl M. tanl M. tanl M. tanl M. tanl M. tanl M. soclnl M. burdlgalensls M. ounterl-tanl M. doclnl M. globullna-lntermedia M. globullna-lntermedia M. globullna-lntermedla M. globullnalnlermcdla M. globullna-lnlermedla M. olobullna-lntermedla M. élobullna-lntermedla M. globullna M. globullna M. globullna M. globullna M. soclnl-burdlgalensls M. soclnl M. soclnl-burdlgalensls M. globulina M. qunterl M. !unîerl M. soclnl-burdlgalensls M. soclnl-burdlgalensls M. soclnl M. gunlerl-tanl M. gunterl-tanl M. gunlerl-tanl M. gunlerl-tanl M. tanl M. tanl M. tanl M. tanl M, lanl M. ianl M. soclnl M. soclnl M. tanl M. gunterl-ianl M. soclnl M. burdlgalensls M. soclni M, olobulina-lntermedla M. 6lobullna-lntermedla M. globullna-lntermedla M. globullna-lntermedla M. globu.llna-lntermèdla M. globullna-lntermedla M. globullna M. globullna M. globullna M. globullna M. globullna M. globullna-lntermedla M. globullna M. globullna M. globullna M. globullna M. burdlgalensls M. soclnl f ! I * M, globullna 1.8 1.5 1.2 0.9 0.6 0.3 0.00 F;- 4 Dendrogram showing the results of group-average cluste- ring applied to the populations studied in Piedmont. Co- phenetic correlation coefficient : 0.75. the other groups. But if the MST is superimposed on the plot (Fig. 3), we see that the two populations are "closer" to R[V3F (belonging to M. socini) than to each other. This means that the projection of the populations onto the first two principal components axes has not preserved, in the case of M. socini-bwrdigalensis, the ori- ginal structure of the phenetic distances. Cluster analysis. A cluster can be defined like a maximally connec- ted set. For the analysis of Miogypsinidae we have selec- ted agglomerative hierarchical clustering techniques, that proceed by a series of successive fusions of the taxono- mic units into groups. The two populations showing the smaller distance between them (measured with the average taxonomic distance, discussed previously) are grouped together, then distances between this two-mem- ber cluster and each of the remaining populations are calculated. The process continues with the number of groups being reduced by one at each stage, until all the populations are grouped into a single cluster. A useful means to display the results is a diagram called "dendro- gram". The methods available differ in the algorithm they use to calculate the distance between two clusters. In group-average clustering (Fig. 4), the distance between t'wo clusters is the averase of the distances between ail 0.96 0.80 0.64 0.48 0.32 0.16 0.00 Fio 5 Dendrogram showing the results of singlelinkage cluste- ring applied to the populations studied in Piedmont. Co- phenetic correlation coefficient : 0.64. 3.6 3.0 2.4 1.8 1.2 0.6 0.00 Dendrogram showing the results of completeJinkage clu- stering applied to the populations studied in Piedmont. Cophenetic correlation coeÍficient : Q.77. 3.6 3.0 2.4 1.8 1.2 0.6 0.00 Fig. 6 Discrim. Function Eigenvalue Percentage of Variance Cumulative Percent 'l z 12.812 1.654 0.337 86.55 11.17 2.28 86.5s 97.72 100.00 372 F. Maia Functions Derived Wilks Lambda Chi-Square Degree of Freedom Signific. Level 1 é 0.020 o.282 o.748 686.94. 223.53 51.29 .5,5 20 I 0 0 0 Tab.4 - Canonical discriminant functions. The eigenvalues are pro- ponional to the variance accounted for by each function. The significance level of the three functions is tested with the \filks - A statistics and the X - square statistics, that show a high degree of significance (probability : 0). pairs of populations that are made up of one population from each group. In singleJinkage clustering (or nearest neighbour method) (Fig. 5), the distance between two clusters is that of their most similar pair of populations. In completeJinkage clustering (or furthest neighbour method) Fig. 6), the distance between two clusters is that of their least similar (or most dissimilar) pair of populations. To evaluate if the original relationships among the populations (described by their measured distances) fit well with the hierarchical structure imposed on data by clustering, we can use the cophenetic correlation coeffi- cient. It corresponds to the correlation of the original distance matrix with the cophenetic matrix, which con- sists of the set of similarities produced by clustering. A value above 0.8 is sufficient to give evidence of a good fit between dendrogram and distance matrix. The cluster analysis of the populations of Miogyp- sinidae studied in Piedmont, applying the complete lin- kage method, produced the most meaningful grouping and the highest cophenetic correlation coefficient (Fig. 6). The resulting classification of populations is not the same as that produced applying only taxonomic criteria, but it provides important informations. For example, the dendrogram shows that M. globulina and M. globuli Function I o lil. gunterl . M, intermedia c M. gunteri-tani o M. socini . M,tani o M,socini-burdigalènsis e M, tani{lobulina . M. burdigaleÍlsis o M. globulina a M. burd--negrii; M. negrii o M. globul.intErmedia Fig.7 - Plot of population means relative to the first two canonical discriminant functions. Data from literature and from Piedmont. na-intermedia are well distinct as regards to all the other species: this suggests an evolutionary trend that isolates these two taxonomic units from all the others. Popula- tions belonging to the phyletic branch M. socini - M. burdigalensis show characters so different that it appears very difficult to group them in a single cluster. Finally, M. gunteri results well distinct from the other species. Discriminant analysis. The discriminant analysis technique assumes that significative differences would exist among the mean vectors of the species to which populations belong. In certain respects, it is similar to PCA, but PCA seeks transformed axes that account for most of the global va- riation of the data, while in discriminant analysis the transformed axes permit to separate the mean vectors of groups. In the case of more than two groups to be di- scriminated, the method is called canonical variate analysis and consists in seeking one or more new varia- bles that would be linear functions of the original varia- tffi,; Tab.5 - Classification results for spe- cies in discriminant analysis, The actual groups are cì.assi- fied correctly when rhey cor- respond to predrcted groups with a high pÈrqentage. 100 0 0 0 0 0 0 0 0 0 0 0 o 78 0 0 0 0 0 11 11 0 0 0 o489700000000 o20060000200000 0 0 0 I 65 16 0 0 0 4 5 0 00000740000422 000001770000013 o1730000ss17700 o2s000005025000 00002000008000 0 0 0 0 0 0 0 0 0 0 100 0 0 0 0 0 0 0 0 0 0 0 0 100 M. gunteri M. gunteri-tani M. tani M. tani-globulina M. globulìna M. globulina-intermedia M. intermedia M. socini M. socini-burdigalensis M. burdìgalensis M. burdigalensis-negrii Cladistic and phenetic rnethods 373 X Coeft. M. gunteri M. gunteri-tani M. tani M. tani-globulina M. globulina M. globulina-inlermedia M. intermedia M. socini M. socini-burdigalensis M. burdigalensis M. burdigalensis-negrii M. neorii 39.283 38.447 3s.41 I 33.942 éz-1J I 31 .097 30,982 39.398 40.1 01 32.102 28.125 29.489 0.1 34 0.490 0.574 0.683 0.765 0.730 0.702 U.3YJ 0.545 0.511 0.564 0.691 0.682 0.571 0.456 0.509 0,970 1.336 1,738 0.793 0.934 1.182 1.326 1.485 200.51 165.37 123.27 124.73 129.54 148.71 172.82 183.46 127.20 108.66 126.52 Tab.6 - Coefficients obtained by discriminant analysis, useful for classifying new populations. The hst column contains a constant in each function. bles. The coefficients of these functions (discriminant functions) are calculated in such a .way to maximize the between-groups variance and covariance matrix on the base of within-groups variance and covariance matrix. The first canonical variate axis is required to be in the direction of greatest variability between the means of the different species, the second axis is chosen to be ort- hogonal to the first and inclined in the direction of the next greatest variability, and so on. Discriminant analysis was performed on the data (parameters X, y and V) of the populations of Miogypsi- nidae studied in Piedmont and those reported by litera- ture, and three significant discriminant functions were obtained (Tab. a). The scatterplot built on the base of the first two functions (which summarize the most part of discrimination) is useful for displaying the distinction between groups of populations (Fig. Z). In the table of results of classification for species (Tab. 5), the actual and predicted groups are shown: it appears that in many cases the populations are not classified correctly. But the most important result is the set of coefficients for use in classifying new populations (Tab. 6). A new population is classified by evaluating one function for each charac- ter and each species and assigning the population to the species corresponding to the highest function value. We have put these coefficients into a spreadsheet to auto- matically assign a new population to one species with the highest probability. Cladistic analysis. The aim of the ciadistic analysis is to construct phylogenies by studying the phylogenetic relationship between species, that is to construct evolutionary trees by considering the transformations of morphological characters during evolution. To apply this approach to the study of Miogypsinidae, first of all we have had to deal with the problem of coding the measured parame- ters. In fact, mathematical algorithms for the analysis of phylogenetic data require alphanumeric codes that rep- resent character states, but the populations of larger Fo- raminifera are studied by the measurement of quantitati- ve, continuous parameters. To not derive discrete codes from quantitative data in an arbitrary fashion, we have assigned character states to the parameters of different taxa using statistically significative differences between species (or homogeneous groups of species) resulting from the Analysis Of Variance (ANOVA). Because of the results of the ANOVA applied to the populations of Miogypsinidae suggest that the varia- tions of parameters between species are highly signifi- cant, we used a multiple range test based on the confi- dence intervals to separate these species in homogeneous groups for each parameter (Tab.7). Then we applied a maximum parsimony method, that is a technique to search a phyletic tree that minimi- zes the amount of evolution needed to explain the ava- laible data. As this method requires a prespecified set of constraints upon permissible character changes, we have established to consider parameters X, y and V like orde- red characters (in a progressive series of character states, the transformation of one state to another that requires to skip an intermediate state is not allowed), and para- meters D7, D2/D1,, e and e/Dl like unordered charac- ters (any state of the character is capable of transforming directly to any other state). The cladogram resulting by applying the analysis to the populations of Miogypsinidae studied in Piedmont (Fig. S) is better comparable to a pattern of the similarities between species rather than to a hierar- chical statement regarding genealogical relationships. This branching diagram, however, shows interesting affi- nities with the hypothetic phylogenesis of Miogypsinidae proposed by the literature, but shows aiso unexpected de- viations along the M. socini - M. burdigalensis branch. Tab.7 - Multiple range test applied ^- .L- -^^,,r.,;^rs studied in Piedmont. The averages o{ the measurements and the character states (groups) are ,.-^É.t t^" .""1--r--.-.. --. ---.'t Parameter' M. socini-burdigalensis and M. negrii are excluded because of missing data. X D1 D2lD1 SDecies Group M. gunterl M. gunt.-tani M. lani M. globulina M. glob.-int. M. socini i,l. burdioal. 9.90 9.17 8.1 6 o.5u 5.84 8.71 7.39 0 1 ó 5 6 -77.8 -39,0 20.2 27.5 -JJ.J -3.U 0 1 ó ó 1 2 2.94 5.43 9.16 27.83 45.O4 16.90 32.83 0 0 0 1 ó 60.3 72.8 70.5 57.3 72.1 69.4 77.3 1 e 3 0 J 2 4 t. tc 1.11 1.14 1.22 1.'16 1.11 0 1 0 J 0 1 267.8 264.O 266.4 289.8 312.8 301 .1 'ì 1 0 1 2 3 1.69 '1 .63 LJ/ 1.71 1.71 1.87 1.70 ,] 1 0 3 ó 2 -lntermedia M. bufdigalensis M, socini 374 F. Maia M, gunteri -{ani M. globullna Fig. 8 - Cladogram of species studied in Piedmont. Numbers in square: characters that change unambiguously on branch. Treelenght (number of character changes) : 31. Discussion. From the working use point of view, if we compa- re the results of the different phenetic methods applied to the study of Miogypsinidae, we can see that the most effective tool consists in discriminant analysis. In fact, this technique provides the statistics to assign a new po- puiation to a certain species in a non-arbitrary manner. PCA and cluster analysis cannot be seen as tools for the production of a formal classification, but only for data expioration. Concerning cladistc analysis, the preliminary re- sults suggest to probe the research, especially in impro- ving the characters coding and the selection of con- straints upon permissible character changes. A characteristic shared by cluster analysis and cla- distic analysis consists in the big number of algorithms available. Sometimes it appears difficult to choose the best method of studying larger Foraminifera. On the ot- her hand, the possibility of preparing a variety of clas- sification using different techniques stimulates a global exploration of the various algorithms available, once the data are collected and coded for numerical use. This appears more and more valid as computer packa- ges for numerical manipulation of the daÍ.a are increa- singly available. If we consider that the data base used for this re- search has been intentionaliy restricted to a predetermi- nated number of species of Miogypsinidae, it appears evident that it will be possible to reach more significant results by including other species in the analysis. Moreo- ver, it would be interesting to extend the research to ot- her larger Foraminifera than Miogypsinidae. A study on the Lepidocyclinidae of the Piedmont Basin is in pro- gress: we hope this research will help to verify if the problems of determination are linked to the type of or- ganism or to the palaeontological classification criteria. Acknouledgements. The research was supponed by the M.U.R.S.T. grants 60% assigned to P. Clari and E. Ferrero. The author wishes to thank E. Ferrero {or his assistance throughout the study. Thanks are due also to M. Delpero for his advice and helpful discussion about cladistic analvsis. REFERENCES Barrai I. (1984) - Metodi di regressione e classificazione in biometria. Y. of. 170 pp., Edagricole, Bologna. Camussi A., Móller F., Ottaviano E. & Sari Gorla M. (1986) - Metodi statistici per la sperimentazrone bioiogica. V. oi 500 pp., Zanichell| Bologna. Drooger C.\f. (1952) - Study of American Miogypsinidae. Y. of 80 pp., Acad. Thesis, Univ. of Utrecht. Dunn G. & Everitt B.S. (1982) - An Introduction to Mathe- matical Taxonomy. Y. of 152 pp., Cambridge University Press, Cambridge. Elliott J.M. (1977) - Statistical Analysis of Samples of Benthic Invertebrates. Y. of $7 pp., Freshwater Biological Ass., Scientific Publication, 25, Ambleside. Fermont \Lj.j. 6. Troelstra S.r. (1983) - Early Miocene larger Foraminifera from Cruiser-Hyeres Sea mount Complex @astern North Atlantic). Proc. Kon. Ned. Akad.. IVe- tensch,, ser.B, v. 86 (3), pp. 243-253, Amsterdam. Ferrero E., Maia F. 8r Tonon M. (1992) - The evolutionary patterr ol the Miogypsinidae rn eastern Monferrato (N\l Italy). Paleontologia i Eoolució, *24-25, pp. 209-217, Barcelona. Ferrero E., Maia F. Ec Tonon M. (1994) - Le Miogypsine del Monferrato: aspetti morfologici e tassonornici. Boll. Soc. Paleont. Ital.,v.33 (3), pp. 345-368, Modena. Forey P.L., Humphries C.J., Kitching I.J., Scotland R.lW'., Sie- bert D.J. Er Slilliams D.M. (1992) - Cladistics. V. of 191 pp., Oxford University Press, Oxford. Schiavinotto F. (1985) - Different evolutionary stages in the Miogypsinidae from Sardinia. Boll. Soc. Paleont. Ital., v. Zl Q), pp.38l-394, Modena. Ytlizzi L. (1991) - Studio biometrico e biostratigrafico sui Miogypsinidi e sui Lepidocyclinidi deile facies carbona- tiche mioceniche del Monferrato orientale. Tesi inedita. Univ. Torino. Receioed April 12, 1996; accepted October 3, 1996