Abstract
Data in germplasm collections contain a mixture of data types; binary, multistate and quantitative. Given the multivariate nature of these data, the pattern analysis methods of classification and ordination have been identified as suitable techniques for statistically evaluating the available diversity. The proximity (or resemblance) measure, which is in part the basis of the complementary nature of classification and ordination techniques, is often specific to particular data types. The use of a combined resemblance matrix has an advantage over data type specific proximity measures. This measure accommodates the different data types without manipulating them to be of a specific type. Descriptors are partitioned into their data types and an appropriate proximity measure is used on each. The separate proximity matrices, after range standardisation, are added as a weighted average and the combined resemblance matrix is then used for classification and ordination.
Germplasm evaluation data for 831 accessions of groundnut (Arachis hypogaea L.) from the Australian Tropical Field Crops Genetic Resource Centre, Biloela, Queensland were examined. Data for four binary, five ordered multistate and seven quantitative descriptors have been documented. The interpretative value of different weightings—equal and unequal weighting of data types to obtain a combined resemblance matrix—was investigated by using principal co-ordinate analysis (ordination) and hierarchical cluster analysis.
Equal weighting of data types was found to be more valuable for these data as the results provided a greater insight into the patterns of variability available in the Australian groundnut germplasm collection. The complementary nature of pattern analysis techniques enables plant breeders to identify relevant accessions in relation to the descriptors which distinguish amongst them. This additional information may provide plant breeders with a more defined entry point into the germplasm collection for identifying sources of variability for their plant improvement program, thus improving the utilisation of germplasm resources.
Similar content being viewed by others
References
Anderberg, M.R., 1973. Cluster Analysis for Applications. Academic Press, Inc., New York.
Anon., 1989. The case for crop networks. Geneflow, June 1989, FAO Italy, pp. 6–7.
Barnett, V., 1981. Interpreting Multivariate Data. John Wiley & Sons, London.
Belbin, L., 1993. PATN: Technical Reference. Division of Wildlife and Ecology, CSIRO, Canberra.
Bray, J.R. & J.T. Curtis, 1957. An ordination of the upland forest communities of southern Wisconsin. Ecol. Monogr. 27: 325–349.
Burr, E.J., 1968. Cluster sorting with mixed character types. I. Standardisation of character values. Aust. Comp. J. 1: 97–99.
Cain, A.J. & G.A. Harrison, 1958. An analysis of the taxonomists' judgement of affinity. Proc. Zool. Soc. London 131: 85–98.
Digby, P.G.N. & R.A. Kempton, 1987. Multivariate analysis of ecological communities. Chapman and Hall, New York.
Ducker, S.C., W.T. Williams & G.N. Lance, 1965. Numerical classification of the pacific forms of Chlorodesmis (Chlorophyta). Aust. J. Bot. 13: 489–499.
Esquivel, M., M. Barrios, L. Walón & K. Hammer, 1993a. Peanut (Arachis hypogaea L.) genetic resources in Cuba. I. Collecting and characterization. FAO/IBPGR Plant Genetic Resources Newsletter 91/92: 9–15.
Esquivel, M., Z. Fundora & K. Hammer, 1993b. Peanut (Arachis hypogaea L.) genetic resources in Cuba. II. Preliminary germplasm evaluation. FAO/IBPGR Plant Genetic Resources Newsletter 91/92: 17–20.
Frankel, O.H. & A.H.D. Brown, 1984. Current plant genetic resources — a critical appraisal. In: (Eds.) Genetics: new frontiers, 4: 1–11, Oxford & IBH Publishing Co., New Delhi.
Goodall, D.W., 1966. A new similarity index based on probability. Biometrics. 22: 882–907.
Gordon, A.D., 1981. Classification: Methods for the exploratory analysis of multivariate data. 1st ed. Chapman and Hall, New York.
Gower, J.C., 1967. Multivariate analysis and multivariate geometry. The Statistician 17: 13–28.
Gower, J.C., 1971. A general coefficient of similarity and some of its properties. Biometrics. 27: 857–872.
Gregory, W.C. & M.P. Gregory, 1967. Induced mutation and species hybridisation in the de-speciation of Arachis. Ciência e Cultura 19: 42–44.
Harch, B.D., K.E. Basford, I.H. DeLacy, P.K. Lawrence & A. Cruickshank, 1995a. Patterns of diversity in fatty acid composition in the Australian groundnut germplasm collection. Genet. Resour. Crop Evol. 42: 243–256.
Harch, B.D., K.E. Basford, I.H. DeLacy, P.K. Lawrence & A. Cruickshank, 1995b. Diversity in the Australian Groundnut Germplasm Collection. Int. Arachis News. 15, ICRISAT.
Hartigan, J.A., 1975. Clustering Algorithms. John Wiley & Sons Inc., Canada.
Henning, R.J., A.H. Allison & L.D. Tripp, 1982. Cultural Practices. In: H.E. Pattee & C.T. Young (Eds.), Peanut Science and Technology, pp. 123–138, American Peanut Research and Education Society Inc., Texas.
IBPGR & ICRISAT, 1992. Descriptors for Groundnut. International Board for Plant Genetic Resources, Rome, Italy & International Crop Research Institute for the Semi-Arid Tropics, Patancheru, India.
Jaccard, P., 1901. Distribution de la flore alpine dans le Bassin des Dranses et dans quelques régions voisines. Bull. Soc. Vand. Sci. Nat. 37: 241–272.
Kaufman, L. & P.J. Rousseeuw 1990. Finding groups in data: an introduction to cluster analysis. 1st ed. John Wiley & Sons, Inc, Canada.
Kouamé, C.N. & K.H. Quesenberry, 1993. Cluster analysis of a world collection of red clover germplasm. Genet. Resour. Crop Evol. 40: 39–47.
Krapovickas, A., 1968. Origen, variabilidad y dispersión del mani (Arachis hypogaea). Actas y Memorias XXXVII Congresso Internacional de Americanistas 2: 517–534.
Krapovickas, A. & V.A. Rigoni, 1951. Estudios citologicas en el genero Arachis. Rev. Invest. Agric. (Buenos Aires). 5: 289–293.
Krapovickas, A. & V.A. Rigoni, 1960. La nomenclature de las subespecies y variedades de Arachid hypogaea L. Rev. Invest. Agric. (Buenos Aires) 12: 197–228.
Lance, G.N. & W.T. Williams, 1967. Mixed-data classificatory programs I. Agglomerative. Aust. Comput. J. 1: 15–20.
Lawrence, P. 1989. The Australian Tropical Field Crops Genetic Resource Centre. The Aust. Plant Introd. Rev. 20 (2): 1–5.
Milligan, G.W., & M.C. Cooper, 1988. A study of standardisation and variables in cluster analysis. J. of Classification 5: 181–204.
Odum, E.P., 1950. Bird populations of the Highlands (North Carolina) Plateau in relation to plant succession and avian invasion. Ecology 31: 587–605.
Pecetti, L., P. Annicchiarico & A.B. Damania, 1992. Biodiversity in a germplasm collection of durum wheat. Euphytica 60: 229–238.
Peeters, J.P. & J.A. Martinelli, 1989. Hierarchical cluster analysis as a tool to manage variation in germplasm collections. Theor. Appl. Genet. 78: 42–48.
Radloff, D.L. & D.R. Betters, 1978. Multivariate analysis of physical site data for wildland classification. For. Sci. 24: 3–10.
Romesburg, H.C., 1984. Cluster analysis for researchers. Lifetime Learning Publications, Belmont, CA.
Rubin, J., 1967. Optimal classification into groups: an approach for solving the taxonomy problem. J. Theoret. Biol. 15: 103–144.
SAS Institute, 1989. SAS User's Guide: Basics. Cary, NC.
Shannon, C.E., 1948. A mathematical theory of communication. Bell System Tech. J. 27: 379–423 & 27: 623–656.
Sharpiro, S.S. & M.B. Wilk, 1965. An analysis of variance test for normality (complete samples). Biometrika 52: 591–611.
Sharpiro, S.S., M.B. Wilk & H.J. Chen, 1968. A comparative study of various tests for normality. J. Am. Stat. Assoc. 63: 1343–1372.
Sokal, R.R. & P.H.A. Sneath, 1963. The principles of numerical taxonomy W.H. Freeman, San Francisco.
Sneath, P.H.A. & R.R. Sokal, 1973. Numerical Taxonomy. W.H. Freeman, San Francisco.
Souza, E. & M.E. Sorrells, 1991a. Relationships among 70 North American oat germplasms: I. Cluster analysis using quantitative characters. Crop Sci. 31: 599–605.
Souza, E. & M.E. Sorrells, 1991b. Relationships among 70 North American oat germplasms: II. Cluster analysis using qualitative characters. Crop Sci. 31: 605–612.
Späth, H., 1980. Cluster analysis algorithms. Ellis Horwood Limited, London.
Tukey, J.W., 1977. Exploratory Data Analysis. Addison-Wesley, Reading, MA.
Veronesi, F. & M. Falcinelli, 1988. Evaluation of an Italian germplasm collection of Festuca arundinacea Schreb. through multivariate analysis. Euphytica 38: 211–220.
Ward, J.H., 1963. Hierarchical grouping to optimise an objective function. J. Am. Stat. Assoc. 58: 236–244.
Watson, L., W.T. Williams & G.N. Lance, 1967. A mixed data numerical approach to Angiosperm taxonomy. Proc. Linn. Soc. 178: 25.
Williams, W.T., 1976. Pattern Analysis in Agricultural Science. Elsevier, New York.
Williams, W.T. & G.N. Lance, 1958. Automatic subdivision of associated populations. Nature (London) 182: 1755.
Wynne, J.C. & T.A. Coffelt, 1982. Genetics of Arachis hypogaea L., In: H.E. Pattee & C.T. Young (Eds.), Peanut Science and Technology, pp. 50–94, American Peanut Research and Education Society Inc., Texas.
Zar, J.H., 1984. Biostatistical Analysis. Prentice Hall International, Inc., New Jersey.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Harch, B.D., Basford, K.E., DeLacy, I.H. et al. Mixed data types and the use of pattern analysis on the Australian groundnut germplasm data. Genet Resour Crop Evol 43, 363–376 (1996). https://doi.org/10.1007/BF00132957
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/BF00132957