Skip to main content
Log in

Mixed data types and the use of pattern analysis on the Australian groundnut germplasm data

  • Regular Research Papers
  • Published:
Genetic Resources and Crop Evolution Aims and scope Submit manuscript

Abstract

Data in germplasm collections contain a mixture of data types; binary, multistate and quantitative. Given the multivariate nature of these data, the pattern analysis methods of classification and ordination have been identified as suitable techniques for statistically evaluating the available diversity. The proximity (or resemblance) measure, which is in part the basis of the complementary nature of classification and ordination techniques, is often specific to particular data types. The use of a combined resemblance matrix has an advantage over data type specific proximity measures. This measure accommodates the different data types without manipulating them to be of a specific type. Descriptors are partitioned into their data types and an appropriate proximity measure is used on each. The separate proximity matrices, after range standardisation, are added as a weighted average and the combined resemblance matrix is then used for classification and ordination.

Germplasm evaluation data for 831 accessions of groundnut (Arachis hypogaea L.) from the Australian Tropical Field Crops Genetic Resource Centre, Biloela, Queensland were examined. Data for four binary, five ordered multistate and seven quantitative descriptors have been documented. The interpretative value of different weightings—equal and unequal weighting of data types to obtain a combined resemblance matrix—was investigated by using principal co-ordinate analysis (ordination) and hierarchical cluster analysis.

Equal weighting of data types was found to be more valuable for these data as the results provided a greater insight into the patterns of variability available in the Australian groundnut germplasm collection. The complementary nature of pattern analysis techniques enables plant breeders to identify relevant accessions in relation to the descriptors which distinguish amongst them. This additional information may provide plant breeders with a more defined entry point into the germplasm collection for identifying sources of variability for their plant improvement program, thus improving the utilisation of germplasm resources.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Anderberg, M.R., 1973. Cluster Analysis for Applications. Academic Press, Inc., New York.

    Google Scholar 

  • Anon., 1989. The case for crop networks. Geneflow, June 1989, FAO Italy, pp. 6–7.

  • Barnett, V., 1981. Interpreting Multivariate Data. John Wiley & Sons, London.

    Google Scholar 

  • Belbin, L., 1993. PATN: Technical Reference. Division of Wildlife and Ecology, CSIRO, Canberra.

    Google Scholar 

  • Bray, J.R. & J.T. Curtis, 1957. An ordination of the upland forest communities of southern Wisconsin. Ecol. Monogr. 27: 325–349.

    Google Scholar 

  • Burr, E.J., 1968. Cluster sorting with mixed character types. I. Standardisation of character values. Aust. Comp. J. 1: 97–99.

    Google Scholar 

  • Cain, A.J. & G.A. Harrison, 1958. An analysis of the taxonomists' judgement of affinity. Proc. Zool. Soc. London 131: 85–98.

    Google Scholar 

  • Digby, P.G.N. & R.A. Kempton, 1987. Multivariate analysis of ecological communities. Chapman and Hall, New York.

    Google Scholar 

  • Ducker, S.C., W.T. Williams & G.N. Lance, 1965. Numerical classification of the pacific forms of Chlorodesmis (Chlorophyta). Aust. J. Bot. 13: 489–499.

    Google Scholar 

  • Esquivel, M., M. Barrios, L. Walón & K. Hammer, 1993a. Peanut (Arachis hypogaea L.) genetic resources in Cuba. I. Collecting and characterization. FAO/IBPGR Plant Genetic Resources Newsletter 91/92: 9–15.

    Google Scholar 

  • Esquivel, M., Z. Fundora & K. Hammer, 1993b. Peanut (Arachis hypogaea L.) genetic resources in Cuba. II. Preliminary germplasm evaluation. FAO/IBPGR Plant Genetic Resources Newsletter 91/92: 17–20.

    Google Scholar 

  • Frankel, O.H. & A.H.D. Brown, 1984. Current plant genetic resources — a critical appraisal. In: (Eds.) Genetics: new frontiers, 4: 1–11, Oxford & IBH Publishing Co., New Delhi.

    Google Scholar 

  • Goodall, D.W., 1966. A new similarity index based on probability. Biometrics. 22: 882–907.

    Google Scholar 

  • Gordon, A.D., 1981. Classification: Methods for the exploratory analysis of multivariate data. 1st ed. Chapman and Hall, New York.

    Google Scholar 

  • Gower, J.C., 1967. Multivariate analysis and multivariate geometry. The Statistician 17: 13–28.

    Google Scholar 

  • Gower, J.C., 1971. A general coefficient of similarity and some of its properties. Biometrics. 27: 857–872.

    Google Scholar 

  • Gregory, W.C. & M.P. Gregory, 1967. Induced mutation and species hybridisation in the de-speciation of Arachis. Ciência e Cultura 19: 42–44.

    Google Scholar 

  • Harch, B.D., K.E. Basford, I.H. DeLacy, P.K. Lawrence & A. Cruickshank, 1995a. Patterns of diversity in fatty acid composition in the Australian groundnut germplasm collection. Genet. Resour. Crop Evol. 42: 243–256.

    Google Scholar 

  • Harch, B.D., K.E. Basford, I.H. DeLacy, P.K. Lawrence & A. Cruickshank, 1995b. Diversity in the Australian Groundnut Germplasm Collection. Int. Arachis News. 15, ICRISAT.

  • Hartigan, J.A., 1975. Clustering Algorithms. John Wiley & Sons Inc., Canada.

    Google Scholar 

  • Henning, R.J., A.H. Allison & L.D. Tripp, 1982. Cultural Practices. In: H.E. Pattee & C.T. Young (Eds.), Peanut Science and Technology, pp. 123–138, American Peanut Research and Education Society Inc., Texas.

    Google Scholar 

  • IBPGR & ICRISAT, 1992. Descriptors for Groundnut. International Board for Plant Genetic Resources, Rome, Italy & International Crop Research Institute for the Semi-Arid Tropics, Patancheru, India.

    Google Scholar 

  • Jaccard, P., 1901. Distribution de la flore alpine dans le Bassin des Dranses et dans quelques régions voisines. Bull. Soc. Vand. Sci. Nat. 37: 241–272.

    Google Scholar 

  • Kaufman, L. & P.J. Rousseeuw 1990. Finding groups in data: an introduction to cluster analysis. 1st ed. John Wiley & Sons, Inc, Canada.

    Google Scholar 

  • Kouamé, C.N. & K.H. Quesenberry, 1993. Cluster analysis of a world collection of red clover germplasm. Genet. Resour. Crop Evol. 40: 39–47.

    Google Scholar 

  • Krapovickas, A., 1968. Origen, variabilidad y dispersión del mani (Arachis hypogaea). Actas y Memorias XXXVII Congresso Internacional de Americanistas 2: 517–534.

    Google Scholar 

  • Krapovickas, A. & V.A. Rigoni, 1951. Estudios citologicas en el genero Arachis. Rev. Invest. Agric. (Buenos Aires). 5: 289–293.

    Google Scholar 

  • Krapovickas, A. & V.A. Rigoni, 1960. La nomenclature de las subespecies y variedades de Arachid hypogaea L. Rev. Invest. Agric. (Buenos Aires) 12: 197–228.

    Google Scholar 

  • Lance, G.N. & W.T. Williams, 1967. Mixed-data classificatory programs I. Agglomerative. Aust. Comput. J. 1: 15–20.

    Google Scholar 

  • Lawrence, P. 1989. The Australian Tropical Field Crops Genetic Resource Centre. The Aust. Plant Introd. Rev. 20 (2): 1–5.

    Google Scholar 

  • Milligan, G.W., & M.C. Cooper, 1988. A study of standardisation and variables in cluster analysis. J. of Classification 5: 181–204.

    Google Scholar 

  • Odum, E.P., 1950. Bird populations of the Highlands (North Carolina) Plateau in relation to plant succession and avian invasion. Ecology 31: 587–605.

    Google Scholar 

  • Pecetti, L., P. Annicchiarico & A.B. Damania, 1992. Biodiversity in a germplasm collection of durum wheat. Euphytica 60: 229–238.

    Google Scholar 

  • Peeters, J.P. & J.A. Martinelli, 1989. Hierarchical cluster analysis as a tool to manage variation in germplasm collections. Theor. Appl. Genet. 78: 42–48.

    Google Scholar 

  • Radloff, D.L. & D.R. Betters, 1978. Multivariate analysis of physical site data for wildland classification. For. Sci. 24: 3–10.

    Google Scholar 

  • Romesburg, H.C., 1984. Cluster analysis for researchers. Lifetime Learning Publications, Belmont, CA.

    Google Scholar 

  • Rubin, J., 1967. Optimal classification into groups: an approach for solving the taxonomy problem. J. Theoret. Biol. 15: 103–144.

    Google Scholar 

  • SAS Institute, 1989. SAS User's Guide: Basics. Cary, NC.

  • Shannon, C.E., 1948. A mathematical theory of communication. Bell System Tech. J. 27: 379–423 & 27: 623–656.

    Google Scholar 

  • Sharpiro, S.S. & M.B. Wilk, 1965. An analysis of variance test for normality (complete samples). Biometrika 52: 591–611.

    Google Scholar 

  • Sharpiro, S.S., M.B. Wilk & H.J. Chen, 1968. A comparative study of various tests for normality. J. Am. Stat. Assoc. 63: 1343–1372.

    Google Scholar 

  • Sokal, R.R. & P.H.A. Sneath, 1963. The principles of numerical taxonomy W.H. Freeman, San Francisco.

    Google Scholar 

  • Sneath, P.H.A. & R.R. Sokal, 1973. Numerical Taxonomy. W.H. Freeman, San Francisco.

    Google Scholar 

  • Souza, E. & M.E. Sorrells, 1991a. Relationships among 70 North American oat germplasms: I. Cluster analysis using quantitative characters. Crop Sci. 31: 599–605.

    Google Scholar 

  • Souza, E. & M.E. Sorrells, 1991b. Relationships among 70 North American oat germplasms: II. Cluster analysis using qualitative characters. Crop Sci. 31: 605–612.

    Google Scholar 

  • Späth, H., 1980. Cluster analysis algorithms. Ellis Horwood Limited, London.

    Google Scholar 

  • Tukey, J.W., 1977. Exploratory Data Analysis. Addison-Wesley, Reading, MA.

    Google Scholar 

  • Veronesi, F. & M. Falcinelli, 1988. Evaluation of an Italian germplasm collection of Festuca arundinacea Schreb. through multivariate analysis. Euphytica 38: 211–220.

    Google Scholar 

  • Ward, J.H., 1963. Hierarchical grouping to optimise an objective function. J. Am. Stat. Assoc. 58: 236–244.

    Google Scholar 

  • Watson, L., W.T. Williams & G.N. Lance, 1967. A mixed data numerical approach to Angiosperm taxonomy. Proc. Linn. Soc. 178: 25.

    Google Scholar 

  • Williams, W.T., 1976. Pattern Analysis in Agricultural Science. Elsevier, New York.

    Google Scholar 

  • Williams, W.T. & G.N. Lance, 1958. Automatic subdivision of associated populations. Nature (London) 182: 1755.

    Google Scholar 

  • Wynne, J.C. & T.A. Coffelt, 1982. Genetics of Arachis hypogaea L., In: H.E. Pattee & C.T. Young (Eds.), Peanut Science and Technology, pp. 50–94, American Peanut Research and Education Society Inc., Texas.

    Google Scholar 

  • Zar, J.H., 1984. Biostatistical Analysis. Prentice Hall International, Inc., New Jersey.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Harch, B.D., Basford, K.E., DeLacy, I.H. et al. Mixed data types and the use of pattern analysis on the Australian groundnut germplasm data. Genet Resour Crop Evol 43, 363–376 (1996). https://doi.org/10.1007/BF00132957

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF00132957

Key words

Navigation