Skip to main content
Log in

Logical characterization of groups of data: a comparative study

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

This paper presents an approach for characterizing groups of data represented by Boolean vectors. The purpose is to find minimal set of attributes that allow to distinguish data from different groups. In this work, we precisely defined the multiple characterization problem and the algorithms that can be used to solve its different variants. Our data characterization approach can be related to Logical Analysis of Data and we propose thus a comparison between these two methodologies. The purpose of this paper is also to precisely study the properties of the solutions that are computed with regards to the topological properties of the instances. Experiments are thus conducted on real biological data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Notes

  1. Note that for simplicity, we present the full array that contains the data matrix (which corresponds thus only the Booleanpart of the array).

  2. Remind that Θ is a set of couples of observations defined in Section 4.1 for indexing lines of the constraint matrix C.

  3. http://www.info.univ-angers.fr/gh/Idas/Ccd/ce_f.php.

References

  1. Aggarwal CC, Reddy CK (2013) Data clustering: algorithms and applications. CRC Press, Boca Raton

    MATH  Google Scholar 

  2. Agrawal R, Imieliński T, Swami A (1993) Mining association rules between sets of items in large databases. In: Acm sigmod record, vol 22. ACM, pp 207–216

  3. Alexe G, Alexe S, Axelrod D, Hammer PL, Weissmann D (2005) Logical analysis of diffuse large b-cell lymphomas. Artif Intell Med 34(3):235–267

    Article  Google Scholar 

  4. Alexe G, Alexe S, Axelrod DE, Bonates TO, Lozina II, Reiss M, Hammer PL (2006) Breast cancer prognosis by combinatorial analysis of gene expression data. Breast Cancer Res 8(4):1–20

    Article  Google Scholar 

  5. Alexe G, Alexe S, Bonates TO, Kogan A (2007) Logical analysis of data - the vision of Peter L. Hammer. Ann Math Artif Intell 49(1–4):265–312

    Article  MathSciNet  MATH  Google Scholar 

  6. Bennane A, Yacout S (2012) Lad-cbm; new data processing tool for diagnosis and prognosis in condition-based maintenance. J Intell Manuf 23(2):265–275

    Article  Google Scholar 

  7. Boros E, Crama Y, Hammer PL, Ibaraki T, Kogan A, Makino K (2011) Logical analysis of data: classification with justification. Ann Oper Res 188(1):33–61

    Article  MathSciNet  MATH  Google Scholar 

  8. Boros E, Hammer PL, Ibaraki T, Kogan A (1997) Logical analysis of numerical data. Math Program 79:163–190

    MathSciNet  MATH  Google Scholar 

  9. Boureau T, Kerkoud M, Chhel F, Hunault G, Darrasse A, Brin C, Durand K, Hajri A, Poussier S, Manceau C, Lardeux F, Saubion F, Jacques M-A. (2013) A multiplex-pcr assay for identification of the quarantine plant pathogen xanthomonas axonopodis pv. phaseoli. J Microbiol Methods 92(1):42–50

    Article  Google Scholar 

  10. Chambon A, Boureau T, Lardeux F, Saubion F, Le Saux M (2015) Characterization of multiple groups of data. In: 2015 IEEE 27th international conference on tools with artificial intelligence (ICTAI). IEEE, New York, pp 1021–1028

  11. Chandrashekar G, Sahin F (2014) A survey on feature selection methods. Comput Electric Eng 40(1):16 – 28. 40th-year commemorative issue

    Article  Google Scholar 

  12. Chhel F, Lardeux F, Saubion F, Zanuttini B (2013) Application du problėme de caractėrisation multiple ȧla conception de tests de diagnostic pour la biologie vėgėtale. Revue d’Intelligence Artificielle 27(4-5):649–668

    Article  Google Scholar 

  13. Chikalov I, Lozin V, Lozina I, Moshkov M, Nguyen H, Skowron A, Zielosko B (2013) Logical analysis of data: Theory, methodology and applications. In: Three approaches to data analysis. Vol. 41 of intelligent systems reference library. Springer, Berlin, pp 147–192

  14. Dasgupta S (2008) The hardness of k-means clustering. Department of Computer Science and Engineering, University of California, San Diego

    Google Scholar 

  15. Dupuis C, Gamache M, Pagé JF (2012) Logical analysis of data for estimating passenger show rates at Air Canada. J Air Transp Manag 18(1):78–81

    Article  Google Scholar 

  16. Hammer PL, Bonates TO (2006) Logical analysis of data - an overview: from combinatorial optimization to medical applications. Ann Oper Res 148(1):203–225

    Article  MATH  Google Scholar 

  17. Hartigan JA, Wong MA (1979) Algorithm as 136: a k-means clustering algorithm. J R Stat Soc Series C (Appl Stat) 28(1):100–108

    MATH  Google Scholar 

  18. Kaufman L, Rousseeuw PJ (1990) Partitioning around medoids (program pam). Finding groups in data: an introduction to cluster analysis. Wiley, New York, pp 68–125

    Google Scholar 

  19. Kholodovych V, Smith JR, Knight D, Abramson S, Kohn J, Welsh WJ (2004) Accurate predictions of cellular response using qspr: a feasibility test of rational design of polymeric biomaterials. Polymer 45 (22):7367–7379

    Article  Google Scholar 

  20. Kumar V, Abbas AK, Fausto N, Aster JC (2014) Robbins and Cotran pathologic basis of disease. Elsevier Health Sciences, Amsterdam

    Google Scholar 

  21. MacQueen J et al (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, vol 1, Oakland, pp 281–297

  22. Makino K, Hatanaka K, Ibaraki T (1999) Horn extensions of a partially defined boolean function. SIAM J Comput 28(6):2168–2186

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Arthur Chambon.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chambon, A., Boureau, T., Lardeux, F. et al. Logical characterization of groups of data: a comparative study. Appl Intell 48, 2284–2303 (2018). https://doi.org/10.1007/s10489-017-1080-3

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-017-1080-3

Keywords

Navigation