Measuring the Implications of the D-Basis in Analysis of Data in Biomedical Studies

  • Kira Adaricheva
  • J. B. Nation
  • Gordon Okimoto
  • Vyacheslav Adarichev
  • Adina Amanbekkyzy
  • Shuchismita Sarkar
  • Alibek Sailanbayev
  • Nazar Seidalin
  • Kenneth Alibek
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9113)

Abstract

We introduce the parameter of relevance of an attribute of a binary table to another attribute of the same table, computed with respect to an implicational basis of a closure system associated with the table. This enables a ranking of all attributes, by relevance parameter to the same fixed attribute, and, as a consequence, reveals the implications of the basis most relevant to this attribute. As an application of this new metric, we test the algorithm for D-basis extraction presented in Adaricheva and Nation [1] on biomedical data related to the survival groups of patients with particular types of cancer. Each test case requires a specialized approach in converting the real-valued data into binary data and careful analysis of the transformed data in a multi-disciplinary environment of cross-field collaboration.

Keywords

Binary table Galois lattice Implicational basis D-basis Support Relevance Gene expression Survival Response to treatment Immune markers Blood biochemistry Infection 

Notes

Acknowledgements

The C++ code for D-basis extraction on the binary table input used for testing in this project was created by undergraduate students of Yeshiva College in New York: Joshua Blumenkopf and Toviah Moldvin. We received the permission of Takeaki Uno, from the National Institute of Informatics in Tokyo, to implement the call to his subroutine performing the hypergraph dualization, within the structure of our programming code. We were assisted by colleagues Ulrich Norbisrath and Mark Sterling, from the Computer Science Department of School of Science and Technology of NU, when we needed tuning and debugging of the code, also to Rustam Bekishev and Anel Nurtay for assistance in the project. The first author is grateful to the bio-informatics group of the University of Hawaii Cancer Center, for the welcoming atmosphere and fruitful collaboration during her visit in June 2014, supported by Nazarbayev University grant N 13/42. The second author expresses his gratitude for support of his visit to Nazarbayev University in May–June 2013 and May 2014, which were partly funded by NU grant N 13/42 and grant N 0112PK02175 of Medical Holding of Astana. Tom Wenska, Ashkan Zeinalzadeh and Jenna Maligro contributed to the research and discussion in Honolulu.

References

  1. 1.
    Adaricheva, K., Nation, J.B.: Discovery of the \(D\)-basis in binary tables based on hypergraph dualization, Theoretical Computer Science (submitted to)Google Scholar
  2. 2.
    Adaricheva, K., Nation, J.B., Rand, R.: Ordered direct implicational basis of a finite closure system. Disc. Appl. Math. 161, 707–723 (2013)CrossRefMATHMathSciNetGoogle Scholar
  3. 3.
    Adarichev, V.A., Vermes, C., Hanyecz, A., Mikecz, K., Bremer, E.G., Glant, T.T.: Gene expression profiling in murine autoimmune arthritis during the initiation and progression of joint inflammation. Arthritis Res. Ther. 7, 196–207 (2005)CrossRefGoogle Scholar
  4. 4.
    Adarichev, V.A., Vermes, C., Hanyecz, A., Ludanyi, K., Tunyogi-Csapó, M., Mikecz, K., Glant, T.T.: Antigen-induced differential gene expression in lymphocytes and gene expression profile in synovium prior to the onset of arthritis. Autoimmunity 39, 663–673 (2006)CrossRefGoogle Scholar
  5. 5.
    Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., Verkamo, A.I.: Fast discovery of association rules. In: Fayyad, U.M., Piatetsky-Shapiro, G., Smyth, P., Uthurusamy, R. (eds.) Advances in Knowledge Discovery and Data Mining, pp. 307–328. AAAI Press, Menlo Park (1996)Google Scholar
  6. 6.
    Babin, M.A., Kuznetsov, S.O.: Computing premises of a minimal cover of functional dependencies is intractable. Disc. Appl. Math. 161, 742–749 (2013)CrossRefMATHMathSciNetGoogle Scholar
  7. 7.
    Balcázar, J.L.: Redundancy, deduction schemes, and minimum-size bases for association rules. Log. Meth. Comput. Sci. 6(2:3), 1–33 (2010)Google Scholar
  8. 8.
    Bertet, K., Monjardet, B.: The multiple facets of the canonical direct unit implicational basis. Theor. Comput. Sci. 411, 2155–2166 (2010)CrossRefMATHMathSciNetGoogle Scholar
  9. 9.
    Boros, E., Elbassioni, K., Gurvich, V., Khachiyan, L.: Generating dual-bounded hypergraphs. Optim. Methods Softw. 17, 749–781 (2002)CrossRefMATHMathSciNetGoogle Scholar
  10. 10.
    Distel, F., Sertkaya, B.: On the complexity of enumerating the pseudo-intents. Disc. Appl. Math. 159, 450–466 (2011)CrossRefMATHMathSciNetGoogle Scholar
  11. 11.
    Fredman, M., Khachiyan, L.: On the complexity of dualization of monotone disjunctive normal forms. J. Algorithms 21, 618–628 (1996)CrossRefMATHMathSciNetGoogle Scholar
  12. 12.
    Guigues, J.L., Duquenne, V.: Familles minimales d’implications informatives résultant d’une tables de données binares. Math. Sci. Hum. 95, 5–18 (1986)MathSciNetGoogle Scholar
  13. 13.
    Kaplan, E.L., Meier, P.: Nonparametric estimation from incomplete observations. J. Amer. Statist. Assn. 53(282), 457–481 (1958)CrossRefMATHMathSciNetGoogle Scholar
  14. 14.
    Kryszkiewicz, M.: Concise representations of association rules. In: Hand, D.J., Adams, N.M., Bolton, R.J. (eds.) Pattern Detection and Discovery. LNCS (LNAI), vol. 2447, p. 92. Springer, Heidelberg (2002) CrossRefGoogle Scholar
  15. 15.
    Murakami, K., Uno, T.: Efficient algorithms for dualizing large scale hypergraphs. Disc. Appl. Math. 170, 83–94 (2014)CrossRefMATHMathSciNetGoogle Scholar
  16. 16.
    Ryssel, U., Distel, F., Borchmann, D.: Fast algorithms for implication bases and attribute exploration using proper premises. Ann. Math. Art. Intell. 70, 25–53 (2014)CrossRefMATHMathSciNetGoogle Scholar
  17. 17.
    Spearman, C.: The proof and measurement of association between two things. Amer. J. Psychol. 15, 72–101 (1904)CrossRefGoogle Scholar
  18. 18.
    Network, T.C.G.A.R.: The cancer genome atlas pan-cancer analysis project. Nat. Genet. 45, 1113–1120 (2013)CrossRefGoogle Scholar
  19. 19.
    R Core Team: R: a language and environment for statistical computing, R Foundation for Statistical Computing, Vienna, Austria (2013). URL http://www.R-project.org/

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Kira Adaricheva
    • 1
    • 2
  • J. B. Nation
    • 3
  • Gordon Okimoto
    • 4
  • Vyacheslav Adarichev
    • 1
  • Adina Amanbekkyzy
    • 1
  • Shuchismita Sarkar
    • 1
  • Alibek Sailanbayev
    • 1
  • Nazar Seidalin
    • 5
  • Kenneth Alibek
    • 6
  1. 1.School of Science and TechnologyNazarbayev UniversityAstanaKazakhstan
  2. 2.Yeshiva UniversityNew YorkUSA
  3. 3.University of HawaiiHonoluluUSA
  4. 4.University of Hawaii Cancer CenterHonoluluUSA
  5. 5.Medical HoldingAstanaKazakhstan
  6. 6.Graduate School of MedicineNazarbayev UniversityAstanaKazakhstan

Personalised recommendations