Gaussian Logic for Predictive Classification

  • Ondřej Kuželka
  • Andrea Szabóová
  • Matěj Holec
  • Filip Železný
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6912)

Abstract

We describe a statistical relational learning framework called Gaussian Logic capable to work efficiently with combinations of relational and numerical data. The framework assumes that, for a fixed relational structure, the numerical data can be modelled by a multivariate normal distribution. We demonstrate how the Gaussian Logic framework can be applied to predictive classification problems. In experiments, we first show an application of the framework for the prediction of DNA-binding propensity of proteins. Next, we show how the Gaussian Logic framework can be used to find motifs describing highly correlated gene groups in gene-expression data which are then used in a set-level-based classification method.

Keywords

Statistical Relational Learning Proteomics Gene Expression 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Ahmad, S., Sarai, A.: Moment-based prediction of dna-binding proteins. Journal of Molecular Biology 341(1), 65–71 (2004)CrossRefGoogle Scholar
  2. 2.
    Best, C.J.M., et al.: Molecular alterations in primary prostate cancer after androgen ablation therapy. Clin. Cancer Res. 11(19 Pt 1), 6823–6834 (2005)CrossRefGoogle Scholar
  3. 3.
    Bhardwaj, N., Langlois, R.E., Zhao, G., Lu, H.: Kernel-based machine learning protocol for predicting DNA-binding proteins. Nucleic Acids Research 33(20), 6486–6493Google Scholar
  4. 4.
    Burczynski, M.E., et al.: Molecular classification of crohns disease and ulcerative colitis patients using transcriptional profiles in peripheral blood mononuclear cells. J. Mol. Diagn. 8(1), 51–61 (2006)CrossRefGoogle Scholar
  5. 5.
    Dahia, P.L.M., et al.: A hif1alpha regulatory loop links hypoxia and mitochondrial signals in pheochromocytomas. PLoS Genet. 1(1), 72–80 (2005)CrossRefGoogle Scholar
  6. 6.
    Domingos, P., Kok, S., Lowd, D., Poon, H., Richardson, M., Singla, P.: Markov logic. In: De Raedt, L., Frasconi, P., Kersting, K., Muggleton, S.H. (eds.) Probabilistic Inductive Logic Programming. LNCS (LNAI), vol. 4911, pp. 92–117. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  7. 7.
    Freije, W.A., et al.: Gene expression profiling of gliomas strongly predicts survival. Cancer Res. 64(18), 6503–6510 (2004)CrossRefGoogle Scholar
  8. 8.
    Gashaw, I., et al.: Gene signatures of testicular seminoma with emphasis on expression of ets variant gene 4. Cell Mol. Life Sci. 62(19-20), 2359–2368 (2005)CrossRefGoogle Scholar
  9. 9.
    Gordon, G.J.: Transcriptional profiling of mesothelioma using microarrays. Lung Cancer 49(suppl.1), S99–S103 (2005)CrossRefGoogle Scholar
  10. 10.
    Higham, N.J.: Computing the nearest correlation matrix - a problem from finance. IMA Journal of Numerical Analysis, 329–343 (2002)Google Scholar
  11. 11.
    Holec, M., Železný, F., Kléma, J., Tolar, J.: Integrating multiple-platform expression data through gene set features. In: Măndoiu, I., Narasimhan, G., Zhang, Y. (eds.) ISBRA 2009. LNCS, vol. 5542, pp. 5–17. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  12. 12.
    Jacob, L., Obozinski, G., Vert, J.-P.: Group lasso with overlap and graph lasso. In: Proceedings of the 26th Annual International Conference on Machine Learning, ICML 2009, pp. 433–440. ACM, New York (2009)Google Scholar
  13. 13.
    Jakl, M., Pichler, R., Rümmele, S., Woltran, S.: Fast counting with bounded treewidth. In: Cervesato, I., Veith, H., Voronkov, A. (eds.) LPAR 2008. LNCS (LNAI), vol. 5330, pp. 436–450. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  14. 14.
    Kanehisa, M., Goto, S., Kawashima, S., Okuno, Y., Hattori, M.: The kegg resource for deciphering the genome. Nucleic Acids Research 1 (2004)Google Scholar
  15. 15.
    Kuriakose, M.A., et al.: Selection and validation of differentially expressed genes in head and neck cancer. Cell Mol. Life Sci. 61(11), 1372–1383 (2004)CrossRefGoogle Scholar
  16. 16.
    Kuželka, O., Železný, F.: Block-wise construction of tree-like relational features with monotone reducibility and redundancy. Machine Learning, online first (2010), doi:10.1007 / s10994-010-5208-5Google Scholar
  17. 17.
    Scherzer, C.R., et al.: Molecular markers of early parkinsons disease based on gene expression in blood. Proc. Natl. Acad Sci. US A 104(3), 955–960 (2007)CrossRefGoogle Scholar
  18. 18.
    Stawiski, E.W., Gregoret, L.M., Mandel-Gutfreund, Y.: Annotating nucleic acid-binding function based on protein structure. Journal of Molecular Biology 326(4), 1065–1079 (2003)CrossRefGoogle Scholar
  19. 19.
    Szilágyi, A., Skolnick, J.: Efficient prediction of nucleic acid binding function from low-resolution protein structures. Journal of Molecular Biology 358(3), 922–933 (2006)CrossRefGoogle Scholar
  20. 20.
    Tsuchiya, Y., Kinoshita, K., Nakamura, H.: Structure-based prediction of dna-binding sites on proteins using the empirical preference of electrostatic potential and the shape of molecular surfaces. Proteins: Structure, Function, and Bioinformatics 55(4), 885–894 (2004)CrossRefGoogle Scholar
  21. 21.
    Wang, J., Domingos, P.: Hybrid markov logic networks. In: Proceedings of the 23rd national conference on Artificial intelligence, vol. 2. AAAI Press, Menlo Park (2008)Google Scholar
  22. 22.
    Yannakakis, M.: Algorithms for acyclic database schemes. In: International Conference on Very Large Data Bases (VLDB 1981), pp. 82–94 (1981)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Ondřej Kuželka
    • 1
  • Andrea Szabóová
    • 1
  • Matěj Holec
    • 1
  • Filip Železný
    • 1
  1. 1.Faculty of Electrical EngineeringCzech Technical University in PraguePragueCzech Republic

Personalised recommendations