Overview of the INEX 2009 XML Mining Track: Clustering and Classification of XML Documents

  • Richi Nayak
  • Christopher M. De Vries
  • Sangeetha Kutty
  • Shlomo Geva
  • Ludovic Denoyer
  • Patrick Gallinari
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6203)

Abstract

This report explains the objectives, datasets and evaluation criteria of both the clustering and classification tasks set in the INEX 2009 XML Mining track. The report also describes the approaches and results obtained by the different participants.

Keywords

XML document mining INEX Wikipedia Structure and content Clustering Classification 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Altingovde, I., Atilgan, D., Ulusoy, O.: Exploiting Index Pruning Methods for Clustering XML Collections. In: Geva, S., Kamps, J., Trotman, A. (eds.) INEX 2009. LNCS, vol. 6203, pp. 379–386. Springer, Heidelberg (2010)Google Scholar
  2. 2.
    Denoyer, L., Gallinari, P.: Report on the XML Mining Track at Inex 2005 and Inex 2006. Categorization and Clustering of XML Documents 41(1), 79–90 (2007)Google Scholar
  3. 3.
    Denoyer, L., Gallinari, P.: Report on the XML Mining Track at Inex 2007. Categorization and Clustering of XML Documents 42(1), 22–28 (2008)Google Scholar
  4. 4.
    Denoyer, L., Gallinari, P.: Overview of the inex 2008 xml mining track. In: Geva, S., Kamps, J., Trotman, A. (eds.) INEX 2008. LNCS, vol. 5631, pp. 401–411. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  5. 5.
    De Vries, C., Geva, S., De Vine, L.: Clustering with Random Indexing K-tree and XML Structure. In: Geva, S., Kamps, J., Trotman, A. (eds.) INEX 2009. LNCS, vol. 6203, pp. 407–415. Springer, Heidelberg (2010)Google Scholar
  6. 6.
    Chidlovskii, B.: Multi-label Wikipedia classification with textual and graph features. In: Geva, S., Kamps, J., Trotman, A. (eds.) INEX 2009. LNCS, vol. 6203, pp. 387–396. Springer, Heidelberg (2010)Google Scholar
  7. 7.
    Hagenbuchner, M., Zhang, S., Scarselli, F., Chung Tsoi, A.: Supervised Encoding of Graph-of-Graphs for Classification and Regression Problems. In: Geva, S., Kamps, J., Trotman, A. (eds.) INEX 2009. LNCS, vol. 6203, pp. 449–461. Springer, Heidelberg (2010)Google Scholar
  8. 8.
    Jardine, N., van Rijsbergen, C.J.: The Use of Hierarchic Clustering in Information Retrieval. Inform. Stor. Retr. 7, 217–240 (1971)CrossRefGoogle Scholar
  9. 9.
    Kutty, S., Nayak, R., Li, Y.: HCX: An Efficient Hybrid Clustering Approach for XML Documents. In: Proceedings of the ACM Document Engineering Symposium, Munich, Germany, pp. 94–97 (2009)Google Scholar
  10. 10.
    Kutty, S., Nayak, R., Li, Y.: Clustering XML documents using Multi-feature Model. In: Geva, S., Kamps, J., Trotman, A. (eds.) INEX 2009. LNCS, vol. 6203, pp. 416–425. Springer, Heidelberg (2010)Google Scholar
  11. 11.
    Largeron, C., Moulin, C., Gery, M.: UJM at INEX 2009 XML Mining Track. In: Geva, S., Kamps, J., Trotman, A. (eds.) INEX 2009. LNCS, vol. 6203, pp. 426–433. Springer, Heidelberg (2010)Google Scholar
  12. 12.
    Nayak, R.: XML Data Mining: Process and Applications. In: Song, M., Wu, Y.-F. (eds.) Hand-book of Research on Text and Web Mining Technologies, ch.15, pp. 249–272. Idea Group Inc., USA Google Scholar
  13. 13.
    Pinto, D., Tovar, M., Vilariño, D., Beltran, B., Salazar, H.: BUAP: Performance of K-Star at the INEX 2009 Clustering Task. In: Geva, S., Kamps, J., Trotman, A. (eds.) INEX 2009. LNCS, vol. 6203, pp. 434–440. Springer, Heidelberg (2010)Google Scholar
  14. 14.
    Romero, A.E., de Campos, M.L., Fernandez-Luna, J.M., Huete, J.F., Mase-gosa, A.R.: Link-based text calssification using Bayesian networks. In: Geva, S., Kamps, J., Trotman, A. (eds.) INEX 2009. LNCS, vol. 6203, pp. 397–406. Springer, Heidelberg (2010)Google Scholar
  15. 15.
    Yang, J., Wang, S.: Extended VSM for XML Document Classification using Frequent Subtrees. In: Geva, S., Kamps, J., Trotman, A. (eds.) INEX 2009. LNCS, vol. 6203, pp. 441–448. Springer, Heidelberg (2010)Google Scholar
  16. 16.
    Suchanek, F., Kasneci, G., Weikum, G.: YAGO: A Core of Semantic Knowledge Unifying WordNet and Wikipedia. In: WWW 2007 (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Richi Nayak
    • 1
  • Christopher M. De Vries
    • 1
  • Sangeetha Kutty
    • 1
  • Shlomo Geva
    • 1
  • Ludovic Denoyer
    • 2
  • Patrick Gallinari
    • 2
  1. 1.Faculty of Science and TechnologyQueensland University of TechnologyBrisbaneAustralia
  2. 2.University Pierre et Marie CurieParisFrance

Personalised recommendations