Skip to main content

Learning to Order: A Relational Approach

  • Conference paper
Mining Complex Data (MCD 2007)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4944))

Included in the following conference series:

Abstract

In some applications it is necessary to sort a set of elements according to an order relationship which is not known a priori. In these cases, a training set of ordered elements is often available, from which the order relationship can be automatically learned. In this work, it is assumed that the correct succession of elements in a training sequence (or chain) is given, so that it is possible to induce the definition of two predicates, first/1 and succ/2, which are then used to establish an ordering relationship. A peculiarity of this work is the relational representation of training data which allows various relationships between ordered elements to be expressed in addition to the ordering relationship. Therefore, an ILP learning algorithm is applied to induce the definitions of the two predicates. Two methods are reported for the identification of either single chains or multiple chains on new objects. They have been applied to the problem of learning the reading order of layout components extracted from document images. Experimental results show the effectiveness of the proposed solution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aiello, M., Monz, C., Todoran, L., Worring, M.: Document understanding for a broad class of documents. International Journal on Document Analysis and Recognition-IJDAR 5(1), 1–16 (2002)

    Article  MATH  Google Scholar 

  2. Aiello, M., Smeulders, A.: Bidimensional relations for reading order detection. In: Proceedings of Joint Conference on Information Science (2003)

    Google Scholar 

  3. Altamura, O., Esposito, F., Malerba, D.: Transforming paper documents into XML format with WISDOM++. International Journal on Document Analysis and Recognition-IJDAR 4(1), 2–17 (2001)

    Article  Google Scholar 

  4. Breuel, T.M.: High performance document layout analysis. In: Proceedings of the 2003 Symposium on Document Image Understanding (SDIUT 2003) (2003)

    Google Scholar 

  5. Ceci, M., Berardi, M., Porcelli, G., Malerba, D.: A data mining approach to reading order detection. In: ICDAR 2007: 9th International Conference on Document Analysis and Recognition, pp. 924–928 (2007)

    Google Scholar 

  6. Cohen, W.W., Schapire, R.E., Singer, Y.: Learning to order things. Journal of Artificial Intelligence Research (JAIR) 10, 243–270 (1999)

    MATH  MathSciNet  Google Scholar 

  7. De Raedt, L.: Interactive Theory Revision. Academic Press, London (1992)

    Google Scholar 

  8. Džeroski, S., Lavrač, N.: Relational Data Mining. Springer, Berlin (2001)

    Google Scholar 

  9. Dwork, C., Kumar, R., Naor, M., Sivakumar, D.: Rank aggregation methods for the web. In: WWW 2001: Proceedings of the 10th international conference on World Wide Web, pp. 613–622. ACM Press, New York (2001)

    Chapter  Google Scholar 

  10. Gionis, A., Kujala, T., Mannila, H.: Fragments of order. In: KDD 2003: Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 129–136. ACM Press, New York (2003)

    Chapter  Google Scholar 

  11. Grimaldi, R.P.: Discrete and Combinatorial Mathematics, an Applied Introduction, 3rd edn. Addison Wesley, Reading (1994)

    MATH  Google Scholar 

  12. Ishitani, Y.: Document transformation system from papers to XML data based on Pivot XML document method. In: ICDAR 2003: 7th International Conference on Document Analysis and Recognition, p. 250. IEEE Computer Society, Los Alamitos (2003)

    Chapter  Google Scholar 

  13. Kamishima, T., Akaho, S.: Learning from order examples. In: Proceedings of the 2nd IEEE International Conference on Data Mining, pp. 645–648 (2002)

    Google Scholar 

  14. Lavrač, N., Džeroski, S.: Inductive Logic Programming: techniques and applications. Ellis Horwood, Chichester (1994)

    Google Scholar 

  15. Levi, G., Sirovich, F.: Generalized and/or graphs. Artificial Intelligence 7(3), 243–259 (1976)

    Article  MATH  MathSciNet  Google Scholar 

  16. Lloyd, J.W.: Foundations of Logic Programming, 2nd edn. Springer, Berlin (1987)

    MATH  Google Scholar 

  17. Malerba, D.: Learning recursive theories in the normal ILP setting. Fundamenta Informaticae 57(1), 39–77 (2003)

    MATH  MathSciNet  Google Scholar 

  18. Malerba, D., Esposito, F., Altamura, O., Ceci, M., Berardi, M.: Correcting the document layout: A machine learning approach. In: ICDAR 2003: 7th International Conference on Document Analysis and Recognition, p. 97 (2003)

    Google Scholar 

  19. Mannila, H., Meek, C.: Global partial orders from sequential data. In: KDD 2000: Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 161–168. ACM Press, New York (2000)

    Chapter  Google Scholar 

  20. Maruster, L., Weijters, A., van der Aalst, W., van den Bosch, A.: Process mining: Discovering direct successors in process logs. In: Lange, S., Satoh, K., Smith, C.H. (eds.) DS 2002. LNCS, vol. 2534, pp. 364–373. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  21. Meunier, J.-L.: Optimized xy-cut for determining a page reading order. In: ICDAR 2005: 8th International Conference on Document Analysis and Recognition, pp. 347–351. IEEE Computer Society, Washington (2005)

    Google Scholar 

  22. Mitchell, T.: Machine Learning. McGraw-Hill, New York (1997)

    MATH  Google Scholar 

  23. Muggleton, S.: Inductive Logic Programming. Academic Press, London (1992)

    MATH  Google Scholar 

  24. Nienhuys-Cheng, S.-W., de Wolf, R.: Foundations of inductive logic programming. Springer, Heidelberg (1997)

    Google Scholar 

  25. Taylor, S.L., Dahl, D.A., Lipshutz, M., Weir, C., Norton, L.M., Nilson, R., Linebarger, M.: Integrated text and image understanding for document understanding. In: HLT 1994: Proceedings of the workshop on Human Language Technology, pp. 421–426 (1994)

    Google Scholar 

  26. Tsujimoto, S., Asada, H.: Understanding multi-articled documents. In: Proceedings of the 10th International Conference on Pattern Recognition, pp. 551–556 (1990)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Zbigniew W. Raś Shusaku Tsumoto Djamel Zighed

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Malerba, D., Ceci, M. (2008). Learning to Order: A Relational Approach. In: Raś, Z.W., Tsumoto, S., Zighed, D. (eds) Mining Complex Data. MCD 2007. Lecture Notes in Computer Science(), vol 4944. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68416-9_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-68416-9_17

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-68415-2

  • Online ISBN: 978-3-540-68416-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics