Finding Total and Partial Orders from Data for Seriation

  • Heikki Mannila
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5255)

Abstract

Ordering and ranking items of different types (observations, web pages, etc.) are important tasks in various applications, such as query processing and scientific data mining. We consider different problems of inferring total or partial orders from data, with special emphasis on applications to the seriation problem in paleontology. Seriation can be viewed as the task of ordering rows of a 0-1 matrix so that certain conditions hold. We review different approaches to this task, including spectral ordering methods, techniques for finding partial orders, and probabilistic models using MCMC methods.

Joint work with Antti Ukkonen, Aris Gionis, Mikael Fortelius, Kai Puolamäki, and Jukka Jernvall.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Agrawal, S., Chaudhuri, S., Das, G., Gionis, A.: Automated ranking of database query results. In: CIDR (2003)Google Scholar
  2. 2.
    Chaudhuri, S., Das, G., Hristidis, V., Weikum, G.: Probabilistic ranking of database query results. In: Proceedings of the 30th International Conference on Very Large Data Bases (VLDB) (2004)Google Scholar
  3. 3.
    Fagin, R., Kumar, R., Mahdian, M., Sivakumar, D., Vee, E.: Comparing and aggregating rankings with ties. In: Proceedings of the 23rd ACM Symposium on Principles of Database Systems (PODS) (2004)Google Scholar
  4. 4.
    Fagin, R., Kumar, R., Sivakumar, D.: Comparing top k lists. In: Proceedings of the ACM-SIAM Symposium on Discrete Algorithms (SODA) (2003)Google Scholar
  5. 5.
    Fagin, R., Kumar, R., Sivakumar, D.: Efficient similarity search and classification via rank aggregation. In: Proceedings of the ACM Conference on Management of Data (SIGMOD) (2003)Google Scholar
  6. 6.
    Ilyas, I.F., Shah, R., Aref, W.G., Vitter, J.S., Elmagarmid, A.K.: Rank-aware query optimization. In: Proceedings of the ACM Conference on Management of Data (SIGMOD) (2004)Google Scholar
  7. 7.
    Li, C., Chang, K., Ilyas, I., Song, S.: Query algebra and optimization for relational top-k queries. In: Proceedings of the ACM Conference on Management of Data (SIGMOD) (2005)Google Scholar
  8. 8.
    Borodin, A., Roberts, G.O., Rosenthal, J.S., Tsaparas, P.: Link analysis ranking: Algorithms, theory, and experiments. ACM Transactions on Internet Technology 5(1) (2005)Google Scholar
  9. 9.
    Brin, S., Page, L.: The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems 30(1–7), 107–117 (1998)CrossRefGoogle Scholar
  10. 10.
    Dwork, C., Kumar, R., Naor, M., Sivakumar, D.: Rank aggregation methods for the web. In: Proceedings of the 10th International World Wide Web Conference (WWW) (2001)Google Scholar
  11. 11.
    Haveliwala, T.: Topic-sensitive pagerank. In: Proceedings of the 11th International World Wide Web Conference (WWW) (2002)Google Scholar
  12. 12.
    Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. Journal of the ACM 46(5) (1999)Google Scholar
  13. 13.
    Cohen, W.W., Schapire, R.E., Singer, Y.: Learning to order things. Journal of Artificial Intelligence Research 10, 243–270 (1999)MathSciNetMATHGoogle Scholar
  14. 14.
    Crammer, K., Singer, Y.: Pranking with ranking. In: Conference on Neural Information Processing Systems (NIPS) (2001)Google Scholar
  15. 15.
    Fürnkranz, J., Hüllermeier, E.: Pairwise preference learning and ranking. In: Lavrač, N., Gamberger, D., Todorovski, L., Blockeel, H. (eds.) ECML 2003. LNCS (LNAI), vol. 2837. Springer, Heidelberg (2003)Google Scholar
  16. 16.
    Lebanon, G., Lafferty, J.D.: Cranking: Combining rankings using conditional probability models on permutations. In: ICML (2002)Google Scholar
  17. 17.
    Gionis, A., Kujala, T., Mannila, H.: Fragments of order. In: KDD 2003 (2003)Google Scholar
  18. 18.
    Puolamäki, K., Fortelius, M., Mannila, H.: Seriation in paleontological data using Markov Chain Monte Carlo methods. PLoS Computational Biology 2(2) (February 2006)Google Scholar
  19. 19.
    Gionis, A., Mannila, H., Puolamaki, K., Ukkonen, A.: Algorithms for discovering bucket orders from data. In: KDD (2006)Google Scholar
  20. 20.
    Fortelius, M., Gionis, A., Jernvall, J., Mannila, H.: Spectral ordering and biochronology of european fossil mammals. Paleobiology 32(2), 206–214 (2006)CrossRefGoogle Scholar
  21. 21.
    Ukkonen, A.: Algorithms for Finding Orders and Analyzing Sets of Chains. PhD thesis, Helsinki University of Technology (2008)Google Scholar
  22. 22.
    Ukkonen, A., Mannila, H.: Finding outlying items in sets of partial rankings. In: Kok, J.N., Koronacki, J., López de Mántaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) PKDD 2007. LNCS (LNAI), vol. 4702. Springer, Heidelberg (2007)Google Scholar
  23. 23.
    Booth, K.S., Lueker, G.S.: Testing for the consecutive ones property, interval graphs, and graph planarity using P-Q tree algorithms. J. of Comp. and Syst. Sci. 13, 335–379 (1976)MathSciNetMATHCrossRefGoogle Scholar
  24. 24.
    Hsu, W.L.: A simple test for the consecutive ones property. Journal of Algorithms 43 (2002)Google Scholar
  25. 25.
    Brower, J., Kile, K.: Seriation of an original data matrix as applied to palaeoecology. Lethaia 21, 79–93 (1988)CrossRefGoogle Scholar
  26. 26.
    Atkins, J.E., Boman, E.G., Hendrickson, B.: A spectral algorithm for seriation and the consecutive ones problem. SIAM Journal on Computing 28(1), 297–310 (1999)MathSciNetCrossRefGoogle Scholar
  27. 27.
    Ng, A., Jordan, M., Weiss, Y.: On spectral clustering: Analysis and an algorithm. In: Advances in Neural Information Processing Systems (2001)Google Scholar
  28. 28.
    Azar, Y., Fiat, A., Karlin, A.R., McSherry, F., Saia, J.: Spectral analysis of data. In: ACM Symposium on Theory of Computing (2000)Google Scholar
  29. 29.
    Chung, F.R.K.: Spectral Graph Theory. CBMS Regional Conference Series in Mathematics (1997)Google Scholar
  30. 30.
    Hill, M.: Correspondence analysis: A neglected multivariate method. Applied Statistics 23, 340–354 (1974)CrossRefGoogle Scholar
  31. 31.
    Kendall, D.G.: Abundance matrices and seriation in archaeology. Z. Wahscheinlichkeitstheorie verw. Geb. 17, 104–112 (1971)MathSciNetCrossRefGoogle Scholar
  32. 32.
    Fortelius, M.: Neogene of the old world database of fossil mammals (NOW) (2006), http://www.helsinki.fi/science/now/
  33. 33.
    Ukkonen, A., Fortelius, M., Mannila, H.: Finding partial orders from unordered 0-1 data. In: Proceedings of the 11th ACM Conference on Knowledge Discovery and Data Mining (KDD) (2005)Google Scholar
  34. 34.
    Mannila, H., Meek, C.: Global partial orders from sequential data. In: KDD (2000)Google Scholar
  35. 35.
    Wilf, H.S.: Generatingfunctionology. Academic Press, London (1994), http://www.math.upenn.edu/~wilf/DownldGF.html Google Scholar
  36. 36.
    Ailon, N., Charikar, M., Newman, A.: Aggregating inconsistent information: ranking and clustering. In: Proceedings of the 37th ACM Symposium on Theory of Computing (STOC) (2005)Google Scholar
  37. 37.
    Coppersmith, D., Fleischer, L., Rudra, A.: Ordering by weighted number of wins gives a good ranking for weighted tournaments. In: Proceedings of the Seventeenth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 776–782 (2006)Google Scholar
  38. 38.
    van Zuylen, A., Hegde, R., Jain, K., Williamson, D.P.: Deterministic pivoting algorithms for constrained ranking and clustering problems. In: Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms, pp. 405–414 (2007)Google Scholar
  39. 39.
    Kempe, D., Kleinberg, J.M., Tardos, E.: Maximizing the spread of influence through a social network. In: KDD, pp. 137–146 (2003)Google Scholar

Copyright information

© Springer Berlin Heidelberg 2008

Authors and Affiliations

  • Heikki Mannila
    • 1
  1. 1.HIITHelsinki University of Technology and University of HelsinkiFinland

Personalised recommendations