Advertisement

Online Lattice-Based Abstraction of User Groups

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10438)

Abstract

User data is becoming increasingly available in various domains from the social Web to location check-ins and smartphone usage traces. Due to the sparsity and impurity of user data, we propose to analyze labeled groups of users instead of individuals, e.g., “countryside teachers who watch Woody Allen movies.” When chosen appropriately, labeled groups provide quick and useful insights on user data. Analysis of user groups is often non-trivial due its huge volume. In this paper, we introduce AugMan, a framework for the efficient summarization of user groups via abstraction. Our framework performs a dynamic, data-driven and lossless abstraction which helps analysts obtain high quality insights on user data without being overwhelmed. Our experiments show that AugManoffers representative and informative abstractions in a scalable fashion.

References

  1. 1.
    Omidvar-Tehrani, B., Amer-Yahia, S., Termier, A.: Interactive user group analysis. In: CIKM 2015 (2015)Google Scholar
  2. 2.
    Parida, L.: Redescription mining: structure theory and algorithms. In: AAAI 2005 (2005)Google Scholar
  3. 3.
    Amer-Yahia, S., Tehrani, B.O., Roy, S.B., Shabib, N.: Group recommendation with temporal affinities. In: EDBT (2015)Google Scholar
  4. 4.
    Kargar, M., An, A., Zihayat, M.: Efficient bi-objective team formation in social networks. In: Flach, P.A., Bie, T., Cristianini, N. (eds.) ECML PKDD 2012. LNCS, vol. 7524, pp. 483–498. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-33486-3_31 CrossRefGoogle Scholar
  5. 5.
    Cao, C.C., She, J., Tong, Y., Chen, L.: Whom to ask?: jury selection for decision making tasks on micro-blog services. VLDB (2012)Google Scholar
  6. 6.
    Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks. Phys. Rev. E 69(2), 026113 (2004)CrossRefGoogle Scholar
  7. 7.
    Van Leeuwen, M., Ukkonen, A.: Discovering skylines of subgroup sets. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds.) ECML PKDD 2013. LNCS, vol. 8190, pp. 272–287. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-40994-3_18 CrossRefGoogle Scholar
  8. 8.
    Jordan, M., Pfarr, N.: Forget the quantified-self, we need to build the quantified-us (2014)Google Scholar
  9. 9.
    Bayer, J., Taillard, M.: Story-driven data analysis (2013)Google Scholar
  10. 10.
    Vreeken, J., Van Leeuwen, M., Siebes, A.: Krimp: mining itemsets that compress. Data Mining Knowl. Discov. 23(1), 169–214 (2011)Google Scholar
  11. 11.
    Das, M., Amer-Yahia, S., Das, G., Yu, C.: Mri: meaningful interpretations of collaborative ratings. PVLDB 4(11), 1063–1074 (2011)Google Scholar
  12. 12.
    Fopa, L., Jouanot, F., Termier, A., Tchuente, M., Iegorov, O.: Benchmarking of triple stores scalability for MPSoC trace analysis. In: 2nd International workshop on Benchmarking RDF Systems (BeRSys 2014) (2014)Google Scholar
  13. 13.
    Omidvar-Tehrani, B., Amer-Yahia, S., Termier, A., Bertaux, A., Gaussier, É., Rousset, M.-C.: Towards a framework for semantic exploration of frequent patterns. In: IMMoA (2013)Google Scholar
  14. 14.
    Srikant, R., Agrawal, R.: Mining Generalized Association Rules. IBM Research Division, New York (1995)Google Scholar
  15. 15.
    Marinica, C., Guillet, F., Briand, H.: Post-processing of discovered association rules using ontologies. In: ICDMW. IEEE (2008)Google Scholar
  16. 16.
    Uno, T., Kiyomi, M., Arimura, H.: LCM ver. 2: efficient mining algorithms for frequent/closed/maximal itemsets. In: Workshop on Frequent Itemset Mining Implementations (2004)Google Scholar
  17. 17.
    Omidvar-Tehrani, B., Amer-Yahia, S., Dutot, P.-F., Trystram, D.: Multi-objective group discovery on the social web. In: Frasconi, P., Landwehr, N., Manco, G., Vreeken, J. (eds.) ECML PKDD 2016. LNCS, vol. 9851, pp. 296–312. Springer, Cham (2016). doi: 10.1007/978-3-319-46128-1_19 CrossRefGoogle Scholar
  18. 18.
    Grouplens. Movielens dataset: Grouplens research group. http://grouplens.org/datasets/movielens/
  19. 19.
    Gray, J., Chaudhuri, S., Bosworth, A., Layman, A., Reichart, D., Venkatrao, M., Pellow, F., Pirahesh, H.: Data cube: a relational aggregation operator generalizing group-by, cross-tab, and sub-totals. Data Mining Knowl. Discov. 1(1) (1997)Google Scholar
  20. 20.
    Ziegler, C.-N.: Book-crossing dataset. http://www2.informatik.uni-freiburg.de/~cziegler/BX/
  21. 21.
  22. 22.
    DBLP. Bibliographic database for computer sciences. https://hpi.de/naumann/projects/repeatability/datasets/dblp-dataset.html
  23. 23.
    Agrawal, R., Gehrke, J., Gunopulos, D., Raghavan, P.: Automatic subspace clustering of high dimensional data for data mining applications, vol. 27. ACM (1998)Google Scholar
  24. 24.
    Amiri, B., Hossain, L., Crowford, J.: A multiobjective hybrid evolutionary algorithm for clustering in social networks. In: Proceedings of the 14th Annual Conference Companion on Genetic and Evolutionary Computation. ACM (2012)Google Scholar
  25. 25.
    Cruz, J.D., Bothorel, C., Poulet, F.: Entropy based community detection in augmented social networks. In: CASoN. IEEE (2011)Google Scholar
  26. 26.
    Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. In: SIGMOD (1993)Google Scholar
  27. 27.
    Liu, Z., Heer, J.: The effects of interactive latency on exploratory visual analysis. IEEE TVCG 20(12) (2014)Google Scholar
  28. 28.
    IMDb. Internet movie database. http://www.imdb.com
  29. 29.
    Miller, G.: Human memory and the storage of information. IRE Trans. Inf. Theory 2(3), 129–137 (1956)CrossRefGoogle Scholar
  30. 30.
    Kamat, N., Jayachandran, P., Tunga, K., Nandi, A.: Distributed and interactive cube exploration. In: 2014 IEEE 30th International Conference on Data Engineering (ICDE). IEEE (2014)Google Scholar
  31. 31.
    Huh, S.-Y., Moon, K.-H., Lee, H.: A data abstraction approach for query relaxation. Inf. Softw. Technol. 42(6), 407–418 (2000)CrossRefGoogle Scholar
  32. 32.
    Bertini, E., Santucci, G.: Quality metrics for 2D scatterplot graphics: automatically reducing visual clutter. In: Butz, A., Krüger, A., Olivier, P. (eds.) SG 2004. LNCS, vol. 3031, pp. 77–89. Springer, Heidelberg (2004). doi: 10.1007/978-3-540-24678-7_8 CrossRefGoogle Scholar
  33. 33.
    Kabadayi, S., Julien, C.: A local data abstraction and communication paradigm for pervasive computing. In: PerCom. IEEE (2007)Google Scholar
  34. 34.
    Sankar, K., Sobha, L.: An approach to text summarization. In: PCLIAWS3. Association for Computational Linguistics (2009)Google Scholar
  35. 35.
    Bonchi, F., Giannotti, F., Mazzanti, A., Pedreschi, D.: ExAnte: anticipated data reduction in constrained pattern mining. In: Lavrač, N., Gamberger, D., Todorovski, L., Blockeel, H. (eds.) PKDD 2003. LNCS (LNAI), vol. 2838, pp. 59–70. Springer, Heidelberg (2003). doi: 10.1007/978-3-540-39804-2_8 CrossRefGoogle Scholar
  36. 36.
    Xin, D., Shen, X., Mei, Q., Han, J.: Discovering interesting patterns through user’s interactive feedback. In: KDD (2006)Google Scholar
  37. 37.
    De Bie, T., Kontonasios, K.-N., Spyropoulou, E.: A framework for mining interesting pattern sets. SIGKDD Explor. 12, 92–100 (2011)CrossRefGoogle Scholar
  38. 38.
    Nandi, A., Yu, C., Bohannon, P., Ramakrishnan, R.: Distributed cube materialization on holistic measures. In: 2011 IEEE 27th International Conference on Data Engineering (ICDE), pp. 183–194. IEEE (2011)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.The Ohio State UniversityColumbusUSA
  2. 2.CNRSParisFrance

Personalised recommendations