Skip to main content

Online Lattice-Based Abstraction of User Groups

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10438))

Abstract

User data is becoming increasingly available in various domains from the social Web to location check-ins and smartphone usage traces. Due to the sparsity and impurity of user data, we propose to analyze labeled groups of users instead of individuals, e.g., “countryside teachers who watch Woody Allen movies.” When chosen appropriately, labeled groups provide quick and useful insights on user data. Analysis of user groups is often non-trivial due its huge volume. In this paper, we introduce AugMan, a framework for the efficient summarization of user groups via abstraction. Our framework performs a dynamic, data-driven and lossless abstraction which helps analysts obtain high quality insights on user data without being overwhelmed. Our experiments show that AugManoffers representative and informative abstractions in a scalable fashion.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    The dataset is publicly available at https://goo.gl/ZQ6doV.

References

  1. Omidvar-Tehrani, B., Amer-Yahia, S., Termier, A.: Interactive user group analysis. In: CIKM 2015 (2015)

    Google Scholar 

  2. Parida, L.: Redescription mining: structure theory and algorithms. In: AAAI 2005 (2005)

    Google Scholar 

  3. Amer-Yahia, S., Tehrani, B.O., Roy, S.B., Shabib, N.: Group recommendation with temporal affinities. In: EDBT (2015)

    Google Scholar 

  4. Kargar, M., An, A., Zihayat, M.: Efficient bi-objective team formation in social networks. In: Flach, P.A., Bie, T., Cristianini, N. (eds.) ECML PKDD 2012. LNCS, vol. 7524, pp. 483–498. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33486-3_31

    Chapter  Google Scholar 

  5. Cao, C.C., She, J., Tong, Y., Chen, L.: Whom to ask?: jury selection for decision making tasks on micro-blog services. VLDB (2012)

    Google Scholar 

  6. Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks. Phys. Rev. E 69(2), 026113 (2004)

    Article  Google Scholar 

  7. Van Leeuwen, M., Ukkonen, A.: Discovering skylines of subgroup sets. In: Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (eds.) ECML PKDD 2013. LNCS, vol. 8190, pp. 272–287. Springer, Heidelberg (2013). doi:10.1007/978-3-642-40994-3_18

    Chapter  Google Scholar 

  8. Jordan, M., Pfarr, N.: Forget the quantified-self, we need to build the quantified-us (2014)

    Google Scholar 

  9. Bayer, J., Taillard, M.: Story-driven data analysis (2013)

    Google Scholar 

  10. Vreeken, J., Van Leeuwen, M., Siebes, A.: Krimp: mining itemsets that compress. Data Mining Knowl. Discov. 23(1), 169–214 (2011)

    Google Scholar 

  11. Das, M., Amer-Yahia, S., Das, G., Yu, C.: Mri: meaningful interpretations of collaborative ratings. PVLDB 4(11), 1063–1074 (2011)

    Google Scholar 

  12. Fopa, L., Jouanot, F., Termier, A., Tchuente, M., Iegorov, O.: Benchmarking of triple stores scalability for MPSoC trace analysis. In: 2nd International workshop on Benchmarking RDF Systems (BeRSys 2014) (2014)

    Google Scholar 

  13. Omidvar-Tehrani, B., Amer-Yahia, S., Termier, A., Bertaux, A., Gaussier, É., Rousset, M.-C.: Towards a framework for semantic exploration of frequent patterns. In: IMMoA (2013)

    Google Scholar 

  14. Srikant, R., Agrawal, R.: Mining Generalized Association Rules. IBM Research Division, New York (1995)

    Google Scholar 

  15. Marinica, C., Guillet, F., Briand, H.: Post-processing of discovered association rules using ontologies. In: ICDMW. IEEE (2008)

    Google Scholar 

  16. Uno, T., Kiyomi, M., Arimura, H.: LCM ver. 2: efficient mining algorithms for frequent/closed/maximal itemsets. In: Workshop on Frequent Itemset Mining Implementations (2004)

    Google Scholar 

  17. Omidvar-Tehrani, B., Amer-Yahia, S., Dutot, P.-F., Trystram, D.: Multi-objective group discovery on the social web. In: Frasconi, P., Landwehr, N., Manco, G., Vreeken, J. (eds.) ECML PKDD 2016. LNCS, vol. 9851, pp. 296–312. Springer, Cham (2016). doi:10.1007/978-3-319-46128-1_19

    Chapter  Google Scholar 

  18. Grouplens. Movielens dataset: Grouplens research group. http://grouplens.org/datasets/movielens/

  19. Gray, J., Chaudhuri, S., Bosworth, A., Layman, A., Reichart, D., Venkatrao, M., Pellow, F., Pirahesh, H.: Data cube: a relational aggregation operator generalizing group-by, cross-tab, and sub-totals. Data Mining Knowl. Discov. 1(1) (1997)

    Google Scholar 

  20. Ziegler, C.-N.: Book-crossing dataset. http://www2.informatik.uni-freiburg.de/~cziegler/BX/

  21. LastFM. Million song dataset. https://labrosa.ee.columbia.edu/millionsong/lastfm

  22. DBLP. Bibliographic database for computer sciences. https://hpi.de/naumann/projects/repeatability/datasets/dblp-dataset.html

  23. Agrawal, R., Gehrke, J., Gunopulos, D., Raghavan, P.: Automatic subspace clustering of high dimensional data for data mining applications, vol. 27. ACM (1998)

    Google Scholar 

  24. Amiri, B., Hossain, L., Crowford, J.: A multiobjective hybrid evolutionary algorithm for clustering in social networks. In: Proceedings of the 14th Annual Conference Companion on Genetic and Evolutionary Computation. ACM (2012)

    Google Scholar 

  25. Cruz, J.D., Bothorel, C., Poulet, F.: Entropy based community detection in augmented social networks. In: CASoN. IEEE (2011)

    Google Scholar 

  26. Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. In: SIGMOD (1993)

    Google Scholar 

  27. Liu, Z., Heer, J.: The effects of interactive latency on exploratory visual analysis. IEEE TVCG 20(12) (2014)

    Google Scholar 

  28. IMDb. Internet movie database. http://www.imdb.com

  29. Miller, G.: Human memory and the storage of information. IRE Trans. Inf. Theory 2(3), 129–137 (1956)

    Article  Google Scholar 

  30. Kamat, N., Jayachandran, P., Tunga, K., Nandi, A.: Distributed and interactive cube exploration. In: 2014 IEEE 30th International Conference on Data Engineering (ICDE). IEEE (2014)

    Google Scholar 

  31. Huh, S.-Y., Moon, K.-H., Lee, H.: A data abstraction approach for query relaxation. Inf. Softw. Technol. 42(6), 407–418 (2000)

    Article  Google Scholar 

  32. Bertini, E., Santucci, G.: Quality metrics for 2D scatterplot graphics: automatically reducing visual clutter. In: Butz, A., Krüger, A., Olivier, P. (eds.) SG 2004. LNCS, vol. 3031, pp. 77–89. Springer, Heidelberg (2004). doi:10.1007/978-3-540-24678-7_8

    Chapter  Google Scholar 

  33. Kabadayi, S., Julien, C.: A local data abstraction and communication paradigm for pervasive computing. In: PerCom. IEEE (2007)

    Google Scholar 

  34. Sankar, K., Sobha, L.: An approach to text summarization. In: PCLIAWS3. Association for Computational Linguistics (2009)

    Google Scholar 

  35. Bonchi, F., Giannotti, F., Mazzanti, A., Pedreschi, D.: ExAnte: anticipated data reduction in constrained pattern mining. In: Lavrač, N., Gamberger, D., Todorovski, L., Blockeel, H. (eds.) PKDD 2003. LNCS (LNAI), vol. 2838, pp. 59–70. Springer, Heidelberg (2003). doi:10.1007/978-3-540-39804-2_8

    Chapter  Google Scholar 

  36. Xin, D., Shen, X., Mei, Q., Han, J.: Discovering interesting patterns through user’s interactive feedback. In: KDD (2006)

    Google Scholar 

  37. De Bie, T., Kontonasios, K.-N., Spyropoulou, E.: A framework for mining interesting pattern sets. SIGKDD Explor. 12, 92–100 (2011)

    Article  Google Scholar 

  38. Nandi, A., Yu, C., Bohannon, P., Ramakrishnan, R.: Distributed cube materialization on holistic measures. In: 2011 IEEE 27th International Conference on Data Engineering (ICDE), pp. 183–194. IEEE (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Behrooz Omidvar-Tehrani .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Omidvar-Tehrani, B., Amer-Yahia, S. (2017). Online Lattice-Based Abstraction of User Groups. In: Benslimane, D., Damiani, E., Grosky, W., Hameurlain, A., Sheth, A., Wagner, R. (eds) Database and Expert Systems Applications. DEXA 2017. Lecture Notes in Computer Science(), vol 10438. Springer, Cham. https://doi.org/10.1007/978-3-319-64468-4_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-64468-4_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-64467-7

  • Online ISBN: 978-3-319-64468-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics