Faceted Browsing over Social Media

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7678)


The popularity of social media as a medium for sharing information has made extracting information of interest a challenge. In this work we provide a system that can return posts published on social media covering various aspects of a concept being searched. We present a faceted model for navigating social media that provides a consistent, usable and domain-agnostic method for extracting information from social media. We present a set of domain independent facets and empirically prove the feasibility of mapping social media content to the facets we chose. Next, we show how we can map these facets to social media sites, living documents that change periodically to topics that capture the semantics expressed in them. This mapping is used as a graph to compute the various facets of interest to us. We learn a profile of the content creator, enable content to be mapped to semantic concepts for easy navigation and detect similarity among sites to either suggest similar pages or determine pages that express different views.


Social Medium Latent Dirichlet Allocation Inverse Document Frequency Word Cloud Topic Extraction 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Kritikopoulos, A., Sideri, M., Varlamis, I.: Blogrank: Ranking weblogs based on connectivity and similarity features. In: 2nd International Workshop on Advanced Architectures and Algorithms for Internet Delivery and Applications, NY, USA (2006)Google Scholar
  2. 2.
    English, J., Hearst, M., Sinha, R., Swearingen, K., Yee, P.: Hierarchical faceted metadata in site search interfaces. In: CHI Conference Companion (2002)Google Scholar
  3. 3.
    Ranganathan, S.: Elements of library classification. Asia Publishing House (1962)Google Scholar
  4. 4.
    Kumar, R., Raghavan, P., Rajagopalan, S., Tomkins, A.: Trawling the web for emerging cyber communities. In: WWW (1999)Google Scholar
  5. 5.
    Glance, N., Hurst, M., Nigam, K., Siegler, M., Stockton, R., Tomokiyo, T.: Deriving marketing intelligence from online discussion. In: KDD, pp. 419–428. ACM (2005)Google Scholar
  6. 6.
    Agarwal, N., Liu, H., Tang, L., Yu, P.S.: Identifying the influential bloggers in a community. In: WSDM, pp. 207–218. ACM, New York (2008)CrossRefGoogle Scholar
  7. 7.
    Leskovec, J., McGlohon, M., Faloutsos, C., Glance, N.S., Hurst, M.: Patterns of cascading behavior in large blog graphs. In: SDM (2007)Google Scholar
  8. 8.
    Chi, Y., Zhu, S., Song, X., Tatemura, J., Tseng, B.L.: Structural and temporal analysis of the blogosphere through community factorization. In: KDD, pp. 163–172. ACM (2007)Google Scholar
  9. 9.
    Qu, L., Müller, C., Gurevych, I.: Using tag semantic network for keyphrase extraction in blogs. In: CIKM, pp. 1381–1382. ACM, New York (2008)CrossRefGoogle Scholar
  10. 10.
    Brooks, C.H., Montanez, N.: Improved annotation of the blogosphere via autotagging and hierarchical clustering. In: WWW (2006)Google Scholar
  11. 11.
    Li, B., Xu, S., Zhang, J.: Enhancing clustering blog documents by utilizing author/reader comments. In: Proceedings of the 45th Annual ACM Southeast Regional Conference (2007)Google Scholar
  12. 12.
    Brin, S., Page, L.: The anatomy of a large-scale hypertextual web search engine. Computer Networks (1998)Google Scholar
  13. 13.
    Hotho, A., Jäschke, R., Schmitz, C., Stumme, G.: Information Retrieval in Folksonomies: Search and Ranking. In: Sure, Y., Domingue, J. (eds.) ESWC 2006. LNCS, vol. 4011, pp. 411–426. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  14. 14.
    Bao, S., Xue, G., Wu, X., Yu, Y., Fei, B., Su, Z.: Optimizing web search using social annotations. In: WWW, pp. 501–510 (2007)Google Scholar
  15. 15.
    Blei, D., Ng, A., Jordan, M.: Latent dirichlet allocation. Journal of Machine Learning Research 3(4-5), 993–1022 (2003)zbMATHGoogle Scholar
  16. 16.
    Porteous, I., Newman, D., Alexander, I., Asuncion, A., Smyth, P., Welling, M.: Fast collapsed gibbs sampling for latent dirichlet allocation. In: KDD, pp. 569–577 (2008)Google Scholar
  17. 17.
    Shannon, C.E.: Prediction and entropy of printed english. The Bell System Technical Journal (1951)Google Scholar
  18. 18.
    Kumar, S., Barbier, G., Abbasi, M.A., Liu, H.: TweetTracker: An Analysis Tool for Humanitarian and Disaster Relief. In: ICWSM (2011)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  1. 1.EMC India COEBangaloreIndia
  2. 2.IBM Research LabNew DelhiIndia
  3. 3.Computer Science & Engg, SCIDSEArizona State UniversityUSA

Personalised recommendations