Resolving Name Conflicts for Mobile Apps in Twitter Posts

  • Sangaralingam Kajanan
  • Ahmed Shafeeq Bin Mohd Shariff
  • Kaushik Dutta
  • Anindya Datta
Part of the IFIP Advances in Information and Communication Technology book series (IFIPAICT, volume 389)


The Twitter platform has emerged as a leading medium of conducting social commentary, where users remark upon all kinds of entities, events and occurrences. As a result, organizations are starting to mine twitter posts to unearth the knowledge encoded in such commentary. Mobile applications, commonly known as mobile apps, are the fastest growing consumer product segment in the history of human merchandizing, with over 600,000 apps on the Apple platform and over 350,000 on Android. A particularly interesting issue is to evaluate the popularity of specific mobile apps by analyzing the social conversation on them. Clearly, twitter posts related to apps are an important segment of this conversation and have been a main area of research for us. In this respect, one particularly important problem arises due to a name conflict of mobile app names and the names that are used to refer the mobile apps in twitter posts. In this paper, we present a strategy to reliably extract twitter posts that are related to specific apps, but discovering the contextual clues that enable effective filtering of irrelevant twitter posts is our concern. While our application is in the important space of mobile apps, our techniques are completely general and may be applied to any entity class. We have evaluated our approach against a popular Bayesian classifier and a commercial solution. We have demonstrated that our approach is significantly more accurate than both of these. These results as well as other theoretical and practical implications are discussed.


Affinity Microblogs Twitter Mobile Apps Filter 


  1. 1.
    Androutsopoulos, I., Koutsias, J., Chandrinos, K.V., Ch, K.V., Paliouras, G., Spyropoulos, C.D.: An evaluation of naive bayesian anti-spam filtering, 9–17 (2000)Google Scholar
  2. 2.
    Apache, Open nlp, (last accessed July, 2012)
  3. 3.
    AppleiOS. Apple-ios, (last accessed on July 14, 2011)
  4. 4.
    Banerjee, S.: Clustering short texts using wikipedia. In: Proceedings of the 30th Annual International ACM SIGIR Conference (2007)Google Scholar
  5. 5.
    Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines (2001)Google Scholar
  6. 6.
    Cui, H., Wen, J.-R., Nie, J.-Y., Ma, W.-Y.: Query expansion by mining user logs. IEEE Transactions on Knowledge and Data Engineering 15(4), 829–839 (2003)CrossRefGoogle Scholar
  7. 7.
    Dent, K., Paul, S.: Through the twitter glass: Detecting questions in micro-text. In: Proceedings of AAAI 2011 Workshop on Analyzing Microtext (2011)Google Scholar
  8. 8.
    Fausett, L. (ed.): Fundamentals of neural networks: architectures, algorithms, and applications. Prentice-Hall, Inc., Upper Saddle River (1994)zbMATHGoogle Scholar
  9. 9.
    Filtertweets, Filter tweets for greasemonkey, (last accessed June 27, 2012)
  10. 10.
    Google Inc. Android developers, (last accessed on July 14, 2011)
  11. 11.
    Google Inc. Google search appliance help center, (last accessed on May 13, 2011)
  12. 12.
    Hu, X., Sun, N., Zhang, C., Chua, T.-S.: Exploiting internal and external semantics for the clustering of short texts using world knowledge. In: CIKM 2009: Proceeding of the 18th ACM Conference on Information and Knowledge Management, pp. 919–928. ACM, New York (2009)CrossRefGoogle Scholar
  13. 13.
    Jansen, B.J., Liu, Z., Weaver, C., Campbell, G., Gregg, M.: Real time search on the web: Queries, topics, and economic value. Inf. Process. Manage. 47, 491–506 (2011)CrossRefGoogle Scholar
  14. 14.
    Lewis, D.D.: Naive (bayes) at forty: The independence assumption in information retrieval, pp. 4–15. Springer (1998)Google Scholar
  15. 15.
    MacQueen, J.B.: Some methods for classification and analysis of multivariate observations. In: Cam, L.M.L., Neyman, J. (eds.) Proc. of the fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297. University of California Press (1967)Google Scholar
  16. 16.
    Markets & Markets. World mobile applications market - advanced technologies, global forecast (2010-2015), (last accessed on May 13, 2011)
  17. 17.
    Mashable. Mobile app market to surge to $17.5 billion by 2012, (last accessed on May 13, 2011)
  18. 18.
    Mobilewalla. Mobilewalla-an app search engine, (last accessed on May 13, 2012)
  19. 19.
    Nigam, K.: Using maximum entropy for text classification. In: IJCAI 1999 Workshop on Machine Learning for Information Filtering, pp. 61–67 (1999)Google Scholar
  20. 20.
    One Riot. The inner workings of a realtime search engine (2009)Google Scholar
  21. 21.
    PC World. It’s android vs. apple: Will you switch sides?, (last accessed on May 13, 2011)
  22. 22.
    Sahami, M., Dumais, S., Heckerman, D., Horvitz, E.: A bayesian approach to filtering junk E-mail. In: Learning for Text Categorization: Papers from the 1998 Workshop, Madison, Wisconsin, AAAI Technical Report WS-98-05 (1998)Google Scholar
  23. 23.
    Sankaranarayanan, J., Samet, H., Teitler, B.E., Lieberman, M.D., Sperling, J.: Twitterstand: news in tweets. In: Proceedings of the 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, GIS 2009, pp. 42–51. ACM, New York (2009)Google Scholar
  24. 24.
    Sarkas, N., Bansal, N., Das, G., Koudas, N.: Measure-driven keyword-query expansion. Proceedings of the Vldb Endowment 2, 121–132 (2009)Google Scholar
  25. 25.
    Schneider, K.-M.: A comparison of event models for naive bayes anti-spam e-mail filtering. In: Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2003), pp. 307–314 (2003)Google Scholar
  26. 26.
    SocialMention. Social mention, (last accessed on November 5, 2011)
  27. 27.
    SocialMention. Social mention api, (last accessed on November 5, 201)
  28. 28.
    Spärck Jones, K.: A statistical interpretation of term specificity and its application in retrieval. Journal of Documentation 28(1), 11–21 (1972)CrossRefGoogle Scholar
  29. 29.
    Sriram, B., Fuhry, D., Demir, E., Ferhatosmanoglu, H., Demirbas, M.: Short text classification in twitter to improve information filtering. In: Proceeding of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2010, pp. 841–842. ACM, New York (2010)CrossRefGoogle Scholar
  30. 30.
    Techcrunch. Report: Mobile app market will be worth $25 billion by 2015 apple ios share: 20%, (last accessed on May 13, 2011)
  31. 31.
    TweetFilter. Tweetfilter for greasemonkey, (last accessed June 26, 2012)
  32. 32.
    Venture Beat. Why apple can not beat android, (last accessed on May 13, 2011)
  33. 33.
    Zhang, L., Zhu, J., Yao, T.: An evaluation of statistical spam filtering techniques. ACM Transactions on Asian Language Information Processing (TALIP) 3 (2004)Google Scholar
  34. 34.
    Znet. Android vs. apple: The 2011 cage match, (last accessed on May 13, 2011)
  35. 35.
    Hevner, A.R., March, S.T., Park, J., Ram, S.: Design science in information systems research. MIS Quarterly 28(1), 75–105 (2004)Google Scholar

Copyright information

© IFIP International Federation for Information Processing 2012

Authors and Affiliations

  • Sangaralingam Kajanan
    • 1
  • Ahmed Shafeeq Bin Mohd Shariff
    • 1
  • Kaushik Dutta
    • 1
  • Anindya Datta
    • 1
  1. 1.School of ComputingNational University of SingaporeSingapore

Personalised recommendations