Abstract
The Twitter platform has emerged as a leading medium of conducting social commentary, where users remark upon all kinds of entities, events and occurrences. As a result, organizations are starting to mine twitter posts to unearth the knowledge encoded in such commentary. Mobile applications, commonly known as mobile apps, are the fastest growing consumer product segment in the history of human merchandizing, with over 600,000 apps on the Apple platform and over 350,000 on Android. A particularly interesting issue is to evaluate the popularity of specific mobile apps by analyzing the social conversation on them. Clearly, twitter posts related to apps are an important segment of this conversation and have been a main area of research for us. In this respect, one particularly important problem arises due to a name conflict of mobile app names and the names that are used to refer the mobile apps in twitter posts. In this paper, we present a strategy to reliably extract twitter posts that are related to specific apps, but discovering the contextual clues that enable effective filtering of irrelevant twitter posts is our concern. While our application is in the important space of mobile apps, our techniques are completely general and may be applied to any entity class. We have evaluated our approach against a popular Bayesian classifier and a commercial solution. We have demonstrated that our approach is significantly more accurate than both of these. These results as well as other theoretical and practical implications are discussed.
Chapter PDF
Similar content being viewed by others
References
Androutsopoulos, I., Koutsias, J., Chandrinos, K.V., Ch, K.V., Paliouras, G., Spyropoulos, C.D.: An evaluation of naive bayesian anti-spam filtering, 9–17 (2000)
Apache, Open nlp, http://opennlp.sourceforge.net/projects.html (last accessed July, 2012)
AppleiOS. Apple-ios, http://www.apple.com/ios/ (last accessed on July 14, 2011)
Banerjee, S.: Clustering short texts using wikipedia. In: Proceedings of the 30th Annual International ACM SIGIR Conference (2007)
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines (2001)
Cui, H., Wen, J.-R., Nie, J.-Y., Ma, W.-Y.: Query expansion by mining user logs. IEEE Transactions on Knowledge and Data Engineering 15(4), 829–839 (2003)
Dent, K., Paul, S.: Through the twitter glass: Detecting questions in micro-text. In: Proceedings of AAAI 2011 Workshop on Analyzing Microtext (2011)
Fausett, L. (ed.): Fundamentals of neural networks: architectures, algorithms, and applications. Prentice-Hall, Inc., Upper Saddle River (1994)
Filtertweets, Filter tweets for greasemonkey, http://userscripts.org/scripts/show/87289 (last accessed June 27, 2012)
Google Inc. Android developers, http://developer.android.com/index.html (last accessed on July 14, 2011)
Google Inc. Google search appliance help center, http://code.google.com/apis/searchappliance/documentation/46/help_gsa/serve_synonym.html (last accessed on May 13, 2011)
Hu, X., Sun, N., Zhang, C., Chua, T.-S.: Exploiting internal and external semantics for the clustering of short texts using world knowledge. In: CIKM 2009: Proceeding of the 18th ACM Conference on Information and Knowledge Management, pp. 919–928. ACM, New York (2009)
Jansen, B.J., Liu, Z., Weaver, C., Campbell, G., Gregg, M.: Real time search on the web: Queries, topics, and economic value. Inf. Process. Manage. 47, 491–506 (2011)
Lewis, D.D.: Naive (bayes) at forty: The independence assumption in information retrieval, pp. 4–15. Springer (1998)
MacQueen, J.B.: Some methods for classification and analysis of multivariate observations. In: Cam, L.M.L., Neyman, J. (eds.) Proc. of the fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297. University of California Press (1967)
Markets & Markets. World mobile applications market - advanced technologies, global forecast (2010-2015), http://www.marketsandmarkets.com/Market-Reports/mobile-applications-228.html (last accessed on May 13, 2011)
Mashable. Mobile app market to surge to $17.5 billion by 2012, http://mashable.com/2010/03/17/mobile-app-market-17-5-billion/ (last accessed on May 13, 2011)
Mobilewalla. Mobilewalla-an app search engine, http://mobilewalla.com/ (last accessed on May 13, 2012)
Nigam, K.: Using maximum entropy for text classification. In: IJCAI 1999 Workshop on Machine Learning for Information Filtering, pp. 61–67 (1999)
One Riot. The inner workings of a realtime search engine (2009)
PC World. It’s android vs. apple: Will you switch sides?, http://www.pcworld.com/article/199109/its_android_vs_apple_will_you_switch_sides.html (last accessed on May 13, 2011)
Sahami, M., Dumais, S., Heckerman, D., Horvitz, E.: A bayesian approach to filtering junk E-mail. In: Learning for Text Categorization: Papers from the 1998 Workshop, Madison, Wisconsin, AAAI Technical Report WS-98-05 (1998)
Sankaranarayanan, J., Samet, H., Teitler, B.E., Lieberman, M.D., Sperling, J.: Twitterstand: news in tweets. In: Proceedings of the 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, GIS 2009, pp. 42–51. ACM, New York (2009)
Sarkas, N., Bansal, N., Das, G., Koudas, N.: Measure-driven keyword-query expansion. Proceedings of the Vldb Endowment 2, 121–132 (2009)
Schneider, K.-M.: A comparison of event models for naive bayes anti-spam e-mail filtering. In: Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2003), pp. 307–314 (2003)
SocialMention. Social mention, http://socialmention.com (last accessed on November 5, 2011)
SocialMention. Social mention api, http://socialmention.com/api/ (last accessed on November 5, 201)
Spärck Jones, K.: A statistical interpretation of term specificity and its application in retrieval. Journal of Documentation 28(1), 11–21 (1972)
Sriram, B., Fuhry, D., Demir, E., Ferhatosmanoglu, H., Demirbas, M.: Short text classification in twitter to improve information filtering. In: Proceeding of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2010, pp. 841–842. ACM, New York (2010)
Techcrunch. Report: Mobile app market will be worth $25 billion by 2015 apple ios share: 20%, http://techcrunch.com/2011/01/18/report-mobile-app-market-will-be-worth-25-billion-by-2015-apples-share-20 (last accessed on May 13, 2011)
TweetFilter. Tweetfilter for greasemonkey, http://userscripts.org/scripts/show/49905 (last accessed June 26, 2012)
Venture Beat. Why apple can not beat android, http://venturebeat.com/2010/11/05/why-apple-cant-beat-android (last accessed on May 13, 2011)
Zhang, L., Zhu, J., Yao, T.: An evaluation of statistical spam filtering techniques. ACM Transactions on Asian Language Information Processing (TALIP)Â 3 (2004)
Znet. Android vs. apple: The 2011 cage match, http://www.zdnet.com/blog/btl/android-vs-apple-the-2011-cage-match/43682 (last accessed on May 13, 2011)
Hevner, A.R., March, S.T., Park, J., Ram, S.: Design science in information systems research. MIS Quarterly 28(1), 75–105 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 IFIP International Federation for Information Processing
About this paper
Cite this paper
Kajanan, S., Bin Mohd Shariff, A.S., Dutta, K., Datta, A. (2012). Resolving Name Conflicts for Mobile Apps in Twitter Posts. In: Bhattacherjee, A., Fitzgerald, B. (eds) Shaping the Future of ICT Research. Methods and Approaches. IFIP Advances in Information and Communication Technology, vol 389. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35142-6_1
Download citation
DOI: https://doi.org/10.1007/978-3-642-35142-6_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35141-9
Online ISBN: 978-3-642-35142-6
eBook Packages: Computer ScienceComputer Science (R0)