Abstract
We study the problem of predicting likely places of visit of users using their past tweets. What people write on their microblogs reflects their intent and desire relating to most of their common day interests. Taking this as a strong evidence, we hypothesize that tweets of the person can also be treated as source of strong indicator signals for predicting their places of visits. In this paper, we propose a novel approach for predicting place of visit within a given geospatial range considering the past tweets and the time of visit. These predictions can be used for generating places recommendation or for promotions. In this approach, we analyze use of various features that can be extracted from the historical tweets—for example, personality traits estimated from the past tweets and the actual words mentioned in the tweets. We performed extensive empirical experiments involving, real data derived from twitter timelines of 4600 persons with multi-label classification as predictive model. The performances of proposed approach outperform the four baselines with accuracy reaching 90 % for top five predictions. Based on our experimental study, we come up with general guidelines on building the prediction model in terms of the type of features extracted from historical tweets, window size of historical tweets and on the optimal radius of query around the place of visit at a given time.
Similar content being viewed by others
Notes
User timeline is the sequence of past tweets blogged by the user on Twitter.
References
Abel F, Gao Q, Houben G-J, Tao K (2013) Twitter-based user modeling for news recommendations. In: Rossi F (ed) IJCAI 2013, Proceedings of the 23rd International Joint Conference on Artificial Intelligence, Beijing, China, August 3–9, 2013. IJCAI/AAAI. http://www.aaai.org/ocs/index.php/IJCAI/IJCAI13/paper/view/6683
Argamon S, Koppel M, Pennebaker JW, Schler J (2007) Mining the blogosphere: age, gender and the varieties of self-expression. First Monday 12, 9. http://dblp.uni-trier.de/db/journals/firstmonday/firstmonday12.html#ArgamonKPS07
Asur S, Huberman BA (2010) Predicting the future with social media. In: Proceedings of the 2010 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology—Volume 01 (WI-IAT ’10). IEEE Computer Society, Washington, DC, USA, pp 492–499. doi:10.1109/WI-IAT.2010.63
Badenes H, Bengualid MN, Chen J, Gou L, Haber E, Mahmud J, Nichols JW, Pal A, Schoudt J, Smith BA, Xuan Y, Yang H, Zhou MX (2014) System U: automatically deriving personality traits from social media for people recommendation. In: Proceedings of the 8th ACM conference on recommender systems (RecSys ’14). ACM, New York, NY, USA, pp 373–374. doi:10.1145/2645710.2645719
Bao J, Zheng Y, Mokbel MF (2012) Location-based and preference-aware recommendation using sparse geo-social networking data. In: Proceedings of the 20th International conference on advances in geographic information systems (SIGSPATIAL ’12). ACM, New York, NY, USA, pp 199–208. doi:10.1145/2424321.2424348
Bhattacharya P, Zafar MB, Ganguly N, Ghosh S, Gummadi KP (2014) Inferring user interests in the Twitter social network. In: Kobsa A, Zhou MX, Ester M, Koren Y (eds) Eighth ACM conference on recommender systems, RecSys ’14, Foster City, Silicon Valley, CA, USA—October 06–10, 2014, ACM, 357–360. doi:10.1145/2645710.2645765
Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022. http://dl.acm.org/citation.cfm?id=944919.944937
Bollen J, Mao H, Zeng X-J (2011) Twitter mood predicts the stock market. J Comput Sci 2(1):1–8. doi:10.1016/j.jocs.2010.12.007
Budak C, Kannan A, Agrawal R, Pedersen J (2014) Inferring user interests from microblogs. Technical Report MSR-TR-2014-68. http://research.microsoft.com/apps/pubs/default.aspx?id=217311
Buza K, Nanopoulos A, Nagy G (2015) Nearest neighbor regression in the presence of bad hubs. Knowl Based Syst 86:250–260. doi:10.1016/j.knosys.2015.06.010
Chen J, Hsieh G, Mahmud J, Nichols J (2014) Understanding individuals’ personal values from social media word use. In: Fussell SR, Lutters WG, Morris MR, Reddy M (eds) Computer supported cooperative work, CSCW ’14, Baltimore, MD, USA, February 15–19, 2014, ACM, pp 405–414. doi:10.1145/2531602.2531608
Gao H, Tang J, Hu X, Liu H (2013) Exploring temporal effects for location recommendation on location-based social networks. In: Yang Q, King I, Li Q, Pu P, Karypis G (eds) Seventh ACM conference on recommender systems, RecSys ’13, Hong Kong, China, October 12–16, 2013, ACM, pp 93–100. doi:10.1145/2507157.2507182
Gayo-Avello D, Metaxas PT, Mustafaraj E (2011) Limits of electoral predictions using twitter. In: Adamic LA, Baeza-Yates RA, Counts S (eds) ICWSM, The AAAI Press. http://dblp.uni-trier.de/db/conf/icwsm/icwsm2011.html#Gayo-AvelloMM11
Gilbert E (2012) Phrases that signal workplace hierarchy. In: Poltrock SE, Simone C, Grudin J, Mark G, Riedl J (eds) CSCW, ACM, 1037–1046. http://dblp.uni-trier.de/db/conf/cscw/cscw2012c.html#Gilbert12
Golder SA, Macy MW (2011) Diurnal and seasonal mood vary with work, sleep, and daylength across diverse cultures. Science 333(6051):1878–1881. doi:10.1126/science.1202775
GoogleAPI (2015) Google Places API. https://developers.google.com/places/documentation
Han B, Cook P, Baldwin T (2014) Text-based twitter user geolocation prediction. J Artif Intell Res 49:451–500. doi:10.1613/jair.4200
Hao Q, Cai R, Wang C, Xiao R, Yang J-M, Pang Y, Zhang L (2010) Equip tourists with knowledge mined from travelogues. In: Rappa M, Jones P, Freire J, Chakrabarti S (eds) In: Proceedings of the 19th international conference on World Wide Web, WWW 2010, Raleigh, North Carolina, USA, April 26–30, 2010, ACM, pp 401–410. doi:10.1145/1772690.1772732
Jonnalagedda N, Gauch S (2013) Personalized news recommendation using twitter. In: IEEE/WIC/ACM international joint conferences on web intelligence (WI) and intelligent agent technologies (IAT), vol 3, pp 21–25. doi:10.1109/WI-IAT.2013.144
Kramer ADI, Chung CK (2011) Dimensions of self-expression in facebook status updates. In: Adamic LA, Baeza-Yates RA, Counts S (eds) ICWSM, The AAAI Press. http://dblp.uni-trier.de/db/conf/icwsm/icwsm2011.html#KramerC11
Lee K, Ganti RK, Srivatsa M, Liu L (2014a) When twitter meets foursquare: tweet location prediction using foursquare. In: Proceedings of the 11th international conference on mobile and ubiquitous systems: computing, networking and services (MOBIQUITOUS ’14). ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering), ICST, Brussels, Belgium, Belgium, pp 198–207. doi:10.4108/icst.mobiquitous.2014.258092
Lee K, Mahmud J, Chen J, Zhou MX, Nichols J (2014b) Who will retweet this? automatically identifying and engaging strangers on twitter to spread information. http://arxiv.org/abs/1405.3750
Lichman M, Smyth P (2014) Modeling human location data with mixtures of kernel densities. In: Macskassy SA, Perlich C, Leskovec J, Wang W, Ghani R (eds) The 20th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’14, New York, NY, USA, August 24–27, 2014, ACM, pp 35–44. doi:10.1145/2623330.2623681
Labeled LDA (2015) Labeled LDA in Java. (2015). https://github.com/myleott/JGibbLabeledLDA
Mahmud J, Zhou MX, Megiddo N, Nichols J, Drews C (2013) Recommending targeted strangers from whom to solicit information on social media. In: Kim J, Nichols J, Szekely PA (eds) 18th International conference on intelligent user interfaces, IUI ’13, Santa Monica, CA, USA, March 19–22, 2013, ACM, pp 37–48. doi:10.1145/2449396.2449403
Mathew W, Raposo R, Martins B (2012) Predicting future locations with hidden Markov models. In: Dey AK, Chu H-H, Hayes GR (eds) The 2012 ACM conference on ubiquitous computing, Ubicomp ’12, Pittsburgh, PA, USA, September 5–8, 2012, ACM, 911–918. doi:10.1145/2370216.2370421
MLib (2015) MULAN java library. (2015). http://mulan.sourceforge.net
De Francisci Morales G, Gionis A, Lucchese C (2012) From chatter to headlines: harnessing the real-time web for personalized news recommendation. In: Adar E, Teevan J, Agichtein E, Maarek Y (eds) Proceedings of the fifth international conference on web search and web data mining, WSDM 2012, Seattle, WA, USA, February 8–12, 2012, ACM, pp 153–162. doi:10.1145/2124295.2124315
Pennebaker JW, Chung CK, Ireland M, Gonzales A, Booth RJ (2007) The development and psychometric properties of LIWC2007. Austin, TX, LIWC. Net (2007)
Ramage D, Hall David LW, Nallapati R, Manning CD (2009) Labeled LDA: a supervised topic model for credit attribution in multi-labeled corpora. In: Proceedings of the 2009 Conference on empirical methods in natural language processing, EMNLP 2009, 6–7 August 2009, Singapore, A meeting of SIGDAT, a Special Interest Group of the ACL. ACL, pp 248–256. http://www.aclweb.org/anthology/D09-1026
Ramasamy D, Venkateswaran S, Madhow U (2013) Inferring user interests from tweet times. In: Muthukrishnan SM, Abbadi AEl, Krishnamurthy B (eds) Conference on online social networks, COSN’13, Boston, MA, USA, October 7–8, 2013, ACM, pp 235–240. doi:10.1145/2512938.2512960
Ritterman J, Osborne M, Klein E (2009) Using prediction markets and twitter to predict a swine flu pandemic. In: Proceedings of the 1st international workshop on mining social media. http://www.socialgamingplatform.com/msm09/proceedings/paper2.pdf
Sadilek A, Brennan SP, Kautz HA, Silenzio V (2013) nEmesis: which restaurants should you avoid today? In: Hartman B, Horvitz E (eds) HCOMP, AAAI. http://dblp.uni-trier.de/db/conf/hcomp/hcomp2013.html#SadilekBKS13
Schwartz HA, Eichstaedt JC, Kern ML, Dziurzynski L, Ramones SM, Agrawal M, Shah A, Kosinski M, Stillwell D, Seligman ME (2013) Ungar LH (2013) Personality, gender, and age in the language of social media: the open-vocabulary approach. PLoS One 8:9. doi:10.1371/journal.pone.0073791
Tausczik YR, Pennebaker JW (2010) The psychological meaning of words: LIWC and computerized text analysis methods. J Lang Soc Psychol 29(1):24–54. doi:10.1177/0261927X09351676
Tsoumakas G, Katakis I, Vlahavas I (2010) Mining multi-label data. In: Maimon O, Rokach L (eds) Data mining and knowledge discovery handbook, Springer US, pp 667–685. doi:10.1007/978-0-387-09823-4_34
TwAPI (2015) Twitter streaming api. https://dev.twitter.com/docs/using-search
Wang C, Wang J, Xie X, Ma W-Y (2007) Mining geographic knowledge using location aware topic model. In: Proceedings of the 4th ACM Workshop on Geographical Information Retrieval. GIR ’07. ACM, NY, USA, pp 65–70. doi:10.1145/1316948.1316967
Yin Z, Cao L, Han J, Zhai C, Huang TS (2011) Geographical topic discovery and comparison. In: WWW. pp 247–256
Yuan Q, Cong G, Ma Z, Sun A, Magnenat-Thalmann N (2013a) Who, where, when and what: discover spatio-temporal topics for twitter users. In: Dhillon IS, Koren Y, Ghani R, Senator TE, Bradley P, Parekh R, He J, Grossman RL, Uthurusamy R (eds) The 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2013, Chicago, IL, USA, August 11–14, 2013, ACM, pp 605–613. doi:10.1145/2487575.2487576
Yuan Q, Cong G, Ma Z, Sun A, Thalmann NM (2013b) Time-aware Point-of-interest recommendation. In: Proceedings of the 36th international ACM SIGIR conference on research and development in information retrieval (SIGIR ’13). ACM, New York, NY, USA, pp 363–372. doi:10.1145/2484028.2484030
Yuan Q, Cong G, Sun A (2014) Graph-based Point-of-interest recommendation with geographical and temporal influences. In: Li J, Wang XS, Garofalakis MN, Soboroff I, Suel T, Wang M (eds) Proceedings of the 23rd ACM international conference on conference on information and knowledge management, CIKM 2014, Shanghai, China, November 3–7, 2014, ACM, pp 659–668. doi:10.1145/2661829.2661983
Author information
Authors and Affiliations
Corresponding author
Additional information
This work is done during the internship of first author in IBM Watson Labs, Bangalore during May 2014–May 2015.
Rights and permissions
About this article
Cite this article
Chauhan, A., Kummamuru, K. & Toshniwal, D. Prediction of places of visit using tweets. Knowl Inf Syst 50, 145–166 (2017). https://doi.org/10.1007/s10115-016-0936-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-016-0936-x