Skip to main content

Customer segmentation using online platforms: isolating behavioral and demographic segments for persona creation via aggregated user data

Abstract

We propose a novel approach for isolating customer segments using online customer data for products that are distributed via online social media platforms. We use non-negative matrix factorization to first identify behavioral customer segments and then to identify demographic customer segments. We employ a methodology for linking the two segments to present integrated and holistic customer segments, also known as personas. Behavioral segments are generated from customer interactions with online content. Demographic segments are generated using the gender, age, and location of these customers. In addition to evaluating our approach, we demonstrate its practicality via a system leveraging these customer segments to automatically generate personas, which are fictional but accurate representations of each integrated behavioral and demographic segment. Results show that this approach can accurately identify both behavioral and demographical customer segments using actual online customer data from which we can generate personas representing real groups of people.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Notes

  1. https://www.youtube.com/channel/UCV3Nm3T-XAgVhKH9jT0ViRg.

  2. https://developers.google.com/youtube/analytics/.

References

  • Abbar S, An J, Kwak H, Messaoui Y, Borge-Holthoefer J (2015) Consumers and suppliers: attention asymmetries. A case study of Aljazeera’s news coverage and comments. In: Paper presented at the Computation + Journalism Symposium 2015, New York, NY, 2–3 Oct

  • An J, Cho H, Kwak H, Hassen MZ, Jansen BJ (2016a) Towards automatic persona generation using social media. In: 2016 IEEE 4th International Conference on Future Internet of Things and Cloud Workshops (FiCloudW), 22–24 Aug 2016, pp 206–211. https://doi.org/10.1109/W-FiCloud.2016.51

  • An J, Kwak H, Jansen BJ (2016b) Validating social media data for automatic persona generation. In: The Second International Workshop on Online Social Networks Technologies (OSNT-2016), 13th ACS/IEEE International Conference on Computer Systems and Applications (AICCSA2016), Agidar, Morocco, 29 Nov–2 Dec 2016

  • An J, Kwak H, Jansen BJ (2017) Personas for content creators via decomposed aggregate audience statistics. In: Advances in Social Network Analysis and Mining (ASONAM 2017), Sydney, Australia, 31 Jul–3 Aug 2017, pp 632–635

  • Antoniou A (2017) Social network profiling for cultural heritage: combining data from direct and indirect approaches. Soc Netw Anal Min 7:39

    Article  Google Scholar 

  • Beane TP, Ennis DM (1987) Market segmentation: a review. Eur J Mark 21:20–42

    Article  Google Scholar 

  • Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022

    MATH  Google Scholar 

  • Bonoma TV, Shapiro BP (1984) Evaluating market segmentation approaches. Ind Mark Manag 13:257–268

    Article  Google Scholar 

  • Bowden JLH (2009) The process of customer engagement: a conceptual framework. J Mark Theory Pract 17:63–74

    Article  Google Scholar 

  • Brownlee J (2016) Machine learning performance improvement cheat sheet. https://machinelearningmastery.com/machine-learning-performance-improvement-cheat-sheet/. Accessed 9 Apr 2018

  • Cha M, Kwak H, Rodriguez P, Ahn Y-Y, Moon S (2007) I tube, you tube, everybody tubes: analyzing the world’s largest user generated content video system. In: Proceedings of the 7th ACM SIGCOMM conference on Internet Measurement, pp 1–14

  • Chapman CN, Milham RP (2006) The personas’ new clothes: methodological and practical arguments against a popular method. Hum Factors Ergon Soc Annu Meet 5:634–636

    Article  Google Scholar 

  • Chéron EJ, Kleinschmidt EJ (1985) A review of industrial market segmentation research and a proposal for an integrated segmentation framework. Int J Res Mark 2:101–115

    Article  Google Scholar 

  • Chiang M-F, Lim E-P, Low J-W (2015) On mining lifestyles from user trip data. In: Paper presented at the Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2015, Paris, France

  • Clarke MF (2015) The work of mad men that makes the methods of math men work: practically occasioned segment design. In: Paper presented at the Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, Seoul, Republic of Korea

  • Cooil B, Aksoy L, Keiningham TL (2008) Approaches to customer segmentation. J Relat Mark 6:9–39

    Google Scholar 

  • Cooper A (2004) The inmates are running the asylum: why high tech products drive us crazy and how to restore the sanity (2nd Edition). Pearson Higher Education, New York

    Google Scholar 

  • Dharwada P, Greenstein JS, Gramopadhye AK, Davis SJA (2007) Case study on use of personas in design and development of an audit management system. In: Human Factors and Ergonomics Society Annual Meeting Proceedings, Baltimore, Maryland, 1–5 Oct 2007, vol 5, pp 469–473

  • Dittmar A, Hensch M (2015) Two-level personas for nested design spaces. In: Paper presented at the Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, Seoul, Republic of Korea

  • Drego VL, Dorsey M (2010) The ROI of personas. Forrester Research. https://www.forrester.com/report/The+ROI+Of+Personas/-/E-RES55359

  • Dursun A, Caber M (2016) Using data mining techniques for profiling profitable hotel customers: an application of RFM analysis. Tour Manag Perspect 18:153–160

    Article  Google Scholar 

  • Eriksson E, Artman H, Swartling A (2013) The secret life of a persona: when the personal becomes private. In: Paper presented at the Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Paris, France

  • Faily S, Flechais I (2011) Persona cases: a technique for grounding personas. In: Paper presented at the Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Vancouver, BC, Canada

  • Firat AF, Shultz CJ (1997) From segmentation to fragmentation: markets and marketing strategy in the postmodern era. Eur J Mark 31:183–207

    Article  Google Scholar 

  • Foedermayr EK, Diamantopoulos A (2008) Market segmentation in practice: review of empirical studies, methodological assessment, and agenda for future research. J Strateg Mark 16:223–265

    Article  Google Scholar 

  • Friess E (2012) Personas and decision making in the design process: an ethnographic case study. In: Paper presented at the Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Austin, Texas, USA

  • Goodwin K, Cooper A (2009) designing for the digital age: how to create human-centered products and services. Wiley, Indianapolis

    Google Scholar 

  • Gray RM (1984) Vector quantization. IEEE ASSP Mag 1:4–29

    Article  Google Scholar 

  • Jansen BJ (2009) Understanding user-web interactions via web analytics. Morgan-Claypool lecture series. Morgan-Claypool, San Rafael

    Google Scholar 

  • Jansen BJ, Sobel K, Cook G (2011) Classifying ecommerce information sharing behaviour by youths on social networking sites. J Inf Sci 37:120–136

    Article  Google Scholar 

  • Jansen BJ, Moore K, Carman S (2013) Evaluating the performance of demographic targeting using gender in keyword advertising. Inf Process Manag 49:286–302

    Article  Google Scholar 

  • Jansen BJ, An J, Kwak H, Hassen MZ, Cho HY (2016) Efforts towards automatically generating personas in real-time using actual user data. In: Paper presented at the Qatar Foundation Annual Research Conference 2016, Doha, Qatar, 22–23 Mar

  • Jansen BJ, An J, Kwak H, Salminen JO, Jung SG (2017a) Viewed by too many or viewed too little: using information dissemination for audience segmentation. In: Association for Information Science and Technology Annual Meeting 2017 (ASIST2017), Washington, DC, 27 Oct–1 Nov, pp 189–196

  • Jansen BJ, Jung SG, Salminen J, An J, Kwak H (2017b) Social analytics data for identifying customer segments for online news media. In: The Third International Workshop on Online Social Networks Technologies, 2017 IEEE/ACS 14th International Conference on Computer Systems and Applications (AICCSA2017), Hammamet, Tunisia, 30 Oct–3 Nov

  • Jenkinson A (1994) Beyond segmentation. J Target Meas Anal Mark 3:60–72

    Google Scholar 

  • Jolliffe I (2002) Principal component analysis. Wiley, Hoboken

    MATH  Google Scholar 

  • Judge T, Matthews T, Whittaker S (2012) Comparing collaboration and individual personas for the design and evaluation of collaboration software. In: Paper presented at the Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, Austin, Texas, USA

  • Jung S, An J, Kwak H, Ahmad M, Nielsen L, Jansen BJ (2017) Persona generation from aggregated social media data. In: ACM Conference on Human Factors in Computing Systems 2017 (CHI2017), Denver, CO, 6–11 May 2017, pp 1748–1755

  • Kamboj S, Kumar V, Rahman Z (2017) Social media usage and firm performance: the mediating role of social capital Soc Netw Anal Min 7:51

    Article  Google Scholar 

  • Kwak H, An J (2014) Understanding news geography and major determinants of global news coverage of disasters. In: Paper presented at the Computation + Journalism Symposium 2014, New York, NY, 24–25 Oct

  • Kwak H, An J, Jansen BJ (2017) Automatic generation of personas using YouTube Social media data. In: Hawaii International Conference on System Sciences (HICSS-50), Waikoloa, Hawaii, 4–7 Jan 2017, pp 833–842

  • Lee DD, Seung SH (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401:788–791

    Article  Google Scholar 

  • Lin CJ (2007) On the convergence of multiplicative update algorithms for nonnegative matrix factorization. IEEE Trans Neural Netw 18:1589–1596

    Article  Google Scholar 

  • Mao E, Zhang J (2015) What drives consumers to click on social media ads? The roles of content, media, and individual factors. In: 2015 48th Hawaii International Conference on System Sciences, 5–8 Jan 2015, pp 3405–3413

  • Marcus C (1998) A practical yet meaningful approach to customer segmentation. J Consum Mark 15:494–504

    Article  Google Scholar 

  • Mirzal A (2014) Nonparametric Orthogonal NMF and its Application in Cancer Clustering. In: Proceedings of the First International Conference on Advanced Data and Information Engineering (DaEng-2013), Singapore, 2014. Proceedings of the First International Conference on Advanced Data and Information Engineering (DaEng-2013). Springer, Singapore, pp 177–184

  • Nielsen L, Hansen KS (2014) Personas is applicable: a study on the use of personas in Denmark. In: Paper presented at the Proceedings of the 32nd Annual ACM Conference on Human Factors in Computing Systems, Toronto, Ontario, Canada

  • Odlyzko A (2003) Privacy, economics, and price discrimination on the Internet. In: Paper presented at the Proceedings of the 5th International Conference on Electronic Commerce, Pittsburgh, Pennsylvania, USA

  • Ortiz-Cordova A, Jansen BJ (2012) Classifying web search queries in order to identify high revenue generating customers. J Am Soc Inform Sci Technol 63:1426–1441

    Article  Google Scholar 

  • Pruitt J, Adlin T (2005) The persona lifecycle: keeping people in mind throughout product design. Morgan Kaufmann Publishers Inc, Burlington

    Google Scholar 

  • Pruitt J, Adlin T (2006) The persona lifecycle: keeping people in mind throughout product design. Morgan Kaufmann Publishers Inc, Burlington

    Google Scholar 

  • Pruitt J, Grudin J (2003) Personas: practice and theory. In: Paper presented at the Proceedings of the 2003 conference on Designing for user experiences, San Francisco, California

  • Revella A (2015) Buyer personas: how to gain insight into your customer’s expectations, align your marketing strategies, and win more business. Wiley, Hoboken

    Google Scholar 

  • Salminen JO et al (2017) Generating cultural personas from social data: a perspective of Middle Eastern users. In: 2017 5th International Conference on the Future Internet of Things and Cloud Workshops (FiCloudW 2017), Prague, pp 120–125

  • Shan D, Xu X, Liang T, Ding S (2018) Rank-adaptive non-negative matrix factorization. Cogn Comput 10:506–515

    Article  Google Scholar 

  • Shapiro BP, Bonoma TV (1984) How to segment industrial markets. https://hbr.org/1984/05/how-to-segment-industrial-markets. Accessed 3 Dec 2017

  • Shi X, Lu H, He Y, He S (2015a) Community detection in social network with pairwisely constrained symmetric non-negative matrix factorization. In: 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2015), Paris, France pp 541–546

  • Shi X, Lu H, He Y, He S (2015b) Community detection in social network with pairwisely constrained symmetric non-negative matrix factorization. In: Paper presented at the Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2015, Paris, France

  • Shuradze G, Wagner HT (2016) Towards a conceptualization of data analytics capabilities. In: 2016 49th Hawaii International Conference on System Sciences (HICSS), 5–8 Jan 2016, pp 5052–5064

  • Smith WR (1956) A product differentiation and market segmentation as alternative marketing strategies. J Advert 21:3–8

    Google Scholar 

  • Stern BB (1994) A revised communication model for advertising: multiple dimensions of the source, the message, and the recipient. J Advert 23:5–15

    Article  Google Scholar 

  • Tuna T, Akbas E, Aksoy A, Canbaz MA, Karabiyik U, Aygun BG (2016) User characterization for online social networks. Soc Netw Anal Min 6:104

    Article  Google Scholar 

  • Xu C (2018) A novel recommendation method based on social network using matrix factorization technique. Inf Process Manag 54:463–474

    Article  Google Scholar 

  • Zarrinkalam F, Kahani M, Bagheri E (2018) Mining user interests over active topics on social networks. Inf Process Manag 54:339–357

    Article  Google Scholar 

  • Zhang X, Brown H-F, Shankar A (2016) Data-driven Personas: Constructing Archetypal Users with Clickstreams and User Telemetry. In: Paper presented at the Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, Santa Clara, California, USA

Download references

Acknowledgements

We thank the many journalists at Al Jazeera News Media Network for their collaboration in this research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jisun An.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

An, J., Kwak, H., Jung, Sg. et al. Customer segmentation using online platforms: isolating behavioral and demographic segments for persona creation via aggregated user data. Soc. Netw. Anal. Min. 8, 54 (2018). https://doi.org/10.1007/s13278-018-0531-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s13278-018-0531-0

Keywords

  • Web analytics
  • Social computing
  • Personas
  • Marketing
  • System design
  • Customer segmentation