How Data Mining is Used in Social Media. Key Performance Indicators’ Impact on Image Post Data Characteristics for Maximum User Engagement

Gkikas, Dimitris C.; Theodoridis, Prokopis K.

doi:10.1007/978-3-031-51038-0_50

Dimitris C. Gkikas⁴ &
Prokopis K. Theodoridis⁵

Part of the book series: Springer Proceedings in Business and Economics ((SPBE))

Included in the following conference series:

The International Conference on Strategic Innovative Marketing and Tourism

1102 Accesses

Abstract

Digital marketing strategy has become increasingly popular aiming to increase social media users’ engagement, brand awareness, and revenues. The aim of this study is to calculate the organic photo posts’ text characteristics such as text readability, hashtags number and characters number. Using data mining classification models, the current study examines whether these characteristics affect organic post user engagement for lifetime post engaged users and lifetime people who have liked a page and engaged with a post. Data were extracted from social media retail business pages. Readability performance metrics (e.g., the post text readability score, the characters’ number, and the hashtags’ number) are the independent variables. Posts’ performances were measured by seven performance metrics, assigned as the depended variables. Data, content characteristics, and performance metrics were extracted from a social platform retail business page. Finally, user engagement was calculated, and posts’ performance classification was represented using decision tree (DTs) graphs. The findings reveal how post texts’ content characteristics impact performance metrics helping this way the marketers to better form their social media organic strategies, the company to increase impressions, reach and revenues, and the customers to comprehend the post message and engage with the brand.

You have full access to this open access chapter, Download conference paper PDF

Keywords

1 Introduction

The rapid expansion of the internet has prompted companies to capitalize on social media marketing as a means to enhance profitability and brand recognition, affording marketers the opportunity to derive marketing insights from customer purchasing patterns [1]. Furthermore, it has been projected that the COVID-19 pandemic accelerated the shift to e-commerce by approximately five years, with an estimated nearly 20% growth in e-commerce during the year 2020 [2]. A substantial body of research projects, as revealed through a comprehensive literature review, focuses on examining various metrics related to social media marketing, including impressions, reach, volume of specific keywords and sentiments, social and organic referrals, conversion rates, click-through rates, and user intentions to purchase. These analyses often employ statistical analysis, predictive analytics, text mining, association rules, and clustering techniques [3]. Empirical evidence suggests that user behavior on social media is influenced by multiple factors, such as the nature of the post, the industry sector, content type and meaning, and posting time and day [4]. Moreover, text readability is considered a significant factor impacting user engagement. Scholars contend that writing skills play a fundamental role in conveying thoughts effectively, surpassing the efficacy of oral expression [5]. For assessing text readability, the Flesch-Kincaid readability test is used, which employs a formula to indicate the ease of reading material on a 100-point scale, with higher scores indicating better readability [6, 7]. The aim of this research is to examine a certain aspect of social media posts characteristics in order to reveal how image post characteristics affect users engagement and give decision makers and marketers the opportunity to mitigate risk in decisions.

2 Literature Review

Regarding the readability performance and based on a recent study, the impact of readability on the popularity of social media messages for information sharing was highlighted. The researchers found that users’ intention to engage with content could be determined by the initial vocabulary words of a post, rather than solely relying on the average length of 16-word phrases [1]. From other research outcomes it was revealed that “photo” and “link” posts generated less engagement than “status” and “video” posts. Informative posts received more “likes” while competition posts had the lowest number of “likes” [3]. A similar study showed that “photo” posts were the most preferred by users. Additionally, posts made on workdays drove higher comments engagement rates, whereas posts published during peak times had the opposite effect [4]. A significant work utilized regression on cosmetics data and revealed that five Facebook attributes, including “lifetime post total reach,” “lifetime post total impressions,” “lifetime post consumers,” “lifetime post impressions by people who have liked your page,” and “lifetime people who have liked your page and engaged with your post,” significantly influenced the total interactions of a post, encompassing “comments,” “likes,” and “shares” [8]. Another study found that Facebook engagement metrics such as “comments,” “likes,” and “shares,” along with the type of content, the month, and the day of publication, had an impact on the “lifetime total organic reach” and “total page likes” of a company's page performance [9]. A similar research attempt conducted a sensitivity analysis and identified content type, particularly “status” posts, as having a considerable impact on user engagement, surpassing the influence of other post types. The date of publication was also found to affect engagement [10]. A significant study reported that “photo” content generated the highest level of user engagement compared to other post types such as “status,” “link,” or “video” [11]. Recent study found that “photo” posts were more engaging than “text” posts. Furthermore, engagement rates decreased when users encountered content related to discounts, contests, or offers [12]. Facebook has decreased impressions and reach for organic posts [13]. The authors have focused their efforts on examining the impact of organic “photo” posts’ text on user engagement highlighting the most important research questions. Considering that posts containing 1–80 characters gain 86% more engagement, they aim to investigate the level of engagement elicited by the texts of “photo” posts, as assessed by their readability score [14]. Lastly, when users like a page on Facebook, they tend to receive more updates or posts from that particular page in their news feed. The current study seeks to discern the differences in the number of impressions and reach between total engaged users and those who liked a page and subsequently engaged with a post [15].

3 Research Methodology

3.1 Data Description

The data utilized in this study was extracted from the Facebook Page Insights of a retail women's fashion store based in Greece, which operates through both physical and online stores. The Facebook page boasts a following of 1800 users, with a total of 1756 “total page likes,” 3690 “total posts likes,” 473 “average organic post reach,” 5985 “average paid post reach,” and 23 “average post reactions.“. The numbers of reactions per brand for business page posts. These data points were collected during the COVID-19 pandemic, precisely from April 30, 2020, to October 25, 2020. Different size of data would potentially increase (bigger) or decrease (smaller) the accuracy of the final DT classification accuracy.

3.2 Performance Metrics

Facebook metrics can be obtained either through direct exports from Facebook Manager or by performing calculations based on various types of information, including identification, content, categorization, and performance [10]. The character number attribute is classified based on the scale of “0–80”, “81–160”, “161–240”, “241–320”, “321–400”, “400–480”, and “481–560” while hashtags are considered as separate classes based on their number of occurrences [14]. “Reach” denotes the total number of people who see a post, “Impressions” refer to the number of times the post is loaded on the news feed, regardless of whether it was viewed or clicked, and “Engagement” represents user actions on the post [14]. The Flesch–Kincaid Readability score (FKRS) represents text readability score of specific ranges, less than 30 (very difficult to read), from 30 to 49 (difficult to read), from 50 to 59 (fairly difficult to read), from 60 to 69 (easily understood), 70–79 (fairly easy to read), from 80 to 89 (easy to read), equal or more than 90 (very easy to read) [16, 17]. WebFX was used for the FKRS score. Table 1 demonstrates the identified performance metrics [18].

Table 1 Performance metrics

Full size table

Table 2 shows the post text attribute types, their values ranges and their occurrence.

Table 2 Features labels and occurrences

Full size table

3.3 Decision Trees Classification

DTs algorithm reads data, separates classes, and assigns values to each class following the rules of “if–then-else” sequences. Datasets consist of attributes, each of which may have properties and multiple instances. DTs comprise nodes representing dataset attributes and branches representing attribute values. The initial node on top represents a super-class, while the leaves represent sub-classes. To evaluate the DTs performance, datasets are divided into training, validation, and testing sets. The algorithm is trained with the largest set of examples (training set) generated by the DTs algorithm, generating a hypothesis, and then calculating the percentage of correctly classified examples in the validation set. The process is repeated while varying the size of the training set. The testing set is used to validate the outcome with entirely new data, preventing overfitting by implementing tree pruning techniques [21, 22].

4 Results and Discussion

Figures 1 and 2 demonstrate how readability affects Facebook performance metrics. In Table 3 Lifetime post, engaged users have a classification accuracy of 71.9%. Lifetime people who have liked the page and engaged with the post have a classification accuracy of 74.1%. WEKA3 sixfold cross-validation pruned DTs J48 classifier was used. Both DTs show that the image post texts with low readability scores tend to achieve higher values of “lifetime total reach”, “lifetime total impressions”, “lifetime post engaged users”, and “total post user engagement” than those posts texts which are easier to read and comprehend. It appears that Facebook algorithm purposely increases the number of reach and impressions towards users when readability ease tends to increase. The “lifetime people who have liked your page and engaged with your post” exhibited a notably higher engagement rate compared to the “lifetime post engaged users”. Individuals who have expressed their interest by liking or following a page are more likely to receive post updates in their news feed as opposed to those who have not engaged in such actions. A correlation between the “impression” performance metrics and the effectiveness of “photo” post text performance is also revealed (see Figs. 1 and 2). The analysis of Lifetime post total impressions exceeds 723 and Lifetime post impressions by people who have liked the page surpasses 452, classifies “Difficult” text over “Very Difficult” text in 2 out of 3 instances where “impression” metrics are involved in the classification process. Figure 3 shows how the post text characters number affect the key performance indicators with a classification accuracy of 73%.

A tree chart of F K R S classified instances for L P E U induced with D Ts. L P E U branches to less than 41, L P E U, and greater than 41, very difficult. L P E U branches to less than 35, L P E U, and greater than 35, L P T R. Each has further branches. — **Fig. 1**

A tree diagram of F K R S classified Instances for L P L P E P induced with D Ts. L P I P L P has branches for less than or equal to 463, L P I P L P, and greater than 463, very difficult. The L P I P L P branches to less than or equal to 452, very difficult, and greater than 452, difficult. — **Fig. 2**

Table 3 Decision trees classification summary for LPEU and LPLPEP

Full size table

A tree chart of character number classified instances for L P I P L P induced with D Ts. L P I P L P mainly branches to less than or equal to 620, and greater than 620 with L P R P L P. L P R P L P branches to less than or equal to 738, L P L P E P, and less than 738, L P T R. Each has further branches. — **Fig. 3**

5 Conclusion and Contribution

The objective of this study is to assist brands in enhancing user engagement for organic posts within the fashion domain. Τhis research provides decision-makers with pertinent information to optimize their social media photo post texts, thereby maximizing user engagement and profits despite the dataset limited size. Employing DTs classification on “photo” posts fashion data, the research reveals that the readability score of “photo” post text does not significantly influence “lifetime post engaged users” and overall performance metrics. There is a noteworthy tendency for it to impact “impression” performance metrics. The instances with the largest numbers of “lifetime post engaged users” and “lifetime post total reach” are associated with “Difficult” and “Very Difficult” texts, thereby indicating that post text readability might not be a substantial performance factor. It is crucial to acknowledge the limitations in the size and origin of the data and the inclusion of social media platforms, indicating the need for further advancements as the current dataset size remains relatively small compared to other research attempts. While aiming to provide suggestions, the authors emphasize the importance of acknowledging human nature and potential indifference, refraining from drawing rules as it would be deemed unethical.

References

Pancer E, Chandler V, Poole M, Noseworthy TJ (2019) How readability shapes social media engagement. Journal of consumer psychology, 29, 262–270. https://doi.org/10.1002/jcpy.1073
Article Google Scholar
World Economic Forum. https://www.weforum.org/agenda/2020/08/covid19-pandemic-social-shift-ecommerce-report.
Cvijikj PI, Spiegler ED, Michahelles F (2011) The effect of post type, category and posting day on user interaction level on Facebook. IEEE Third International Conference on Privacy, Security, Risk and Trust and IEEE Third International Conference on Social Computing, pp. 810–813. Boston, MA. https://doi.org/10.1109/PASSAT/SocialCom.2011.21
Cvijikj PI, Michahelles F (2013) Online engagement factors on Facebook brand pages. Social network analysis and mining 3, 843–861. https://doi.org/10.1007/s13278-013-0098-8
Article Google Scholar
Applebee, AN (1984) Writing &reasoning. Review of educational research, 54(4), 577–596
Article Google Scholar
Kincaid J, Fishburne RP, Rogers RL, Chissom BS (1975) Derivation of new readability formulas (automated readability index, fog count and Flesch reading ease formula) for Navy enlisted personnel
Google Scholar
Dubay WΗ (2004) The principles of readability. Costa Mesa, CA
Google Scholar
Mittal R (2020) Identification of salient attributes in social network: A data mining approach. In: Batra, U., Roy, N., Panda, B. (eds) Data Science And Analytics. Redset 2019. Communications In Computer and Information Science, vol 1230. Springer, Singapore. https://doi.org/10.1007/978-981-15-5830-6_16
Huang JP, Sembirin G, Riorini SV, Wang PC (2018) Leveraging social media metrics in improving social media performances through organic reach: A data mining approach. Review of economic and business studies 11, 33–48
Article Google Scholar
Meedanphai S, Jayasurya T, Swapna (2023) Factors affecting consumers buying decision behavior via online media. World Journal of Advanced Research and Reviews, 18(3), 1253–1259. https://doi.org/10.30574/wjarr.2023.18.3.1012
Fachmi M, Setiawan IP, Hidayat A (2019) Analysis of factors affecting consumer purchase decision at online shops. https://doi.org/10.17605/OSF.IO/WV7MU
Davidaviciene V, Davidavičius S, Tamosiuniene R (2019) B2C marketing communication in social media: Fashion industry specifics. In: 2019 International Conference on Creative Business for Smart And Sustainable Growth, Crebus, pp. 1–4. Sandanski, Bulgaria. https://doi.org/10.1109/CREBUS.2019.8840067
Bernazzani, S (2023) The decline of organic Facebook reach & how to adjust to the algorithm. https://blog.hubspot.com/marketing/facebook-organic-reach-declining.
Hootsuite. https://blog.hootsuite.com/ideal-social-media-post-length/#Facebook.
Gleam. https://gleam.io/blog/facebook-posts/.
Wikipedia. https://en.wikipedia.org/wiki/Flesch%E2%80%93Kincaid_readability_tests.
Eleyan D, Othman A, Eleyan A (2020) Enhancing software comments readability using Flesch reading ease score. Information, 11(9), 430. https://doi.org/10.3390/info11090430
Article Google Scholar
Social Media Examiner. https://www.socialmediaexaminer.com/facebook-page-metrics.
WebFX. https://www.webfx.com/tools/read-able/, last accessed 2023/07/01.
Weka. https://www.cs.waikato.ac.nz/ml/weka/downloading.htm, last accessed 2023/06/25.
Mitchell, T (1997) Machine learning. McGraw Hill, New York
Google Scholar
Kohavi R (1995) The power of decision tables. In Lavrac, N., Wrobel, S. (eds). Machine Learning, Ecml-95, Lecture Notes in Computer Science (Lecture Notes in Artificial Intelligence), vol. 912, pp. 174–189. https://doi.org/10.1007/3-540-59286-5_57

Download references

Author information

Authors and Affiliations

University of Patras, Gr 26504, Mesolonghi, Greece
Dimitris C. Gkikas
Hellenic Open University, Patras Campus, 18 Aristotelous Str., 26335, Patras, Greece
Prokopis K. Theodoridis

Authors

Dimitris C. Gkikas
View author publications
You can also search for this author in PubMed Google Scholar
Prokopis K. Theodoridis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dimitris C. Gkikas .

Editor information

Editors and Affiliations

University of West Attica, Athens, Greece
Androniki Kavoura
University of the Azores, Ponta Delgada, Portugal
Teresa Borges-Tiago
University of the Azores, Ponta Delgada, Portugal
Flavio Tiago

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gkikas, D.C., Theodoridis, P.K. (2024). How Data Mining is Used in Social Media. Key Performance Indicators’ Impact on Image Post Data Characteristics for Maximum User Engagement. In: Kavoura, A., Borges-Tiago, T., Tiago, F. (eds) Strategic Innovative Marketing and Tourism. ICSIMAT 2023. Springer Proceedings in Business and Economics. Springer, Cham. https://doi.org/10.1007/978-3-031-51038-0_50

Download citation

DOI: https://doi.org/10.1007/978-3-031-51038-0_50
Published: 01 June 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-51037-3
Online ISBN: 978-3-031-51038-0
eBook Packages: Business and ManagementBusiness and Management (R0)

Publish with us

Policies and ethics

How Data Mining is Used in Social Media. Key Performance Indicators’ Impact on Image Post Data Characteristics for Maximum User Engagement

Abstract

Keywords

1 Introduction

2 Literature Review

3 Research Methodology

3.1 Data Description

3.2 Performance Metrics

3.3 Decision Trees Classification

4 Results and Discussion

5 Conclusion and Contribution

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation