Keywords

1 Introduction

The increase of social media platforms has led to continuous changes in how entrepreneurs carry out their day-to-day activities (Fan et al., 2021; Olanrewaju et al., 2020). Meanwhile, entrepreneurs use social media for diverse purposes and may expect different outcomes (Olanrewaju et al., 2020). Indeed, on social media, entrepreneurs can collect various types of information data about customers’ needs and market potential, communicate with their existing and potential customers in new ways through messages, and build relationships with relevant stakeholders.

The most recent study by Olanrewaju et al. (2020) has done a systematic literature review in the domain of social media and entrepreneurship. The results suggest that research studies in this domain are remarkably new and fragmented. Moreover, the literature review of social media usage in entrepreneurship research covers four areas: marketing, information search, business networking, and crowdfunding. Specifically, the marketing field is the one most developed regarding artificial intelligence (AI) issues and discussions (e.g. AI personalised recommendations) (Loureiro et al. 2021). Despite that, less attention has been devoted to social media marketing in combination with machine learning and/or AI, with only several empirical studies in existence. For instance, the study by Capatina et al. (2020) has explored the perceptions of 150 marketing experts from three countries (Italy, France, Romania) on three single antecedents (i.e. audience, image, and sentiment analysis) regarding AI-based software for social media marketing, but the empirical research can unlock the full potential of social media and digital records for entrepreneurship research (Obschonka and Audretsch, 2020; Kosinski et al. 2013). Nevertheless, the causal relationship of company/brand content and customer engagement on social media was not explored, particularly from a machine learning perspective.

As social media usage is increasing among both businesses and customers, successful social media implementation initiatives are a priority for businesses. Previous studies offered frameworks that explain the adoption and use of social media by entrepreneurs, covering two perspectives: customer-oriented adoption and entrepreneur-oriented adoption (Olanrewaju et al., 2020). The customer-oriented adoption framework pays attention to customer engagement as the foci of social media use (ibid.). The centre of the entrepreneur-oriented adoption framework denotes how to implement social media within the business (Olanrewaju et al., 2020). The current research seeks to contribute the theory and practice in the social media implementation by brands/companies’ domain within a customer-centric perspective.

The concept of customer engagement behaviour (CEB) has been widely analysed in academic literature (Beckers et al. 2017; Hollebeek and Andreassen 2018; Yang et al. 2016). Following the most recent suggestions by Harmeling et al. (2017) and Obilo et al. (2020), this research uses the behavioural manifestation of CEB, which is defined as ‘the customer’s behavioural manifestations toward a brand or firm, beyond purchase, resulting from motivational drivers’ (van Doorn et al. 2010, p. 253).

CEBs on social media can be encouraged with various features of company/brand posts. Hence, previous studies of CEBs on Facebook have investigated several features of company messages, namely, content, emotional characteristics, and media types (e.g. video, image, and links). For instance, Leung et al. (2017) investigated four media types in hotel brands’ posts (e.g. video, image, link, word) and six types of post content (i.e. promotion, product, reward, brand, information, and involvement). Recently, social media provides 3-D and carousel images, live videos, and interactive polls. Therefore, a more granulated level of analysis, including content analysis of text and images, is needed.

Regarding CEB, the latest developments of Facebook support five consumer emoji reactions: love, ha ha, wow, sad, and angry. Meanwhile, previous studies have analysed the impact of different features of posts on only three customer responses (i.e. likes, comments, and shares) on Facebook (Labrecque and Swani 2017; Leung et al. 2017). The full spectrum of emotional reactions was not included. Moreover, all these customer emoji reactions might act as a catalyst for other customers’ behaviours as well.

Companies’ messages and CEBs can be tracked and analysed through text-based sentiment analysis and offer a more granular level analysis. Indeed, recent academic studies have combined lexicon-based (an automatic) and machine learning-based approaches to sentiment analysis research in customer comments from 83 Facebook brand pages (Dhaoui et al. 2017). Despite several attempts to analyse text-based sentiment analysis and use a machine learning approach, the focus of the prediction of post popularity on Facebook has attracted limited attention in academic literature. Meanwhile, this machine learning approach provides a more nuanced and robust understanding of the practices of a company’s/brand’s messages on social media platforms and might enhance the field’s methodological development. Therefore, the fundamental question remains regarding how to predict CEBs on Facebook, based on the features of a company’s posts (e.g. content types, media types, emotional cues). To address a research question, this research seeks to predict CEBs (likes, shares, comments, and emoji reactions) on Facebook based on the features of company/brand posts. Hence, a Random Forest (RF) method was applied.

This chapter includes a review of the relevant literature on CEBs on social media platforms and, thus, integrates a behavioural approach to CEBs. Notably, this chapter seeks to alter the academic discussion about the power of machine learning on CEBs on Facebook. Machine learning advances social media research (Khan and Chang, 2019) and enables entrepreneurs to build personal or a company’s/brand’s brand on social media. Therefore, this research contributes to the growing body of literature on the features of a company’s posts and CEBs on social media. Regarding various features of brand posts, this research is based on widely used theories of uses and gratifications and media richness. Furthermore, from a theoretical perspective, the current research expands existing views of content types by distinguishing them into single content and blended content types and providing empirical evidence for the effect on CEBs. Hence, the research contributes to the uses and gratifications theory by proposing a list of various content types of brand posts that satisfy users’/customers’ specific needs based on their behavioural responses regarding the number of likes, shares, comments, and emotional reactions on Facebook.

Using a machine learning perspective, the chapter offers a novel research approach to the social media marketing literature, and offers several contributions to both academics and practitioners. Firstly, this research investigates the relationship between various features of company messages and CEBs, and, thus, provides a prediction model of customer responses (e.g. likes, comments, emoji reactions) towards various features of company messages on Facebook. Hence, it offers a deeper understanding of what kind of post features are the most effective for successful CEBs on Facebook. Secondly, the enhanced list of company post features enables social media practitioners to rethink their current social media marketing strategies and excel at them. Finally, the proposed prediction model can act/serve as a foundation and can be developed further within additional components and is suitable for AI-enabled business applications on social media.

The remainder of this study is organised as follows: firstly, this study provides the theoretical background encompassing the features of brand posts, conceptualisation of CEB on Facebook, and a conceptual framework development; secondly, this paper presents the methodology; and thirdly, it provides results. Finally, the conclusions and discussion are provided.

2 Theoretical Background

Traditionally, companies seek to capture customers’ attention and stimulate them to react to their content on social media platforms. These customer actions might encourage other customers to respond, and the message can reach a huge audience organically without any additional costs (e.g. paid post nature). Thus, company/brand sales posts may provide the best deals and immediately attract customers to buy their products/services from their websites or order products through private messages. Moreover, after the post-purchase phase, a happy customer can express his or her opinion about the product/service (e.g. rate products/services with stars on the Facebook page), write a positive message to a company/brand privately, or even create content and tag the company’s/brand’s page on social media platforms. As a result, both sides – either company/brand or customer – can initiate this communication.

But how to implement social media within the business is covered by the centre of the entrepreneur-oriented adoption framework (Olanrewaju et al. 2020). Moreover, the implementation of social media can involve several actions such as setting up a company/brand page on social media, creating and constantly developing strategies for social media activities (e.g. product/service brand awareness, sales), and publishing relevant content. Indeed, the latter action requires consistent and persistent support on social media platforms and efforts to discover a real value for customers.

The effective social media implementation can lead to tangible and intangible benefits for the business. For instance, Aulet (2013) has highlighted that if the company can focus primarily on creating demand, then various web-based techniques such as e-mail, inbound marketing, telemarketing, and social media marketing help lessen the need for direct salespeople. Moreover, companies enhance their own performance if they have an active presence on social media (Kumar et al. 2016; Tafesse and Wien 2018; Yoon et al. 2018). Another great benefit for companies is the extensive analytics about customers, which are not possible through the human channel (ibid.). Meanwhile, the company’s/brand’s social media implementation starts within a clear social media strategy, platform, and selected features of business/brands posts that keep the customer engaged.

2.1 Features of Brand Posts: Content Types, Media Types, and Time Frame

CEB might be influenced by various features of brand posts on Facebook. Accordingly, a huge variety of research has investigated the relationship between users’ usage of media content and motivation on social media, and there are several dominant perspectives. The uses and gratifications approach by Katz (1959) explains customers’ social media use motivations (Li et al. 2021). Meanwhile, the ‘use’ follows the assumption that the message cannot influence users (media) who have no ‘use’ for it (Katz, 1959). Indeed, the ‘use’ approach aligns with the user’s values, interests, and associations, and the social roles that have a greater influence on them than without it (ibid.). At the same time, ‘gratification’ holds that media users need to achieve gratification.

Social media as a medium should satisfy user gratifications similar to those that traditional mass media does (Pujadas-Hostench et al. 2019). Thus, social media empowers users to consume different content and, thus, socialise with each other which, in turn, influences their behaviour. Therefore, social media users can have diverse gratifications (i.e. entertainment, information). Within the context of social media, previous studies classified brand content into two groups such as informative and entertainment. These two groups cover two consumer motivations respectively: entertainment and information. While information motivation has four sub-motivations that contain expertise, surveillance, pre-buying information, and inspiration motivation (Muntinga et al. 2011), the sub-motivation of information content might include a remuneration type of content (i.e. special rewards). Additionally, the supportive literature that applied theory includes empirical studies by Muntinga et al. (2011), Dolan et al. (2016), Annamalai et al. (2021), and Mishra (2021).

Meanwhile, a brand post can be accompanied with a diverse media type and encompass information with various degrees of media richness (e.g. photo). For instance, video and photo posts are considered richer than text posts. Thus, media might differ in its capacity to possess rich information which can be explained by the theory of media richness (Daft and Lengel, 1986). Ishii et al. (2019) believe that the media richness theory will remain as ‘the landmark foundation of studies on continuously evolving communication technology and media use behaviour’ (p.129). Indeed, the theory is widely used by social media researchers describing the media type of brand posts, which are commonly defined as the vividness of posts (see Cvijikj and Michahelles 2013; Luarn et al. 2015; Annamalai et al. 2021). Meanwhile, brand posts with videos are engaging and immersive and present a high level of vividness compared to images with a low level of vividness.

The academic literature provides the classification of features of brand posts that involves mainly three major categories such as time frame, content, and media types. The last two broad categories of post features can be divided into subcategories. For instance, content types of brand posts can cover eight content types such as informational, entertainment, social, promotions, social responsibility initiatives, user-generated content and reposts (e.g. influencer), educational, and job offer(s). User-generated content cannot be classified as a post created by the company itself, but the company/brand can post and/or repost it on social media platforms. Additionally, all these content types can be blended and constitute a single post with mixed content types. In a similar vein, media types of brand posts may involve images, video, and links. Thus, the text of a brand post can be accompanied by various emotional cues (e.g. emoji, emoticons). The discussion about single content and mixed content types is provided below.

2.1.1 Time Frame of Brand Posts

Theoretically, the time frame (i.e. publishing time) represents the day of the week and the time of day (Cvijikj and Michahelles 2013; Sabate et al. 2014). The exact time can be currently done either ‘manually’ or ‘automatically’ by using special platforms (e.g. Later). Indeed, the appropriate time for publishing is expected to create better possibilities for organic reach. For instance, late evening is a good choice for companies/brands to attract the attention of young parents when the children are sleeping.

2.1.2 Content Types of Brand Posts

Informational content posts involve information about the company, brand, products/services, or other information related to marketing activities (De Vries et al. 2012; Luarn et al. 2015). For instance, a clothing brand can post an informational post about new collections and provide detailed information about the colours, materials, etc.

On the other hand, entertainment content contains fun content or entertains viewers. Indeed, the content is not related to the brand or a particular product or service but enables users to enjoy themselves, have fun, and escape from routine (Gutiérrez-Cillán et al. 2017; Luarn et al. 2015).

Remuneration posts involve various benefits, including economic incentives and rewards (Aydin, 2020). These brand posts can encourage customers to take action towards a buying decision (Tafesse and Wien 2017). For instance, a sales promotion post can involve special promotional offers (e.g. price discounts, 70% off), promotion codes, and competitions/quizzes ‘share and win’.

Social brand posts contain various questions or statements to encourage interactions with users, provide them with the opportunity to react to a post, and facilitate the interaction further (Luarn et al. 2015). For instance, a brand can publish a post about their employee of the month, and fans of the brand page can express their surprise emotion or even write a greeting message in the comments section.

Social responsibility initiatives. A Corporate Social Responsibility (CSR) brand post is assumed as a ‘communication that is designed and distributed by the company itself about its CSR programs’ (Khan et al. 2016, p. 699) based on Morsing (2006). Programmes of social responsibility involve energy consumption, carbon footprints, sustainable consumption, and others. An example of this type of post is as follows, e.g. ‘[...] Thank you Ronald McDonald House Charities of Southwest Florida for keeping her family together!’) (Khan et al. 2016, p. 701).

User-generated content and reposts. Voorveld (2019) notes that brand communication with customers on social media can blur the lines between brand content and other content. The other content that companies can repost can be named ‘user-generated content’, which is regarded as a post created by social media platform users.

Nevertheless, the user-generated posts can cover diverse types of content, including informational, social, and entertainment, which might be related to a company/brand or not related to a company/brand. The only distinction here is that the content is not created by the company/brand. Moreover, the content can be sponsored by a company/brand, but social media influencers can create a post (Vaiciukynaite 2019). Specifically, social media influencers can generate posts with original and authentic content (ibid.), while brands/companies can repost these posts on social media platforms.

Importantly, user-generated content can be created not only by social media users but also by social media influencers (a company/brand-sponsored post), and companies/brands might reshare their content. Importantly, social media influencers can be either micro (i.e. smaller reach) or macro (i.e. bigger reach) and might impact user responses differently (Voorveld 2019). This reshared content should credit the original content within ‘@username’.

Educational posts describe posts that educate and inform customers (Tafesse and Wien 2018). For instance, food-brand posts can involve posts on how to prepare a particular dish or how to cook properly (e.g. how to prevent vitamin and mineral loss when cooking vegetables). These posts can entail information that enables customers to gain new information and skills. It is important to note that these brand-generated posts are related to a company’s/brand’s products or services.

Job offers – a job advertisement is generated by a company or brand to inform potential job seekers about job possibilities. Facebook (2020) for business suggests that brand pages can reach their fans and get more information about their candidates quickly for free. Moreover, job offers can be designed creatively and may stimulate potential candidates to answer some questions or stimulate their curiosity to open a company/brand link.

The mixed content types. Typically, previous research has provided classifications of post content that entails only a single content type. Importantly, according to Tafesse and Wien’s (2017) findings, brand posts can contain multiple messages in a single post. However, according to this study’s findings, brand posts can have multiple types of post content (ibid.). Indeed, a brand can design longer posts that entail two distinct parts of the text. For instance, the first part of a post text can contain information about new products/services, while the second part of the text might involve remuneration content. Therefore, the company/brand can expect a more significant reach among users as the company/brand informs them about its product/service and stimulates them to act accordingly (e.g. ‘share and win’).

2.1.3 Emotional Cues of Brand Posts

All these types of text content can be altered with emotional cues, i.e. emoji. More specifically, company textual messages can be associated with emotional (non-verbal) (i.e. emoji) and verbal (i.e. words) cues. An emoticon is typographical (textual symbols), such as an emoticon with the tongue sticking out (‘:P’). On the other hand, emojis are graphic symbols that can include representation of facial emotional expressions, abstract concepts, and also plants, animals, gestures or body parts, and other objects (Rodrigues et al. 2018; Troiano and Nante 2018).

Luangrath et al. (2017) have classified non-verbal cues into four categories: (1) words are accompanied by special characters or text styles with caps, (2) non-standard language words, (3) words that do not fit grammatically within a sentence, and (4) posts that include visual emoji. Hence, a verbal message can be accompanied by diverse non-verbal cues. Moreover, the most recent study by Das et al. (2019) has investigated advertisements accompanied by emoji and indicated that the presence of emoji can encourage a higher positive effect for customers that leads to higher purchase intention.

2.1.4 Media Types of Brand Posts

The types of brand content posts can be accompanied by various types of media, including videos, images, and links. All these media types can contribute to different levels of vividness in the posts. For instance, an image/photo represents a low level of vividness because it contains pictorial content (Luarn et al. 2015). In contrast, video is considered to have a higher level of vividness (Antoniadis et al. 2019), for instance, YouTube videos. A medium level of vividness is for links to websites/news sites or blogs (Luarn et al. 2015). In many cases, links include company links or other sources on the Internet. For instance, a company may provide a brand post with expert views from external sources or use a link with more detailed information about a product/service. Interestingly, posts with hyperlinks are the most common on institutions’ Facebook pages (Chauhan and Pillai 2013).

Concerning images, there are many different types such as an image accompanied by product images, humans with products images, consumption contexts, nature backgrounds, etc. For instance, Berg et al. (2015) noted that images with human models have facial expressions and can be found in advertisements, on packages, etc. Notably, the previous study revealed the importance of facial expressions for an effective brand post in terms of CEB on Instagram (Rietveld et al. 2020). Indeed, photos of human models can be published on social media platforms as well.

2.2 Customer Engagement Behaviour on Facebook: Definition and Conceptualisation

A company can have their business page on Facebook and initiate interactions with its existing or new customers through their posts. Customers might be motivated to express their engagement behaviours towards diverse types of company’s posts. As a result, the company can develop and build relationships with their customers, and in turn, customer engagement can have a positive effect on a company’s performance (Kumar and Pansari 2016; Yoon et al. 2018). Indeed, company posts can act as a trigger for customers’ attention and, thus, motivate them to express responses to posts.

Active customer participation on social media can be defined as ‘customer engagement’ or ‘customer engagement behaviour’. These terms have been widely analysed by academics and practitioners, but there is still no general agreement about their definition and conceptualisation. Consequently, academics use diverse terms for ‘customer engagement’. For instance, some authors use terms such as ‘social media engagement’ (Tafesse and Wien 2017), ‘customer engagement’ (Harmeling et al. 2017), ‘social media behaviour’ (Dimitriu and Guesalaga 2017), ‘(customer) engagement’ (Chaffey 2007; Marsden 2017), ‘customer engagement behaviour’ (van Doorn et al. 2010), ‘customer brand engagement behaviour’ (Leckie et al. 2018), and ‘firm-initiated customer engagement behaviour’ (Beckers et al. 2017). Moreover, previous studies have conceptualised customer engagement (or customer engagement behaviour) differently as either a psychological state or behavioural manifestation beyond purchase, resulting from customer motivational drivers (Beckers et al. 2017; Harmeling et al. 2017; Hollebeek and Andreassen 2018; Hollebeek et al. 2014; van Doorn et al. 2010).

Recently, there is an increasing trend towards using a behavioural approach (Rietveld et al. 2020; Beckers et al. 2017; Barger et al. 2016; Carlson et al. 2018a, b; Yoon et al. 2018). Consistent with Rietveld et al. (2020), this research assumes a behavioural approach for understanding customer engagement on social media. Therefore, customer engagement behaviour is defined as ‘the customer’s behavioural manifestations toward a brand or firm, beyond purchase, resulting from motivational drivers’ (van Doorn et al. 2010, p. 253). Similarly, customer engagement is ‘a customer’s voluntary resource contribution to a firm’s marketing function, going beyond financial patronage’ (Harmeling et al. 2017, p. 316). Consistent with Obilo et al. (2020), customer engagement is made up solely of behaviours, and this research applies a behavioural approach, which is widely used in previous academic and practical studies (Ferrer-Rosell et al. 2020; Luarn et al. 2015).

On Facebook, CEB might involve a list of reactions’ functionalities such as likes, shares, emoji, or emotional reactions. Importantly, these reactions’ features can be enhanced due to platform updates. For instance, Facebook enables users to express animated and diverse emoji reactions to posts; for example, the user can press a ‘love’ button. Recently, due to COVID-19, Facebook has launched a new emoji ‘care reaction’ – a heart being hugged (Hayes 2020). On Facebook, emotional reactions include love (beating heart), ha ha (laughing face), wow (surprised face), sad (crying face), and angry (red/angry/pouting face) (Emojipedia 2020).

In summary, and consistent with Yoon et al. (2018), our research is focused on active customer actions because their engagement behaviour (i.e. liking) exposure could also influence other customers’ behaviour. Hence, this current research denotes active customer actions on Facebook, including eight behavioural responses: likes, comments, shares, love, ha ha, wow, sad, and angry expressions.

3 Conceptual Framework Development

The proposed model of CEB on Facebook is based on various features of brand posts and organised based on stimulus-organism-response paradigm (S-O-R) (Mehrabian and Russell, 1974). The paradigm provides that the environmental stimuli (S) lead to an emotional reaction (O) and, in turn, influences customers’ behavioural responses (R) (Carlson et al. 2018a, b). Importantly, the framework was widely applied in studies of online consumer behaviour (Eroglu et al. 2003; Manganari et al. 2009). Meanwhile, the most recent studies have applied the full S-O-R paradigm (Carlson et al. 2018a, b; Triantafillidou and Siomkos, 2018; Schreiner et al. 2021) or a part of the S-O-R framework to CEB on social media platforms’ context (see Mishra, 2021).

Based on the S-O-R paradigm, the stimulus (S) denotes various features of a brand’s/company’s posts, while response (R) means CEBs on Facebook and the developed model is shown in Fig. 1. The features of the brand posts are explained based on the theories uses and gratifications and media richness.

Fig. 1
An illustration of brand post impact on C E B. 2 boxes are labeled stimulus and response. The left box gives the features of brand posts and the right gives the C E B on Facebook.

Conceptual framework of CEB on Facebook based on features of brand posts

Based on the literature review, the features of brand posts entail three broad categories: content types, media types, and time frame. All these features of brand posts can act as a catalyst for customer engagement behaviour (CEB) on Facebook. The CEB covers likes, comments, shares, and emotional reactions. Specifically, within the latest developments of Facebook, the platform supports five distinct consumer emoji reactions: love, ha ha, wow, sad, and angry. Therefore, the current research integrates the full spectrum of emotional responses. Conceptual framework explains that the stimulus (features of brand posts) can act as input features for a mathematical model for predicting CEB response – output variables (likes, comments, shares, emotional reactions).

4 Methodology

The analysis of methodological approaches in CEB research has revealed that qualitative and conceptual approaches are the most used. Meanwhile, using a mathematical modelling approach might achieve a more nuanced and robust understanding of the company/brand communication practices on social media platforms. Therefore, the current research has chosen an empirical approach to model CEB on Facebook on features of brand posts based on stimulus (S) and response (R) framework (see conceptual framework development in Fig. 1). For this purpose, various types of companies/brands, which cover diverse market contexts, including business to business (B2B) and business to customer (B2C) on Facebook, were used. Consistent with Tien and Aynsley (2019), both markets were involved. The posts were gathered manually from official companies’ Facebook pages. Companies’/brands’ pages were selected if they published posts regularly and/or at least once a week on average (Abitbol et al. 2019). Following Tafesse and Wien (2017), posts covered a four-week period (1–31 June 2018) and were analysed further by using a hand-coded content analysis. Two coders who were not related to this research were involved in the coding process.

Following previous studies (see Table 1) and conceptual framework development of stimulus and response (see Fig. 1), this research considered the main categories of post features, such as content type, media type, and time frame. All these categories have subcategories under the specific features, for instance, content types. Features of posts were divided into media type (e.g. video, image) and content type (e.g. informational, social), which are explained below. The selected list of companies included a diverse range of industries based on Tafesse and Wien (2017) and was later refined. In sum, all variables were coded at the single post level (Abitbol et al. 2019).

Table 1 The coding categories of features of brand posts on Facebook

Additionally, adapted by Rietveld et al. (2020), the brand’s pages involved a minimum of 100 posts on the Facebook platform, which ensures us to enable a fair comparison between brand accounts. As a result, three brands were removed from the list.

4.1 Coding Variables

4.1.1 Independent Variables

Based on the coding categories of the features of brand post on Facebook (see Table 1), content types, media type, and time were captured.

4.1.2 Dependent Variable: Customer Engagement Behaviour

Consistent with previous studies (Barger et al. 2016), this research operationalises CEB as a set of measurable customer actions on a company’s Facebook page, such as customer response to a company/brand message: likes, comments, shares, and emotional reactions (i.e., love, ha ha, wow, angry, sad) (see Table 2).

Table 2 The coding categories of CEB on Facebook

Once all posts were analysed, the collected dataset had to be labelled to perform the classification task. The current research seeks to measure CEB by predicting how popular a post is in metrics from the raw data: number of likes, comments, shares, and emotional responses.

These different types of customer social actions can be categorised into a diverse level of engagements. For instance, a liking behaviour indicates less value compared with a commenting behaviour or sharing behaviour, and receives a lower score (Peters et al. 2013). Indeed, customer comments require more effort and engagement from consumers (Yoon et al. 2018).

Adopted from the BuzzRank interaction rate formula on social media by Peters et al. (2013), the metric for a social media score was developed. This social media score was calculated using the following formula (1):

$$ {S}_p={\mathrm{likes}}_p+{\mathrm{comments}}_p\times 2+{\mathrm{shares}}_p\times 3 $$
(1)

where Sp is a social media score of post p, likesp is the number of likes of post p, commentsp is the number of comments on post p, and sharesp is the number of shares of post p. After target metrics for CEB were calculated, data labelling was started based on these metrics. Two classes were formed. The first class created was unpopular brand posts, and the second one indicated popular brand posts. It is important to note that the popularity of brand posts was computed for each metric separately (e.g. likes).

Concerning customer likes, all brand posts that have a smaller/lower number of likes than the mean of likes in the dataset were marked as ‘unpopular posts’. In contrast, all brand posts that have a larger number of likes than the mean value of likes were assigned to the ‘popular posts’ class. In a similar vein, the same process was performed for all CEB metrics: social score, likes, comments, shares, and emotional reactions (i.e. love). Therefore, nine classification tasks were formulated for each of the CEB metrics.

4.2 Prediction Method and Model

Many machine learning methods are capable of dealing with classification tasks. Moreover, several machine learning models can be built, including naive Bayes, k-nearest neighbour, logistic regression, decision tree, and Random Forest (RF) (Eluri et al. 2021). For this research purpose, the RF method was selected. Specifically, the RF model was successfully used by previous researchers in the social media domain (Hajhmida and Oueslati, 2021; Huang et al. 2018).

Inner workings of the RF algorithm are based on decision trees. The main flow of RF is to build many decision trees, which then vote to assign the specific class to the given input. In this paper, inputs are a post’s parameters, and binary classes reflect predicted post popularity (popular and unpopular post). Moreover, some degree of randomisation is used when picking the feature on the node split: not every feature is used on every node on the decision tree. This is done to lower the risk of overfitting the model. When generating the decision trees on specific attributes, we split the tree and an attribute is placed as a root node based on splitting measures like the Gini index or information gain.

The Gini index is based on the probability of a variable being classified incorrectly when it is picked randomly. This index ranges from 0 to 1, where zero means that all data points belong to the same class and 1 means that data points are distributed evenly. The Gini index can be calculated using the following formula (Bramer, 2007) (2):

$$ G=1-{\sum}_{i=1}^n{\left({p}_i\right)}^2 $$
(2)

where G is the Gini index and pi is a probability of being classified as a particular class. Given the Gini index, it is possible to calculate feature importance in the model. For each decision tree in the RF, a node’s importance can be calculated using the Formula (3):

$$ {Ni}_k={w}_k{G}_k-{w}_{left(k)}{G}_{left(k)}-{w}_{right(k)}{G}_{right(k)} $$
(3)

where Nik is the importance of the node k, wk is the weighted number of samples reaching node k, Gk is the Gini index of node k, and left(k) and right(k) indicate the split of node k in the decision tree. Finally, the importance value for each feature can be calculated by the Formula (4):

$$ {Fi}_i=\frac{\sum_{k=k\ node\ splits\ of\ feature\ i}{Ni}_k}{\sum_{k=1}^N{Ni}_k} $$
(4)

where Fii is the feature importance of feature i and Nik is the node importance of the node k. This value can then be normalised by dividing it by the sum of all feature importance.

The prediction model of CEB on Facebook is based on brand posts’ features, such as time frame, content type, and media type. This model is illustrated in Fig. 2.

Fig. 2
A flow diagram of brand post popularity prediction. Moving from the left, the steps are as follows. Gathering social media posts. Analyzing and measuring engagement to derive C E B and post properties. The outcome gives the popular and unpopular post categories.

Prediction model of CEB on Facebook based on various features of brand posts

To sum up, this paper follows the standard process of data analysis for creating machine learning models. In the beginning, the initial database of companies’ posts was gathered; then, each post was analysed in terms of post properties and customer engagement. These extracted properties were used to train nine models, which are capable of predicting post popularity in terms of calculated social score, likes, comments, shares, and each emotional reaction (for five emotional reaction types).

Finally, to ensure model correctness, the validation procedure and evaluation parameter were selected. The widely used tenfold, cross-validation method was chosen to validate the model, thus ensuring that data samples from the training set do not spill over to the testing set and minimising randomness by splitting the dataset into ten separate folds and using nine of them for training, and one of them for testing, and iterating for all of them. For model evaluation and comparison prediction accuracy, the area under the curve (ACU) parameter was selected. The ACU parameter measures how good the model is at predicting the correct class: popular or unpopular brand post.

5 Results

5.1 Descriptive Results

The descriptive results are discussed based on three types of the company’s post features: time frame, content, and media types. In total, 1109 posts were analysed from the official brand/company’s pages on Facebook.

5.1.1 Time Frame

The results show that the largest number of posts was published on Friday (21.3%; 236) and Thursday (19.3%; 214), while the least number of posts was on Sunday (6.5%; 72). Indeed, the companies posted messages during working days (85.6% of all posts; 949). Regarding the time of day, which was classified into three groups: morning [from 06 a.m. to 10 a.m.], day [from 10 a.m. to 2 p.m.], and evening [from 2 p.m. to 12 p.m.], the majority of the posts were published in the evening (45.3%; 502), whereas the lowest number of posts were published in the daytime (23.5%; 261).

5.1.2 Content Types

Almost half of brand posts (45.1%; 500) involved informational content type, followed by promotional content (35.3%; 391). The majority of brand posts contained informational (45.1%; 500) and social (35.3%; 391) content types, followed by social (15.8%; 175), social responsibility (3.1%; 34), and entertainment (0.8%; 9).

5.1.3 Media Types

Four types of media were included: image, video, link, and other. The latter included unlisted types such as a graphical image. Most posts covered an image media format (70.6%; 783), followed by videos (15.4%; 171) and links (12.7%; 141).

5.2 Random Forest and Accuracy of Trained CEB Prediction Models

The Random Forest algorithm was used for CEB prediction model training. Importantly, a few insights from the trained models can be observed by analysing the feature importance measured by the Gini index. Based on the results, the content type and the time frame (e.g. day, time of a day) were the strongest predictors for post popularity calculated by the social score (see Table 3).

Table 3 The calculated Gini index for the features’ importance and accuracy of trained CEB prediction models

The results reported in Table 3 indicate that the features of brand posts such as day show a higher importance value (> 0.25 and 0.20–0.25) for the social score, likes, comments, shares, and emotional reactions, including love, wow, ha ha, and angry. In a similar vein, a higher importance value (> 0.25 and 0.20–0.25) was indicated for content types and likes, shares, love, ha ha, and sad expression. Interestingly, the post text with emoji showed the importance of customer commenting and sad expressions on Facebook (see Table 3; the importance value is between 0.10 and 0.15). Notably, video length could be associated with sad expressions (0.20–0.25). The least important values for features of brand posts were indicated for different media subtypes such as human emotions, and emoji stickers in a photo, followed by images accompanied with a logo, and human faces/bodies.

These models were evaluated based on prediction accuracy using tenfold cross-validation. According to the results, in terms of social score, likes, comments, and shares, the strongest prediction models were for the company’s post comments and shares (see Table 3). Indeed, these models were capable of predicting whether a brand post would be popular in terms of shares (80.3%) and comment accuracy (84.0%). It is important to note that models for customer likes (68.4%) and computed social score (72.3%) have shown slightly lower accuracy values. Results of predictive models for emotional reactions are provided in Table 3 (see accuracy values).

6 Conclusions and Discussion

The descriptive results highlighted that the majority of brand posts were published during working days and less on weekends. Moreover, the largest number of posts was posted in the evening. The findings show that the primary content types of brand posts on Facebook are informational and remuneration. These results are aligned partially with the first studies on brands’ posts on Facebook (see Luarn et al. 2015). Importantly, this research also indicates mixed content types such as informational and promotional, followed by social and remuneration. Our findings also support Taffese and Wien’s (2017) findings that brands do post blended content types.

Concerning media types of the company’s post, the dominant media type of posts was an image. Moreover, this result is following the study by Sabate et al. (2014), indicating that accompanying a brand post with images plays a key role in the post’s popularity. Images can contain different features, such as human faces with emotions (e.g. happiness, surprise, neutral), and emoji stickers. Importantly, the least popular media types of posts among companies/brands were links.

Our findings provide evidence to suggest that both the time frame and content types of brand posts matter for CEB on Facebook prediction. Indeed, our research results are aligned partly with global trends, provided by Hootsuite company, a global leader in social media management (see more: Tien and Aynsley 2019; the second quarter in 2018). For instance, the best time to post on Facebook is between 9 a.m. and 2 p.m. on Tuesday, Wednesday, or Thursday for B2B brands, while for B2C brands the best time is 12 p.m. on Monday, Tuesday, or Wednesday (ibid.). Hence, a time to publish posts for brands on weekends is not recommended, although these results are aligned with previous research by Cvijikj and Michahelles (2013).

The content types of brand posts are also associated with customer likes, shares, love, ha ha, and sad expressions. These results suggest that the brand should pay more attention to various content types, such as informational, social, remuneration, social responsibility, etc. Notably, the findings support insights by De Vries et al. (2012) that different drivers of posts influence the number of likes and comments on Facebook. Thus, the current results also support findings by Annamalai et al. (2021) that varying influences of content types of posts are shared by sport clubs on social media. Interestingly, the results show that the text of a brand post accompanied with emoji can act as a catalyst for customer comment responses and for sad expressions on Facebook. It is important to note that brands should avoid posts that encourage customers to express negative emotions.

Analysis of our trained machine learning prediction models is also in line with previous findings. The importance of image on CEB was indicated by Luarn et al. (2015) and Sabate et al. (2014) over the brand posts accompanied by videos (a high level of interactivity). A possible explanation for these results might be biased because the number of images posted by brands was higher than posts accompanied with videos. Thus, the research did not distinguish brands based on company size and their social media budget for media types of posts, or the type of market (e.g. B2C, B2B). In summary, a post’s time frame, use of an image, content type, and use of emojis were important features for the prediction model and generated Random Forest decision trees. Thus, it is useful to collect and include these features when dealing with CEB prediction.

To the authors’ best knowledge, there is no previous research that explores features of a brand’s posts on CEB using a granular level of analysis. The current research results extract new features that can be added to existing classifications of brand posts, especially job offer content, influencer reposts, mixed content types (i.e. informational and promotional), emoji within the text, images with emoji, humans and/or emotional expressions (e.g. happiness, surprise, neutral), and logos and video.

7 Limitations

This research has several limitations. First, the exploratory data analysis using the Random Forest method is used. Thus, the dataset includes emotional expressions at an early stage on Facebook. In general, CEBs (i.e. likes, comments, share, emoji reactions) are increasing over time from January first to March tenth from 2017 until 2019 towards many Top Web publishers (Owen 2019). Therefore, future studies should replicate the analysis using the most recent data from companies’ pages on Facebook.

Second, the data was collected from one single social media platform (i.e. Facebook) and country (i.e. Lithuania). Further studies should replicate the analysis using datasets from different social media platforms and countries, providing a deeper understanding of CEBs towards different features of company messages.

Third, CEBs can be different across diverse types of brands (i.e. B2B, B2C) and their message nature (i.e. organic, paid messages). For example, B2C brand messages perform best at noon on Monday, Tuesday, and Wednesday (Cooper 2020). The paid message can reach a wider audience and might generate diverse types of CEBs on Facebook. Unfortunately, the current research could not collect data about post reach, which indicates the number of users who saw a post (Barnhart 2020). Moreover, the ratio between the number of reach and engagement can reveal more about users’ willingness to engage with a brand post. For instance, a high ratio can be an indicator that a post might involve relevant content to the brand’s audience. While company and post nature are outside the scope of this research, future research can involve these aspects in analysis.

The following limitation must be highlighted: a conceptualisation of CEB on Facebook. The current research does not apply the view to CEB that entails active and passive participation on Facebook. Hence, future research studies might involve additional metrics such as the total number of people reached through the message that capture passive customer participation as well. The last limitation is due to the constant updates by social media platforms, especially Facebook. For example, Facebook is updating its features and functionality continuously. Therefore, the paper presents an area for future research that has both theoretical and practical value.

In conclusion, this research responds to the call for research on timing and frequency features of brand posts (Rietveld et al. 2020) and seeks to provide a more granular level of analysis of post features on CEB on Facebook. The current research provides a novel approach in this area, and future research can enhance our findings.