DFGR: Diversity and Fairness Awareness of Group Recommendation in an Event-based Social Network

An event-based social network is a new type of social network that combines online and offline networks, and one of its important problems is recommending suitable activities to users. However, the current research seldom considers balancing the accuracy, diversity and fairness of group activity recommendations. To solve this problem, we propose a group activity recommendation approach that considers fairness and diversity perception. Firstly, we calculate activity similarity based on the context and construct an activity similarity graph. We define the weighted coverage on the similarity graph as a submodular function and transform the problem of fair and diverse group activity recommendation into maximizing the weighted coverage on the similarity graph while considering accuracy, fairness, and diversity. Secondly, we employ a greedy algorithm to find an approximate solution that maximizes the weighted coverage with an approximation ratio. Finally, we conducted experiments on two real datasets and demonstrate the superiority of our method compared to existing approaches. Specifically, in the domain of diversity-based recommendation algorithms, our method achieves a remarkable 0.02% increase in recall rate. Furthermore, in the domain of fairness-based recommendation algorithms, our proposed method outperforms the latest approach by 0.05% in terms of overall metrics. These results highlight the effectiveness of our method in achieving a better balance among accuracy, fairness, and diversity.


Introduction
In recent years, event-based social networks (EBSN) have become increasingly popular.Users can not only find other users with similar interests and discuss topics of interest by adding online friends or joining online groups, but also create and publish offline activities [22].In addition, users who participate in offline activities can communicate face-to-face in the real world using sites, such as Meetup, 1 Plancast, 2 and Douban. 3Meetup is the largest local group network in the world, and collects and registers people's interests and addresses, and identifies potential groups and gather them together.As of February 2021, the number of Meetup users has reached a staggering 49 million people, with over 230,000 event organizers.On average, there are 15,000 offline events held every day. 4Note that "activity" and "event" are used interchangeably in this paper.
In addition, the diversity of recommendation systems has always been one of the crucial indicators of recommendation system evaluation, which is of great importance for improving the satisfaction of user recommendation systems [19,32,43,55].However, for activity recommendation in EBSNs, existing research mainly uses a variety of context information to resolve the cold start activity recommendation problem, thereby improving the accuracy of recommendation.Among them, the sponsor and content information of the event are very helpful to improve the accuracy of event recommendation because users mainly consider whether the content of the activity is interesting when selecting activities.In addition, we found that if users participated in an event held by a certain sponsor, they would likely participate in the event held by the sponsor in the future.Therefore, although the existing research has improved the accuracy of activity recommendation by using the sponsor and content factors, it tends to recommend activities with similar content, resulting in the lack of diversity in recommendation results.As a result, it is essential to consider the diversity of group recommendations.
However, group recommendations need to consider whether the recommendation results meet the needs of each user in the group.If the recommended activities can only make some users interested and do not include activities that other users are interested in too, then the recommended results lack fairness.Most of the existing research on activity recommendation does not consider the fairness of recommendations, while some existing research on group recommendation does not consider the diversity of recommendations; although it considers the fairness, it lacks evaluation indicators of fairness.Therefore, it is necessary, meaningful and valuable to combine diversity and fairness for group event recommendation.
In order to solve the above problems, we study the diversity and fairness awareness in group recommendation(DFGR), mainly to consider improving the accuracy, diversity and fairness of group activity recommendation.First, we define context based activity similarity, construct an activity similarity matrix and an activity similarity graph; then, we define the weighted coverage on the activity similarity graph as a monotone submodule function, which measures the accuracy, diversity and fairness between groups and activity sets.Second, we transform the group activities with fairness and diversity awareness into the weighted coverage maximization on the activity similarity graph, and propose a greedy algorithm to seek the approximate optimal solution of maximizing the coverage on the similarity graph.Finally, we develop a fairness evaluation index and a comprehensive evaluation index on accuracy, diversity and fairness.The contributions of this paper are summarized as follows: -We propose a group activity recommendation problem with diversity and fairness awareness, and formalize it as a weighted coverage maximization problem on the activity similarity graph.
-We develop a method based on weighted coverage to balance recommendation accuracy, diversity and fairness, and compare three greedy algorithms efficiently to find the approximate optimal solution.-We propose a new evaluation index to measure the fairness of group recommendation, and conduct effective contrast experiments on real datasets.

Related Works
In recent years, there have been some studies on diversity and fairness of group activity recommendation, we divide the related work into three categories: event recommendation, diversity event recommendation, and fairness event recommendation.

Event Recommendation
In recent years, there have been a lot of researches on event recommendation [22,35,36,54,56].Among them, event recommendation in EBSN (Event-Based Social Networks) is often defined as a heterogeneous network [22] or heterogeneous graph [35,54,56].Liu et al. [22] first proposed the concept of EBSN and defined it as a heterogeneous network, where nodes represent users and there are two types of edges representing online and offline interactions between users.Liu et al. [22]also identified two main problems in EBSN: community discovery and event recommendation.Event recommendation, in particular, can effectively alleviate the problem of information overload for users in EBSN, and has received extensive attention in recent years [25,35,36,49,51,56].Traditional event recommendation is usually formalized as utility functions defined on the user space and event space [3].Given a target user, the goal of recommendation is to find events from a candidate event set that maximize the utility function.Event recommendation problems can generally be divided into two categories: rating prediction [1] and top-N recommendation.In EBSN, users' feedback on events is implicit (i.e., attending or not attending) rather than explicit ratings (e.g., 1-5 ratings).Therefore, event recommendation in EBSN is typically a top-N recommendation problem, which involves generating a recommendation list of N events for a group of users.Common methods for event recommendation include collaborative filtering-based recommendation [39,50], content-based recommendation [4], and hybrid recommendation [7], among others.
However, collaborative filtering-based recommendation methods cannot alleviate the coldstart problem, which is relevant to group event recommendation in DFGR.Therefore, DFGR cannot solely rely on collaborative filtering-based recommendation methods and needs to incorporate additional contextual information (e.g., time, location, and content) to address the cold-start problem, known as context-aware recommendation methods.Additionally, unlike personalized event recommendation, DFGR also considers group preferences based on users' preferences.
In addition, there are some relevant and recent papers cover various aspects of event recommendation and can provide further insights and perspectives in event-based social networks [20,26,27,31].For example, [31] introduces a personalized event recommender called SoCaST, which takes into account the influences of user's geographical location, category, social connections, and time on event recommendations.It models personalized event recommendations using adaptive kernel density estimation.Reference [27] primarily addresses a real-time event recommendation problem called 3T-IEC.They propose a three-tiered Internet of Things (IoT)-Edge-Cloud framework for real-time context-aware event recommendation, utilizing IoT devices to capture user's current location, road traffic, and weather conditions.
[20] extracts feature and makes full use of the information in EBSN, including spatial features, temporal features, semantic features, social features, and historical features.It transforms the recommendation problem into a classification problem and extends ELM as the classifier in the model.[26] considers event recommendation using IoT data and collaborative filtering, and they calculated the prediction scores of IoT-based parameters and collaborative filtering based on social influencers, and recommended the final activity with the highest prediction score to the user.These researches primarily focus on multiple factors for real-time event recommendation and does not address the aspects of diversification or fairness in event recommendations.The existing research on group recommendation mainly focuses on the accuracy of group recommendation, ignoring other important indicators of group recommendation (such as diversity and fairness).

Diversity Event Recommendation
In serval years, among many other metrics besides relevance in event recommendation, "diversity" has been extensively studied [48].It not only helps to cater to various short-term and long-term user needs but also contributes to increasing the exposure of activities.
There have been some studies on diversity and fairness of group activity recommendation, such as that of [11,17,24,41].Reference [24] mainly studied the diversity of subgraph query results and group fairness.They formalized a dual criteria optimization problem and found a group of Pareto optimal queries.Reference [2] divided the diversity of recommendation systems into two categories: individual diversity and overall diversity.Individual diversity measures the similarity between items in the recommendation list.If the amount of similarity between items in the recommendation list is smaller, the individual diversity is higher.The overall diversity measures the similarity between items in the recommendation list of different users.He et al. [15] considered the general diversified ranking problem, that is, given any correlation function and similarity function between items to generate diversified recommendation lists.
The existing research on group recommendation mainly focuses on the accuracy of group recommendation, ignoring other important indicators of group recommendation (such as diversity and fairness).Reference [53] proposes an enhanced graph recommendation method with heterogeneous auxiliary information called EGR-HA.They mainly focus on the knowledge representation of auxiliary information and node updates based on graph neural networks, enabling personalized recommendations through deep networks.Pessemier et al. [34] conducted an experimental evaluation and comparison on the accuracy, diversity, coverage and surprise of various group recommendation methods, but did not propose new methods to improve the peinrformance of group recommendation on these indicators.Lin et al. [21] studied the fairness of group recommendation and used Pareto's efficiency to improve the accuracy and fairness of group recommendation at the same time.

Fairness Event Recommendation
Fairness in event recommendation refers to ensuring that the recommendation results are generated based on unbiased and equitable principles.From the user's perspective, the recommendation results should not be influenced by factors such as race, gender, age, or other
In particular, Parambath et al. [33] proposed a coverage based method to improve the diversity of recommendations, defined the coverage on the project similarity graph, and improved the accuracy and diversity of recommendation results.Chen et al. [8] proposed an efficient greedy algorithm for maximum a posteriori inference of determinant point process, and used it to generate relevant and diversified recommendation results.Hamedani et al. [14] studied the recommendation problem of long tail projects, and used a multi-objective optimization simulated annealing algorithm to improve the accuracy and diversity of recommendation results to recommend more long tail projects.
In addition, Serbos et al. [42] also studied the package fairness of group recommendation, considering two different types of fairness: proportionality fairness and envy-freeness fairness.Proportional fairness measures whether each user in the group can find enough interesting items in the recommended item set.Moreover, jealousy free fairness measures whether each user in the group finds more interesting items in the recommended items set than users in other groups.However, our work focuses on the accuracy, diversity and fairness of the group recommendation balance.We define the weighted coverage on the activity similarity graph as a submodule function, and consider the balance between the accuracy, diversity and fairness by using the diminishing marginal utility effect of the submodule function.
Although the fairness of group recommendation has received some research attention, the existing research only improves the recommendation accuracy by considering the fairness of group recommendation [21,42], but does not evaluate the fairness of recommendation.We believe that it is unreasonable to use accuracy evaluation indicators to evaluate fairness because accuracy indicators can only measure whether group members are interested in the recommended activities and cannot measure whether the number of activities interested by each group member in the recommended list is equal.Since there is no quantitative index to evaluate the fairness of group recommendation, we propose a new index to evaluate the fairness of a recommendation list S to group g.Finally, we added a comparison table to pinpoint our work is different and superior the existing researches as shown in Table 1.

Problem Definition
Given group g, the user set U g in group g, and the activity set E u in which each user u has participated, the candidate activity set E c , and the DFGR's goal is to find the set E r .⊂ E c contains K candidate activities from the candidate activity set E c and recommends it to group g so that the E r includes the activities that each group member u ∈ U g is interested in (fairness), and the similarity between the activities in E r is as small as possible (diversity).

Activity Similarity Calculation
Given an activity set E = {e 1 , e 2 , . . ., e n } in EBSNs, we construct an activity similarity matrix M = (M i j ) i, j=1...n .An element M i j in the similarity matrix M represents the similarity between activity e i and activity e j .The similarity can be obtained by calculating the cosine similarity between column i and column j in the user activity interaction matrix X = (X i j ) i=1...m, j=1...n .If the user u i ∈ U gives positive feedback on the activity e j , that is, the user u i has participated in the activity e j , then there is X i j = 1; otherwise X i j = 0.
However, the cold start activity recommendation problem and activities in the test set do not receive any user feedback, which means that some columns of the user activity interaction matrix X are all 0, resulting in the same similarity between all candidate activities and historical activities when generating recommendations.Therefore, the cold start problem cannot be resolved.
In order to solve this problem, we propose a context-based activity similarity calculation method.Specifically, given the time t e , the place v e , the sponsor o e and the set of words W e , we describe the activity e, and the activity e is represented as the feature vector p e = (t e , v e , o e , w e ).Among them, t e ∈ R 7×24 is a 7×24 dimensional one pot time feature vector.If activity e is held in the j-th hour of the i-th day of the week, the i × j element in t e is 1, and the other elements are 0. v e and o e represent one-pot site feature vector and sponsor feature vector of activity e respectively.w e is the content feature vector of activity e.Each word describing the activity content is mapped into a low dimensional space as a low dimensional vector through word2vec word embedding technology [28].We calculate the content feature vector w e of the activity according to the low dimension vector of the word as follows: where z w is the low dimensional vector of word w obtained through word2vec.The similarity between activity e i and activity e j is calculated as follows: where p i and p j represent feature vectors of active e i and e j respectively.We regard (E, M) as a weighted undirected graph.Each activity e ∈ E is a node in the graph, and M i j represents the weight on the edge between node i and node j, which is called an activity similarity graph.

Weighted Coverage Modeling
In order to improve the diversity and fairness of group activity recommendation, we use the coverage on the activity similarity graph to consider accuracy, diversity and fairness at the same time.Specifically, we develop an activity subset S ⊂ E and an activity e i / ∈ S that does not belong to this subset.We define the coverage of activity subset S on activity node e i as follows: where f is a reversible non decreasing concave function defined in the field of positive real numbers, which is called a saturation function.Therefore, coverage cov(e i , S) is a monotone and submodular function of set S [33].
The characteristic of the submodular function is that the benefit brought by adding new elements to the smaller set is greater than that brought by adding new elements to the larger set, that is, the marginal decreasing effect: the submodular function cov(e i , S) positions the coverage of the activity set S to the activity e i to decrease with the increase of the number of activities in the set S. Assume that S is the set of activities to be recommended, and define coverage as a submodule function, so that coverage benefits from activities similar to the recommended and historical activities; e i will decrease with the increase of the number of recommended activities, so as to improve the diversity of recommendations.
Let the activity set that user u ∈ U has participated in be E u , and the activity set that group g has participated in be defined as E g { E u , u ∈ U g }.For an activity subset S ⊂ E cand of candidate activity set E cand , the coverage of activity subset S to group activity set E g is as follows: Since cov(e, S) is a submodular function, cov(E g , S) is also a submodular function.Because E g includes activities that all members of the group have participated in, cov(E g , S)'s sub modular nature makes it possible to select activities similar to those that Therefore, DFGR can be transformed into the problem of maximizing coverage cov(E g , S).However, different users have different activeness, which means that some group members participate in more activities than others.Maximizing coverage cov(E g , S) will lead to recommendation results more inclined to meet the preferences of more active users in the group, thus reducing the fairness of group recommendation.In order to solve this problem, the weighted coverage of activity subset S to group participation activity set E g is defined as follows: w e = r e e∈E g r e (6) where w e represents the weight of activity e, and U e represents the set of users participating in activity e.The purpose of defining activity weight according to Eq. 6 is to reduce the weight of activities participated by group members with high activity and improve the weight of activities participated by group members with low activity to avoid the problem of unfair group recommendation results caused by direct coverage maximization.Finally, DFGR is transformed into finding candidate activity subset S with a scale of K , so that its weighted coverage cov w (E g , S) of target group g is the largest, as follows: The weighted coverage maximization problem in Eq. 8 is an NP-hard problem [30].At present, there are many methods that can be used to find the approximate optimal solution of an NP-hard problem, such as an evolutionary algorithm [12] and greedy algorithm [29,30].Although the recent research results show that the multi-objective evolutionary algorithm can reach the (1 − 1/e) approximation degree in polynomial time O(N 2 (log N + K )) in the maximization problem of monotone submodule function [12], where N represents the scale of the problem, the greedy algorithm can reach the same approximation degree with lower time complexity O(N K ).Therefore, we use the greedy algorithm 1 to find the approximate solution to maximize the objective function.Theorem 31 can guarantee the approximate optimal solution of interpretation obtained by greedy algorithm 1, and the approximate degree is (1 − 1/e).Theorem 31 For the monotone submodule function cov w (E g , S), let S * represent the optimal solution of the objective function 8, and S represents the approximate optimal solution obtained by the greedy algorithm 1.The following inequality holds [30]: 4 Experimental Evaluation In order to reduce the complexity of the training model, only the classification and labeling of activities are used in the experiment.Since the dataset does not contain any group information, we evaluate the recommended methods of group activities by generating simulated groups.Before that, data sparsity was reduced through data preprocessing.For the local data of Douban, only the activity data between September and December 2016 are used.For the Meetup data, only activity data from January 2013 to December 2014 are used.In addition, for activity data of each city, users who participated in activities less than 10 times are filtered out.Table 1 shows the basic statistical information of the dataset.

Evaluation Method
In order to simulate a real group activity recommendation scenario, we first sort all the activities in the dataset according to the time, and select 80% of the earliest activities and their user participation data as the training set and the remaining 20% of the activities and their user participation data as the test set.Because the activities in the test set are held later than those in the training set, and there is no intersection between the activities in the training set and the test set, our experiment can simulate the real group activity recommendation scenario, that is, we can recommend future activities for the group based on historical activity data.
The purpose of the experiment is to evaluate the accuracy, diversity and fairness of the recommendation methods for group activities.In order to evaluate the recommended accuracy, normalized discounted cumulative gain (NDCG) and Recall rate (Recall) are used as the accuracy evaluation indicators.These two indicators have been widely used for the evaluation of activity recommendations [16,56].For each group, the recall and NDCG are defined as follows: where k is the length of the recommendation list, and rel i is the correlation between the i-th activity in the recommendation list and the group.If the i-th activity appears in the group's test set, rel i = 1; otherwise, rel i = 0. IDCG@k indicates the maximum DCG@k of all possible recommended lists with a length of k.The higher the Recall@k and NDCG@k indicators, the more accurate the recommended results.We take the average recall rate and NDCG of all groups as the evaluation indicators of recommendation accuracy.Moreover, we do not use the accuracy indicator because the accuracy is defined as the proportion of related activities in the recommended list, and some unknown positive samples are lost [10,23].Since the fairness of group recommendation is usually evaluated by accuracy indicators [21], NDCG and recall are used in experiments to measure the accuracy and fairness of group recommendation.
In order to evaluate the diversity of recommendations for group activities, we use normalized topic coverage(NTC) [9,38].NTC measures the coverage of topics or types of recommended projects and uses the maximum minimum normalization method to normalize the coverage to the [0,1] interval as follows: where cov g (S) refers to the proportion of activity types in the recommended activity set S to all activity types, that is, topic coverage.mincov g and maxcov g represent the possible minimum and maximum topic coverage, respectively.NTC is the diversity evaluation indicators of traditional recommendation systems, but it cannot evaluate the fairness of group recommendation.Because the diversity of recommendations is high, this does not mean that the fairness of recommendations is high.
Although the fairness of group recommendation has received some research attention, the existing research only improves the recommendation accuracy by considering the fairness of group recommendation [21,42], but does not evaluate the fairness of recommendation.We believe that it is unreasonable to use accuracy evaluation indicators to evaluate fairness because accuracy indicators can only measure whether group members are interested in the recommended activities, and cannot measure whether the number of activities interested by each group member in the recommended list is equal.Since there is no quantitative indicators to evaluate the fairness of group recommendation, we propose new indicators to evaluate the fairness of a recommendation list S to group g as follows: ) where T u represents the collection of activities that user u participates in in the test set.A higher fairness indicator indicates a higher fairness.The fairness we define can be used not only to evaluate the fairness of group activity recommendation, but also to evaluate the fairness of general group recommendation.In the experiment, the average fairness of all groups is used as the final evaluation indicators of fairness.
Inspired by the F index, we define comprehensive evaluation indicators, namely, comprehensive evaluation precision and diversity, and comprehensive evaluation precision, diversity and fairness.In the field of information retrieval, the F index [9] is defined as the harmonic average of precision and recall.Similarly, the precision, diversity and fairness of group activity recommendation can also be evaluated through the F index.Because recall rate and NTC have the same value range [0,1], the indicator of the comprehensive evaluation accuracy and diversity are defined as the harmonic average of the recall rate and NTC as follows: The indicators for comprehensive evaluation of precision, diversity and fairness are defined as the harmonic average of recall rate, NTC and fairness as follows: The harmonic weight in the two comprehensive evaluation indicators is set as 1.

Comparison Method
In the experiments, we compare our method with the following recommendation methods: 1. CBPF [56] uses Bayesian Poisson decomposition as the basic unit to model social relations, venues, sponsors and content factors.Then, several basic elements are jointly modeled by collective matrix decomposition.
2. GLFM [16] is a two-way latent factor model that generates activity recommendations for a single user.At the same time, the method also models time, place, popularity and social factors.
3. Cover [33] is a diversified recommendation method based on coverage considering the accuracy and diversity of the recommendation results.
4. MoFIR [13] introduces a practical multi-objective reinforcement learning approach to balance relevance and fairness in recommender systems.
5. GFAR [18] proposes a definition of fairness that "balances" the relevance among group members in a rank-sensitive way.
6. FPGR [40] presents a method for efficiently enumerating fair group recommendations 7. MTAF [46] explores the application of group fairness, a concept in machine learning fairness, in a multi-task scenario.
In order to evaluate the influence of saturation function in weighted coverage on the effectiveness of DFGR recommendation, the nonlinear saturation function and linear saturation function are tested.For nonlinear saturation functions, f (x) = x r is used, where r is taken as {0.1, 0.5, 0.8, 1}.When r is 1, the saturation function is linear.In order to study the influence of different context factors on recommendation results, four DFGR methods are implemented: DFGR-V1, DFGR-V2, DFGR-V3 and DFGR-V4.DFGR-V1 calculates the similarity between activities according to the relationship matrix between activities and sponsors, DFGR-V2 calculates the similarity between activities according to the venue and sponsors, and DFGR-V3 calculates the similarity between activities according to the time, venue and sponsors.DFGR-V4 considers the time, venue, sponsor and content of the event to calculate the similarity between activities.

Experimental Result
This section compares the effectiveness of DFGR and the comparison methods in recommending group activities on the Douban city dataset.Here are six evaluation indicators of the experimental comparison method and the method proposed by us on the Douban dataset.
Figure 2a shows that the recommended recall rate of GFAR methods is higher than that of other methods.Cover has a low recall rate because it mainly targets recommendation methods for individual users and its recommendation generation process does not calculate the similarity between users and activities.
Similar results are observed from Fig. 2b.In addition, the recall rate and NDCG indicator of DFGR and cover are lower than MTAF.Among diversified recommendation algorithms, DFGR has the highest recommended recall rate and the NDCG indicator.
Figure 2c shows that the NTC indicators of the cover and DFGR diversity recommendation methods are considerably higher than those of other methods.The NTC indicator of GFAR is higher than that of MoFIR, and the NTC indicator of MoFiIR is higher than that of CBPF-AVG, meaning that DFGR, Cover and MoFIR can effectively improve the diversity of recommendation effects.In addition, we can also observe that the diversity recommendation algorithm used in the experiment improves the diversity of recommendations, while the accuracy of recommendations is reduced to a certain extent.This shows that it is difficult to improve the accuracy and diversity of group activity recommendation at the same time, and we can only find a balance between the accuracy and diversity.
Figure 2d shows the fairness indicators recommended by group activities.It can be observed that FPGR has the highest fairness, followed by our DFGR method.This shows that DFGR can effectively improve the fairness of group activity recommendation.
Figure 2e shows the comprehensive evaluation indicators on accuracy and diversity.It can be observed that when the number of recommended items K ≥ 10, the F-Recall of DFGR method of NTC indicator is higher than other methods, indicating that DFGR method can achieve a good balance between recommendation accuracy and diversity.When the number of recommended items of FPGR and MTAF is small (K < 10), the F-Recall-NTC indicator is low.This is mainly because FPGR and MTAF both use greedy algorithms to optimize the objective function, that is, select the activities that maximize the objective function from the candidate set one-by-one and add them to the recommendation set.The submodular nature of the objective function can only play a role when the recommendation set is large.Therefore, when FPGR and MTAF are in a small recommended list (K < 10), and the F-Recall-NTC indicator of the recommended results is low.Figure 2f shows the harmonic average F-Recall-NTC-Fairness of the three evaluation indicators: recall rate, NTC and Fairness.It can be observed that DFGR is substantially superior to other methods in this comprehensive evaluation indicator, which indicates that our method can find a better balance between accuracy, diversity and fairness.
Table 2 shows the effectiveness of three different greedy algorithms on the Douban dataset to maximize the weighted coverage.The recommended number of activities is fixed at 20.It can be observed that the recommended results of the lazy-greedy algorithm are the same as those of the original greedy algorithm, and other indicators are higher than the random greedy is the same as that of the original greedy algorithm, while s-greedy uses random sampling to improve efficiency, but the approximate degree of recommendation results has a certain loss.Specifically, the approximation degree of the s-greedy algorithm is (1 − 1/e − ), which is lower than that of the original greedy algorithm by 1 − 1/e.

Effect of Different Contexts
This subsection discusses the influence of different context factors in DFGR on recommendation accuracy and diversity.Figure 3a-c show that the recommended recall rate of DFGR-V2 is higher than that of DFGR-V3 and DFGR-V4, which indicates that adding time and content factors cannot effectively improve the recommendation accuracy.The recall rate of DFGR-V2 is higher than that of DFGR-V1, which indicates that it is more effective to consider both the host and venue factors than only the host factors.
Figure 3b shows that the diversity of DFGR-V4 recommendations is considerably lower than the other three methods, indicating that the introduction of content factors reduces the diversity of recommendations.The main reason for this is that after considering the content factor, the DFGR method is more inclined to recommend activities similar to the activities that the group has participated in, while the recommended diversity indicator NTC measures the content similarity between the recommended activities according to the activity type, so the introduction of the content factor leads to the reduction of the recommended diversity.The diversity of DFGR-V1 recommendations is higher than that of the other three methods, indicating that the diversity of the recommendations is the best when only considering the factors of the sponsor.

Effect of Saturation Function
This section is about studying the influence of parameter r of saturation function f (x) = x r in DFGR on the effectiveness of recommendations through experiments.Figure 3d shows the indicators of DFGR when the parameter r is 0.1, 0.5, 0.8 and 1.It is observed that when r = 0.5, the recall rate, NTC and comprehensive evaluation index of DFGR reach the maximum.When r = 0.1, the NDCG index gets the maximum value.For different r values, the value of F-Recall-NTC-Fairness changes little.
In addition, it was found that when r = 1, NTC index and comprehensive index get the minimum value.The reason for this is that when r = 1, the saturation function becomes a linear function so that the weighted coverage does not have a submodule property, which cannot provide a diminishing marginal utility effect for similar activities, reducing the diversity and fairness of recommendations.In addition, with the increase of the r value, the recommended NDCG index gradually decreases, which indicates that the linear saturation number will also reduce the recommended accuracy.

Effect of Group Sizes
This section shows the impact of group size on the accuracy and diversity of group activity recommendation.Figure 4a-c show the recall, NTC and F-Recall-NTC indicators recommended by the group activity recommendation method for groups of different sizes in the Douban dataset.
In Fig. 4a, it can be observed that DFGR and MoFIR have no substantial difference in the recommended recall rate of different sized groups and are higher than other methods.Because MoFIR uses a multi-objective Reinforcement learning strategy, it can learn a single parameter in all possible preference spaces to represent the recommended results of the optimization strategy.And with the increase of the group size, the change of the recall rate of group activity recommendation does not show a uniform regularity.
Figure 4b shows that for groups of different sizes, the recommended diversity indicator NTC of DFGR is considerably higher than other benchmark methods.There is no substantial difference between the MTAF method and MoFIR method in NTC indicators.The recommendation diversity of GFAR and FPGR methods decreases with the increase of group size, while the recommendation diversity of Cover methods does not change considerably.
Figure 4c shows that for groups of different sizes, the F-Recall-NTC indicators of DFGR are substantially higher than those of other comparison methods.This indicates that DFGR Fig. 4 The result on the size of group has a relatively good effect in balancing accuracy and diversity.And Fig. 4d shows that DFGR has the highest score than other methods, it decates that DFGR method can effectively balance accuracy, diversity, and fairness in group recommendations.

Conclusion
We study the diversity and fairness of group activity recommendation on the EBSNs platform, and propose a coverage based group activity recommendation method.The weighted coverage on the activity similarity graph is defined as a submodule function, and the accuracy, diversity and fairness of recommendations are considered simultaneously by using the diminishing marginal utility effect of the submodule function.In addition, we propose a new evaluation index to measure the fairness of group recommendation.The experimental results on the real data sets show that the proposed method can effectively find the balance between recommendation accuracy, fairness and diversity, and is superior to the existing group activity recommendation methods in the comprehensive evaluation indicators of the three indicators.

Fig. 1 A
Fig. 1 A flow diagram of DFGR

4. 1
Datasetse use the real datasets collected from Douban and Meetup.The Meetup dataset was obtained by Ma et al.[25] using the Meetup REST API to obtain the activity data held between January 2010 and April 2014.The experiment used the activity data of Chicago and Phoenix in Meetup dataset, and the activity data of Beijing and Shanghai in the same city of Douban.For each activity in the dataset, the sponsor, participants, text content, venue and start time are collected.The text content of the activity includes title, classification, label and introduction.

Fig. 2
Fig. 2 Result on group activity recommendation

Table 1
Comparison our work and the existing researches

Algorithm 1 :
Greedy Algorithm for Maximizing Weighted Coverage Input: Candidate activity set E cand , related activity set E g of group g, activity similarity matrix M, recommended number of activities K Output: Recommended activity set S Initialize activity set S = ∅; while |S| < K do Traverse candidate set E cand to find the candidate activity e that maximizes the objective function 8; S = S ∪ {e}; E cand = E cand \{e}; end some users have participated in in E g to join the recommended activity set S to reduce coverage, and recommend activities of interest to different users in the group to improve coverage cov(E g , S).

Table 2
Statistics of the datasets

Table 3
Compare three different greedy algorithms Result on different context factors and saturation function algorithm s-greedy except for NTC indicators.The reason for this is that lazy-greedy uses delayed updating to improve efficiency.The approximate degree of recommendation results