EvoRecSys: Evolutionary framework for health and well-being recommender systems

In recent years, recommender systems have been employed in domains like e-commerce, tourism, and multimedia streaming, where personalising users’ experience based on their interactions is a fundamental aspect to consider. Recent recommender system developments have also focused on well-being, yet existing solutions have been entirely designed considering one single well-being aspect in isolation, such as a healthy diet or an active lifestyle. This research introduces EvoRecSys, a novel recommendation framework that proposes evolutionary algorithms as the main recommendation engine, thereby modelling the problem of generating personalised well-being recommendations as a multi-objective optimisation problem. EvoRecSys captures the interrelation between multiple aspects of well-being by constructing configurable recommendations in the form of bundled items with dynamic properties. The preferences and a predefined well-being goal by the user are jointly considered. By instantiating the framework into an implemented model, we illustrate the use of a genetic algorithm as the recommendation engine. Finally, this implementation has been deployed as a Web application in order to conduct a users’ study.


Introduction
The Internet enables ubiquitous access to a vast array of online products and services. However, while this offers users the benefit of greater choice, finding a preferred product or service when presented with seemingly endless options requires significant exploration time. To attenuate this problem of information overload, recommender systems (RS) were introduced to supply a user with targeted results (i.e. recommendations) based on that user's individual preferences and the preferences of other users with similar characteristics (Aggarwal 2006).
While there is a considerable literature on recommender systems, across a variety of domains, the effort focused on health and well-being recommendation is comparatively scarce. The vast majority of these works focus on a single aspect of health, such as exercise (Berndsen et al. 2017;Pilloni et al. 2017;Reimer et al. 2016) or healthy food intake (Achananuparp and Weber 2016;Akkoyunlu et al. 2017;Schäfer 2016), in isolation. Another limitation of previous work is the lack of flexibility to recommend "tailored" items. In some domains, such as retail (Wu et al. 2019) or entertainment (Gómez-Uribe and Hunt 2015), this is not an issue because recommendations (i.e. products or movies) are intrinsically static, with unchanging properties. By contrast, recommendations in a personalised well-being domain-meal ingredients and serving sizes; duration and intensity of exercise-should be configurable.
We argue that the interrelationship between daily meals and exercise plays a fundamental role in general well-being and the adoption of healthy living habits. Therefore, it is crucial to consider this interrelationship in personalised well-being recommendation. Our present study constitutes-to the best of our knowledge-one of the first research efforts in this direction. We also produce recommendations that are dynamically tailored to users' preferences, such that we capture not only what to recommend, but also how much. To achieve this, we employ a genetic algorithm (GA). While GAs have been used in recommender systems before (e.g. Caldeira et al. 2018;Lv et al. 2015), this technique has not been used to realise the potential of producing highly tailored recommendations, which are necessary in domains such as well-being.
To overcome the aforesaid limitations in recommender systems for well-being, we propose a novel approach: 1. We firstly introduce EvoRecSys, a conceptual recommendation framework based on an evolutionary multi-objective optimisation problem, where constraints are modelled upon users' preferences, their physical condition, and their well-being goals. The underlying evolutionary algorithm explores the search space of all possible combinations of recommendable items, which are in the form of mealexercise bundles. These bundles capture the interrelationship between eating and exercising in the domain of personal well-being. Output solutions (i.e. items to recommend) balance what the user likes with what the user needs in order to achieve her/his well-being goals. For example, if a user wants to lose weight then the recommended food items will not only meet user preferences, but also keep within strict calorie limits; while recommended exercise will consider calories burned, so that weight loss is ensured. At the same time, general health guidelines, such as controlling the amount of saturated fats, sugars, etc., are observed. 2. To demonstrate the suitability of the EvoRecSys framework, we instantiate it as a model for general well-being, with four possible well-being goals as the personalised constraints for the user, a well-defined quantity of items to be recommended, and a specific evolutionary implementation. 3. Our instantiated model incorporates principles from collaborative filtering recommender systems. Using a similarity metric, our model identifies users that are similar to the target user in terms of preferences, physical condition, and well-being goal. We show that integrating principles from collaborative filtering helps deal with situations of incomplete recommendation user-related interaction information and enhances recommendation diversity, which is fundamental to motivating users during their well-being journey.
Our implemented model is evaluated and validated under two aspects: (1) we measure the algorithmic performance and optimality of the underlying evolutionary approach for different parameter settings and through benchmark against several baselines implementations, demonstrating the model's ability to produce efficient and optimal recommendations that are semantically meaningful; (2) we also conduct a user study with more than 200 participants and demonstrate that personalised recommendations are positively perceived under four criteria: health, diversity, serendipity, and attractiveness. An additional, A/B test shows user's tendency to prefer recommendations produced by the proposed EvoRecSys implementation against a CF-based implementation. EvoRecSys advocates the use of genetic algorithms (GAs) as the core evolutionary technique to optimise recommendations. Although this paper illustrates a model instantiated upon EvoRecSys, different models emulating our conceptual framework can be seamlessly built by defining the necessary input data and objectives to optimise, thereby adapting it to the intended users and aims of the model in question. Furthermore, as opposed to recommending static immutable items, which conventional RS approaches usually deal with, EvoRecSys enables the generation of dynamic recommendations. In summary, this study provides the first effort in establishing a conceptual framework for producing configurable recommendations that incentivise users' wellbeing using evolutionary computing as the core of the recommendation engine.
This paper is structured as follows. Section 2 describes related research on recommender systems for food and exercise, along with studies that use GAs in the recommendation process. Section 3 explains the architecture and key elements that compose the general EvoRecSys framework. Section 4 presents a concrete implementation of EvoRecSys for health and well-being recommendations. Section 5 analyses algorithmic performance. A user study is then presented in Sect. 6. Finally, Sect. 7 concludes.

Related work
There exist a number of recent research studies on RS for health and well-being. However, unlike our proposed approach, existing studies tend to consider food and physical activity recommendations in isolation, rather than as a combined bundle. Section 2.1 outlines relevant work focused on food recommendation. Section 2.2 describes research related to physical activity recommender systems. Section 2.3 shows studies that incorporate a genetic algorithm as a complementary technique or an extra step during the recommendation process. Finally, Sect. 2.4 presents studies on recommender systems whose output contains bundles of more than one recommendable item. Our contribution to the literature is summarised in Sect. 2.5.

Food recommender Systems
In the scope of personalised food recommendation, some approaches have focused on food substitution. For instance, Achananuparp and Weber (2016) hypothesised that food items consumed in the same context can be seamlessly replaced by each othere.g. a tuna sandwich can be substituted for a ham sandwich if both are consumed with a salad-thereby allowing for greater diversity in daily meals. The "substitutability" between two food items was measured by the cosine similarity technique and two vector representations for those food items were explored: positive pointwise mutual information matrix (PPMI matrix) and singular value decomposition (SVD); with the latter obtaining best performance. This method produces top-10 food substitute candidates for each food item. The food data used in this work came from 9896 users of the web platform called MyFitnessPal (MFP) and their food consumption diaries. Akkoyunlu et al. (2017) argued that it is possible to recommend healthy food substitutes that match user preferences within the same context; where context is defined as the set of other food items that are consumed with the target food. For example, in the meal {tea, bread, juice}, the context of tea is {bread, juice}. Using the French database, INCA 2, which contains food diaries of 2624 adults, the recommender model generates a graph, with nodes representing meals in the database. Under this design, substitutable nodes-those belonging to the same dietary context-are adjacent and form a fully connected sub-graph, or clique. Nodes are considered highly substitutable if consumed in similar contexts, and less substitutable if consumed together. Caldeira et al. (2018) suggest that meal recipes can be recommended by considering their nutritional value, harmony of ingredients, and the availability of the ingredients. This research uses the Non-dominated Sorting Genetic Algorithm II (NSGA-II), introduced by Deb et al. (2002), which is an evolutionary algorithm with the following features: (i) elitism, to preserve the best solution of current population in the next generation; (ii) crowding distance techniques, to provide diversity in solutions; and (iii) non-dominated sorting techniques, to maintain a Pareto-optimal archive solutions.
Using this algorithm, a list of suggested meal recipes is found by considering the number of portions, quantity of ingredients, and tastiness. Their approach also makes it possible to specify a food style, such as vegetarian or vegan. The set of recipes used in this study was collected from Brazilian website TudoGostoso. 1 Recently, Musto et al. (2020) proposed a knowledge-based strategy that incorporates "holistic" user profile information in a popularity-driven recipe recommender algorithm. Profiles include user data such as demographics, age, gender, and weight, as well as food requirements, physical activity level, and body mass index, in order to re-rank popularity-based recommendations so that user-related health factors are considered. This solution is different from state-of-the-art food RS in that it handles knowledge about users' physical health and behavioural characteristics rather than considering diet preferences alone. However, we note that while physical activity data are used as input to the model, only food recipes are recommended (unlike the approach that we propose in this paper, where we recommend a bundle containing interrelated food items for meals and physical activities).

Recommender systems for physical activity
In terms of physical activity recommender systems, there are several studies whose objective is to try to change the user behaviour towards a healthy physical lifestyle. For example, Reimer et al. (2016) advocated for users to change their habits in a tailored manner in order to reach exercise goals. The proposed framework of this research motivates a user through "nudges" (Thaler and Sunstein 2008). There are various types of nudges such as suggestions, praise and rewards. The accepted nudges by the user are used to create a personalised profile that will encourage the user to reach the goal. Furthermore, this framework utilises a collaborative filtering technique to generate recommendations focused on the goals. Users' socio-demographic data and their past behaviour are used in order to characterise the feature vector that represents each user. Similarly between two users is calculated by the cosine similarity.
Some research tackles the domain of sports. For instance, Pilloni et al. (2017) argue that it is possible to predict when a user is going to abandon an exercise routine based on their previous behaviour and thus prevent it. The proposed model uses a machine learning algorithm as the core of the recommendation process. The previous user behaviour is used as a training vector which has 34 features including covered distance, workout duration, and rest time. Once the algorithm is trained, it is able to predict if a user is going to abandon the routine. If so, a recommendation for encouraging the user to continue the routine is triggered. Otherwise, the system predicts the user will not abandon the routine. The study tested 4 classification algorithms: (i) random forest, (ii) AdaBoost, (iii) extra trees, and (iv) multi-layer perceptron; where random forest obtained the best performance. Data used for the analysis were taken from the u4fit platform. 2 Following the same path, Berndsen et al. (2017) showed that amateur runnersthose without advanced training, or access to a coach-can improve performance using elite runners' behaviour as a target to follow. Two models-K-nearest neighbours (KNN) and extreme gradient boosting (XGB), which was shown to have higher performance-were trained to predict marathon times using users' finishing times at various distances. The predicted times are the basis of the recommendations, which is performed by collaborative filtering. For instance, if a runner finishes a 10 km race in 63 min, while an elite runner takes 46 min, the recommendation would be, "you have to train a little bit more". The dataset employed in this work was taken from diverse websites where athletes are allowed to declare their race times, such as the website RunnersWorld. 3 Additionally, the work explored ways to best present recommendations to runners in order to nudge their training behaviours.

Evolutionary algorithms in recommender systems
Evolutionary computation techniques have been used in many domains, including, to name a few, Computer Science (Dutta et al. 2020;Gunasegaran and Cheah 2019), Geology (Rezaei and Asadizadeh 2020), Biology (Guo et al. 2020), and Chemistry (Buchely et al. 2020). In the area of RS, genetic algorithms (GAs) (one of the primary evolutionary computing techniques) have been scarcely applied to date.
One of the most typical applications of RS is in e-commerce. For example, Lv et al. (2015) proposed a framework to help the standard techniques of recommendation (collaborative filtering and content-based) to yield the quality of recommendations through a traditional GA and a class-based ontology, which is built by considering an item that the user is interested in. For instance, if the user is interested in a book, the class would be Book and some of its attributes would be title and publication date. The workflow of this framework consists of: (i) retrieving the items in the user's shopping cart from the log file of an e-commerce site; (ii) mapping each attribute as a class to build the ontology; (iii) using the GA to optimise the different feature weights in the set of items; (iv) using the coefficients calculated in the previous step to cluster items; and (v) recommending the nearest cluster to the product that the user was interested in during the last visit. To evaluate the framework, the MovieLens dataset 4 was used, showing a better performance than standard collaborative filtering and content-based techniques. Hassan and Hamada (2018) have also used a GA as a optimisation step within a multi-criteria RS to compare well-known RS methods (collaborative filtering and content-based) and GA-based methods. In the study, three variations of GA were used: (i) a standard GA-a population of candidate solutions (called individuals) to an optimisation problem is evolved towards better solutions. Each candidate solution has a set of properties which can be mutated and altered by genetic operators with a fixed probability of occurrence (Whitley 1994); (ii) an adaptive GA-population information in each generation is used to adjust the probability of both mutation and crossover in order to maintain population diversity and sustain the convergence capacity (Srinivas and Patnaik 1994); and (iii) a multi-heuristic GA-the principal features of two or more heuristic approaches are combined to form a single algorithm for enhancing per-formance and preventing premature convergence during the search process. Crossover and mutation rates are initially set high, and then reduced slowly over time. Results demonstrated that GA-based approaches could outperform collaborative filtering and content-based RS methods when tested using the Yahoo! Movie website. 5 Karabadji et al. (2018) presented another multi-criteria based study and found that a GA can suggest a suitable set of neighbours in a collaborative filtering RS. The research demonstrated that a GA (i) alleviates settings problems related to selecting the N most similar neighbours and (ii) guarantees diversity by selecting groups of individuals that are different. In this way, both high similarity and high diversity are achieved. The model was tested using a dataset containing 239 ratings by 100 customers from 17 Algerian insurance companies and the MovieLens dataset 4 .
Finally, Cui et al. (2017) argued that, while losing a certain degree of precision, it is possible to improve the metrics of diversity and novelty by adding a multi-objective GA during the recommendation process. The GA represents each individual as a 1-D integer array, where each loci in the genotype represents an item that can only appear once in the recommendation list. While the mutation operator has a standard functionality, the crossover operator is designed to preserve a user's habits such that if an item appears frequently in the user's recommendation lists, the probability of preserving it unchanged during the crossover process will increase. For evaluation, two objective functions are used: (i) accuracy and (ii) diversity. The authors validate the performance of their GA in combination with a set of traditional recommendation algorithms. Results demonstrate that the combination of their GA and the recommendation algorithms can achieve a good balance between precision and diversity.

Bundling in recommender systems
In recommender systems, an output recommendation may contain multiple items, which we call a bundle. For instance, Rapti et al. (2014) introduced an agent-based approach for generation of personalised product bundles for enterprise networks. The core process includes complementary associations between products and building bundles according to the customer preferences. Furthermore, the approach is able to adapt itself if the environment changes: customer profile modifications, product availability, and rule and constraint rule diversity. The authors provide an example under an e-Furniture context, employing an agent-based system in a network of enterprises that manufacture.
Bundling is also used in other recommender system domains such as the telecommunication industry. For example, Dragone et al. (2018) present a system whose outputs are combined services (mobile connectivity, broadband allocation, TV on demand, etc.) and electronic device plans (smartphones, tablets, TVs), selected according the customer necessities. The system considers the constructive preference elicitation framework, which allows to model the bundle offers as a defined set of variables and constraints. By using constraint optimisation, the system generates high-utility recommendations. Furthermore, an empirical validation study, where 134 participants were involved, is presented. Results show that the outputs of the system were considered more satisfactory than those obtained with standard techniques used in the market. Zanker et al. (2010) applied bundling to the tourism domain, with recommendations containing bundle collections of accommodation, activities, and restaurants. This system uses a constraint satisfaction problem (CSP) solver, which invokes numerous recommender systems to propose a ranked list of items for each product category based on the user model and available community knowledge. The authors evaluate their system using an example scenario consisting of 5 product classes with 30 different product properties and 23 representative constraints. The evaluation only focuses on computation time and does not consider the quality of the final recommendations. Results demonstrate the system is able to generate bundles within a time period that is acceptable for typical e-commerce situations.

Contribution
While bundle recommendations have been explored in a number of domains, we present the first research that considers recommendation bundling in the health and well-being domain. We focus on linking two aspects that have previously been studied separately: (i) nourishment (see Sect. 2.1) and (ii) physical activities (see Sect. 2.2). We combine physical activity and diet as this aligns with the weight loss/management goal of the user and is designed to illustrate how lifestyle bundles can be incorporated to provide more holistic advice. Elliot and Hamlin (2018) have presented evidence that people like to treat healthy lifestyle in a collective manner when making efforts to change their behaviour and improve health (also see Johns et al. 2014).
Also, while other recommender systems have included a GA, often the role of the GA is limited to a single step inside the recommendation process (see Sect. 2.3). By contrast, we introduce a novel approach for generating recommendations entirely based on a GA. Our approach also enables us to combine items with a high level of granularity, such that the attributes of each recommendable item can be optimised throughout the evolutionary process. This provides an opportunity to offer highly tailored recommendation items based on user preferences.
Furthermore, as previously stated in Sect. 2.3, GAs have been successfully used to improve traditional RS techniques such as collaborative filtering (Hassan and Hamada 2018;Karabadji et al. 2018) and matrix factorisation (Kilani et al. (2018)). Thus, the presented research represents an effort to continue the integration path of GAs in the RS domain.

EvoRecSys: general description
This section introduces EvoRecSys (Evolutionary Recommender System), a conceptual framework that reformulates the recommendation problem as a multi-objective optimisation problem in which solutions to the problem are modelled by configurable items or groups of items to recommend to the end user. Under this approach, it is plausible to build recommendations focused on (i) reaching a specific well-being goal The remainder of this section describes the main elements of our proposed approach for personalised well-being. Section 3.1 shows the general framework architecture, the workflow model and a general description of the data sources that could be used by the framework during the evolutionary-recommendation process. Section 3.2 defines essential concepts related to the framework. Finally, Sect. 3.3 describes the key features of the core element of this framework: a genetic algorithm.

Architecture and workflow
EvoRecSys receives the user input (physical characteristics, well-being goals, and food and exercise preferences) and recommends meal and physical activity (PA) bundles that are tailored to the user through an evolutionary optimisation process. The architecture and workflow are presented in Fig. 1.
The main inputs of EvoRecSys are divided into four elements of user-related information: (i) Physical status and exercising habits. Data related to age, gender and body measurements of the user, as well as their frequency of exercising. (ii) Food category preferences. How much the user likes certain ingredients, predetermined types of food, etc. In order to do this, a numerical scale can be used, for example the 5-point Likert scale. Due to the versatility of this element, it is possible to implement it in diverse ways. For instance, it might focus on a specific dietary requirements such as vegetarians, vegans or people with certain allergies.
(iii) Preferences on types of physical activity. Information that describes how much the user likes certain types of PA. As in the previous element, it can be implemented focusing on predetermined set of physical activities. For example, water activities, or sports where a ball is used. In order to measure the user preferences, a minimum-maximum-based numerical scale can be used, similar to the previous element. (iv) Well-being goal. A goal chosen by the user from a set of predefined goals focused on a specific well-being aspect to be improved. The goals can be set for handling either general circumstances (losing weight, maintaining weight, etc.) or specific ones (control chronic diseases), depending on the specific aims of the model implemented upon this general framework. Without losing generality, our instantiated model in Sect. 4 focuses on general-purpose recommendations aligned with various well-being goals.
During the evolutionary recommendation process, the framework interacts with a data source that contains, at least, food data and PA data. Previous user preferences are also needed in models that gather users' interactions and feedback over time. Due to the decoupled architecture of the framework, the data source can be any readily accessible database (e.g. myfitnesspal, u4fit, or Kaggle) 6 as long as it contains the necessary data to perform the evolutionary recommendation. The framework outputs a list of K recommendations, which are tailored to the user preferences and the chosen well-being goal. Next, we describe the proposed evolutionary process and its core components.

Basic definitions
Genetic algorithms (GAs) are an optimisation technique inspired by natural selection. They operate by "evolving" a population of individuals, each representing a possible solution in the problem domain. Initially, the population is poor quality and widely dispersed across the search space. Over time, the population will gradually converge towards regions of space with better solutions that are closer to the user's preferences and their health needs.
In the scope of this study, and based on existing coding schemes to represent individuals (Goldberg 1989), we encode individuals as meal-PA bundles, containing food attributes and PA features. A meal is defined as a set of N food items. Additionally, a PA item defines a physical exercise that could contain attributes such as the type of PA, duration, intensity level etc. This is illustrated in Fig. 2. Additionally, it is feasible to define semantic rules that guarantee coherent meal structures (an example will be shown in the concrete implementation presented in Sect. 4).
Depending on the design decisions made to build a model upon this general framework architecture, the output can have different structures. For instance, the typical output is the best evaluated individual after the evolutionary process with K bundles. Another possible output would consider the top N individuals after the evolutionary process under the assumption that each individual would have one bundle.

The evolutionary process
Here, we describe the main steps of the genetic algorithm that drives evolution in the EvoRecSys architecture. First, a population of individual "recommendations" are initialised at random across the solution space. Each individual created must: (i) match user preferences; (ii) be within the intrinsic boundaries of the inputs defined; and (iii) have an interrelationship between food and PA items. To determine the performance of individual recommendations, a key element of a GA is the evaluation function or fitness function. It is inspired by the natural selection statement that says that the most adapted individuals in a certain environment have more opportunities to survive and hence, to transmit their genetic information to the next generation of individuals (Goldberg 1989). In order to provide suitable and consistent recommendations that meet (i) the user's preferences and (ii) her/his well-being goals, we define the fitness function upon a set of restrictions or objectives to optimise. Let R = {μ 1 , μ 2 , . . . , μ M } represent the set of all possible restrictions to consider with M ≥ 1. A fitness function F F i associated with a user u i ∈ U with a goal G i is defined based on R and her/his individual food-PA preferences Ψ u i . It assesses the matching degree to which a recommended bundle simultaneously meets G i and the user preferences.
is an aptitude function describing the degree to which restriction μ i, j ( j = 1, . . . , M) is satisfied by the individual; Ψ u i is a function that measures how much u i preferences are met, and φ is a combination function, e.g. an averaging or aggregation operator (Beliakov et al. 2007). For example, Ψ u i could be a distance function between a representation of the user preferences and the properties of recommendable bundles in an individual. Figure 3 illustrates the definition and application of a fitness function FF i . The next evolutionary step is the selection of "parent" individuals to reproduce. This step focuses on choosing the fittest individuals (i.e. those with the lowest aptitude values). There are numerous selection methods, including: proportional selection, rank-based selection, tournament selection, disruptive selection, and elitism (for further reading, see Jong et al. 1997). Here, we use tournament selection, however EvoRecSys enables any selection method to be used. We then produce the offspring population from the selected parents. To enable exploitation of good genetic combinations that have produced high fitness in parents, offspring should be similar to parents. At the same time, to explore solution space, we need to introduce some novelty in the offspring population. To achieve these two aims, we use the genetic operators crossover and mutation, respectively (e.g. see Goldberg 1989). The crossover operator randomly takes two individuals of the new population and it combines a part of each individual to randomly create two new ones, with the aim of further exploring a specific (and sometimes promising) part of the search space. Under the EvoRecSys framework approach, it is feasible implementing this genetic operator in different ways and granularity levels. Regarding meals, we suggest to recombine food items among individuals' bundles, rather than complete meals. Regarding physical activities, the suggestion is similar to meals: recombining them among individuals' bundles. The mutation operator acts on one individual, such that one element of the genotype (its representation in GA terms) is modified. This operator therefore explores the local region of search space. In the context of EvoRecSys, a variety of approaches to mutate exist. We suggest, nevertheless, to only mutate food items within meals and PA items within bundles of an individual. Due to the flexible architecture of bundles, both food items an PA items can be mutated regardless the stochastic process implemented for this genetic operator.
Finally, genetic algorithms have a number of other parameters to be considered, including population size (i.e. number of individuals), number of generations (or evolutionary iterations), crossover probability, and mutation probability. Holland (1975) states that the crossover operator is the most reliable operator in order to explore the search space, whereas the mutation operator is a complement of the crossover. Therefore, crossover should have a considerably high probability of taking place and mutation a comparatively small probability that it occurs. Regarding the number of generations and the population size, these parameters are directly proportional to the problem size (Jong et al. 1997). Said otherwise, the more elements individuals represent, the bigger the population size and the number of generations are for the sake of

EvoRecSys implementation for personalised well-being
We introduce a concrete proof-of-concept implementation of EvoRecSys in the domain of well-being and preventative health. Here, we do not consider the more difficult problem of accommodating clinical conditions, so the target user profiles exclude people with chronic diseases (diabetes, hypertension, allergies, etc.) and kinetic limitations (paralysed or amputated limbs). We also do not consider different traditions or cultural backgrounds. However, in future, the framework can be easily extended to encompass these more general cases. Section 4.1 presents the architectural considerations and the data used. Section 4.2 describes the specific design choices made for the GA. Finally, Sect. 4.3 describes the integration of a nearest neighbourhood-based mechanism, inspired by collaborative filtering, which is used in the mutation operator.

Architecture, inputs, and data source
We implement an evolutionary model following the description in Sect. 3.1. The inputs are: physical status and exercising habits (see Table 1), and well-being goal: (1) losing weight, (2) maintaining weight, (3) gaining weight, and (4) building muscle mass. Data on 166 food items allocated in 14 food types and 50 physical activity items (PAs) allocated in 8 types are taken from Health Canada (2008) and Arizona State University (2011), respectively (see Table 2), using the following preference categories (expressed using a 5-point numerical scale): Previous users' data were also collected from an initial survey of 145 users, where each provided their physical status along with their food and activity preferences. This dataset is only used for finding users with similar preferences during the collaborative filtering stage (see Sect. 4.3). The output of this evolutionary model is the best evaluated individual, comprising K meal-PA bundle recommendations.

Evolutionary specifications of the implemented model
This subsection describes the specific design choices made for the GA components in the model implemented, based on the general framework guidelines introduced in Sects. 3.2 and 3.3. The principal element of EvoRecSys is a genetic algorithm (see Algorithm 1), which we describe below.

Creation of individuals
In this model, a meal contains four food items (N = 4). To ensure semantic consistency of portions, and to allow for finding diverse and non-repetitive recommendations, each meal contains two food types: (i) a single main food item and (ii) three side food items. Without loss of generality, we consider that each individual contains three bundles (K = 3), as shown in Fig. 4. A relevant parameter that influences the interrelationship between the meal and the PA being jointly recommended is the number of intake calories that the user should consume per day. In order to calculate the target value, we use the Harris and Benedict equation due to the proven trustworthiness of its predictions in existing health-related literature (Lee and Kim 2012). Basal Metabolic Rate (BMR) is measured, as follows: where weight is measured in kilograms, height is in centimetres, and age is in years. Total energy expenditure (TEE) is then obtained by multiplying BMR by the physical activity level (PAL) (Shetty et al. 1996): Table 3 shows the possible values for PAL according to the activity level which is asked to every user of our model: Finally, the calculated TEE value is tailored according to the chosen goal. In this implementation, the set of available goals are: (i) losing weight; (ii) maintain-ing weight; (iii) gaining weight; and (iv) gaining muscle mass. Under the evidenced assumption that a kilogram of body fat contains 7717.75 kilocalories (Wishnofsky 1958), it is feasible to estimate the number of required intake calories for some of the well-being goals previously defined. For instance, if a user reduces 551.26 intake kilocalories in the daily TEE, in 7 days the user would lose approximately 500 g of weight.
The resulting output represents the maximum number of kilocalories that a meal should have and it will be used to calculate the suggested time that should be spent on the PA, considering the chosen well-being goal. In other words, this value helps to determine the segments from the search space that fulfil the user PA preferences, her/his nourishment requirements and her/his goal during the stochastic process of creation of the population, excluding those that do not fulfil.
Remark 1 Although meal and exercise activities are both generated stochastically, they have a linking parameter in common, namely the tailored number of intake calories associated with the target user.

Evaluation of individuals
This model implements three specific restrictions that compose the fitness function to evaluate individuals (see Table 4). Let R = {μ h f , μ P A , μ cd } be the set of all restrictions to consider. A fitness function F F i associated with a user u i ∈ U with a goal G i , is defined based on R and her/his individual food-exercising preferences Ψ u i . It assesses the degree that a recommended bundle simultaneously meets G i and the user preferences.
where φ is the arithmetic mean averaging operator. The three restrictions are described as follows: -The Healthy Food Restriction follows the England Government Dietary Recommendations (England 2016). Based on this, the restriction evaluates independently the amount of proteins, carbohydrates, sugar, fibre, fat, saturated fat, and salt in a meal. -The Exercising Restriction evaluates the matching degree between the recommended time in the PA item and the average time that the user spends during the exercising time. This helps, for instance, to ensure that a PA for a given user is neither too mild, nor too ambitious or intense for her/him. We use the MET as the reference value to ensure that the meal-PA combination aligns with intended user's well-being goal. -The Consistency and Diversity Restriction evaluates the food item diversity from two approaches: firstly, it evaluates the diversity in a single meal (among the food items that conforms the single meal) and secondly, it evaluates the diversity among meals within an individual. It also evaluates the diversity among exercising items within an individual. Moreover, this restriction evaluates, in terms of serving size, Finally, in this EvoRecSys instance, Ψ u i has been implemented as an additional restriction whose purpose is to evaluate how likeable are both the recommended meals and the recommended PA's, based on the user preferences. Thus, once all four restrictions are employed, the aptitude of the individuals is calculated as follows: Remark 2

Selection
We use tournament selection (Zhang and Kim 2000), which works as follows. First, a pair of individuals are randomly sampled from the population, with replacement.
The aptitude values of the two individuals are compared, and the individual with the best aptitude (the lower value in this implementation) is selected and added (as an "offspring") to the new population. The process repeats N times (where N is the population size), until a new offspring population is formed with size equal to the parent population. Tournament size T directly controls selection pressure in the population. Note that, on average, since we are using a tournament of size T = 2, we expect: the best member of the parent population to have, on average, T = 2 offspring in the new population; the median member of the parent population to have, on average, T /2 = 1 offspring; and the lowest aptitude member is guaranteed to have no offspring. We also include elitism, such that we ensure that the best individual of each generation is reproduced (without modification) into the new offspring population. This ensures that good solutions are not lost during the reproduction process.

Genetic operators: crossover and mutation
Crossover and mutation follow a stochastic process and occur at the element level of each bundle. During crossover and mutation, each element is selected using the following method: (i) for each bundle, the bundle is selected with probability p 0 = 0.9; then, (ii) one element within the bundle {Main, Side, PA} is selected with probability {0.2, 0.6, 0.2}; (iii) if Side is selected, each sub-element (each side meal) is selected with probability 0.5, enabling the possibility of multiple sides to be selected. The crossover operator is used to recombine genetic code (the items) between two "parent" individuals, A and B (see example in Fig. 5). An offspring (i.e. a child) is created as a copy of parent A then, using the crossover process described above, for parent B, each element (or sub-element) that is selected will be inserted into the child. In the example shown in Fig. 5, the child is a copy of Parent A, with two side meals ("beans" and "broccoli") copied from parent B.
The mutation operator is directed by collaborative filtering (detailed fully in Sect. 4.3). First, "similar" neighbours are discovered using collaborative filtering over user preferences; then, when a mutation occurs, the element or sub-element selected is replaced by the corresponding element in the neighbouring user. Figure 6 shows an example of mutation in bundle number 2, for the element PA, which is replaced by the exercise activity "Yoga, 59 minutes", taken directly from a neighbour with similar preferences.
Remark 3 Using collaborative filtering within the mutation operator is non-standard. This novel contribution is designed to heuristically navigate through the population search space, guided by neighbours' preferences.

Directing evolution using nearest-neighbour collaborative filtering
In a collaborative filtering RS, items are typically recommended to a given user based on the preferences of similar users to her/him (Alhijawi et al. 2016;Karabadji et al. 2018). In essence, if u a and u b are similar users, and u b has positively rated or liked an item x j not seen by u a yet, then x j is likely to be recommended to u a . Accordingly, our proposed model incorporates a strategy inspired by collaborative filtering in the core GA that identifies similar users to the target user. Due to the multi-objective nature of our evolutionary approach, we consider a holistic notion of similarity among users that does not only consider their taste towards food and PA, but also their physical characteristics (weight, height, age, gender) and their selected well-being goal. For example, two users who have very similar food preferences but exhibit different physical characteristics and opposing goals, e.g. losing weight versus gaining weight, are unlikely to be considered similar.
To reflect this holistic view, we quantify similarity sim(u a , u b ) between users u a , u b ∈ U , using: (i) food preferences; (ii) PA preferences; (iii) physical status; and (iv) well-being goal. Let F T = { f t 1 , f t 2 , . . .} be a non-empty finite set of food types and let p ] be a vector describing u a 's preferences towards food types. Then, the food-based similarity between u a , u b is computed using the following formula: with d(·, ·) a normalised distance metric between two vectors, e.g. Euclidean distance. Let AT = {at 1 , at 2 , . . .} be a non-empty finite set of PA types. Accordingly, let p at a = [p at 1 a p at 2 a . . . p at |AT | a ] be a vector describing u a 's preferences towards such PA types. The PA-based similarity between u a , u b is computed as follows: The user's physical status is modelled after the attributes employed to calculate the standardised calorie expenditure function: height, weight, age, and gender. Formally, we have Status a = [weight(kg), height(cm), age(yr), gender(m/ f )]. The rationale is that two users with similar physical status and activity levels will have a similar calorie expenditure rate, and therefore, the interrelationship between their food intake and exercise requirements in the recommended bundle should is similar. Based on their TEE value [Eq. (4)], we use the following formula to calculate the similarity between two users' physical status: Finally, an aggregation function W is used to combine the three similarities on food preferences, PA preferences, and physical status, into one: with W a weighting vector for adjusting the relative importance of food preference, PA preference, and physical status. Finally, the selected well-being goal, is used to apply a "rewarding effect" on the aggregated similarity if the two users share the same well-being goal, thereby making users with a common goal more likely to be nearest neighbours of each other: Intuitively, since 0 ≤ sim(u a , u b ) ≤ 1, we have √ sim(u a , u b ) ≥ sim(u a , u b ). A simple k-nearest neighbour strategy is then applied to identify the k most similar users to u a based on sim (u a , u b ). Information about the preferences and needs of these neighbours is used to direct the mutation operator of our evolutionary process, leading to more diverse and meaningful personalised recommendations by further exploring the search space.

GA Performance analysis
Here, we analyse, optimise, and benchmark the performance of the genetic algorithm used in EvoRecSys.

Finding suitable aptitude and semantic coherency of recommendations
An essential aspect in the proposed EvoRecSys implementation is to have a comprehensive understanding of the fitness value, which indicates the quality and semantic coherency of the recommendations. In order to demonstrate how to interpret a fitness value, we illustrate using an example based on a vegetarian user with 1925 calories intake per day, which yields 642 calories per meal. This example user spends 43 min  per exercise session and has "losing weight" as well-being goal. Table 5 shows examples of fitness values (from worst to best aptitude) for bundles tailored to this example user. Using Table 5, we consider 0.2480 as an acceptable fitness threshold for assuring semantic coherency in the output recommendations. Table 5 shows that small variations in fitness values may yield considerable changes in the quality of the output recommendations. This signals that the fitness function is sensitive and nonlinear.

Finding optimal parameters for the genetic algorithm
Since this implementation of EvoRecSys has been deployed as a Web application hosted in a domestic-use machine (see Sect. 6), the evolutionary process must execute in real time. We therefore conducted experiments to find optimal GA parameter values that can consistently reach the required aptitude threshold. We ran 50 repeated evolutionary trials over the following parameter space:  Table 6. Figure 7 presents the mean performance of the best individuals across 50 trials (i.e. the mean system performance; with shaded region showing 95% confidence interval). The horizontal dotted line represents the fitness threshold we require for semantic coherency (see Sect. 5.1). Although we can be confident that EvoRecSys will reach the desired aptitude threshold after 60 s, the system continues to improve and does not equilibrate until 80 s. Therefore, we consider 60 s as the minimum computational time required to build coherent recommendations (on the given hardware); but to ensure the best possible recommendations, we use a run time of 80 s when building recommendations during the user study (Sect. 6). We believe the performance improvement is worth the additional 20 s that each user must wait, and also since more powerful hardware would reduce the run times, we focus on producing the best quality recommendations rather than minimising wait time.
Remark 5 All GA trials were conducted using a standard domestic-use machine. Deploying the model on high-performance hardware would significantly reduce run times.

Benchmarking
Here, we benchmark the performance of our proposed algorithm. In particular, since our use of collaborative filtering in the mutation operator is novel, we are interested in quantifying the benefit that this process brings. To achieve this, we compare four approaches, each containing different components of the model.  mutation operator works under the "standard" approach such that a randomly selected item is replaced by another item of the same class (i.e. a side replaces a side; a main replaces a main) that is randomly selected from the database (i.e. collaborative filtering is not used in the mutation operator). 2. EvoRecSys-no-crossover: In this baseline approach, the crossover operator is disabled. Thus, the evolutionary task relies exclusively on the mutation operator, namely guided by the nearest-neighbour collaborative filtering strategy described in Sect. 4.3 and illustrated in Fig. 6. In addition, individuals are built considering the value calculated by the process described in Section 4.2.1. 3. EvoRecSys-standard: Both genetic operators are enabled. However, the mutation operator works under the standard approach, as described in approach (1). Furthermore, individuals are created by the procedure described in Sect. 4.2.1. 4. EvoRecSys-full: The proposed implementation in this paper. This approach has no modifications; it includes crossover and the collaborative-filtering-based mutation operator, as described in previous sections.
We conducted 50 trials on each approach. Figure 8 presents the mean performance of the best individual (the lowest aptitude value) across all trials, with 95% confidence interval presented as shading. While there are relatively small differences in best aptitude between each approach, these differences translate into significant differences in coherency of recommendation (see Remark 4). We see that the "full" system (4), which includes both crossover and mutation directed by collaborative filtering, produces the lowest aptitude values (see Remark 2), which indicates that it performs best. In particular, (4) significantly outperforms the "standard" approach (3) (paired t-test, p < 0.0001), indicating that directing mutation using collaborative filtering is beneficial. Approach (3) also significantly outperforms approaches (1) and (2) (paired t-test, p < 0.001). The "naïve" approach (1) appears to tend towards better performance values than the "no crossover" approach (2); however, this difference is not significant (paired t-test; p > 0.05). Approach (1) starts poorly because of the randomised configuration of the initial population, with mean aptitude of 0.4337 at generation 0. However, the addition of the crossover operator enables approach (1) to quickly catch and then slightly overtake the performance of approach (2). In summary, these results show that both crossover and mutation with CF are necessary components for the system to perform best and are the only configuration that consistently reaches the desired aptitude threshold.

User study
Based on the previous experimental evaluation to determine an optimal configuration of our EvoRecSys Web implementation, 7 we deployed it to conduct a cohort study with users who volunteered to interact with the system. The study provides additional insight about the system performance from the subjective perspective of end users, analysing their response towards recommendations.
It is important to note that the Web front-end used for both user studies was designed to provide the GA with a default value in cases where the user, whether deliberately or accidentally, skipped a question related to food or physical activity preferences. We set a default value of 3, which corresponds to the neutral preference in a 5-point numerical scale. Thus, recommendations are generated even if the user skips all preferencerelated questions.

Subjective analysis of EvoRecSys recommendations
Volunteers were invited to conduct a series of interactions with the system throughout three steps, for approximately 10 min: i. Providing explicit rating information about food/PA preferences, physical status, exercising habits, and well-being goal (see Fig. 9a, b). ii. Receiving a list of three bundle recommendations and evaluating their satisfaction with each one (see Fig. 9c). iii. Assessing overall perception of recommendations received based on four criteria: diversity, serendipity, appeal, and healthiness.
A total of 205 users completed the three tasks. A geographical distribution of the country from where these users participated is shown in Table 7, and Table 8 summarises their demographic and physical characteristics along with their exercising habits and well-being goal.
For each of the three meal-PA recommendations received, users were asked to rate the suggested meal and exercise (see Fig. 9c) using a 5-point Likert scale, and to optionally mark one of more of the recommended combinations as favourite. Figure 10 shows, on the left, the average user satisfaction with individual meals and PAs suggested alongside their standard deviation. The last two bars show the average results    across all three meals (resp. PAs). The plot on the right-hand side of Fig. 10 shows the overall distribution of 1-to-5 ratings given by users to the recommendations. The results show, in general, a prevalence of positive ratings over negative ones, particularly towards meals, showing slightly better values than PA in terms of both average ratings and rating distribution. Deviations around the average value are shown as consistent between meal and PA being recommended, suggesting that there is a similar consensus between both aspects in terms of users' perception of the recommendations. Finally, in order to assess the perception of recommendations received "as a whole", users were requested their subjective opinion regarding four quality criteria using again a 5-point scale with values ranging between 1 and 5: (i) diversity, where higher ratings mean more diverse and less repetitive recommendations, (ii) serendipity, with higher ratings meaning more serendipitous and less expected recommendations, (iii) attractiveness, with higher ratings meaning more appealing recommendations that suit their preferences, and (iv) health with higher ratings indicating that recommendations are perceived as healthier. These final questions were optional, hence not all users answered all four of them. Figure 11 summarises the feedback collected for the four questions as a rating distribution (left) and the average score per question/criterion (right). We believe these are promising results for various reasons. Firstly, all four rating distributions show a moderately skewed trend towards higher ratings, demonstrating that the average ratings obtained are good representatives of a minority of negative feedback, with no polarised majority opinions around the two extremes of the rating scale. Health is the most positively assessed criterion by most users, with a significantly higher average rating than the other three (4.15). It is also the only criterion in which the majority of users gave the highest rating, and hence, the proposed model succeeds in delivering meal-PA bundles perceived by users as healthy. Most users reported recommendations as appealing (4) or very appealing (5), which is also an encouraging result in terms of balancing healthy recommendations with adaptability to the user preferences. Diversity and serendipity show, in average, slightly closer results to the neutral value (3), although the majority of ratings are still distributed across the {3, 4, 5} rating interval. This suggests that while the collaborative filtering approach integrated in the GA helps producing diverse and serendipitous recommendations, there might still be areas for improving these aspects in future versions of the model or in new ones, motivating a more thorough exploration of the GA components, its fitness function, and any other RS techniques to be investigated and integrated in EvoRecSys.

Challenge study: EvoRecSys vs. collaborative filtering
Following the first study, 44 volunteers accepted an invitation to take part in a followup study to subjectively compare recommendations generated by EvoRecSys and recommendations generated by the second baseline system used in Sect. 5.3, which can be understood as collaborative filtering (CF) only. Users begin by entering their preferences (see Fig. 9), and are then shown 5 pairs of "blind" recommendations (e.g. see Fig. 12), one generated by EvoRecSys and one generated by CF alone. For each pair, the user is then challenged to select their preferred recommendation, without being told how the recommendations are generated. The ordering (A or B) of pairs is randomly shuffled between EvoRecSys and CF to ensure that there is no selection bias based on ordering of options shown.
For this stage of the user study, EvoRecSys was configured to recommend one bundle, using parameters popSize = 150, maxGen = 100, probCross = 0.6 and probMut = 0.1. For the CF-recommendation, we initially created a population of individuals. The best individual in this initial population (generation 0) is then taken, and the CF-based mutation operator is applied (see Sect. 4.2.4). In this way, CFrecommendations are built using collaborative filtering, but without evolutionary optimisation.
In total, we conducted 220 pairwise challenges (n = 44 × 5 = 220). The recommendation option generated by EvoRecSys was preferred 124 times, while the recommendation generated by CF was preferred 96 times. If we consider the null hypothesis that recommendations generated by each system are equally likely to be selected by users, then we can test this hypothesis by using a binomial distribution with p = 0.5 (probability of each option being selected at random), n = 220 (number of repeated trials), and x = 124 (number of times that EvoRecSys option is selected). We get probability P(X ≥ x) = 0.034. Therefore, results suggest that EvoRecSys recommendations are preferred by users and we are able to reject the null hypothesis at the 0.05 significance level.

Discussion and lessons learnt
Our efforts to reformulate the recommendation problem as a multi-objective optimisation problem driven by a GA can be summarised as successful in the light of the experimental results. In general terms, recommendations have been positively rated by the majority of users who participated in the study. Furthermore, after a careful experimental setting up of the model parameters, the model achieved recommendations that are tailored, consistent to integrated knowledge and domain guidelines, diverse, and acceptable. All of these are fundamental requirements to meet according to the extant RS foundations (Aggarwal 2006). On the other side, although we showed an implemented model founded on specific design decisions, it must be noted that EvoRecSys deserves further exploration of other recommender principles and user preference/interaction aspects left outside the scope of this work. This, together with the results of our study, suggest that the EvoRecSys framework and its conceptual architecture should be subject to further study by the research community, thereby opening new pathways of research within the field of recommender systems for health and/or based on evolutionary computing.
Although the proposed framework and model have reported favourable results, they constitute to the best of our knowledge the first research efforts for health RS in this direction. Consequently, several challenges and areas for improvement have been identified during the framework design, model development, and experimental studies. The most relevant such directions are:

Complementary datasets and interpretable recommendations:
One of the proven strengths of EvoRecSys is its GA ability to construct configurable recommen-dations that accurately adapt to the users' needs and preferences, personalising fine-grained aspects such as serving sizes in meals and PA duration. However, these recommendations-specifically the suggested meals based on food itemsmay sometimes be less interpretable than, for example, recommending a recipe (Musto et al. 2020). For this reason, an immediate aspect deserving study is how to incorporate datasets that facilitate more meaningful food recommendations such as recipes, ready meals from a supermarket, regional food, or specific groceries. An interesting question to study here is the effect of bridging precise and highly optimised meals generated by EvoRecSys with static but more understandable recipes/products (e.g. from third-party datasets) that are similar. This would also help developing bespoke models focused on determined demographic sectors. 2. Highly configurable and diverse components: Experiments on the GA parameter settings have demonstrated the importance of semantic coherency criteria to guarantee higher aptitude and quality in recommendations. Due to the nature of the techniques used at the core of the EvoRecSys framework, it is possible to flexibly define the architecture of the recommender engine. Based on this feature, more semantic rules such as compatibility among ingredients can be implemented in order to ensure coherency and diversity from the deepest level (food items within a bundle), to the highest level (food items between bundles). Furthermore, it is possible to add/remove new food item categories. For instance, desserts could be incorporated for building a more robust recommendation for two or three-course meals. 3. Implicit dynamic data acquisition: Additionally, improving how the users' preferences are modelled in order to acquire a more accurate insight about the user preferences and habits would be possible. The model implemented in this study relies on preferences explicitly provided by the user during their initial interaction with the system. However, more reliable recommendations could be built: (i) by acquiring new forms of data dynamically and over time, e.g. via daily feedback of physical activity logs, and (ii) discovering how these recommendations may align with the preferences and personal needs stated by the user if a mechanism that learns from the user feedback and her/his evolving behaviour towards recommendations is incorporated. In line with this research direction, we also consider it equally important to define more objective evaluation metrics and criteria for experimentally validating the models developed, especially in terms of quantifying the extent to which recommendations align with stated and/or implicitly modelled users' preferences.
Another relevant aspect to consider is that all experiments were performed on standard commodity hardware. Thus, the thresholds arranged in Sects. 5.1 and 5.2 were partly dependent of the computational power of the equipment available. A dedicated server with high performance hardware would enable us to consider much lower efficiency thresholds and therefore more reliable recommendations. Moreover, fast responses in real time and parallel handling of multiple user requests would be possible. Nevertheless, the experiments made provide a suitable methodological approach to follow for the experimental configuration and validation of models built upon the EvoRecSys conceptual framework.

Concluding remarks and future work
In this research, we have introduced EvoRecSys, a novel conceptual framework for recommendations in health-related domains, entirely based on the premise that it is possible to strike a balance between three main dimensions: (i) what the user prefers, (ii) what the user needs, and (iii) what the user sets as a goal. The framework is characterised by defining an evolutionary algorithmic approach that establishes the balance of these three components through a multi-objective optimisation problem. A distinctive feature of the framework is its ability to build highly configurable items in the form of meal-exercising bundles to be recommended rather than immutable items, which allows more tailored and reliable recommendations for the user. The proposed framework architecture is defined to be flexibly instantiated into different model implementations for recommendation across different application areas of health and well-being. In this paper, we presented an implementation of EvoRecSys into a general purpose meal-physical activity recommender to help achieving wellbeing goals. However, the proposed conceptual framework guidelines may also help when building models for people with special needs such as patients with chronic diseases, professional athletes, or people whose cultural background only allows them to eat specific food. In all cases, when a model is built, we encourage a validation process by health experts to consider the well-being goals and recommendations provided by the system; in particular, users should seek medical assistance to reach goals when there is an illness present. This study has delivered a first proof of concept where GAs are exploited as the core technique of a recommender system instead of being a complementary part of a recommender engine driven by other currently used techniques. As a consequence of the promising results obtained in this research, future work directions have been outlined. For instance, the inclusion of a dietary and exercising diary for the user is one of the main developments that will help to improve the robustness of this framework by obtaining a better insight of users' behaviour.
Additionally, a more flexible graphical interface will improve the user experience. For instance, the possibility of creating new bundles combining any of the recommended meals with any of the PAs built by the system. Regarding interpretation of recommendations (see Sect. 7), it can be difficult for users to intuitively compare food portions to the nearest gram and exercise to the nearest minute, therefore showing users an average food portion or a valid portion interval (e.g. to the nearest 50 g) and presenting valid time ranges for exercise (e.g. to the nearest 5 min), will improve the user experience.
On a last note, a native mobile application would provide more freedom in terms of the implementation of a more friendly user interface, possibly linked to wearable devices for seamless capturing of data. For instance, heart ratio and number of steps per day would allow us to learn more about the physical status and habits of the user and therefore the recommendations of physical activities would be more aligned with the physical activity in real time.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Appendix: Algorithms
To enable replication, we present EvoRecSys algorithms in full. Algorithm 2 presents the general workflow of EvoRecSys. First, we calculate the TEE value using Eq. (4) and user's physical data obtained through the web app. The TEE value and the fitness goal chosen by the user are passed as parameters to the function described in Algorithm 3 which calculates the number of tailored calories (see Sect. 4.1).

Algorithm 2 EvoRecSys Algorithm
In order to lose approximately 500 g weekly, which is the minimum recommended without harm (Williamson et al. 1992), the allocated value for calories is 551.26 (see example in Sect. 4.2.1). In addition, this value is used if the user wants to gain weight. On the other hand, the allocated value of muscle is 0.15 because, on average, 15% of excess of intake calories is needed for a non-athlete person (Garthe et al. 2013). The returned value, tailoredCal, is then used by the genetic algorithm.
The step before the genetic algorithm starts consists in retrieving the food and PA data from the database. In this implementation, items whose preference value is 0 (the user neither eat it in the case of food items nor perform the physical activity) are not retrieved (see Sect. 4.1). In Algorithm 2, Line 7, the population of size popSize is initialised, with each individual (containing B meal/PA bundles) created using the procedure described in Algorithm 4. The process focused on building a meal is described in Algorithm 5. The algorithm not only explores the search space in terms of food items, but also explores the search space in terms of portions. [40,60] is the range which is used to allocate a random percentage of the total number of calories to the main food item (see Line 3). The function called tailorFood (see Lines 6 and 12) creates a new food item interpolating the food item data received as the first parameter (see Sect. 4.1) using the number of calories received as the second parameter. Finally, randomValues (see Line 8) generates a list of three random numbers whose sum is the remain calories.
Algorithm 6 presents the process of building a PA. In order to preserve whether the excess number calories or the deficit of calories needed to achieve the chosen well-being goal, the TEE value is used rather than the value of tailoredCal. Similarly to the food tailoring function (see Algorithm 5, Lines 6 and 12), the function called tailorPA (see Line 7) interpolates the PA data received as the first parameter using the recommended time as the second parameter for building a new PA item.
Algorithm 6 Create PA 1: procedure CREATE PA(P A _ items, T E E, weight, goal) 2: met Re f = 60 MET reference value for an hour 3: random P A ← random(P A _ items) 4: met V alue ← gets(random P A) 5: burnedCal = met V alue * weight Burned calories per hour 6: recommendedT ime = (T E E * met Re f )/burnedCal 7: new P A ← tailor P A(random P A, recommendedT ime) 8: return new P A 9: end procedure Regarding the evaluation of individuals (see Algorithm 2, Line 8), Algorithm 7 describes the procedure that takes place. Each individual is evaluated under four approaches (see Sect. 4.2.2). Each approach (or restriction) provides a numerical value between [0.0, 1.0]. Once all evaluations are finished, the arithmetic mean is calculated using the outputs of the approaches [see Eq. (6)] and finally it is set on the individual.
Executing tournament (see Algorithm 2, Line 11) is fully described in Sect. 4.2.3. Concerning the crossover operator (see Algorithm 2, Line 12), Algorithm 8 describes the procedure. Two random individuals from the population are taken to recombine their items under a probability crossProb. This task is repeated popSize times.
The core procedure of the crossover operator is called combine items (see Algorithm 8, Line 6). Algorithm 9 presents how said procedure works. Likewise the function called tailor food (See Algorithm 5, Lines 6 and 12), the function called mixFood (see Lines 14 and 21) builds a new food item interpolating the food data received as the second parameter using the number of calories contained in the food data received as the first parameter. Regarding the PA combination process (see Line 31), it only inserts the PA of the second parent (P2) to the new bundle, conserving the meal of the first parent(P1) Finally, the mutation operator (see Algorithm 2, Line 13) is described in Algorithm 10. An individual from the population is taken to swap its items under a probability mutProb. This task is repeated popSize times. Furthermore, the CF task takes place (see Sect. 4.2.4) and the output, which consists in both food items and PA items, is available (see Line 2).
The principal procedure in the mutation operator is called swapItems (see Line 6). Algorithm 11 presents the behaviour of the said procedure. The function called swapMain (see Line 10) replaces the current main item inside the individual by a random main item taken from the set of similar items, interpolating the data from the similar food item using the calories intake from the current item. In the same manner, swapSide (see Line 15) replaces a side food item. Regarding the function called swapPA (see Line 21), it replaces the current PA by a random PA taken from the similar items, interpolating the data from the similar item using the intake calories from the current item.