Introduction

With the development of the tourism industry, more people plan to travel by air. As the price of flight tickets usually tending to increase before takeoff, customers often schedule their trips in advance to lower costs. A good timing of schedule-making can bring travelers considerable opportunity profits. However, the tourism market is easily affected by global crises. The study (Uğur and Akbıyık 2020) shows that travelers have a high probability of canceling or delaying their trips when they hear the news, especially when the world is experiencing pandemics. Compared with other crises, COVID-19 proves that the pandemic has a much larger destructive impact on the travel and tourism industry (Sadreddini et al. 2021). Due to the widespread impact of the COVID-19 pandemic, the World Travel & Tourism Council (WTTC) estimates the daily loss of one million jobs in the travel industry. Furthermore, research indicates that the potential loss of tourism GDP in 2020 could be as high as US $2.1 trillion (Škare et al. 2021). In this case, customers are likely to reschedule their trip during the COVID-19 period (as shown in the Fig. 1) and changing or cancelling these pre-booked and low-cost tickets might turn customers’ profits into a loss.

Fig. 1
figure 1

The refund and change rate. The full line represents the change rate and the dotted line indicates the refund rate

Motivated by this, some airlines and OTAs offer customers a cancellation protection service (CPS) (Sadreddini 2020), with which customers can change or cancel their flight tickets, even the whole trip, without high expenses. During the first wave of the pandemic, without CPS, most airlines' refund and compensation policies were widely criticized. The coverage and duration of refund policy for a canceled trip often fell short of expectations. The proposal of CPS benefits the customers, especially the early birds. But with the recurrent pandemic and regionally fast-changing policies, the service provider’s pressure is increasing from the high usage rate of CPS.

To help the service providers, many scientists are working on adjusting the CPS or insurance fee based on airlines and origin–destination pairs. In the work of Samuel Lukas et al. (Lukas et al. 2019), they use standard deviation, Generalize Linear Model, and Support Vector machine to calculate the premium, which will be included in the ticket price and help the airline’s company to improve their service and cover the cost of compensation. An insurance premium model for case delay or cancellation is proposed from Stefani et al. (2019), the author found that the calculation works better when considering the airline and the departure city of the flight. However, the methods above have two shortcomings. On the one hand, the customer group of the OTAs is more diversified, allowing them to make more flexible and personalized sales strategies, but the application scenario of the above methods only allows them to consider the flight information. On the other hand, the calculation is based on the assumption that extra fee should be included in the ticket price, the price fluctuation would create confusion for customers.

One different perspective for calculating an optimal CPS fee is based on customer risk groups. Customers are clustered or classified using their purchasing and browsing behavior data. It has been proven critical for successful marketing and customer relationship management (Chen et al. 2017; Hanafi 2020). For example, the PurTreeClust clustering algorithm (Chen et al. 2016) was proposed for large-scale transaction data, and the authors of Hsu et al. (2012) propose a segmentation algorithm to identify similarities between clients. In the work of Sadreddini et al. (2021), they propose an adaptive calculation method that minimizes the CPS fee for customers with low or no-cancellation rates, and maximizes the CPS fee for customers with high rates by using real-world customer transaction-based behavior data. Their algorithm can learn the cancellation rates from the customer's historical data, but also results in different fees for different customers. All methods mentioned above are trying to use price discrimination to promote the profits of CPS, which is likely to influence the ordering experience. Therefore, we come up with a personalized recommendation method to increase the package attach rate by providing suitable and optimal service package.

In our work, we propose the all-in-one (AIO) service package, which can be considered as an upgraded CPS since it offers compensation for flight delays, ticket changes, and refunds. To improve the customer’s ordering experience, we design the dynamic recommendation engine (DRE). It predicts the preference probability for the different service packages rather than directly calculates the fee for it. For customers with low interest in AIO, DRE will recommend other types of service package products that meet their needs. In this case, each customer has a personalized service package, and the same packages always have the same price. The DRE uses historical data and real-time data, user information and flight information, and a machine learning-based model to make the prediction. The experiment shows DRE not only raises the package attach rate without interrupting the flight booking process, but also helps the customers cut cost when meeting flight cancellations or changes.

The rest of the paper is organized as follows. “Background and motivation” section presents background and the motivation of designing DRE. “Scene recommendation” section introduces scene recommendation and related data analysis. “Personalized recommendation” section introduces the personalized recommendation, including the data analysis, the design of the model and the experiment results. A conclusion and a discussion of further work is given in “Discussion” section.

Background and motivation

In this section, we will introduce the service package, whose main function is providing delay compensation, ticket change services, and cancellation services. Then, the motivation of building a DRE will also be illuminated.

The service package

Service package is a kind of ancillary product, which consists of the primary service and the secondary bonuses. The primary service contains delay compensation, ticket change service cancellation service, or the combination of them. Travelers can receive compensation when their flight is delayed or changing and refunding tickets voluntarily. The secondary bonuses are all kinds of vouchers, which make the service package more attractive and help improving crossover rate between the flight business unit and other business units.

At present, we propose 4 types of service packages. The Delay Compensation Package is a basic package, only taking delay compensations as the primary service. The Cancellation Guarantee Package and the Change Protect Package are advanced versions of Delay Compensation Package, whose primary service is the combination of delay compensation plus ticket refund service or ticket change service. The Premium Package (AIO) includes all types of services making it the most multi-purpose package amongst them. Apparently, the price of the package goes up while the primary service evolves.

Experiments setup

The experiment uses A/B testing, which is a common and useful research methodology to understand user engagement and satisfaction of a new feature or product (Xu et al. 2015). The experiment was conducted by dividing the participants into two groups, each group will be recommended one specific type of service package.

We choose five indicators to compare different service packages:

  1. (1)

    Package attach rate indicates the proportion of passengers who buy service packages. The larger the value, the greater the customers willing to buy the service packages.

  2. (2)

    Conversion rate is the percentage of users on the website that complete a booking out of the total number of visitors. This value shows the influence of service package to the main flight booking procedures.

  3. (3)

    Income per ticket is the total flight tickets income divided by the number of tickets.

  4. (4)

    Income per client is the total flight tickets income divided by the number of clients who enter the flight fare selection page.

  5. (5)

    P value indicates the probability that an observed difference could have occurred just by random chance. The lower the value, the greater the statistical significance of the observed difference.

Experiments results and analysis

Before the COVID-19 pandemic, only the Delay compensation package was available for sale. When the pandemic happened, we introduced the Cancellation Guarantee Package and the Change Protect Package for the first time to enhance our travel protection services and provide our customer a carefree journey, and later, we introduced the Premium Package (AIO).

When a new product is first introduced, we will set up an experiment to divide customers into control and treatment groups by the principle of randomization and double-blind control. In this situation, half the participants were recommended the AIO package and the other half was recommended the Delay compensation package.

The target of the experiment is to test whether the AIO package has a higher attach rate than the Delay compensation package, whether it will have some negative impact on conversion rate, and most importantly whether it will increase our profit. After a certain period of testing, the result showed that the attach rate increased significantly by nearly 2% and the profit also rose nearly 5% due to the attractiveness of the AIO and the inflation of the price. However, the conversion rate also decreased more than 1%, which means the AIO will have a negative influence on selling tickets if we recommend the AIO to all customers without any strategy.

Not every customer has need for the AIO product. We must recommend the product to truly meet customers’ needs. The best solution is to select certain customers who are more willing to buy the flight tickets with the AIO package, or specific scenarios where customers require refunds or change services. Thus, building a DRE is an imminent task to be completed.

Scene recommendation

In this section, we conduct some analysis using the experiment data and online data from an OTA platform, to find the appropriate service packages for different scenes.

Refund and change services analysis

By analyzing the online data on the OTA platform from different perspectives, we determined that the flight refund rate and change rate of some scenes are higher than those of others.

To begin with, the ticket change rate of big airlines is normally higher, but the ticket refund rate is smaller than that of small airlines. For example, the top five airlines with the highest change rate include the top three airlines by sales, but the top five airlines with the highest refund rate are all small airlines. One reason is that the big airlines have more business passengers, who might adjust their trip according to the needs of work. And big airlines normally have multiple optional flights from which customers can select. In this case, we investigate the relationship between the refund rate and the proportion of business passengers. The lower the proportion of business passengers on a flight, the higher the refund rate. When the proportion of business guests is higher than 50%, the refund rate has a large probability of being lower than 20% (Table 1).

Table 1 The refund and change rate of multiple airlines

Besides, the refund and change ratios of direct flights, stopover flights, and connecting flights are various. The re-booking rate of connecting flights is the highest, at around 1.28 times as high as that of stopover flights, which is the lowest of these three types of flights. And the refund rate of connecting flights is also the highest here, at around 1.29 times as high as that of direct flights. The refund rate of stopover flights is around 1.13 times as high as that of direct flights. The stopover flights usually have more than one segment. The change of one segment has a large probability of influencing the flight status of other segments, causing the refund and change rates of that segment to be higher than those of the other two types.

The passenger load factor is one of the key indicators that affect the decision-making of airlines. For example, the value of the passenger load factor will be considered as one important factor in determining whether to cancel the flight or not. The analysis demonstrates that a low passenger load factor usually leads to flight cancellations. Compared to the airlines, we can acquire not only the historical passenger load factor of one specific flight, but also that of different air routes and cities, which includes flights from different airlines. The real-time passenger load factor is difficult to calculate for OTAs since the tickets are sold on multiple channels. Since it strongly associated with the historical passenger load factor, search data, presale data, date-related holiday data and so on, we have to make a prediction by using advance algorithms.

And the analysis of different routes and different departure days shows that the refund rate can be affected by the pandemic. In contrast to 2019, prior to the pandemic, the refund rate increases around 20% in 2020. And in 2021, there is a clear link between the refund rates and the outbreaks of the pandemic. The flight change rate in 2020 was slightly lower than that in 2019, since the flight cancellation rate in China increased significantly during the COVID-19 period.

In addition, we calculate the advance booking days and discover that the number of changes and refund rates are proportional to the increased advance booking days. The change rate is the highest when customers book their trips more than one month before departure. Furthermore, the refund and change rate have a close connection with the price of flight tickets. Usually, the inexpensive tickets are more likely to be changed or refunded because they are often booked far in advance (Table 2).

Table 2 The refund and change rate of different advance booking days and price ranges

Conversion rate analysis

Based on the analysis above, we design some scene judgment conditions for scenario recommendation, which involve the selected departure airport, destination airport, airlines, a specific number of advance booking days, and the certain range of flight tickets' price. We choose five scenario selection conditions as below:

  1. (1)

    The flight belongs to big airlines and several small airlines with high ticket refund rate.

  2. (2)

    The flight is a stopover flight or a connecting flight or a high-priced direct flight.

  3. (3)

    The flight’s predicted passenger load factor is lower than 50%.

  4. (4)

    The departure and arrival airports of the flight are in our selected cities.

  5. (5)

    The flight is booked more than 3 days in advance.

If all conditions are satisfied, we will recommend the AIO service package; otherwise, the Delay compensation package will be recommended.

We use an A/B testing experiment to test the effect of scene recommendation, like how we compared the new proposed service package to old packages. For the selected scenes, we recommend the AIO Package. For the rest of the flights, we recommend the Delay compensation package. The experiment's results have been significantly improved, all indicators have positive elevation. The conversion rate didn't decrease significantly, the package attach rate increased by 1.80%, the income per ticket and client increased by 1.57% and 1.71%, respectively.

Personalized recommendation

The scenario recommendation performs well, but the selected scenarios only cover 34% of the ticket quantity. To increase the rate of exposure and let AIO package assist more clients, the flight search, which is filtered out by scene recommendation should also be processed by other methods, that is personalized recommendation. It should be emphasized that, to protect a user’s privacy, if the user choose to turn off personalized services, we only use the scenario recommendation.

Data analysis

From the experiments in scene recommendation section, we discovered some aspects that can distinguish different-preference clients. First, we try to find the difference of customers through profiles including the gender, age group, VIP grade, etc. Male customers usually have higher package attach rate than female customers, and customers older than 45 and those between 13 and 17 are more likely to buy an AIO package compared to other age groups. The higher the VIP grade, the higher the package attach rate of AIO packages. Besides the 13- to 17-year-old customers, we consider the proportion of business passengers in other mentioned groups is relatively higher. They might modify the itinerary more frequently according to the constraints of work, and their travel plans are more temporary and uncertain.

Secondly, some differences can be found in customers' historical booking behaviors. The new customers who have never purchased a flight ticket on this OTA platform tend to buy more AIO package than regular customers. Since they are unfamiliar with the refund or the change procedures, and prefer to avoid any inconvenience. Besides, customers with higher average number of ancillary products in their historical flight booking have a higher package attach rate of the AIO service package. And price-sensitive clients are unlikely to order the AIO service package since it has extra costs.

In addition, the usage of the AIO package is one of the clear indicators, and we designed a new rule to divide the users into groups. Depending on whether one has used the AIO package before or not, we recommend the AIO package or the delay package first. However, the rule cannot be applied to customers who have never purchased an AIO package or a flight ticket through the platform being analyzed (Table 3).

Table 3 The order ratio and package attachment rate of multiple customers’ features

Model design

To further guess the preferences and needs of customers, we designed a machine learning model to classify the customers. We utilize a decision-tree-based ensemble machine learning algorithm which uses a gradient boosting framework, XGBoost (Chen and Guestrin 2016), as our model here. XGBoost is one of the well-known gradient boosting techniques which has enhanced performance characteristics. XGBoost is an additive boosting tree system. The prediction is given by the sum of a set of regression trees {fk}, which is used to approximate the residual in each iteration. While minimizing the loss function, it is expanded into the second-order Taylor polynomial. Compared to the gradient boosting decision tree (GBDT), which uses minus gradient substitute for residual, XGBoost is more accurate.

$$\widehat{y}=\sum_{k=1}^{K}{f}_{k}(x)={\widehat{y}}^{(k-1)}+{f}_{K}$$

In this model, we first consider the customer's booking history and behaviors, such as the package attach rate of his or her flight orders over the last year and the price sensitivity level, which we found useful in the data analysis part. As a "lookalike" method, this model works as a classifier to group and find more similar clients. For different groups, we will recommend one of four service packages (Delay compensation package, Cancellation guarantee package, Change protect package, or the AIO package).

To get closer to the actual flight search scenario, and because customers may have different performances when ordering different flights, we add some real-time information into the model, such as the chosen flight's takeoff time, origin–destination pairs, airlines, and price of tickets, which we also found crucial in the scene recommendation. In this case, DRE can make personalized accurate commendation for every person and every flight search. Since there are thousands of origin–destination pairs and hundreds of airlines, if we convert these two features into one-hot encoding, the dimension of feature space will be huge. Therefore, we build a simple embedding network as a trick of dimensionality reduction and introduce a prior knowledge into the XGBoost model. The structure of the final model is shown in Fig. 2.

Fig. 2
figure 2

The architecture of final model

Experiment results

In the offline testing, the overall accuracy of the model can reach around 80% and the accuracy of AIO package is around 86%. Just as our analysis indicated, the most important features of the model is the airline, the airport of departure city, and the price sensitivity level of the customer.

In the A/B testing experiment, we compare the DRE has both scene and personalized recommendations and the DRE only has the scene recommendation. After a 3-week experiment, the conversion rate increased by 0.06%, the package attach rate decreased by 0.07%, the recommendation coverage of page view increased from 34 to 70%, and the experiment also found a statistically significant increase in profit. Although the package attach rate slightly decreases, the P value is high enough to indicate the differences is not significant. The high conversion rate shows with combining scenario and personalized recommendations, the clients are more likely to finish the flight booking process.

Discussion

In this paper, we first propose a new service package called the AIO package. The AIO package includes multiple flight ticket services, which can help customers when they need to change or cancel their itinerary. We used an A/B testing experiment to compare the proposed AIO package with other service packages, from which we found that the universal recommendation of one package can influence some customers' ordering experiments. Therefore, we designed a recommendation engine, DRE, which integrates scenes and personalized recommendations (Fig. 3).

Fig. 3
figure 3

The schematic of DRE. The steps in the upper dotted box is the scene recommendation; and the bottom dotted box shows the steps of personalized recommendation

Before designing the DRE model, we conducted some data analysis to select useful aspects that could help to make the recommendation efficient and accurate. The perspectives of the analysis include the scenario features (such as the airline, the airport of departure and the advance booking days) and the personal features (such as the price sensitivity level and the usage of different flight services). The model classifies the customers into several groups; for each group, we will recommend a specific type of service package. Proved by experiments, our proposed DRE can predict the customers' preference and actual needs without affecting the flight booking process.

Compared with airlines, online travel agencies (OTAs) have multiple advantages. OTAs offer multi-airline flight itineraries, so clients can effortlessly make comparisons among diversified products within the shortest time spent on searching for flights. For example, airlines offer limited air routes, but OTAs can offer some new air routes using connecting flights by combining the products from different airlines. OTAs not only provide exclusive connecting flights to satisfy the needs of customers, but also offer supplementary products to down-size the risks, for example, the flight delay insurance for the first flight segment.

With much more tourism resources and ticket sales, OTAs have a diverse customer base, by analyzing which services can be improved and the customers’ preference can be learned. Especially as the OTA with the highest market share in China, the behaviors of our customers are more representative of the Chinese tourism market. In addition, we have the customers' historical behaviors across multiple products, including flight tickets, hotel reservations, car rentals, and so on. Courtesy of various data, user browsing history, and marketing tools, using artificial intelligence and big data technologies, we are more aware than ever of what our customers are searching, their past purchases and their preferred destinations. By analyzing the features and information of different flights, we also can compare the services and customer satisfaction of different airlines, which could help over time to improve their products. When dealing with special events, OTAs usually have more resources to make prompt adjustments, maximizing the chances of on-time travel. Especially when using our designed DRE, the customers’ demands can be satisfied fast and with little loss.

In the context of the pandemic, by recommending appropriate service products to customers, OTAs can increase revenue by boosting the package attach rate of flight tickets and improving the customers' ordering experience. OTAs can not only provide users with travel protection for refund or change services of air tickets, but also improve convenience for our clients in multiple aspects of travel. As an OTA, we have confidence in the future development of our products, and there is still plenty of room to improve the AIO and the DRE. On the one hand, the AIO can include not only flight ticket-related services but also vacation-related products such as hotel and activity tickets. On the other hand, when the combination of services or products becomes complex, the DRE needs to make a more detailed classification, which requires a more advanced model and strategies.