1 Introduction

Appointment systems are widely used in the service industries to match customers’ demand and a service provider’s capacity, e.g., in the health care sector, gastronomy, or leisure industry. “An appointment system has completely altered our lives; it has brought order out of chaos and we can never cease to wonder how we endured our old ways so long” (Cardew 1967). First, common practices started with pen and paper and were later replaced by computer-based appointment systems, which are nowadays connected to the internet with increasing pace (see Zhao et al. 2017 for an overview of web-based medical appointment systems). Well-known platforms to make an appointment online are, e.g., www.opentable.com (restaurants), www.zocdoc.com (doctors), and www.opencare.com (dentists). Online appointments seem to be an emerging trend (see U.S. Government Accountability Office 2017) and are especially beneficial in times of a pandemic like COVID-19Footnote 1 (resp. Sars-CoV-2). In times of social distancing, well matched supply and demand is important to avoid unnecessary gathering of people. Online appointment systems can decrease the number of customers lining up in a long queue not knowing whether they will be treated after a reasonable waiting time, especially after a reopening (e.g., after a pandemic lockdown). The pandemic situation has even pushed virtual health care forward. “This crisis has forced us to change how we deliver health care more in 20 days than we had in 20 years” (Dr. Robert McLean in Span 2020) and Dr. Meeta Shah conjectures “kind of a turning point for virtual health care” (Dr. Meeta Shah in Abelson 2020). In any case (virtual or physical service), offering the booking possibility online comes with the decision which appointments to offer in detail. Some providers ask potential customers about their time preferences before offering appointment times. Irrespective of possible time preferences, the following question is raised: Should the provider solely offer one appointment time, e.g., the one that is closest to the preference (if known), several times to choose from, or even all available appointments?

Nowadays, customers looking for an unknown service provider (e.g., for a new dentist or a restaurant) often read ratings on portals like www.tripadvisor.com or www.yelp.com. Shukla et al. (2020) show by investigating clickstream data on online word-of-mouth that how many doctors are rated can influence customers’ choice behavior. Observing the service provider’s free capacity may also be part of the opinion formation (observational learning from previous customer choices). The consequence may be a negative or a positive effect. On the one hand, a small offer set may be negatively associated with e.g., longer waiting times, leading to a turned-away customer. On the other hand, a small offer set may be positively associated with popularity, leading to an increased interest in the provider. We see this behavior (positive association) as an analogy to the offline empty restaurant syndrome, where herd behavior may lead to a queue if there are sufficiently informed customers who know that the quality of the service is high (see Becker 1991; Kremer and Debo 2016; Teraji 2003). The general concept is that uninformed customers follow the behavior of (seemingly) informed customers. One queue gets longer while the competitors possibly stay with a lack of demand (as long as the queue is not oversized). In terms of an online appointment system, customers may infer the quality of the service when observing the booking status, i.e., the number of offered time slots and the number of slots that are not offered (or booked). In that vein, a small set of offered appointments in the online world can be put on a level with a queue in the offline world.

Kaluza et al. (2023) investigate whether this empty restaurant effect is also present in the online world. They find in different settings that the number of offered appointments can influence the customers’ choice behavior. We replicate these findings and set up a survey that puts respondents in a stylized situation to book an appointment with a dentist. Different from shortages of specific physicians (in several regions), some dentists face a performance pressure, leading to many overtreatments to increase revenues (Heath et al. 2020), which can harm the perceived quality and the customers’ satisfaction and may turn away customers. In line with Kaluza et al. (2023), we find in the survey that customers are less likely to choose a dentist if the number of offered time slots is relatively high. As an example, one of our survey’s participants states: “I chose by seeing the number of free appointments as a sign of how good the dentist is. So I preferred the one, that looks more preferred by other people”.

Based on our observation, we model an online appointment system as a Markov decision process, following Gupta and Wang (2008) and Zhang and Cooper (2005). The service provider aims at maximizing expected profits by controlling the number of time slots offered to a customer. A discrete choice model is embedded to cover the customers’ preferences (see Train 2009 for an overview). We assume that the customer’s utility function includes the number of offered time slots, as we observe in the survey that customers infer inferior quality of the service if too many time slots are offered. We analyze the conditions under which the visibility of free time slots should be actively managed by the service provider, i.e., offering only a specific subset of the available appointments.

As the state space of the Markov decision process grows exponentially with the number of differently preferred time slots (slot types), we analyze three decision rules and test them against the optimal solution. We find in a numerical study that the decision rules’ performances mainly depend on two factors, i.e., how strong the quality inference affects choice behavior and how strong the preference for a time slot varies throughout the day. The decision rule that performs best in all instances is solving the combinatorial problem myopically by finding the offer set that minimizes the likelihood that a customer leaves the system without booking at any given point in time. We show that this rule is optimal in case the customers have no preferences concerning the timing of the appointment throughout the day. In other cases, the performance of the rule is close to optimal, while we also identify simpler rules than the myopic with reasonable performances in specific scenarios.

In sum, our main contribution lies in considering a new aspect of customers’ choice behavior in online appointment systems. Including the possibility of offering only a subset of the available slots may improve the utilization and, thus, the expected profit of a service provider when customers infer quality from the booking status as in our survey. We propose rules and provide management insights on which policy seems most appropriate depending on the strength of the quality inference and the heterogeneity on time slot preferences.

The rest of this paper is structured as follows. In Sect. 2, we briefly review the relevant theoretical literature regarding related appointment systems and behavioral literature about the empty restaurant syndrome. In Sect. 3, we introduce the design and results of our survey on online appointments to investigate the empty restaurant syndrome in an online environment. In Sect. 4, we provide details about our theoretical model and explain generalizable insights in Sect. 6. Section 7 gives an overview of our numerical study, introduces our developed decision rules, and shows the performance of our model and the decision rules. In Sect. 8, we critically discuss our approach and point out relevant limitations. Finally, we conclude our paper in Sect. 8.

2 Related Literature

In our paper, we study how time slots should be made available when customers infer quality from the number of available time slots in appointment systems. Revenue management, as an instrument for the allocation of service capacity among customers, has been extensively studied in the literature, especially airline revenue management (see McGill and van Ryzin 1999 for an overview). We focus on a setting, in which the service provider cannot differentiate the time slots by prices or other characteristics (contrary to, e.g., the standard two-fair class revenue management problem in Belobaba 1989). See, e.g., DeCroix et al. (2021) for the consideration of dynamic personalized pricing when service variability reduces a provider’s revenue.

Our setup is close to the literature on revenue management with parallel flights, i.e., where multiple flights are scheduled between the same origin and destination in a sufficiently narrow time frame (Zhang and Cooper 2005). From a modelling perspective, this planning problem is similar to a service provider offering multiple appointment slots within a short period (Gupta and Wang 2008). In particular, our planning situation is a special case of the model presented in Zhang and Cooper (2005) without customer segmentation, whereas a specific flight corresponds to an appointment slot type (e.g., a time slot in the morning or the evening) and the number of seats on a specific flight corresponds to the number offered slots per slot type (e.g. four available slots in the morning). The airline/service provider must decide how many seats/appointments to offer from the available capacity. In revenue management with parallel flights, offering less seats than available may be beneficial for later arriving customers who book in a differently priced booking class. In our scenario, offering less appointments than available may be beneficial if this stimulates demand.

Different decision levels are studied regarding appointment systems (see Ahmadi-Javid et al. 2017 for strategic, tactical and operational level). The strategic level primarily focuses on decisions such as how many servers to consider, the access policy, and how to deal with walk-ins in general. On the tactical level, the appointment design is to be determined, such as the interval, time window, and block size (Cayirli and Veral 2003), whereas the operational level, what we primarily focus on, refers to the individual customer level with fixed capacities. Patients are then either accepted and allocated to a day and time or rejected.

Approaches for multiple service providers in one appointment system exist (Gupta and Wang 2008), whereas we focus on an appointment system for a single service provider for simplicity (and further show how to include competing providers in Appendix D). Liu et al. (2019) consider sequential offering, enabling interaction between the provider and the potential customer. This process is especially beneficial for telephone-based appointment scheduling. We focus on an online appointment system and exclude the possibility of iteratively offering different sets to one potential customer. Most of the literature considers intraday planning (e.g., Gupta and Wang 2008; Talluri and van Ryzin 2004a), as we do. This means, we only consider and plan one workday separately from others. Only a few authors work on interday planning to consider several days of a week (e.g., Feldman et al. 2014; Wiesche et al. 2017; Zacharias et al. 2020). Some approaches for appointment systems, regarding the analytical methods on the operational level, make use of a queueing theory (see, e.g., Green 2006 and Zhou et al. 2021), while others, including our approach, use (stochastic) dynamic programming, like Gerchak et al. (1996), Green et al. (2006), Gupta and Wang (2008), Feldman et al. (2014), and Truong (2015).

Concerning customer choices, discrete choice approaches (Train 2009) received great attention in the literature. Random utility models help consider individual preferences among alternatives. A considerable range of models has been developed and can be used for different contexts. Independent demand models, for example, are often found in airline revenue management studies (see Talluri and van Ryzin 2004b). Mackert (2019) considers dynamic slot management for profit-maximization in the context of attended home delivery using a general attraction model for customer choice behavior. We use a multinomial logit model (Ben-Akiva and Lerman 1985; McFadden 1974) and take into account the size of the offer set (i.e., the number of offered appointments) in the utility function. Mushtaque and Pazour (2020) focus on the consideration set theory and study multinomial logit cardinality effect models to compare the benefits and costs of offering a specific subset on an entertainment subscription platform. In contrast to our approach, numerous customers can purchase the same service and customers are overwhelmed by too much information, wherefore a personalized subset is recommended. Customers do not infer quality from the offer set.

Several publications exist in which heterogeneous customer preferences are examined or modeled (e.g., Hole 2008; Liu et al. 2018, 2019), whereas we assume a homogenous customer group who requests service in advance. As demand is endogenous, our approach fits into the literature on choice-based optimization (see Haase and Müller 2013). Hole (2011) similarly models the decision on attributes in the choice endogenously, while other approaches (see for example Gupta and Wang 2008; Liu et al. 2019) consider demand as exogenous.

Liu et al. (2019) also allow to offer less slots than available. However, they block full slot types for later arriving customers, as heterogeneous customers are interested in specific slot types (binary choice model) and find several instances for which offering all slots is optimal. In contrast, we consider homogenous customers (requesting service during the booking horizon) and enable to block single slots of one slot type (set of slots that are equally interesting to the customers) to increase the customers’ interest in the provider. Gupta und Wang (2008) introduce a booking limit which indicates the optimal number of requests to be accepted, in other words, slots to be booked before the day of service. Thus, it is never optimal to reject a request if the booking limit is not reached. However, it may be worthwhile to limit the offer set.

Kaluza et al. (2023) show in their surveys that the size of the offer set of appointments as well as star ratings and travel times can have an impact on customers’ choice behavior. Liu et al. (2019) find in a survey that patients have heterogeneous time windows preferences for an appointment with their primary care doctor. In our survey and in line with Kaluza et al. (2023), we control these heterogeneous preferences by predefining the preferred appointment for the respondents and focus on the number of offered slots instead. We expect the customer’s interest to decrease with a greater choice set in an online appointment system, as a positive effect from the theory of observational learning, meaning that customers adopt the behavior of previous customers. Thus, we expect to have a positive effect of observing the queue on the interest in the service provider. For further information on observational learning, see basic literature of Banerjee (1992) and Bikhchandani et al. (1992). The effect behind our expectation is also known as the empty restaurant syndrome, but rarely studied in the literature. A queue may be associated with quality by uninformed consumers (potential customers who do not know the provider) because they presume informed customers in the queue demanding service and knowing the provider (Kremer and Debo 2016). Thus, absent demand may evoke unpleasant associations (e.g., low quality of service). A service provider who offers many appointments at once may experience less demand if the effect is present. Therefore, it may be beneficial to not offer all available slots. Veeraraghavan and Debo (2009) for example include information about service quality and queue length in their model on customers’ choice behavior. They further investigate the impact of waiting cost in queues on the customers’ behavior and find that customers behave according to the herd behavior as long as they do not want to minimize ex-post regret (Veeraraghavan and Debo 2011). Debo et al. (2012) consider a queueing system and assume that potential customers decide whether to buy the product after observing the queue. They conclude that a queue can be a signal of high quality. Özer and Zheng (2016) include the perceived probability that a product is available, which may impact the purchase behavior.

Empirical research finds the empty restaurant syndrome in offline scenarios. Koo and Fishbach (2010), for example, experimentally show that the presence and the length of a queue behind a person increase the value of a product. Giebelhausen et al. (2011) find in an experiment that waiting time can indicate quality and positively influences the purchase intention and experienced satisfaction. Kremer and Debo (2016) show in laboratory experiments that waiting times positively affect the uninformed consumers’ purchase intention if informed consumers are present. Contrary, DeVries et al. (2018), e.g., find a negative impact of waiting time on the long-term customer behavior and revenue by analyzing the data collected from an Indian restaurant. Jin et al. (2015) observe both positive and negative effects for the choice between several locations and find that observational learning depends on the congestion level. In our survey, we focus on the examination of whether the empty restaurant syndrome may also occur in an online appointment system as firstly investigated by Kaluza et al. (2023). In the following, we also refer to it as the quality inference effect.

3 Survey

3.1 Design

We distributed the survey (Software: LimeSurvey) via a platform for students of an undergraduate course at the University of Hamburg. 248 undergraduates completed the survey in January 2020 (14 January until 23 January).Footnote 2 As an incentive, subjects received bonus points for the exam if they completed the survey.

Our survey is organized into three parts, in line with Kaluza et al. (2023). The first part includes warm-up questions regarding the subjects’ internet usage. The second and main part consists of three scenarios in which the subjects choose between two service providers (see an exemplary scenario in Fig. 8 in Appendix A). In each scenario, we ask participants to imagine the need for a yearly routine appointment at the dentist without having any toothaches, whereby the preferred time slot is at 8 a.m. With a complete appointment system of 32 slots (15-minute cycle from 8 a.m. to 4 p.m., known to the subjects), each dentist offers either a small (two slots), a moderate (eight slots) or a big set (32 slots) of appointments. The smaller offer set is always a subset of the bigger offer set and the preferred time slot (8 a.m.) is always offered by both providers. The dates and time until the day of service are excluded.

We conducted six treatments in a between-subject design (see Table 1). In each treatment, we test three scenarios in a within-subject design. As an example, in treatment T1a, subjects first choose between a dentist with 2 available slots and another dentist with 8 available slots, and in the second-choice situation, between 2 and 32 slots, and so on. To check if the position of the choice option on the screen affects choice behavior, we have a second treatment T1b that is almost identical to treatment T1a but reverses the order of free slots. That is, in treatment T1a, the first choice is between 2 and 8, and in treatment T1b it is 8 vs. 2. Furthermore, we randomize in each treatment and each scenario the order of answers in the choice list (provider A and provider B).

Table 1 Treatment summary

In the third and last part of the survey, we ask open questions to understand the subjects’ decisions in the preceding scenarios. We integrated several attention checks making sure that subjects answer conscientiously. Note that we do not exclude those participants in our analysis who did not provide the correct answers. Only two subjects “failed” all attention checks. We did not find any critical contradictions when excluding those two subjects in our analysis.Footnote 3

Our hypothesis follows the idea of the empty restaurant syndrome, i.e., a greater offer set may be associated with low quality and vice versa.

Hypothesis:

A provider with a smaller offer set is preferred to a provider with a greater offer set.

3.2 Survey Results

We first present the main socio-demographics of our dataset. 46% of our respondents are female, 54% are male. The self-reported ages range from 18 to 76, while 95% are between 19 and 30. See Fig. 1 for the distribution of the respondents’ age. 90% of the respondents are from Germany, the remaining 10% are from Ukraine (3 entries), Afghanistan (3 entries) and 16 other countries from all over the world.

Fig. 1
figure 1

Distribution of the respondent’s age

We next analyze the choice frequencies in the first scenario of each treatment, see Table 2. We consider the (a) and (b) treatments jointly since we observe no significant order effects (fisher’s exact test, \(\alpha > 0.05\)).

Table 2 Descriptive statistics of the first choice

Having to choose between the moderate and the big offer set (T3), 75% choose the option with less available slots. The narrow majority chooses the option with fewer available slots when opting between the small and the big offer set (T2). Slightly less than one-half chooses the small offer set compared to the moderate offer set (T1). Choice by chance (i.e., no effect of the number of offered slots) would predict a 50/50 split, which we could not reject in T1 and T2 (binomial test, \(\alpha > 0.1\)), but which is clearly rejected in T3 (\(\alpha < 0.001).\) Choosing the provider with eight slots differs highly significantly with the given alternative (fisher’s exact test, \(\alpha < 0.01\)).

3.3 Choice Motives

A too-big offer set leads to less demand in treatment T3. This effect is in line with our hypothesis. However, the choice behavior of preferring a smaller offer set can be explained by different motives besides our considered quality inference effect, e.g., by choice overload or by simply being easier to find the predefined preferred appointment at 8 a.m. if less slots are displayed. Therefore, we have a closer look at how the respondents explained their choices in an open question after the three choices (“What aspect(s) did you consider when deciding which of the providers to choose in the scenarios?”). We let two independent raters code the open question on a binary scale whether a motive is mentioned by a respondent or not (yes: 1, no: 0). Besides the frequency of the mentioned motives, the Cohen’s kappa coefficient (Byrt et al. 1993) is stated for the inter-rater reliability. We provide five explanations detected from the data set: quality, choice overload, flexibility, scarcity and less waiting time.Footnote 4

In Table 3, we show the raters’ mean percentage (geometric mean) of respondents in T3 (8 vs. 32, significant results) who name the respective motives and the Cohen’s kappa for the inter-rater reliability. For the sake of completeness, we provide the frequency of the motives of all respondents in Table 6 in Appendix B. However, those results do not deviate significantly.

Table 3 Frequency of the motives for the choice behavior in T3 (71 respondents)

The predominant explanation of the behavior is the quality motive, which is mentioned by 43% of the respondents in treatment T3 and in line with our hypothesis, followed by the flexibility motive (38%). These findings are in line with Kaluza et al. (2023). Following the quality motive, customers expect the service provider with a smaller offer set to be more demanded and thus more popular, which leads to the perception that it may be the better dentist. “Less appointments may mean that the dentist is more popular among the clients and so better.” (respond. 225) Further, a too-big offer set seems to be suspicious due to absent demand. “[H]ow many appointment choices were available because I think that too many options show that the doctor doesn’t have many patients, so maybe he’s not that good in what he’s doing” (respond. 110). Summarizing, a smaller offer set is mostly associated with higher quality, while a too-big offer set is negatively associated with lower quality. Customers thus rather choose a service provider with a smaller offer set.

An alternative explanation to the quality motive for a preferred smaller offer set is choice overload (see Eppler and Mengis 2004 and Scheibehenne et al. 2010 for a literature review). Only 5% give an overload-related motive. “It should be clearly structured and not too overloaded” (respond. 283). This motive is considered explicitly even though it has only single entries and a poor Cohen’s kappa.

An explanation for opting the larger offer set is the “flexibility”, and given by 38% of the respondents in T3. This motive is also found by Rubin et al. (2006) as a driver. They investigate waiting times and choice of time and doctor in a discrete choice experiment and find that, e.g., employees are willing to wait longer in order to get their preferred time. Even with a predetermined preference of the 8 a.m. appointment in our survey, customers like to have the flexibility and greater availability. “The more available possibilities[,] the more freedom I have […]” (respond. 245). Further, some subjects considered the possibility that their preferred 8 a.m. appointment would not take place. They imagined being late or shifted by the provider and preferred flexibility for the postponed appointment. “[…] possibilities to change to a[n] appointment a bit later in case that the appointment at 8 am can[]not take place […]” (respond. 138). Summarizing, a greater offer set gives more flexibility and may thus be positively associated. However, we want to mention that some subjects only mentioned “availability”, which is not always a clear motive for a preferred greater offer set. By simply stating “availability”, it is not clear whether the respondent prefers more options or also infers quality from a smaller availability. Further note that the open question is given for all three choices, even though we focus on the first choice of treatment T3 (8 vs. 32 slots). Thus, the explanation could also be a reason for a different choice behavior (of the second or third choice, e.g., 2 vs. 32) and does not need to be contradictory.

Two less mentioned but clearer motives are scarcity (10%), see for example Denier (2008), and less waiting time (9%), see for example Rubin et al. (2006). If a service provider offers too few slots, potential customers expect the provider to be in a rush and/or the waiting room to be crowded, which leads to a longer wait time. “[…] I don’t want to visit a dentist, who is super stressed […]” (respond. 10). Potential customers may have negative associations with a scarce offer set (i.e., very few offered slots) and may rather choose another provider with more available slots. We included the two motives in our analysis as alternatives to “flexibility” despite the lower frequency and the poorer Cohen’s kappa for scarcity.

The remaining motives (24%) are single entries (e.g., safety, breaks). Note that the open question gives the possibility to mention several aspects. Thus, it sums up to more than 100%.

Summarizing, the open-ended question indicates that customers mainly infer quality from a relatively small offer set. However, a too-small offer set may also evoke disutility because customers (a) are also looking for some flexibility in their choice or (b) anticipate longer waiting times. In line with Kaluza et al. 2023 we denote the latter as the scarcity effect. We will not further consider this behavioral effect in our study, since eliminating scarcity would require adding capacity to the service provider, which is not within the scope of our analysis.

3.4 Strength of the Quality Inference Effect

We next estimate the effect of the number of displayed slots using a multinomial logit model in R (version 4.0.2) with the utility function specified in Eq. 1. The number of offered slots is denoted by o. We use a relative formulation in which \(\frac{o-a}{b-a}\), \(b> a> 0\), becomes 1 if all slots are offered (upper bound b) and 0 if the number of offered slots is equal to the lower bound a which is to be specified. For simplicity, we model a linear quality inference effect, i.e. we assume a negative and linear impact of the number of offered appointments on customer’s on customers’ utility.Footnote 5 We set the alternative specific constant to zero, \(asc=0\), as no differences between the alternatives are given in our survey, apart from the number of slots that are offered. We thus cannot find any alternative specific anchor that requires the integration of an \(asc\neq 0\). The upper bound is \(b=32\) and the lower bound \(a=2\) (the lowest offer set we consider in our survey, thus \(b\geq o\geq a\) in our setting). The \(\hat{\beta }\)-coefficient measures the strength of the effect relative to asc and the error term ϵ.

$$U=asc-\hat{\beta }\cdot \frac{ o-a}{b-a}+\epsilon$$
(1)

We present our analysis regarding the first decision of each subject (248 observations) to account for the dependencies of the within-subject variations. We receive similar results when running the analysis on the whole data set, see Table 7 and 8 in Appendix C.

The overall estimated \(\hat{\beta }^{\mathrm{all}}\)-value is 0.6007 (σ = 0.18, t‑ratio = −3.34), indicating a slight but significant negative tendency of the utility for an increasing offer set at the 95% confidence level, which confirms our hypothesis on an aggregated level. The more slots are offered, the less attractive the provider gets.

We also estimate \(\hat{\beta }^{i}\) for the three treatments separately, \(i\in \left\{T1,T2,T3\right\}\). Contrary to our hypothesis, we estimate \(\hat{\beta }^{T1}=-0.5382\) (\(\sigma =1.0385,\mathrm{t}-\text{ratio}=0.52\)), indicating a non-significant increase of interest in a provider with an increasing offer set (2 vs. 8). Note that we model a negative impact of the size of the offer set on the interest in the provider, see Eq. 1. With a negative \(\hat{\beta }\)-estimation, it results in a positive impact of the effect on the utility. For T2 (2 vs. 32), we get a non-significant estimate of \(\hat{\beta }^{T2}=\)0.2877 (\(\sigma =0.2205,\mathrm{t}-\text{ratio}=-1.3\)). For T3 (8 vs. 32), we estimate a significant effect of \(\hat{\beta }^{T3}=1.35\) (\(\sigma =0.341,\mathrm{t}-\text{ratio}=-3.96\)).

In sum, we find evidence that customers infer quality from the booking status. We assume that in realistic scenarios, customers go through a search process when looking for a service provider. Within such a process, customers observe one offer set after another of different providers and decide each time whether to book an appointment with that provider or search for an alternative. The process ends with an appointment request. In our stochastic dynamic program, we consider one of those service providers and include the positive externalities of the search process. We further assume that the scarcity problem is out of the service provider’s control and focus on those situations where \(\beta > 0\) by setting the lower bound a such that the quality-effect clearly exceeds the scarcity-effect. Due to the scarcity effect, this quality inference effect only affects choice behavior if the number of offered slots is sufficiently large. We next introduce our stochastic dynamic program with the integrated discrete choice model that accounts for customers’ quality inferences from the booking status and solely focuses on one provider (disregarding competitors).

4 Model Formulation

We focus on an appointment system for one workday of a single service provider. In Appendix D, we further show how we derive the customers’ choice behavior regarding the considered provider when several competing providers are observed (e.g., two providers in our survey, see Sect. 3). We assume that the service provider faces uncertain customer demand from a homogenous group of customers that requests slots in advance during the booking horizon. We assume that customers have not booked the provider’s service yet and are thus not familiar with the provider’s quality. Note that Gupta and Wang (2008) consider a second customer group that places requests on the day of service, so-called same-day requests. We assume that quality inference has no effect for these customers and do not discuss this group further. We formulate our problem as a discrete-time, finite-horizon Markov decision process, following Gupta and Wang (2008), Liu et al. (2019), and Zhang and Cooper (2005).

Markov Decision Process Formulation

The service provider has a fixed capacity of κ slots on the workday. As customers may prefer some appointment times to others, we consider N different slot types, with 1 … N. Appointments from slot type \(n=1\) are the most preferred, followed by appointments from slot type \(n=2\), and so on. The fixed capacity of slot type n is denoted by κn, with \(\kappa =\sum _{n\in N}\kappa _{n}\). Each slot belongs to one unique slot type n and can be booked for at most one customer. The booking status of the workday is denoted by \(\vec{s}=\left(s_{1,},\ldots ,s_{N}\right)\), whereby \(s_{n}\in \left\{0{,}1,\ldots ,\kappa _{n}\right\}\) states how many slots of type n are available. The offered appointments are denoted by \(\vec{o}=\left(o_{1},\ldots ,o_{N}\right)\), whereby \(o_{n}\in \left\{0{,}1,\ldots ,s_{n}\right\}\) states how many slots of type n are offered. It follows that \(o_{n}\leq s_{n}\), i.e., only available slots can be offered. It the following we will refer to those slots being available but not offered as blocked slots.

The booking horizon starts with opening the slots for requests, ends with the beginning of the planned workday, and is divided into τ discrete time periods, with \(t=1,\ldots ,\tau\). Time is counted backwards; thus, we denote the planned workday by \(t=0\). In each period \(t\geq 1\), the service provider must decide which of the yet unbooked slots to offer. Denying requests cannot be optimal in our framework. At most, one request occurs per period and only a slot of type n with \(o_{n}> 0\) can be requested. The service provider aims at an optimal capacity utilization via controlling the appointment system’s booking status to maximize her expected profit. Thus, in each period the service provider must decide how many slots to offer in the subsequent period.

Figure 2 illustrates an extract of the sequence of events for periods t and t−1 with two slot types, i.e., \(N=2\). Following the bold path from one state to another one observes state \(\vec{s}=\left(1{,}2\right)\), i.e., one available slot of type one (\(s_{1}=1)\) and two available slots of type two (\(s_{2}=2)\). The service provider blocks one slot of type 2 and it follows \(s_{1}=o_{1}=1\) and \(s_{2}> o_{2}=1.\) Then, demand for the slot of type 2 is realized. That slot is booked for the customer, which leads to the state \(\vec{s}=\left(1{,}1\right)\) in the next period that is closer to the day of service.

Fig. 2
figure 2

Sequence of events with \(N=2\). R1/R2 denotes a request of a slot type 1/2, and “No” denotes no request

Planning Requests and Booking Status

In each period \(t\geq 1\), there is at most one potential customer with an independent arrival rate \(0< \alpha _{t}\leq 1\). The customers’ demand is random and follows a multinomial logit choice rule provided that a customer arrives in the respective period. \(P_{n}\left(\vec{o}\right)\) is the conditional probability that a slot of type n is requested given offer set \(\vec{o}\). \(P_{0}\left(\vec{o}\right)\) denotes the probability that a customer does not request any of the offered slots (“no choice”). \(P_{n}\left(\vec{o}\right)=0\), if no slot of type n is offered. For this specification, \(\sum _{n\in N}P_{n}\left(\vec{o}\right)+P_{0}\left(\vec{o}\right)=1\) holds true.

Equation (2a) states the request probability \(P_{n}\left(\vec{o}\right)\) for a slot of type n given the offer set \(\vec{o}\), assuming that the unobserved attributes of slot type n, denoted by ϵn, with the utility \(U_{n}=w_{n}-\beta \cdot \left(\frac{ o-a}{b-a}\right)^{+}+\epsilon _{n}\), follow a Gumbel distribution (Train 2009).

The weight of the slots type n, wn, captures its mean utility, also known as the alternative specific constant. We assume the weights to be exogenous. A survey similar to Liu et al. (2019) could help gather further information on time preferences. Similar to our survey with one slot type, we assume that the size of the offer set \(o={\sum }_{i=1}^{N}o_{i}\) has a negative linear impact on the choice probability of all offered slots. Thus, the observable attribute of each slot of type n consists of the weight wn and the size of the offer set (number of offered slots) in a relative formulation, multiplied by the parameter β, which indicates the strength of the effect.

As each slot of type n has the utility 𝒰n, we multiply each exponentiated utility of slot type n by the number of offered slots on.

$$P_{n}\left(\vec{o}\right)=\begin{cases} \frac{o_{n}\cdot e^{{w_{n}}-\beta \cdot {\left(\frac{ o-a}{b-a}\right)^{+}}}}{\sum _{m\in N}o_{m}\cdot e^{{w_{m}}-\beta \cdot {\left(\frac{ o-a}{b-a}\right)^{+}}}+e^{{w_{0}}}}\mathrm{if}o_{n}> 0\\ \qquad 0\qquad \text{otherwise} \end{cases}$$
(2a)

We finally simplify (2a) by capturing the quality inference effect in the no-choice option, instead of subtracting it from each slot type option, see Eq. 2b.

$$P_{n}\left(\vec{o}\right)=\begin{cases} \frac{o_{n}\cdot e^{{w_{n}}}}{\sum _{m\in N}o_{m}\cdot e^{{w_{m}}}+e^{{w_{0}}+\beta \cdot {\left(\frac{ o-a}{b-a}\right)^{+}}}}\text{ if }o_{n}> 0\\ \qquad 0\qquad \text{ otherwise} \end{cases}$$
(2b)

A customer’s booking of a time slot gives a revenue of r. We assume that there is no further interaction between the customer and the service provider (e.g., finding a suitable time slot verbally or having multiple requests).

Let \(v_{t}\left(\vec{s}\right)\) be the maximum expected profit from t onwards given the booking status \(\vec{s}\). The problem is solved recursively by maximizing expected profit stated in \(v_{t}\left(\vec{s}\right)\) and \(v_{0}\left(\vec{s}\right)=0\) for the recursion start.

$$v_{t}\left(\vec{s}\right)=\max _{o\leq s}\left[\alpha _{t}\cdot \left(\sum _{n\in N}P_{n}\left(\vec{o}\right)\cdot r+v_{t-1}\left(\vec{s}-\vec{e}\right)\right)+\left(1-\alpha _{t}\right)\cdot P_{0}\left(\vec{o}\right)\cdot v_{t-1}\left(\vec{s}\right)\right]$$
(3)

Vector \(\vec{e}=\left(e_{1},\ldots ,e_{N}\right)\) tracks the booked slots of type n. If slot type n is booked, component n is one while all other components are zero.

5 Generalizable Insights

5.1 Policy

We found no general optimal static rule for the blocking action. The optimal policy on whether and which slots per slot type to block is state-dependent. For the special of a single slot type, we can proof (see Appendix F).

Theorem 1:

It is optimal to minimize the no-choice probability in each period for an appointment system with one single slot type.

In a nutshell, we show that the problem is reduced to a single dimensional dynamic program in each period.

5.2 Necessary Condition for Blocking Being an Optimal Action

We next show that blocking is beneficial if the impact of the quality inference effect is beyond a threshold value (i.e., if the effect is sufficiently strong). Consider two available slots of different slot types and two remaining periods t and t−1. We define \(\Updelta V\left(\vec{s}\right)\) as the difference of the overall expected profit with the blocking of one slot in period t compared to the overall expected profit without blocking at all, see Eq. 4. If \(\Updelta V\left(\vec{s}\right)> 0\), it is beneficial to block at least one slot temporarily. We show in Appendix E that \(\Updelta V\left(\vec{s}\right)> 0\) always holds if β exceeds threshold value \(\tilde{\beta }\).

$$\Updelta V\left(\vec{s}\right)=V\left(\vec{s}\| \vec{s}=\left(1{,}1\right),o_{1}=0,o_{2}=1\right)-V\left(\vec{s}\| \vec{s}=\left(1{,}1\right),o_{1}=o_{2}=1\right)$$
(4)

We analyze in Appendix E the properties of \(\Updelta V\left(\vec{s}\right)\) analytically for \(\alpha _{1}=\alpha _{2}=1\). \(\Updelta \mathrm{V}\left(\vec{s}\right)\leq 0\) holds if quality is not inferred from the booking status, i.e., \(\beta =0\). Intuitively, blocking increases the cumulated no-choice probability in both periods leading to lower expected profits. From \(\Updelta \mathrm{V}\left(\vec{s}\right)> 0\), if \(\beta \rightarrow +\infty\), it follows that blocking is beneficial if β is sufficiently large \(\left(\beta > \tilde{\beta }\right)\).

In case more slots are available (\({\sum }_{i=1}^{N}s_{i}> 2\)), we refer to our stochastic dynamic program in Sect. 4. We further note that the booking system will reach \({\sum }_{i=1}^{N}s_{i}=2\) in expectation if sufficient periods are left and enough potential customers request slots.

6 Numerical Study

6.1 Setup

We first analyze in which scenarios blocking is beneficial when making optimal decisions (see Sect. 6.2). Since the state space of the Markov decision process grows exponentially in the number of slot types, we further test three decision rules in Sect. 6.3.

We let the overall capacity vary from three to six (\(\kappa =3,\ldots ,6\)), with a fixed number of two slot types. The time horizon is kept proportional to the system size \(\hat{\tau }\in \left(2\kappa ,3\kappa \right)\). Without loss of generality, we assume a revenue of \(r=1\). We vary the coefficient of the quality inference effect, \(\beta \in \{0,0.5,1,1.5,3,6\}\). Besides the value of zero, which means that there is no impact, we test a value for a very small impact (0.5), a small and a moderate value around the value of the survey (1 and 1.5), relatively high impact (3), and a very high impact (6). The constant arrival rate of a potential customer varies from rather small (\(\alpha =20{\%}\)), moderate (\(\alpha =50{\%}\)) to rather high (\(\alpha =80{\%}\)).

Our survey results show that customers only infer quality from the booking status if the number of offered slots exceeds a certain value a (i.e., the lower bound in Eq. 2b). We set the lower bound relative to the number of available slots whereas imposing a lower bound of one, i.e., \(a=\max \left\{0.25\cdot \kappa ;1\right\}\). Further, we consider two slot types. Slot type \(n=1\) contains all preferred slots (in the following wh), whereas slot type \(n=2\) contains the less preferred slots (in the following wl). We consider the levels \(w_{h}\in \{1{,}2,5\}\), while keeping \(w_{l}=1\). This results in instances where slots are homogenous to the customer (\(w_{h}=w_{l}\)), in other words, where we have one single slot type, and instances with heterogeneous slot preferences (\(w_{h}> w_{l}\)). Note that w0 (no-choice) is normalized to 0. Besides varying the overall capacity and the weights, we build different ratios of the capacities of the two slot types. The assignment of the slots to the two slot types follows a typical time-of-day preference distribution, in which appointments in the morning, at noon and after work are preferred to appointments in the forenoon and in the afternoon. Accordingly, we get \(\kappa _{1}=3\) and \(\kappa _{2}=2\) for a system with \(\kappa =5\) appointments. The capacities per slot type of all considered system sizes are given in Table 4.

Table 4 Capacity of high-weighted slots (wh) vs. low-weighted slots (wl), in absolute values

We explored a total of 936 instances in our full factorial design. The numerical investigation took place on a workstation equipped with an x64-based Intel processor, 256 GB RAM, and a 3.70 GHz CPU boasting ten cores. Coding for this study was executed in MATLAB R2022a, making use of the parallel-computing add-on from MathWorks. Access to the research data, inclusive of the codes, is available through the provided DOI https://doi.org/10.25592/uhhfdm.14387.

6.2 Results: Optimal Dynamic Policy

Table 5 reports the most interesting extract of the percentage improvement when the optimal blocking strategy is followed. We find that blocking becomes more beneficial with a greater impact of the quality inference effect. Blocking is also more beneficial when customers are homogenous in their slot weights.

Table 5 Percentage improvement (\(+\Updelta \mathrm{{\%}})\) of our sdp compared to the baseline without blocking with varying quality effect (β) by slot weight (wh) and arrival rate (λ)

Unsurprisingly, blocking is more beneficial the stronger the quality inference β. Finally, we observe that blocking is particularly beneficial with relatively low arrival rates or increasing the penalty for denying a request. The number of offered slots (3 to 6 slots) has only a minor impact, with average performance losses of not blocking ranging between 2% (6 slots) and 6% (3 Slots). Similarly, because number of slots and time horizon are coupled in our numerical study, we observe a rather minor impact of the latter with performance differences ranging between 2% (\(\tau =18)\) and 7% (\(\tau =6)\). Figure 3 shows on the left-hand side a strictly convex relationship of the strength of the quality inference effect on the performance differences when averaging over all instances. The right-hand side of Fig. 3 shows that the benefits of blocking cannot be explained by an improvement in the profit from the optimal blocking strategy. The no blocking strategy becomes increasingly inappropriate and leads to a considerable decline in the expected profit instead. Intuitively, if the quality inference effect becomes stronger, the customers’ likelihood of not choosing an appointment (no-choice) increases. Blocking slots is an effective countermeasure. It keeps expected profits at a steady level by also keeping the no-choice probability at a steady level. Overall, we observe an s‑shaped effect on expected profits of blocking slots vs. no blocking (left side, solid line).

Fig. 3
figure 3

Convex relationship between quality inference effect β and the relative benefit of blocking (a) and absolute expected profit differences (b)

We conclude that with a noticeable quality inference effect (\(\beta > \hat{\beta }\)) and without blocking, the willingness of uninformed potential customers to visit the service provider tends to be low if the schedule is (almost) empty. With the lack of demand, uninformed potential customers in later periods also face an empty schedule, again leading to absent demand. If informed customers do not book a slot, the schedule stays empty. A possible countermeasure against the phenomenon is the blocking of specific slots to trigger demand.

To give applicable decision-support and to cope with the curse of dimensionality with an increasing number of slot types, we developed decision rules.

6.3 Decision Rules

6.3.1 Definition

We present three rules and test their performance in a numerical study. For the comparison, we use the same setup as described in Sect. 6.1.

We propose two rules that are simple to implement, while the third is more complex. The two simple decision rules both rely on one assumption: When blocking a slot, the multinomial logit framework adds the choice probability of this slot proportionally to the remaining options (“a rising tide lifts all boats”, see Kennedy 1963). The more (less) preferred the blocked slots, the higher (lower) the portion added to remaining options, and, most importantly, to the outside-option (no-choice option). Thus, if a preferred slot is blocked, the no-choice probability increases more than if blocking a less preferred slot. Therefore, the rules focus on the blocking of less preferred slots rather than of more interesting slots (in case of several slot types).

(1) Rlow blocks only less preferred slots with a lower slot weight (wh). All preferred slots (with a higher slot weight), are always offered to the potential customer. If at least one preferred slot is yet unbooked in period t, all less preferred slots are blocked. As soon as all preferred slots are booked, the less preferred slots are unblocked successively, one at a time. (2) Rone offers one slot at a time. With different slot types, the slot weights are decisive for the order. First, the preferred slots are successively offered. When all preferred slots are booked, the less preferred slots are offered, again one at a time. Note that we consider offering one slot at a time because \(a=\max \left\{0.25;1\right\}=1\) in all instances. A higher lower bound would increase the number of offered slots proportionally. (3) Rmyopic is the more complex decision rule. It checks in each period myopically all possible blocking options for each booking status and chooses the one with the lowest probability for the outside option. For homogenous slot types, this is the optimal policy (see Theorem 1).

6.3.2 Results: Decision Rules

Figure 4 compares the performances of the control strategies with increasing difference of the slot weights (from left to right). First of all, it is important to not that simple rules like Rlow and Rone may be worse than not blocking at all for low β-values. Only the myopic rule performs well over all instances. In-line with our analysis of the optimal rule in Sect. 6.2, we observe that blocking has the greatest influence when slot weight differences are small, i.e., the highest deviations captured in Fig. 4 shift to the right with increasing slot weight differences. With a high slot weight difference (\(w_{h}=5\)), all three decision rules show only minor deviations from the optimum.

Fig. 4
figure 4

Performance of decision rules by slot weight \(w_{h}\in (1{,}2,5)\)

Myopic Rule:

We observe that the myopic rule performs very well, with an overall percentage difference of \(< 0.01{\%}\) to the optimum. It finds the optimal strategy for homogenous slots (\(w_{h}=w_{l}=1\), see Theorem 1) and for the extreme cases \(\beta =0\) and \(\beta =6\). In some instances, with unequal slot weights, the myopic rule does not detect the optimal (time-dependent) strategy. Intuitively, the trade-off for blocking a slot is (a) decreasing the no-choice probability in a given period vs. (b) decreasing the likelihood that a potentially blocked slot is offered during the remaining time. The myopic rule disregards the latter part and tends to block slots too early when there are sufficient periods left until the day of service. In Appendix G, we present an example with two slot types and a system size of three slots for which the myopic rule does not find the optimal solution. In general, the myopic rule starts blocking (less preferred) slots from the very beginning if this minimizes the no-choice probability myopically. However, by blocking a less preferred slot, we eliminate the potential occasion to book that slot by chance and face a relatively high no-choice probability when having the less preferred slot in the offer set during the remaining periods. This reasoning carries only a little weight close to the day of service (because there are only a few periods left with relatively high no-choice probabilities). It neither holds if slots are weighted equally (because there are no relatively high or low no-choice probabilities in later periods, see also proof of Theorem 1).

Rone:

In settings with a very strong effect, offering one slot at a time while blocking the remaining unbooked slots also reaches, as the myopic strategy, the optimal expected profit. Note that for differently weighted slots, this means that first successively offering the more preferred slots followed by the less preferred slots is recommended. It should also be noted that Rone does not perform well for low and moderate β-values.

Rlow:

For intermediate levels of β-values, it is worth considering the decision rule that blocks the less preferred slots because of its easy implementation, even if the myopic rule is dominant. At some point, it is better to block slots instead of offering all unbooked slots. A closer look at Fig. 4 reveals comparably good performance for a β range of 1.5 to 2.

To derive recommendations for action, see Fig. 5. We consider scenarios in which the slots are homogenously interesting to customers (low slot weight differences) and in which some slots are highly preferred to others (high slot weight differences). We further vary the impact of the quality inference effect on the customer choice. Whenever decision rules lead to the same result or marginally different results, we choose the rule that is easier to implement. Alternative strategies that are easier to implement but have a slightly lower performance are added in brackets.

Fig. 5
figure 5

Action recommendation regarding the strength of the effect (QIE) and the difference of the slot preferences

For a very small impact of the effect (\(\beta \leq 0.5\)), blocking cannot be recommended in general, regardless of the slot weight difference. For a small impact of the effect (\(\beta =1\)), as in our survey, the myopic rule outperforms no-blocking. At intermediate levels in the β range of 1.5 to 2, Rlow can be considered as an alternative to the myopic rule. Only if the quality inference effect is very strong, Rone should be considered as an alternative.

We finally note that a decision rule that primarily blocks prioritized slots (reversed order in Rone regarding the order of slot types that are to be offered) is dominated in our numerical study by a strategy that first blocks less preferred slots. Intuitively, the main purpose of blocking slots is to trigger demand. Yet, if preferred slots are blocked, the no-choice probability increases more than if a less preferred slot is blocked.

6.3.3 Sensitivity Analysis

It is likely that decision makers in practice need to form a directional belief (low, intermediate, or high quality inference effect) when making the fundamental decision to block or not to block slots. To this end, it appears important to assess the downside of overestimating or underestimating the strength of the effect.

To investigate the influence of imprecise quality inference effect, we evaluate the blocking decisions as follows. We introduce βa (a: actual size of effect) and βe (e: estimated size of effect). Our previous analysis is nested with \(\beta =\beta _{e}=\beta _{a}\). Underestimation is formalized by \(\beta _{e}\leq \beta _{a}\) and vice versa. We assume that the service provider makes the blocking decision in line with the optimal decision in point t given state s if the quality effect were βe and denote this as \(\vec{o}_{\left(t,s\middle| \beta _{e}\right)}\). We denote the corresponding expected profits for an actual quality inference effect of βa by \(V_{\tau ,{\beta _{e}}}(s| \beta _{a})\). We calculate the performance losses of differences of estimated and actual quality inference effect by \(\Updelta \left[{\%}\right]=\left(V_{\tau ,{\beta _{e}}}\left(s|\beta _{a}\right)-V_{\tau ,{\beta _{a}}}\left(s|\beta _{a}\right)\right)/V_{\tau ,{\beta _{a}}}\left(s|\beta _{a}\right)\). We again test values \(\beta _{e},\beta _{a}\in \left\{0{,}0.5{,}1,\ldots ,6\right\}.\)

On an aggregated level over all instances, we observe that the consequences of an underestimation are with an average of \(\overline{\Updelta }_{{\beta _{a}}> {\beta _{e}}}=-10.14{\%}\) much more severe than an overestimation with \(\overline{\Updelta }_{{\beta _{a}}< {\beta _{e}}}=-1.38{\%}\). Figure 6 illustrates the increasing marginal losses resulting from estimation errors, particularly pronounced when there is an underestimation of the quality inference effect. We note, though, with \(\beta _{a}=1.35\) estimated from our survey data and an assumed relevant range of \(\beta _{a}-\beta _{e}\in \left[-1.5;1.5\right]\), the performance losses exhibit a more symmetric development.

Fig. 6
figure 6

Performance effects of over- and underestimation of quality inference effect

Finally, we analyze the risk profile associated with the binary decision of whether to block or not. Figure 7 plots the performance difference between optimal blocking and not blocking at all. Negative values in the plot signify that blocking leads to a performance loss relative to not blocking at all.

Fig. 7
figure 7

Performance difference “Blocking” vs. “No Blocking”

We observe that with a non-existent or very low-quality effect (\(\beta _{a}\leq 1)\), the blocking strategy appears risky, and the exposure high compared to the potential gains. For a βa level as estimated from our survey data, the potential gains and losses appear balanced, while for strong quality inference effects, there is no downside risk.

7 Discussion and Limitations

Our survey results replicate the results of Kaluza et al. (2023) and provide further evidence that customers infer quality from the booking status of a dentist’s online appointment system. We hereby focus on showing that the booking status, when being the only source of information, impacts quality perceptions and, thereby, impacts the demand for a service. This fact gives managerial leeway to stimulate demand by offering only a subset of the available slots. Kaluza et al. (2023) investigate quality and the scarcity effects in more detail. They find a stronger quality inference effect of the booking status on demand for non-standardized services such as hairdressing or medical treatment. In turn, for rather standardized services such as applying for a new identity card or signing of certified copies the quality effect is not present. Kaluza et al. 2023 also find indications of a scarcity effect, i.e., it might be more attractive to offer more slots than available in highly utilized systems. While the service provider has typically limited leeway to increase capacity on an operational level, it might be worth considering how the exact design of the appointment system’s user interface (e.g., greying out booked slots vs. not showing them at all) impacts behavior.

In our conceptual study, we examine the opportunities and threats associated with actively managing appointment schedules. We make several key assumptions: (a) we focus exclusively on the multi-nominal logit (MNL) model, (b) we concentrate on a linear effect regarding the number of available appointments and their impact on choice probabilities, (c) we disregard additional factors that impact the booking of a service (e.g. star-ratings, distance, etc.), and (d) we consider only the time-invariant behavioral effects.

  1. a.

    Our MNL modeling approach assumes a homogeneous group of uninformed customers, unaware of the provider’s service quality. Other discrete choice models, such as a mixed logit model for heterogeneous customer groups, are left for future research. In this context, it seems to be of interest if customers subgroups can be characterized and identified by their browsing behavior or profile information in booking software presents an interesting research avenue.

    With an MNL, we assume that the independence of irrelevant alternatives (IIA) holds, i.e., the ratio of the choice probabilities between two alternatives (slots) is independent of other slots being offered. In our setup, eliminating an appointment of a slot type with low attractiveness from the patient’s choice set, the IIA assumptions leads to an increase of the cumulated choice probabilities for slot types with higher attractiveness and the outside option. The increased choice probability for the outside option is de-biased by the quality inference effect (see Eq. 2b). The cumulated choice probabilities for slot types with higher attractiveness may be overestimated. However, since each slot type provides identical revenues, we believe that our numerical analysis concerning expected profits is robust to other choice model formulations.

  2. b.

    Concerning the functional relationship, non-linear effects as shown in Kaluza et al. (2023) can be easily considered in numerical studies. We expect our main directional insights concerning the strength of the quality inference effect and the heterogeneity of the attractiveness of slot types hold under other formulations of this utility component.

  3. c.

    Other factors that impact choice probabilities, such as ratings on portals, the distance to the service provider, individual preferences regarding the day or time of service, sex or age of the service provider can be incorporated into the random utility framework. Kaluza et al. (2023) show that in the presence of other quality signals, particularly star-ratings, the number of free slots appears to have a minor importance of the service choice on an aggregate level. However, the relative importance of the booking status varies strongly, and the authors conclude that the actively managing the appointment offers with the goal of signaling quality is of minor importance when customers who put a relatively high importance to the booking status cannot be targeted individually or other quality information with a strong impact (such as a star-rating) are available.

  4. d.

    We further note that different set-ups for the slot weights and slot types may lead to different recommendations in detail, but we assume that the overall action recommendation for the blocking strategy is not impacted significantly. Moreover, we assumed both the strength of the quality inference effect and the slot weights to be exogenous and time independent. Further research might relax this assumption and provide further empirical evidence on time-dependent effects (e.g., a customer booking well in advance aspects more appointment slots being available and is less prone to infer quality than a customer who books in timely proximity to the day of service).

For our modeling framework, we oriented towards the seminal study of Gupta and Wang (2008). We modeled a Markov decision process for one exclusive provider but empirically investigated the choice between two competitors. Our modeling is implicitly based on the assumption that potential customers learn from previous observations during a search process. The customer observes an offer set of one provider and decides whether to book one of the appointments or searches for another provider. In our model, we assume that customers are at a certain point within this process but abstract from the comparison between several providers. Including this search process (i.e., Eq. 6 in Appendix D) in the model and the empirical study is left for future research. To this end, it appears interesting to investigate the strength of the quality inference effect if benchmarks to other providers are missing. Similar to Gupta and Wang (2008), the Markov decision process could be extended to several providers in one system. A company with several providers, e.g., a clinic with several doctors, can still use our approach by having one system per provider. We leave the extension of our approach to include several providers for future research. Further, the arrival rate of customers during the booking horizon was held constant for simplicity, even though increasing arrival rates towards the planned workday appear more realistic and can be further analyzed in numerical studies. We further did not consider no-shows, cancellations, and delays. Instead, we assumed that a customer who requests a slot arrives in time for the service. Considering no-shows and cancellations may lead to overbooking. More slot requests may be accepted as the service provider expects not all the accepted customers to show up. This also means that more slots than available may need to be offered. Delays in contrast may retard the service provision and may thus evoke overtime on the day of the service which would reduce or even eliminate the aforementioned effects. Zacharias and Pinedo (2014) for example include no-shows in their overbooking model for appointment planning, Kong et al. (2020) consider time-dependent no-shows in their distributionally robust model. See for example Hall (2006) for the consideration of delays.

Finally, the close relationship to the revenue management literature with parallel flights (see Zhang and Cooper 2005) presents an interesting body of knowledge that might also inform the management of appointments systems. Particularly, the myopic rule leaves room for investigation. We found that the myopic view is optimal when having one single slot type (homogenous slots). However, we also found some critical instances, for which it is time-dependent whether slots should be blocked or not, independent from the fact whether blocking minimizes the no-choice probability of the respective period. As long as enough periods are left until the day of the appointment, no slots are recommended to be blocked. This is no longer in line with the myopic view, even though the myopic rule is close to the optimal expected profit. Adapting the approximation schemes presented in Zhang and Cooper (2005, Sect. 7) appears an interesting research direction.

It is important to note that service providers do not have the right to deceive customers online in, e.g., Germany by pretending scarce supply (§ 5 I UWG in conjunction with § 3 UWG)Footnote 6 and similar laws might be in place in other legislative areas. However, this law does not oblige service providers to announce all conceivable appointments at each point in time. Still, the instrument of blocking time slots should be used carefully to not evoke negative long-term consequences when pretending to be more booked than being demanded. Offering a subset of the available appointments to a customer should only be a preselection of time slots. Asking for time preferences could be a helpful instrument to offer an interesting subset instead of deceiving customers by pretending to be well booked.

To this end, long-term consequences of offering only a subset of the available appointments is not considered in our approach. Customers who book a seemingly popular service provider but are disappointed by an underutilized system (as might be the case, e.g. in restaurants, because this impacts the atmosphere in the location) might share their negative experience with long-term negative consequences. However, if the utilization of the system and the actual service are unrelated (e.g. the quality of the physician’s treatment regardless of the utilization), negative long-term consequences are not expected.

8 Conclusion

We analyze an online appointment system as a Markov decision process that aims at maximizing the service provider’s expected profit. In a survey, we identify that an extensive set of available appointments leads to significantly less demand because customers infer a lower quality of the service (observational learning from previous customer choices). We capture this quality inference effect in a discrete choice model and provide quantitative decision support and qualitative insights on which time slots should be offered during the booking horizon.

We analyzed the benefits of blocking and releasing time slots in a numerical study. Intuitively, the benefits are larger the stronger the customer reacts to underutilized service providers. We present three decision rules, since solving the problem to optimality (in realistic dimensions) is computationally expensive. The myopic rule, which chases the minimum no-choice probability in each period, performs very well and is optimal for equally weighted slots. Simpler decision rules perform reasonably if the quality inference effect is sufficiently strong.

Finally, we highlight that our conceptual study aims at showing the high-level effects of signaling quality via the booking status and/or neglecting poor quality signals from high appointment slots availabilities. We further note that the empirical evidence so far suggests a rather minor aggregate effect of the booking status on demand in hypothetical choice situations. The actual quality inference effect estimated from our stylized survey indicates a rather low economical relevance. Considering other quality signals (such as star-ratings) even diminishes the relevance of the booking status as shown in Kaluza et al. (2023). On the other hand, our sensitivity analysis highlights relatively higher performance losses when underestimating the quality inference effect (e.g. assuming there is no effect although there is one), and vice versa. To this end, it is imperative to note that offering less appointment slots to signal service quality should only be considered if management expects a high relevance of the booking status for their uninformed customers. The implementation should be carefully evaluated e.g., with A/B testing.