Who’s calling? The effect of phone calls and personal interaction on tax compliance

Most tax agencies use letters as the method of communicating with taxpayers. Still, other technologies exist that could be more effective. This paper reports the results of a field experiment conducted by the National Tax Agency of Colombia (DIAN), using phone calls to reduce tax delinquencies. DIAN randomly assigned 34,000 tax debtors to a phone call operation using a fixed script to communicate existing debts and invite taxpayers to a meeting at the local tax agency office. Phone calls were very effective to increase collection of unpaid taxes. Conditional on the phone call being made, the effect on the treatment is about 25 percentage points higher than the control group (about a fivefold increase). We also find suggestive evidence that the personal interaction seems to be an important channel for explaining taxpayers’ behavior. Faced with a tax agent, taxpayers tend to commit to attending the meeting and paying the tax owed. However, many taxpayers who commit do not make payment effective. The findings complement a nascent literature that shows that there are plenty of gains from innovating in the communication strategy. They also indicate that personal interactions are important, but they have to be paired with easy-to-follow and immediate actions. Paying taxes is easier said than done.

taxpayer to attend a private meeting at the tax agency to evacuate any doubts or make any consultation they thought necessary. This iterative process, which was aimed at increasing the impact of the intervention, provides additional insights into the role of reminders and plan-making interventions (Rogers et al. 2015).
In 2014, the agency randomized the phone calls intervention in a campaign to increase tax collection from outstanding liabilities. Around 34,000 entities (firms or individuals) with debt on already self-declared but unpaid taxes were assigned randomly either to a phone call treatment or a control group. The content of the message is constant for everybody in the treatment group. During the phone call, an agent delivered a scripted message that contained personalized information about the taxpayer, including the amount and origin of the debt. The agent offered the taxpayer the possibility of setting an appointment at the local office of the tax agency. The objective of the invitation was to increase personal connections and trust in the tax agency (by showing that it was open to discuss any discrepancies and evacuate any concerns the taxpayer might have). As a result of the call, it was expected that the taxpayer would provide a verbal payment agreement or make an appointment to visit the local office. If the taxpayer attended the meeting, the agent asked the taxpayer for a payment agreement. Because of the way the intervention was designed, we can evaluate well the overall effect of the phone calls (and subsequent meetings for those who attended) on payment behavior. Because invitations to the meetings were not randomized, we can only present suggestive evidence about the role of the subsequent steps and the personalized nature of the interaction.
The results show that calling delinquent taxpayers is an effective technology for increasing payment. Being assigned to the treatment increased the probability of making a payment by almost 6 percentage points (doubling the probability in the control group or "doing nothing"). In terms of economic results, on average, the agency recovered about USD$722 ppp per person from the treatment group versus USD$450 ppp from the control group during the campaign. 2 Because there was substantive one-sided noncompliance with the treatment-many people in the treatment were not contacted-the effect of an effective phone call is 26 percentage points higher than the control group (a fivefold increase). It is important to notice that this coefficient falls in between the estimates for the impersonal methods (letter, email) and the personal method (personal visits) in Ortega and Scartascini (2020), as it would have been expected. Interestingly, we also find that the payment probability declines with the size of debt [as in Perez-Truglia and Troiano (2018), and Gillitzer and Sinning (2018)], differs according to the type of tax [as in Ortega and Scartascini (2020)], and there is a lower effect for firms compared to individuals. In this last aspect, phone calls seem to behave more similar to the impersonal methods (letter, email) than to the personal visits (Ortega and Scartascini 2020).
Because we have information about the several moments in time in which agents and taxpayers interacted, we can look in detail at the role of personal interactions.
Basically, personal interactions seems to affect decision making through "promises" of compliance more than actual compliance. That is, more than 90% of those who received the phone call made an appointment to attend the meeting but less than 70% of them actually showed up. Of those who showed up, about 50% promised to make a payment but only 53% of them actually paid. Therefore, while personal interactions seems to have a large motivational effect for promising to act, actual compliance falls quite substantially when there is a delay between the promise and the moment taxpayers have to actually act. This may show social desirability bias in terms of the responses to the calls or the personal interaction, or it may show procrastination and forgetfulness. We cannot distinguish between them with the data we have available. Still, it provides valuable policy insights for policy makers. Even with the gap between promises and payments, the effect of the intervention is quite large and much larger than the effect found when using less personalized methods. Even though promises and plan-making are not foolproof the suggestive evidence indicates that the impact for tax compliance could be substantial (and even larger than in other areas). 3 This paper adds to the literature on tax compliance by showing that phone calls can be a very effective method for increasing tax compliance. Phone calls have been rarely used by tax agencies but they have been extensively studied in the context of political canvassing. Phone calls made from commercial call centers or robotic calls are not effective to mobilize voters, while volunteer-made phone calls and interactive scripts are more likely to draw attention from the voter (Arceneaux 2007;Gerber and Green 2001;Imai 2005;Nickerson 2006;Shaw et al. 2012;Ha and Karlan 2009;Ramirez 2005;Nickerson 2007). Similarly, Mann and Klofstad (2015) show that political-specialized phone banks are more effective than commercial phone banks to mobilize voters. In line with this literature, our study shows that high-quality and personalized phone calls can deliver strong results but in this case in terms of tax compliance.
The study complements a nascent literature that shows that personalized methods of communication and interaction seem to be more effective that impersonal methods. In the voter mobilization literature, Green and Gerber (2008) report significantly different effects between visits, letters, emails, and text messages. Personal canvassing is the most effective tool for getting people to vote. In charity fund-raising literature, DellaVigna et al. (2012) show that variations in the campaign methods can help to pin down social pressure, as a motivation behind charity donations. In one of the extremes of personal interaction, Castillo et al. (2015) find that friends are very effective for generating larger donations. In loan repayment for microcredit, Karlan et al. (2012) found that SMS are effective if they include reminders about personal relationships with the local bank officer. In the tax compliance literature, results are lining up in the same direction. Ortega and Scartascini (2020) found significant differences across communication methods (letter, email, and personal visit), and Doerrenberg and Schmitz (2015) found differences between visits and letters sent to small firms. In every case, more personal methods were more effective. Our study adds to the existent evidence by exploiting a highly personalized interaction and showing that it affects both what people commit to do (which takes place at the moment of the interaction) and actual behavior (which takes place later and separated from the interaction).
The fact that verbal commitments are substantially higher than actual actions provides insights for several strands of literature. First, it cast an additional layer of skepticism on studies that use survey methods to assess the potential effect of deterrence and tax morale interventions. In this intervention, promises to pay are about two times higher than actual payment. Second, it also adds evidence to the literature on deadlines, reminders, plan-making, and procrastination (Nickerson and Rogers 2010;Milkman et al. 2011Milkman et al. , 2012Vervloet et al. 2012;Rogers et al. 2015;Karlan et al. 2016). In this intervention, taxpayers declared how much they had to pay, they were reminded they had not paid, and they promised they would do it by a certain date (plan-making). Evidence is mostly suggestive given the way the intervention was carried out but results seem to be very promising. As such, reminders, commitments, and plan-making may help increasing tax payments.
The study also provides policy recommendations regarding the value of the interaction, the potential role of phone calls as part of the communication methods portfolio, and the potential effect of reducing the gap between the timing of verbal commitments and actual behavior. The evidence should spark policy discussions regarding how much and what processes may work better if automatized and made impersonal, and which ones should not. This paper is organized as follows. Section 2 describes the experiment. Section 3 presents the results and Sect. 4 concludes.

The experiment
With the objective of increasing tax collection and evaluating the effectiveness of phone calls and personalized interactions with the taxpayer, the National Tax Agency of Colombia (DIAN) agreed in March 2014 to randomly assign the method used to contact a sample of 34,783 taxpayers with due liabilities (self-declared but unpaid taxes) on the income tax, wealth tax, and value added (VAT) tax, the penalties and interests of those liabilities, and other minor taxes for the period 2009-2014. The amount owed by these taxpayers was equivalent to about 2.5% of the total tax collection from these sources in 2014 fiscal year. 4 The agency randomized them into a treatment and a control group. The randomization was made within randomization strata according to debt size terciles. The treatment group was assigned to receive a phone call, while the control group did not receive any notification. Phone calls were made by a professional phone call bank. The individual making the call had to follow a detailed script, which is structured as follows: first, the caller makes contact with the legal representative of the firm or with the individual. Second, she proceeds to remind the taxpayer about the outstanding debts with DIAN, though the specific amount was not mentioned. The caller mentions possible legal and financial sanctions. Third, the caller attempts to schedule the taxpayer to an appointment at the DIAN office, where the taxpayer is offered the possibility to clarify current account delinquencies, solve any disputes, and arrange a payment. Alternatively, the taxpayer compromises to pay by a certain date. At the end of the call, the agent thanks the taxpayer for her time and mentions the campaign slogan "Colombia, a commitment we can't evade." 5 The contents of the script attempt to affect the deterrence and moral suasion channels of compliance. Importantly, the script proceeds as a conversation rather than a rigid text, with multiple interactions between the agent and the taxpayer in order to foster personal interaction. Moreover, the invitation to attend the tax agency further appeals to stress personal interactions. The call produces two main outcomes: an appointment to the local agency office or a verbal payment commitment. The payment commitment should increase the probability that the taxpayer actually pays-reminder plus plan-making (Nickerson and Rogers 2010;Milkman et al. 2011Milkman et al. , 2012Vervloet et al. 2012;Rogers et al. 2015;Karlan et al. 2016). In the data set, we can track whether people attended the meeting, the promises they made, and their actual payment behavior. Phone calls were made between April 24 and May 10, 2014. The initial distribution comprised 24,870 subjects in the treatment group and 9913 in the control group, as shown in Table 1. However, the tax agency decided to stop the intervention once 12,853 calls had been made (a little bit over half of the total originally planned). 6 Out of the universe of taxpayers called, 5267 were contacted (21% of the total and about 40% of the total made.) We have information about the results of each call. The contact rate (share of effective contacts over the number of calls made) is similar to contact rates from specialized phone banks in recent GOTV interventions (Mann and Klofstad 2015). After being reached by phone, 4957 made an appointment to the local office and 3427 attended this meeting (see Fig. 1).
We present summary statistics in column 1 of Table 2, for the control group. Tax delinquents in our sample have an average tax liability of COL$34 million ( ≈USD$10k). Most of these debtors owe only one tax payment that has been due Fig. 1 Agreements, attendance, and compliance. Notes: This figure shows the behavior of those who received the phone calls. The second column shows the share that agreed to the meeting. The third column shows the share that attended the meeting. The fourth column shows the share of people who agreed to pay. The last column shows the payment behavior of taxpayers that followed each one of the paths. % is computed for each stage for 3 years. Sixty-two percent of delinquents are firms, and the more common sources are debts originated in the Income and Value Added taxes. Table 1 Panel B column 1 shows the payment behavior of the sample of tax delinquents participating in this intervention. In the absence of any campaign, only 5.4% of the individuals with tax delinquencies at the beginning of the year would have made any payment at the end of the year. This is equivalent to recovering 3.2% of the total outstanding debt.
This intervention follows a stratified randomized design (terciles of initial debt). Columns 2, 3, and 4 of Table 2 present randomization test for observable covariates. The groups are balanced in the amount of tax owed since this was the variable used to construct the randomization strata. Groups are also balanced according to the type of taxpayer (firm or individual). There are some imbalances on debt age, type of tax, and the number of debts. Differences are rather small. Moreover, these imbalances  (1) and (2) present means and standard deviation (in square brackets) for the covariates by treatment group. Column (3) shows the differences in means computed by OLS regression that includes stratification and district dummies. Standard errors are robust (in parentheses). Column (4) displays the p value of the t test that the differences in means between the treatment group and the control group are zero (taxpayers in the treatment group have on average debts that are 3.22 years old while those in the control 3.02, and higher number of missed payments − 1.37 vs 1.32) work against finding any results. As has been shown in the literature, these groups tend to be less affected by any treatment. We include all the variables in Table 2 as controls in the empirical analysis, but it does not affect the results.

Empirical specification
We estimate the following equation: where T is a dummy to indicate the assignment to treatment, X is a vector of control variables, B the blocks or strata, and D are the district fixed effects. The district variable indicates the geographic jurisdiction in which the taxpayer is registered. The set of covariates include all the pre-intervention observable characteristics: Liabilities denotes the total outstanding debt, Number of debts denotes the number of tax obligations per taxpayer; Debt origin is a set of dummy variables that indicate the source of the tax liabilities for wealth, income, VAT, and other liabilities; Taxpayer type denotes whether the taxpayer is a firm or an individual. Finally, we also control for DIAN district level fixed effects. These district-level dummies should control for differences in the local economic conditions, socioeconomic structure, and differences in the Tax Agency local offices. We use several dependent variables to measure compliance. Paid is a dummy that takes value one if the taxpayer made any payment canceling liabilities by the time the endline was collected, Full payment is a dummy that takes value one if the taxpayer canceled the liabilities reported in the message in full. Total payment is the amount (in logs) paid by the taxpayer after the experiment. Payment share is the share of liabilities canceled by the taxpayer.

Effectiveness of the phone calls campaign
We first note that the mere presence of the phone calls campaign delivered important collection outcomes to the agency, according to the raw results presented in Panel B of Table 1. In the absence of any campaign, about 5.4% of the individuals with tax delinquencies at the start of the intervention would have made any payment 2 months later. Compared to this baseline scenario, about 11 percent of the taxpayers assigned to the intervention made any payment, payments were 50 percent higher, and the agency recovered 6% of the total debt on average. When there is actual contact with the taxpayer, these results are even higher: for the 5267 individuals that were effectively contacted, 33% of them made any payment of about three (1) times the average payment in the control group, and the agency recovered close to 18% of the outstanding debt.
In Table 3, columns 1 and 2, we present the ITT estimates of the effect of the phone calls on the tax payment outcomes (each cell corresponds to an independent regression; each row indicates the dependent variable used.) The first column shows the results when no controls are added, while column 2 shows the results including the full set of controls. For those assigned to the intervention, the probability of making any payment increases by 5.8 percentage points with respect to the control group (column 2). The probability of making a full payment increases by 1.7 percentage points. The share of the outstanding debt that is paid increases by 3 percentage points. The amount paid increases by more than 70%. Results vary little across columns (specifications with and without controls). This first set of results show that the probability of a payment doubles with the phone calls compared to the status quo.
Those results provide an incomplete view of the implications of the program for the tax agency because the number of taxpayers actually treated was much lower than the share assigned to being treated. There are two sources of one side noncompliance: the sudden stop in the operation and the low contact rate of those who are called. Therefore, to evaluate the effectiveness of the program for the tax agency, it is relevant to restrict the analysis to the calls actually made and the calls actually received. As such, we run the regressions using 2SLS with assignment to a phone call as instrument. LATE results are presented in Table 3. In columns 3 and 4, the Table 3 Tax payment outcomes: impact of the phone calls campaign N = 34,783. Column 1 shows the estimated coefficient from a OLS regression using random assignment as independent variable, controlling for stratification strata. Column 2 includes all covariates as in Table 2. Columns 3 and 5 show the estimated coefficient from the second stage of a 2SLS regression using random assignment as the instrument for attempted calls or effective calls. Columns 4 and 6 include controls for all covariates. Controls included: Initial Liabilities (in logs), Debt age (years), Number of debts, Taxpayer type (Firm), Debt origin dummies, District dummies. Robust standard errors in parenthesis instrumented variable is the attempted call. The impact of the phone call campaign on this set of individuals is now higher (as expected). The probability of making any payment by those the agency attempted to contact is now about 11 percentage points, which is more than twice the effect of having done nothing. The average amount collected is about one and a half times higher. This is the effectiveness of the phone call campaign for the tax agency taking into account that there will be taxpayers that do not answer the phone. However, it is also relevant to know what the effectiveness is once a taxpayer can be reached to evaluate the actual effect of the call, and to evaluate whether it would be worth it to spend resources updating the taxpayer database. Because only 40% of those called were actually reached, the effect on the group who had the opportunity to talk to the agent should be larger. The effect of the intervention on the effectively treated (effective call) is included in columns 5 and 6. The probability of payment by those who were actually contacted (call was answered) is now almost 27 percentage points higher than in the control group. These results show that phone calls can be an effective way to get taxpayers to pay what they owe. These results are in line with those in Ortega and Scartascini (2020) who use a very similar setting. The effect of the phone calls lies in between more impersonal methods, such as a letter or an email, and more personalized and targeted methods, such as a personal visit.
Of course, our estimates are affected by a temporal dimension-the time span between the intervention and data collection. Therefore, we cannot estimate what the impact would have been if we had collected data at a later date. On the one hand, we may be underestimating the magnitude of the effect if treated taxpayers are reacting with a delay-e.g., they want to comply but it takes some time to collect the money to do it. On the other, we may be overestimating the overall collection effect if control taxpayers would have paid anyway in the longer run (we would still be capturing financial gains for the agency from collecting the money earlier rather than later). There are a few considerations that make us believe more on the first rather than the second alternative. First, the average age of the debt in our database is 3 years old. Second, Ortega and Scartascini (2020) show that the share of taxpayers that cancel their tax delinquencies if not contacted by the tax agency is approximately 5% over a period of approximately 6 months-which is similar to the compliance for the control group in this article. Additionally, unpaid debts are an important problem for DIAN. Every year more than half of the outstanding tax debt is written off. Consequently, it would be reasonable to assume that if endline data collection had taken place at a later data the effects could have been higher rather than lower. 7 7 For robustness of this analysis, in Fig. 4 we present the estimates of the survival function using standard non-parametric estimation. We use detailed data at the tax bill level for both treatment and control groups. The payment date denotes the failure event. During the period of analysis, we can observe that the bills assigned to the phone call were paid faster and were more likely to be paid at the end of the period (as is the case in our overall analysis). The estimated survival function is statistically different between treatment and control groups, and the payment rate of the control group is low.

3
Who's calling? The effect of phone calls and personal interaction…

Heterogeneous effects
We look at heterogeneous effects across the several observable characteristics of the taxpayers. In each case, we run regressions that include an interaction term between the treatment variable and the covariate of interest. Figure 2 shows the results (point estimates and the 95 confidence interval) for the variable "paid" (OLS regressions), and they are reported in Tables 7 and 8 along with the IV results. Results are qualitatively the same for the OLS and 2SLS regressions. First, according to Fig. 2, female taxpayers seem to react more to the treatment (even though differences are not statistically significant). Second, phone calls seem to be less effective for firms than for individuals. Third, VAT and income tax debtors seem to react more readily than debtors of the wealth tax. Fourth, the probability of payment is higher the lower the level of debt is. Finally, there is some evidence of negative effects of the phone calls for the so-called chronic debtors: individuals with an outstanding debt of more than 7 years (Fig. 3). All of these results are in line with the literature and the results in Ortega and Scartascini (2020). It is more difficult to reach and generate payment from a firm rather than from an individual; phone calls may be particularly inconvenient for reaching the person deciding tax payments in the case of the firms. Lower compliance  Table 2. Gender figure only includes individual taxpayers (not firms). All regressions include stratification controls and district fixed effects. 95% confidence intervals with the wealth compared to the income tax and the VAT could be expected. The first one taxes an asset, in many case illiquid, while the other two tax a flow of revenues. 8 Accumulating higher debts is not random. These people may have more financial difficulties than those who have not or they may have lower priors regarding the ability of the tax agency to effectively enforce payment. Therefore, they respond less.
The results indicate that the tax authority should be very vigilant to contact taxpayers when the probability of payment is higher (e.g., letting the amount of debt grow increases the chances that taxpayers may not be able to pay because of financial constraints, and it may also reduce taxpayers' beliefs about the enforcement capacity of the government). Additionally, phone calls may not be that effective for reaching firms' decision-makers. Finally, because the bases of taxation are different across taxes, the tax agency may want to think about how to discriminate its campaigns. For example, payment plans may well be needed to increase compliance with the wealth tax for people with cash flow restrictions. Fig. 3 Heterogeneous effects by debt characteristics before intervention. Notes: N =34,783. Each figure shows the coefficients corresponding to an OLS regression using random assignment to phone calls interacted with all covariates in Table 2. Second axis shows the distribution for the debt variable. All regressions include stratification controls and district fixed effects. 95% confidence intervals

Effects of meeting in local tax office after phone call
So far we have shown the overall effects of the phone call campaign. However, as we mentioned before, there are two additional factors that can be exploited. First, during the phone call, the taxpayer was offered the option of setting an appointment to the local office to settle any disputes, evacuate any questions, and reach a payment commitment. We have information on the taxpayers who scheduled the appointment and actually attended the meeting. The idea behind this was that offering the meeting would increase personal connections and motivate higher payments. Of course, offering the meeting was risky. Because attending the meeting has costs for the taxpayer and taxpayers may form biased expectations about their ability to convince the tax authority to forgive what they owe, it could increase resentment for those who are told to pay. Hence, it may lower the incentives to pay. If taxpayer resentment rises, and given the agency is incurring additional costs by hosting the meetings, the tax authority may risk lowering the efficiency of the campaign.
Second, during the several interactions between the agents and taxpayers, taxpayers were not only reminded about their debt but also asked to commit to a payment (plan-making). We have records of those commitments as well as actual behavior.
A cursory view at the data in Fig. 1 provides a first view to the effect of the appointments. Basically, most people agreed to the meeting (94%), but only 69% of them attended the meeting. From those who attended the meeting, almost half committed to pay, but only about half of them actually did it. Among those who did not agree to attend the meeting, about half of them committed to pay. Still, only 39% of them did it. Figure 1 shows two things. First, personal interactions matter. In every case, a large share of the taxpayers agreed with the proposal presented by the agent to attend the meeting or pay what they owe (97%). Also, about 50% of those who attended also agreed to pay. About 40 to 50% of those who committed to pay did it. Second, while taxpayers initially committed to pay when asked by the agent, their actual compliance was lower but still much higher than in most interventions. As such, there are two important lessons that can be drawn. One of them is that "paying taxes is easier said than done." The other is that commitments and plan-making strategies seem to work, and maybe more in tax compliance than in other policy areas where it has been used so far. Table 4 shows regression results that look at whether attending the meeting had any effect on payment behavior. The idea is trying to isolate the effect of this last stage of personal interaction from the role of the phone call itself. In this analysis, we use three different samples: those who received the call (column 3), those who received the call and agreed to the meeting (column 5), and a sample that includes those plus a subgroup for which there was some type of contact even if the conversation could not take place (column 1). These are the taxpayers who faced an actual choice (answering the phone, agreeing to the meeting, attending the meeting).
OLS regressions show that attending the meeting has an effect on payment between 13 and 16 percentage points higher than not attending the meeting. The cursory view and the OLS regressions do not take into account that there could be some selection into attending the meeting. It is not clear what the selection bias would be though. On the one hand, we could be overestimating the impact of the meeting if it were the case that those who were planning to pay were also more likely to attend the meeting. That case could be plausible if those who attend the meeting were those with better views of the work of the tax agency. Still, if people were thinking to pay anyway, it would not make too much sense to attend the meeting given the costs of time and travel it implies. For example, about 25% of those who did not agree to the meeting paid, and 22% of those who agreed to attend the meeting but did not attend also paid. Therefore, there is a sizable share of taxpayers that understand that participating in the meeting may not be worth the effort if they have to pay anyway.
On the other hand, it could be the case that those who think that they could avoid paying the tax because they believe the tax agency made a mistake are the ones with more incentive to attend the meeting. For example, we know that among those who attended the meeting 22% had some inconsistencies between what they were asked to pay and what they had to pay, about 5% planned to compensate their debt with other credits, and about 14% arranged a payment plan. In this case, attendance could be underestimating the potential effect that meetings have on motivating taxpayers to pay because many taxpayers who attended did not have to pay.
While dealing with endogeneity has plenty of complications because we do not have that much information on the taxpayers and their priors, we could use the distance from the home or the firm to the closest local tax office as an exogenous instrument for Table 4 Tax payment according to attendance to meeting For 2SLS, the instrument is Distance in km to the Tax administration office where the taxpayer is registered. Columns 1, 3 and 5 present results from an OLS estimation, with payment as outcome variable and attending the meeting as independent variables controlling for all variables in Table 2. Columns 2, 4, 6 present results from 2SLS estimation using distance in km as instrument for attending the meeting. Sample "Contact Started" refers to the individuals with effective contact+ phone calls with partial contact results like "call again" and "meeting not scheduled." Sample "Effective Contact" refers to phone calls where the script is completed and the result includes either "schedule a meeting," "report intentions to pay" or present other results. Sample "Agreed to Meeting" refers to individuals that decide to schedule a meeting in the local tax office during phone call. Controls include liabilities in tercile, debt age (years), number of debts, taxpayer type (firm), debt origin dummies, and district fixed effects. Robust standard errors in parenthesis * (p < 0.05 ), * * (p < 0.01 ), * * * (p < 0.001) the decision to attend the meeting. Distance should be a good instrument in the context of this intervention that deals with tax delinquencies instead of tax declarations or compliance. There is no reason to expect that a taxpayer would choose a location to live based on having declared a tax obligation but not paying it, but there are plenty of reasons why those who live farther away would not attend regardless of their decision to pay or not. Additionally, by including district fixed effects, we are controlling for possible urban and economic characteristics related to payment behavior and office location. Therefore, within districts, the location of the tax office is exogenous in principle to the location of the taxpayer at least previous to the decision of tax compliance and meeting attendance. In some regressions, we have also included other variables with or without the pure distance variable: for example, the relative distance (which weights the individual taxpayers' distance to the DIAN office compared to the distance of all other taxpayers in the district), dummies for distances longer than certain thresholds, distance squared, and others. Results do not change.
In the analysis presented in Table 4, we examine the effect of attending the meeting, instrumented by distance, on actual payment. Table 6 presents the results for the first stage, where distance has a statistically significant effect. Results show that the coefficients are positive and slightly smaller than in the OLS regression. However, they are not statistically significant (see Table 4.) Consequently, while there seems to be some indication that the meeting might have been successful, it is difficult to ascertain that from the data we have. In particular, we do not have very relevant information that could be biasing the results, including (i) taxpayers' priors regarding what they could get out from the tax agency meeting; (ii) errors in the tax declaration, prepayment of part of the debt, and/or inconsistencies for the overall population.

Conclusion
We find that phone call campaigns that stress personalized interactions are effective to increase the collection of tax debt. Phone calls were made by a professional phone call bank using a high-quality script, and agents were encouraged to make the call personal, and to invite taxpayers to a meeting. At the meeting, again, agents were encouraged to maximize personal interaction. Conditional on the call, the tax agency collected three and half times the amount it would had collected otherwise. This was highly cost-effective for the tax agency as each attempted call resulted in $470 in recovered debt (and almost $4000 per contacted taxpayer).
Compared to previous estimates in the literature, the effect of phone calls on the probability of payment lies between the effect for impersonal methods (letter and email) and more personal-related methods like personal visits to the taxpayer. The government choice of the communication technology is not trivial. The method of communication conveys a signal to the taxpayer about the enforcement capacity of the State. Also, different communication technologies provide different levels of personal interaction with the authorities. Phone calls seem to be an effective way to increase both the perception of deterrence efforts and personalized interactions.
Phone call campaigns, as is the case with other similar campaigns, have some limitations. First, if databases are not up to date, the contact rate could be low (less than 50% in this case). Second, chronic debtors may find ways to avoid being contacted by phone as technology progresses. Third, phone calls may be less effective with certain populations, such as firms. Combining cost-effective interventions targeted to the majority of taxpayers, with limited phone calls and personal visits to a set targeted according to the collection potential might be the best strategy.
In addition to the call, the tax agency also offered taxpayers the opportunity to schedule a meeting to discuss their situation, which increased the personalized nature of the intervention. Because this offer was not randomized, we can only make limited inferences. Still, the responses from the taxpayers to the interaction with the agency present suggestive evidence of the personalized channel. Taxpayers tend to react very positively to the requests made by the agency. A large share of them scheduled the meeting and a large percentage also committed to pay what they owed. Promises were not followed by actions in the same rate. Tax compliance is easier said than done, which may confirm the existence of social desirability biases in the agent taxpayer-interaction and procrastination in the payment of taxes. This evidence indicates that tax agencies should make it very simple for taxpayers to follow through with their promises immediately, in order to take full advantage of the role that personal interactions have. It also provides evidence that reminders and explicit commitments work even if they are not foolproof.
This study provides hints about the relevance of phone calls as a method of communication, the role of personal interaction, and the role of reminders and commitments for increasing tax compliance. The evidence should spark a discussion about what processes can be automatized and made impersonal, and which ones should not as technology progresses. Still, there is plenty that this study does not address explicitly that should be the focus of future interventions. In particular, it would be important to separate the effect of the phone call, the meeting, reminders, and the plan-making exercise to understand the value added by each one of them. Adjusted R-squared 0.147 0.075 Table 6 Attendance to meeting after phone calls: first stage Columns present results from the first stage of 2SLS estimation using distance in km as instrument for attendance to meeting. Sample "Contact started" refers to the individuals with effective contact+ phone calls with partial contact results like "call again" and "meeting not scheduled." Sample "Effective Contact" refers to phone calls where the script is completed and either "schedule a meeting," "reports intention to pay" or present other results. Sample "Agree to meeting" refers to individuals that decide to schedule a meeting in the local tax office during phone call. Controls include liabilities in tercile, debt age (years), number of debts, taxpayer type (firm), debt origin dummies, and district fixed effects. Distance refers to distance in km to the tax administration office where the taxpayer is registered. Robust standard errors in parenthesis * (p < 0.05 ), * * (p < 0.01 ), * * * (p < 0.001) Outcome: attendance to meeting Sample "Contact Started" Sample "Effective Contact" Sample "Agreed to Meeting" (1) (2)

3
Who's calling? The effect of phone calls and personal interaction…

3
Who's calling? The effect of phone calls and personal interaction… Table 9 Heterogeneous impact of phone calls-type of tax N = 34, 783 . Columns 1-3 show the coefficients corresponding to an OLS regression using random assignment to phone calls interacted with all covariates in Table 2. Columns 4-6 show the coefficients corresponding to an 2SLS regression, regression using random assignment as instrument to attempted phone calls interacted with all covariates in Table 2. All columns include stratification controls and district controls. Robust standard errors in parenthesis * p < 0.05 , * * p < 0.01 , * * * p < 0.001