Keywords

1 Introduction

Outbound automated calls have shown potential in delivering voice-based messages at scale, hence presenting itself as a key alternative for easy information delivery services such as reminders [3, 8] and adherence [2, 13]. It offers advantages of cost savings [11], convenience, simplicity, and privacy [13]. Moreover, it demands low prerequisites such as basic feature phones, no dedicated training, and education [17].

The adoption of outbound automated calls has increased [15] due to increased mobile phone penetration in rural India [5]. Prior research indicates the need for appropriate call timing [17], duration [10, 17], suitable contents and familiar voice persona to increase the effectiveness of automated calls [16]. Although sufficient research is conducted in developed regions and urban settings [12, 16], limited research is done in studying these factors and its influence in rural settings, especially among users with low-resource settings.

In this paper, we present results of a user study conducted among 40 early-stage (in their first month of treatment) TB positive patients who are undergoing Directly Observed Treatment Short course (DOTS) treatment. These patients come from remote regions of 2 districts of the northeastern state of India, Assam. We sent one outbound automated call to each patient to increase awareness on various early-stage TB care topics. The automated calls were followed by a telephonic interview and post-call questionnaire to study the preferred (i) contents (ii) time of the call (iii) call duration and (iv) preference of gender of voice. We also studied learnability among the participants after one week of the automated call. This study is a part of ongoing field trials of “Swasthyaa” [4], which is an end-to-end ICT ecosystem aimed to reduce initial and retreatment TB defaulters through real-time tracking and monitoring of patients’ health progress.

2 Related Work

Numerous studies [13, 17, 22, 23] utilizing advantages of outbound automated calls have been conceptualized and studied in developing regions. Automated calls have been used to support HIV treatment, mainly for adherence reminders and have been used to collect adherence data [21]. Joshi et al. [17] used automated calls to support the treatment of HIV/AIDS patients in India. The study found that patients preferred receiving daily calls of shorter duration as opposed to a long call at a preferred time. Helzer et al. [10] used outbound automated calls consumption and randomized the duration of phone calls received by participants. During the study over 3 months, the duration of the phone call was reduced to reach a mean of approximately 2 min due to the diminishing motivations of participants. Other studies have used automated calls for motivation and improved quality of life for asthma [18], physical activity [19], coronary syndrome [20], however, limited studies have been conducted to explore best practices in terms of duration, content, gender of voice, and time of an outbound automated call, especially among users of low-resource settings.

3 Introduction to Swasthyaa

Swasthyaa [4] is an Information Communication Technology (ICT) based solution aimed to reduce initial and retreatment TB defaulters. It complements the TB care efforts initiated by the Government of Assam (a northeastern state in India) via real-time tracking and monitoring the health progress of each presumptive and positive TB patient enrolled in local health centers. It consists of mobile & a web application, Interactive Voice Response (IVR) and a newly designed TB blister packet. The web and mobile application are designed for health administrators to track health progress of each patient, whereas IVR and blister packets allow patients to register DOTS intake regularly. In addition to tracking and monitoring health progress, it sends periodic automated calls explaining TB care methods, government initiatives, food habits, family care and motivational messages to increase adherence among the patients.

4 Outbound Automated Call Description

A simulated, 3 min and 48 s outbound automated voice call was recorded in the Assamese language. The call was sent with a hypothetical female persona Dr. Heena Saikia. The choice of a female persona as a health professional was to increase the confidence and acceptance of the shared contents. Heena Saikia is a commonly found Assamese name and was chosen to increase familiarity among the users. The call presented information on the government initiatives such as free DOTS & Nikshay Poshan Yojana (monthly Rs. 500 to buy healthy food), the importance of TB adherence and DOTS course completion, possible side effects of medications, nutrition, home hygiene, methods of cough and sputum disposal, and the poor effects of tobacco and alcohol. The message did not require input from the user to listen to the call. Government health professionals developed and approved the health contents before delivering to patients.

5 Study Objectives

The objective of this study was to learn about the preferences of automated calls among users of low-resource settings. We believe that incorporating these preferences will increase the acceptance, adoption, and effectiveness of the automated calls aimed to increase adherence in rural areas. We studied the following preferences: (i) the duration of the automated call (ii) the preferred time of a day where the automated calls should be made (iii) preference of male or female voice for an automated call and (iv) preferred platform of information delivery, i.e., preference of a voice call or SMS. We also studied the learnability of the contents to understand the effectiveness of an automated call as a suitable platform for information awareness.

6 Methodology

6.1 Participants

We selected 40 participants (M = 32, F = 8) from Swasthyaa’s database who were in their first month of the DOTS treatment. They belonged to 5 rural blocks from Darrang and Kamrup district of the Assam state. The monthly income of the participants ranged from Rs. 500–10,000 (US$7–US$143) (Mean = US$49, SD = US$45). 11 participants’ monthly income was below Rs. 1,000 (US$14), whereas 22 and 7 participants earned Rs. 1,000–5,000 (US$14–72) and Rs. 5,000–10,000 (US$72–144) each month respectively. Of the 40 participants, 14 were daily wage workers (i.e., farm laborers, rag pickers, etc.) 3 were retired, 5 were students, 2 were homemakers, 5 small business owners, and 11 with regular incomes. The age of the participants ranged from 18–60 years (mean = 34.15, SD = 13.08). The age, monthly income, location and occupation of participants were already available in Swasthyaa’s database. We collected this information during patient registration process. 30 participants owned feature phones, out of which 17 shared it among family members. They used the mobile phone for making & receiving calls. No other advance usage (including the use of SMS) was observed among them. 10 participants owned low-cost smartphones which were used to make-receive calls and sent messages on WhatsApp. Participants had limited knowledge and understanding of TB as a disease and were only aware of the free DOTS program. No other information such as food habits, side effects or other government benefits were known to the participants. The existing knowledge of participants was obtained from Senior Treatment Supervisor (STS), a government official who directly interacts with patients and is responsible for DOTS adherence and completion.

6.2 Procedure

We recorded the calls using a Philips DVT6010 audio recorder for this study. We manually called the participants between 1900 to 2100 h. The time of the call was guided by the hypothesis that participants would be at leisure and unengaged during this period. Moreover, the chosen timing also served the purpose of acting as a reminder at the end of the day for patients who may have missed the dosage. Once the participant picked up the call, the recording was played out loud on a speaker and placed next to the mobile phone’s microphone to mimic the effect of an automated call. If a patient did not pick up the call, they were called again up to three times on consecutive days. No patients were called more than once a day. Following the automated call, we called the patients within an hour. A structured telephonic interview was conducted to learn about the preferred time of the call, duration, voice preference and platform for information delivery (i.e., SMS or automated call). Participants were asked to provide their choices without any specific option given by the moderator. However, they were probed further in case they provided vague answers. For example, if the participants answered morning as the preferred time for an automated call, we asked them to provide a specific time (e.g., 0800, 0900 h, etc.). We also conducted unstructured interviews to identify qualitative findings such as motivations behind preferred choices, listening to the automated call and the preferred contents on TB care. The interviews lasted 20–25 min per person. A female moderator conducted the interviews in Hindi or Assamese language depending upon participants’ preference. All of their responses were noted and recorded for further analysis. Each participant was called again after one week to study the content learnability. A telephonic questionnaire consisting of 6 questions was used to study the learnability. The 6 questions were - (i) how long does the DOTS course take to cure TB? (ii) when should you stop taking TB medication? (iii) what are the consequences of not taking medicine for complete DOTS course? (iv) what should you buy with the money provided via the Nikshay Poshan Yojna? (v) if you face side-effects such as nausea or vomiting due to the medication, what should you do? (vi) which of the following options is an incorrect way of disposing of sputum during TB? We provided 3 options verbally to each participant for all questions. We took prior consent from each participant before recording the call.

6.3 Data Collection Methods

We collected the data by noting down the answers in a notebook while conducting telephonic interviews. We also recorded the call to revisit the findings. The data to measure learnability was collected by writing down the correct choice given by the participants. We further gave scores (1 - correct answer, 0 - incorrect answer) to analyze the content learnability.

7 Results

7.1 Motivation to Listen to Automated Calls

Out of 40 participants, 27 participants listened to the complete duration of the call (3 min 48 s) whereas 13 did not listen to the complete call. Out of 13 participants, 3 listened between 90 s to 180 s, and 10 listened to it for less than 90 s. The participants’ ability to listen to the call was hindered by the poor network connectivity, as 9 participants cited it as a reason to disconnect the call. One participant stated, “I wanted to listen to the entire message, but I could hear only pieces of information, so I disconnected the call.” 2 participants each mentioned boredom and irrelevant contents to disconnect the call. Figure 1(a) shows the reasons for disconnecting the call before completion.

Fig. 1.
figure 1

(a) Reasons for disconnecting the call before completion (b) Reasons to listen to complete call duration and (c) Duration preference for the automated call

Qualitative findings revealed 4 major factors responsible for listening to the complete duration of the call. The foremost being the fear of missing out any health information that may improve their deteriorating health, as reported by 12 out of 27 participants. For instance, a participant stated, “I want to recover as soon as possible. I don’t want to lose on any new information that will worsen my health. Hence, I listened to ensure I consume each information that helps me recover quickly”.

No other means to find relevant information was also one of the foremost reasons for listening to the complete call. As revealed by 7 participants, they did not know any other source to collect information regarding TB treatment. No health administrators provided any information on TB care during DOTS, and they were not aware of any technological means (e.g., internet browsing) to find new information by themselves. One participant remarked, “This is the first time anyone in my family has been found TB positive, and we are unaware what to do and what to expect. We do not know where to look for information, or whom to ask.”

6 participants who used a shared mobile phone revealed that a family member persuaded them to listen to the call. A participant, who is a young adult and first TB positive patient in his family said, “I did not feel like listening to the call, but my father insisted that it will be helpful for me, and instructed me to listen to the complete call.” Another participant reported, “I was resting in bed when my wife handed me the phone and asked me to listen to the call.” Similarly, 2 participants stated concern for family members’ future as a primary motivation to listen to the call. Another participant said, “I liked the part when the doctor said that the disease would not spread to others if I follow my DOTS regime regularly. I do not want any of my family members to have TB as it is very painful.” These findings are similar to findings presented by Cauldbeck et al. [22] indicating a positive correlation between family support and adherence. Figure 1(b) shows the 4 major motivations to listen to the complete call.

7.2 Preferred Time for Automated Call

Out of 40 participants, 32 preferred receiving automated calls in the evening between 1800–2200 h. Out of the 32 participants, 25 preferred a call between 1900–2100 h, whereas 5 and 2 participants preferred between 1800–1900 h and 2100–2200 h respectively. It is mainly due to its easy availability and convenience to attend a call post working hours. A participant stated, “I generally come home by 1900 h from work, so I prefer an evening call after it (1900 h).” A teenage participant stated, “I don’t have a mobile phone myself. My father has a mobile phone who comes around 2000 h. Hence, I can only listen to the call only after 2000 h.”

4 participants preferred the call during a lunch break between 1200–1400 h. A participant explained, “I get free from work between 1300 to 1400 h for lunch, and generally have nothing else to do. I do not mind listening to a phone call at this time.” Similarly, 4 participants preferred a call in the morning between 0900–1100 h. These participants were retired and preferred receiving a call when other family members had left for work.

7.3 Preferred Gender of Voice for Automated Call

26 participants said that they had no preference for the gender of the voice of the automated call. 10 participants said that a female voice was more appropriate, stating that it reflects compassion. A participant responded, “A female voice shows more concern and gives you the relief that you will become healthy.” 4 participants said that a male voice was more appropriate, stating it will communicate in poor network connectivity. One participant remarked, “I could not hear the female voice due to the faulty network, but maybe I would have been able to hear a man. His voice would have been clearer.”

7.4 Preferred Duration of Automated Call

9 of the 40 participants said that they preferred the duration of the call to be less than 5 min. 22 participants said that they preferred the duration of the call to be less than 10 min, stating that they would lose interest in a longer automated call. A participant said, “If the call comes while I am busy working, I will not listen to it if it is long. So I want the call to be shorter than 5 min.” The remaining participants said that they did not mind listening to a long call, as long as it provided new and relevant information. A participant stated, “I have the disease and want to get cured quickly. Hence, I don’t mind listening to a long call if it provides relevant TB information.” Figure 1(c) shows the participants’ preferred duration for an automated call.

7.5 Preference Between Automated Call and SMS

14 participants preferred an SMS over an automated call due to its ability to review it later at convenience. A participant said, “I prefer to receive the information through text message so that I can access it at another time, in case I’m busy.” 26 participants stated discomfort in reading messages due to smaller screen display and the inability to view and read messages as primary reasons to prefer an automated call over SMS. This finding is similar to the finding proposed by Joshi et al. [17] which suggests a minimal use of SMS-based systems in developing countries because of low-literacy. A participant said, “I have many unread messages on my phone, so I might miss out on the messages that have information about my TB.”

Participants also emphasized the need to preserve and access the contents of an automated call. One participant mentioned, “I would have liked the option to record the call. I would have recorded the call and listened to it again whenever I wanted to.”

The data showed a moderate positive correlation (R = +0.583) between education and the preference between SMS and automated call. Participants with a higher number of education years preferred receiving information through SMS.

Only 1 participant preferred receiving a message instead of a voice call despite having received no schooling. She claimed, “Even though I cannot read, I can show the message to my daughter-in-law, and she can interpret it for me. The message can also be forwarded to others.”

7.6 Content Learnability

We calculated the learnability score based on the number of correct answers out of 6 questions (correct answer = 1, incorrect answer = 0). The overall mean learnability was 4.52 (SD = 1.36). The mean learnability score for participants who listened to the complete call and who did not was 4.77 (SD = 1.39) and 4 (SD = 1.15) respectively. We conducted an unpaired sample t-test to determine the learnability differences in both groups. The results indicated a significant increase in learnability (p = 0.036) for participants who listened to the complete call as compared to who did not.

We also studied the correlation of learnability scores to age and education years. The results indicate a moderate negative correlation between learnability scores and age (R = −0.671). Older participants showed lower learnability as compared to younger participants. We also observed a moderate positive correlation (R = +0.598) of learnability scores to education years. Participants with higher education years showed increased learnability scores.

7.7 Content Preferences

Majority of the participants preferred information that was immediate and easy to implement, i.e., nutrition, effects of tobacco and alcohol, and government schemes. A participant stated, “I liked the fact that I was informed to consume specific food items which I could immediately add to my diet.” A few participants also demanded additional information on government schemes as they found it inadequate. For instance, one participant said, “It should also explain the step through which I can avail the benefits of the government scheme (not limiting to benefits of government schemes).” 3 participants could not recall the contents, hence could not give their preferences. 10 participants did not provide any preference as all the information was new and equally important for these participants.

8 Conclusion

We conducted a study with 40 participants to understand the content learnability, preferred time of the call, duration, gender of voice and preference of automated call over SMS for an outbound automated call. We identified 4 major motivational factors (i) fear of missing out important information (ii) no alternate source of information (iii) family members’ persuasion and (iv) concern for family members for listening to the complete call. We also found poor network connectivity as a prominent reason to disconnect the calls. Thus, an adaptive call strategy is recommended for users residing in poor connectivity areas where repeated attempts should be made for calls that are disconnected in between the conversation. We recommend maximum upto 5 min of calls disseminated between 1900–2100 h for increased adoption due to the availability of the mobile phone and convenience. The call duration can be extended upto 10 min if delivered at a convenient time with new and relevant contents. We also recommend the use of automated calls among users of resource-constrained environment as compared to SMS. The system should allow content preservation through easy recording of an automated call or a call-back feature to later refer the information. Actionable contents immediately and easily applied in their daily routine is recommended to increase adoption of the disseminated contents. Although it is yet to be proven through scientific studies, we believe that our findings can also be applied to other contexts as preferences of call timing, call duration, gender of voice and time of call are often independent to the chosen context of study.

The study had certain limitations. It includes reliance on self-reporting data and the remote interviews conducted via telephone. In the future, we plan to overcome the limitations and further conduct a longitudinal study to statistically prove the findings presented in this paper.