Introduction

Digital approaches to public health problems, including mobile and web applications, short message system (SMS), and tablet and other computer-assisted devices, are becoming commonplace. In low- and middle-income countries (LMICs), mobile health (mHealth) technology has been recognized as a powerful tool to improve HIV outcomes and care delivery. Globally, over 5 billion people have access to mobile phones; and coverage, acceptability of technology, and familiarity with use are anticipated to only increase over time [1]. Global funding agencies are making strategic investments in mobile technology approaches that provide innovative solutions to these problems, including improving HIV prevention, care, and treatment [2,3,4,5]. Despite the potential for technology to propel global HIV initiatives, there is substantial variability in the quality, fidelity, and implementation of strategies using mobile technology. The World Health Organization (WHO) currently recommends use of mHealth technologies to strengthen health systems yet also recognizes the need for guidance on optimizing use of technology and has developed policy documentation on best practices [6].

High-quality data capture is essential to advance novel HIV research, as well as monitor progress of programs designed to prevent and treat HIV. Yet, there are many obstacles to in-person data collection, including logistical constraints that exclude individuals who face challenges with accessing in-person visits, as well as social desirability bias for sensitive questions. As a result, in-person data collection may lead to less precise or incomplete ascertainment of exposures and outcomes. Simple and widespread technology, such as SMS, could be a viable option to improve collection of a variety of health and behavior data for HIV research and programs. This approach is particularly appealing when in-person data collection is difficult, including during the COVID-19 pandemic.

With the expansion of mobile phone coverage in LMICs, SMS was one of the first mHealth technologies used to support health and delivery of health care services. SMS were particularly attractive in LMICs and have been used to complement clinical HIV prevention, care, and treatment programs. SMS use remains popular in settings where penetration of smartphones and internet access is low, and more sophisticated digital tools, such as mobile applications (“apps”), have limited reach. They have been used to test approaches for disseminating HIV education messages and encouraging HIV testing [7,8,9,10,11,12,13], providing counseling support to patients and triaging their care-seeking decisions [13,14,15,16,17,18,19,20], providing reminders for clinic visits and medication [11, 13, 17,18,19,20,21,22], and offering an alternative strategy for laboratories to share test results with providers [23]. Several trials and evaluations have found SMS sent to patients improve antiretroviral therapy (ART) adherence and HIV virologic suppression [11, 13, 21, 22, 24, 25], rates of early infant testing for HIV and postpartum prevention of mother-to-child HIV transmission (PMTCT) retention [8, 26], and retention in HIV care [11, 21, 26,27,28]. However, efforts to use SMS to remotely capture data from individuals to measure HIV-related risks, behaviors, and clinical outcomes in the context of research or programmatic monitoring and evaluation are limited.

In this review, we examine potential benefits of, and challenges with, using SMS for remote HIV-related data capture. We consider SMS data capture in the context of research and health programs in LMICs, and present examples where SMS have been used for remote data capture in the fields of HIV and sexual and reproductive health. We collate lessons learned from these studies, as well as from fields outside of HIV and sexual and reproductive health, and identify areas where SMS has enormous potential to improve data collection in the context of research and program delivery, and propose guidelines for best practices.

Methods

We searched titles and abstracts published on PubMed through December 2019 using terms to identify mobile health, SMS, and text messaging; HIV or sexual and reproductive health; and in combination with terms for data collection and/or surveys. We also conducted a similar search using the Google search engine. We reviewed citations with relevant topics in articles identified from the PubMed and Google search. Articles were reviewed if they were primary research articles that included mHealth technology for data collection; while we prioritized articles from LMIC on HIV and sexual and reproductive health, we also included articles outside of this setting if they contributed information about barriers or challenges to implementing remote data collection using mHealth technology, prioritizing data from LMIC whenever possible. Review articles of mHealth or SMS, and HIV, that included studies from LMICs were also included. As thematic areas to include in the review were identified, PubMed and Google searches were conducted again to identify additional articles with terms for these themes. Finally, we also reviewed relevant articles and guidance documents the research team was previously aware of.

Results

Overall, 54 references were included in our review: 33 primary research articles, 18 review articles, and 3 guidance documents. The majority of references (n = 47, 87%) included HIV and sexual and reproductive health results or themes; 32 were identified through searches and 22 the research team was aware of. Major thematic areas included phone access and privacy, timing of SMS delivery and response (n = 5), behavior change or behavior change theory (n = 4), SMS use for routine activities (n = 1), data collection on sensitive topics (n = 10), response rates (n = 7), phone access and privacy (n = 23), data quality (n = 8), generalizability (n = 2), and ecological momentary assessment (EMA, n = 1).

Overcoming Logistical Constraints to In-person Data Collection

Remote data capture using SMS technology has the potential to overcome many limitations of in-person data collection efforts for research or program monitoring. This approach offers the opportunity to provide real-time monitoring of behaviors and health outcomes, to shorten time intervals between measurements and prevent missing data at scheduled visits, and more accurately capture dynamic behavioral constructs. One example of real-time monitoring is EMA, which permits repeated reporting of behaviors in the moment and environment in which they occur, and has been used for a variety of HIV-related research questions in the USA [29]. SMS could be used to support EMA, as well as a wider range of experiences that change over longer intervals of time. Two-way SMS can capture these changes both by SMS system programming to tailor the time interval between subsequent SMS soliciting responses and permitting individuals to initiate a report of a change through SMS without being prompted. Thus, the capacity for real-time monitoring reduces recall bias associated with longer reporting intervals common in prospective research studies and between visits for clinical care, and may improve measurement precision of health behaviors and outcomes in research and programs.

Many SMS studies have customized the timing of SMS delivery based on individual preferences [20, 30,31,32], in an effort to increase the likelihood SMS are received at a day and time that is convenient. While most SMS that request a response are replied to immediately after receipt [33, 34], SMS are easier, and require less time to respond to, than a phone call or voicemail. Unlike phone calls, recipients can screen SMS for the effort required to respond. Fluidity in the timing of interaction with SMS facilitates engagement by recipients at times that are convenient, increasing opportunities for participation. Responding to SMS inquiries also provides a privacy advantage over phone calls, as SMS responses can be replied to discreetly while verbal responses to phone inquiries can be overheard and present a different level of vulnerability [35]. SMS are readily accessible for those with mobile phone access, as phones are integrated into daily life for regular communication, including banking and engaging in social media [36].

SMS systems can be low cost [37] and less expensive to collect self-reported survey data than in-person strategies due to reduced personnel. They save both patient and provider time required for visits to the facility that do not require clinical care, evaluation, or sample collection. In times when services are disrupted and travel to facilities is unsafe or not possible due to political unrest, natural disasters, health care worker strikes, or most recently a global COVID-19 pandemic, SMS can fill a critical void in monitoring and evaluation efforts, as well as service provision.

Acceptability and Response Rates

SMS can improve generalizability of study findings, and reduce selection bias, by including individuals who are often left out of research because they are unable to access care, live in remote settings, are too sick to visit health care facilities, or lack resources to pay for transportation-related expenses and health care services required to participate in facility-based research or receive follow-up care [38]. Near-universal access to SMS technology may also improve generalizability, with higher external validity compared with data collection using smartphones in populations with low to moderate smartphone penetration [38, 39]. Ascertainment of data that can be self-reported through SMS has potential to increase retention rates if individuals are unable to navigate logistical barriers required for facility-based research, including employment, family duties, or relocation, but are still interested in and able to respond to SMS. For example, reaching marginalized populations at high risk of HIV infection is challenging in HIV prevention research, and maximizing enrollment and high retention is critical to accurately measure success and failures for those involved in prospective studies or programs. In a pre-exposure prophylaxis (PrEP) demonstration project in Kenya and Uganda conducted in 2014 that asked participants to report sexual behavior and PrEP adherence, 95% of participants said SMS were easy to use, 74% had no challenges with SMS, and 72% preferred SMS over in-person study visits [40••]. Growing familiarity and comfort with using this technology make SMS increasing accessible.

While SMS have been described as generally simple and easy to use [38, 40••], prior studies that have used remote SMS messages and surveys have found variable response rates. In a large study of SMS for ART adherence in Kenya, response rates to two-way message inquiries were as high as 69% [41]. Another Kenyan study that captured fertility data among HIV-serodiscordant couples through a daily, short SMS survey found 78% of women completed all SMS, 7% partially completed, and 16% did not complete any surveys [42•]. They also noted seasonal variation in responses, with lower response rates over the end-of-year holidays [42•]. In Uganda, a community-based survey with multiple topics found an overall response rate of 70%, with the highest response rate for surveys on HIV/sexually transmitted diseases (STDs) (79%), followed by sexual behavior (64%), male medical circumcision (56%), and family planning (53%); notably only 37% completed demographic survey questions [43]. Median response time was 50 min and declined over time, even with SMS reminders with potential for incentives (prizes, phone credit, or free HIV testing) [43]. In contrast, a 2010 study on ART adherence among caregivers of HIV-infected children that used SMS for reporting found that despite high initial interest and participation, weekly completion of SMS surveys was low (range 17–33%); however, confusion around use of a personal identification number (PIN) for security issues was cited as a potential explanation for low completion rates [44]. Similarly, in a large, free, opt-in platform that provides information on family planning by SMS (Mobiles for Reproductive Health, m4RH), only 39.5% of registered users responded to a 5–6-item survey about their knowledge and use of contraception, and only 20.9% completed all survey questions [34]. Approaches to mitigate respondent fatigue may be warranted for studies planning on using SMS data collection, as attrition is likely to occur over time [45].

Few studies have directly compared performance of different telecommunication platforms for remote data capture in research studies and clinical care. In Peru and Honduras, response rates were compared using interactive voice response (IVR), computer-assisted telephone interview (CATI), and SMS in a population-based representative sample of households in a Gallup Poll commissioned by the World Bank, measuring household characteristics. SMS and IVR had similar response and retention rates (21% vs. 19% in Peru, 40% vs. 38% in Honduras; respectively) but lower rates than CATI (39% Peru, 72% Honduras) [46]. These results concur with generally low initial response rates for population-based sampling approaches (as opposed to generally higher non-population-based approaches) and suggest there was some differential attrition by data collection method. However, assignment to data collection method also varied by country (randomized approach in Peru vs. multiple strategies used among households who initially participated in-person in Honduras) [46].

In some SMS studies, participation rates were high when economic incentives were offered. In the Gallup Poll survey in Peru and Honduras that directly compared attrition by mode of data collection, economic incentives led to moderate reductions in attrition; attrition was reduced by 5–10%, but costs were threefold higher for IVR and fivefold higher for CATI compared with SMS in Peru [46]. While many studies use incentives to encourage participation in SMS surveys, few studies have determined the impact in randomized studies. One study in Kenya on family planning found incentives had no effect on response rates [34].

Challenges with Implementation

Consistency in Phone Access and Privacy

While SMS have several features that make them appealing to incorporate into HIV research and programs for data collection, there are limitations to using this technology which require special considerations in LMICs to optimize utility. Despite high and expanding coverage of mobile phone access, shared phones and having multiple phones and/or SIM cards are common in some settings, creating barriers to using SMS for remote data capture. Phones may be shared within a household among couples or with other family members, and access to phones may be variable throughout the day or week [47]. Sharing phones has implications for privacy, as individuals who share phones may need to take extra precautions to delete incoming and outgoing SMS that contain sensitive information [48]. This includes data related to their HIV status, use of medication, and participation in research. Women are more likely to share phones with their partner and have more inconsistent access to phones than men. In addition, use of shared phones by health care workers has also led to problems. For example, in Zimbabwe, HIV viral load results were shared with facilities by SMS, but instructions were not provided to all health care workers who shared the phone creating confusion about what to do with information in the SMS [15, 49]. Other studies have reported that health care workers in LMICs use their mobile devices for informal activities both related, and unrelated, to their jobs [50]. This additional use may increase risk of breaches in confidentiality, as well as opportunities for devices to be lost, broken, or stolen.

Several strategies have been implemented to minimize potential breaches in confidentiality and maintain privacy. SMS participants can “opt-in” to receive more detailed messages, such as overt messaging about HIV and use of medication [51]. They can also be advised to delete any messages received, or sent, that contain sensitive information. PINs or passwords to unlock SMS or screens have also been used to maintain security of sensitive information [52], but may create a barrier to participation [44].

Access to participating in data collection efforts using SMS can also be hampered by electric power disruptions and damaged or broken phones [53, 54]. In Burkina Faso, 65% of pregnant women living with HIV in a study that used SMS for appointment reminders had phones that were damaged and needed to be replaced, some due to extreme weather conditions [49]. While phones were provided to participants in many early SMS studies, it is increasingly less common to do so as frequency of phone ownership has increased [35] and due to concerns that phone provision may not increase participation if phones may be sold or used for other purposes [55]. As a result, phone provision was not determined to be a scalable or sustainable approach to digital health programs [39, 56]. However, in areas where mobile phones and SMS literacy are not ubiquitous, SMS engagement may be sub-optimal [57] and training can result in substantial start-up costs and delays [49]. In some areas, phone credit is required and creates a barrier to sending and responding to SMS [48, 58]. Reverse-billed short codes are a common approach to mitigate requirements for phone credit [14, 59,60,61,62], as is a “flashback” where the SMS participant calls the sender to trigger a call back to avoid costs incurred by participants [63]; both alleviate participant concerns about cost of responding to SMS. Inconsistent power supplies, limited network range and strength, and network outages have also been previously reported [49, 54, 64, 65]. SMS systems operated through a web-interface may also experience outages when internet service is disrupted [48, 54, 58]. The type, frequency, and duration of these issues in individual communities or countries should be considered during planning stages.

Errors Due to Phone Quality and Language

The quality of data collected through SMS is related to the type of data collected. Remote data capture using SMS can capture free-form, open response text or coded alphanumeric responses. Similar to paper or other electronic surveys, free-form text offers the most flexibility, including descriptive narrative responses within the SMS character limit, but is the most difficult data to aggregate and analyze. In contrast, coded responses can be used to select discrete answers, including dichotomous yes/no responses, but require additional questions to capture multiple answers. Discrete data capture is significantly easier to clean and analyze, and may also incur lower labor costs for data cleaning and analysis activities compared with free-form text responses. It also provides an opportunity to develop skip patterns based on prior responses. This smart logic response approach, with limited, coded numerical responses for SMS inquiries, has been tested in several studies [53, 66,67,68]. Deploying SMS on a platform with “longitudinal memory,” where follow-up SMS inquiries sent minutes, days, or weeks after a response can resume a skip pattern from a prior SMS survey provides even more flexibility. While fewer keystrokes with coded responses provide fewer opportunities for data entry error, they are less forgiving and errors are more difficult to find and detect than free-form text.

As a result, the mobile phone interface, keyboard type, screen size, and ability for respondents to accurately enter responses may also reduce quality and accuracy of data collection. If coded responses are erroneously entered and submitted by SMS, it is not possible to recall the SMS and resubmit, or otherwise “undo” an entry. Use of a second, confirmatory SMS inquiry for coded, one-character responses (rather than free-form text) is one strategy that has been used to reduce data entry errors and confirm critical information captured through SMS. However, this approach should be balanced with attention to respondent fatigue during the survey, to maintain engagement by the end user. Two-way systems have also been set up to accept multiple potential responses [69].

In addition, in countries with multiple languages, literacy may vary by language, and considerations for handling multiple language responses are necessary [54]. Unique considerations and accommodations may be necessary if non-Latin alphabets on phones are used. If multiple languages are common, SMS can be offered in multiple language tracks if investigators or programs are prepared to interpret responses that may be in multiple languages or a hybrid of languages [54], such as “Sheng”—a combination of Swahili and English; this may include interpretation manually or by computer software. It is worth noting that while relatively advanced natural language processing software tools exist to interpret languages such as English, far fewer tools have been developed for the indigenous languages spoken in sub-Saharan Africa where the HIV epidemic is most concentrated [70]. Alternatively, SMS can be limited to only support specific languages. This approach may be warranted if the additional languages contribute relatively little data, generalizability is similar with and without the additional language, or due to logistical constraints to support multiple languages. These decisions can have implications for the 160-character count limit for SMS [37]. SMS length will vary by language, and while an initial SMS may be within the character limit, translated versions into other languages may exceed the limit, necessitating two SMS to accommodate the longer translated version.

SMS for Data Collection on Sensitive Topics

The validity and reliability of self-reported responses on sensitive topics, including sexual behavior, is the subject of much debate, with many hypothesizing that social desirability bias leads to underreporting of behaviors in face-to-face interviews [71,72,73]. To overcome underreporting and discrepancies between behavioral risk factors and HIV/sexually transmitted infection (STI) acquisition in LMICs, a variety of remote, self-administered, or computer-assisted surveying modes (IVR, CATI, self-administered surveys, and web surveys) [74] have been used, yet SMS surveys as a specific modality have rarely been used to collect this type of data. A 2010 systematic review of alternative interview modes for capturing information on sensitive HIV risk behaviors failed to identify any published studies comparing SMS data collection to in-person surveys [75]. It is plausible that remote SMS surveys may be more acceptable and result in higher reporting of a variety of sensitive behaviors, such as sexual activity and adherence to novel biomedical HIV prevention interventions, compared to in-person surveys, similar to findings from other computer-assisted surveys [71, 73, 75, 76]. SMS technology may also be preferable over many other modalities that perpetuate the digital divide by preferentially sampling participants with internet access, while excluding those who do not.

Studies to date suggest SMS surveys are both feasible and acceptable as a means for collecting information on sensitive HIV-related exposures and outcomes remotely. In a study assessing a mobile intervention to support safe conception among Kenyan serodiscordant couples, nearly 80% of all daily SMS surveys on fertility indicators (such as menses) and sexual activity were completed [42•]. Participants found the SMS format easy to use, and few expressed concerns about confidentiality in qualitative interviews [42•]. Another study of adherence to HIV PrEP in Kenya found that participants were willing to report sensitive behaviors via an SMS survey, with high completion rates (median 2 of 60 unanswered daily surveys); however, unanswered surveys were more common at the end of the study [52]. Nearly 50% reported unprotected sex and 70% reported at least one missed dose of PrEP [52]. Similarly, in a PrEP implementation study of Kenyan women, use of SMS to improve ascertainment of male partner HIV self-testing outcomes led to substantial increases in completeness of information [77]. In addition, women were more likely to report harm or negative partner reactions as a result of the HIV self-test by SMS compared with in-person reports, which may indicate increased willingness to report sensitive information remotely. Together, these data suggest SMS are an acceptable format to obtain sensitive HIV-related health behaviors and outcomes, and may bolster participation in research; however, evaluations directly comparing data collection modalities in the same study, population, and setting in LMICs are still needed.

Incorporating Health Behavior Theory

A prior review of health behavior theory and its application to eHealth HIV interventions and research advocated for theory to be incorporated in the design [78], and these findings are relevant for SMS approaches for data collection. While SMS may be used solely as an approach to capture data, it can simultaneously be used to serve as an intervention designed to elicit behavior change. Elements of SMS interventions that have previously been shown to be successful for behavioral change outcomes such as interactivity, optimized frequency, and personalization [32, 37, 79,80,81] will also be important for high engagement, low attrition, and successful data collection efforts.

Recommendations

Based on our review, we have summarized key considerations in selecting and designing an SMS system for data collection in LMICs (Table 1), and identified 2 broad areas that have high potential to benefit from SMS as an approach to collect HIV-related data:

  1. 1.

    Surveillance. SMS could be a powerful tool to monitor disease HIV burden, health behaviors, or health outcomes. Surveillance efforts to conduct in-person assessments and manage data are labor intensive, and likely more costly than SMS. In addition, the ability to tailor frequency of monitoring to suit surveillance needs, and avoid limitations of collecting data only among individuals who seek and remain in care, makes SMS a good candidate for surveillance.

  2. 2.

    Monitoring self-reported outcomes. SMS promote ascertainment of self-reported outcomes for research or monitoring and evaluation activities for those who are doing well, capturing data on individuals who less frequently access in-person care than patients who require medical evaluations. This strategy could be important in supporting differentiated care models with longer intervals between visits to the health care facility for re-dosing of ART or other medications, or to check on outcomes that can be self-reported (such as ART adherence or side effects). It can also be used to more accurately measure ART or PrEP adherence, retention in care, and mortality, potentially capturing individuals transferring care to new facilities who may otherwise be misclassified as lost to follow-up when they are not. Finally, it could be used to support tele-triage, guiding clinical decision-making about the necessity of seeking care.

Table 1 Considerations for SMS-based data collection in low- and middle-income countries

Conclusions

In conclusion, SMS have many appealing features that make them well suited to serve as a tool to collect data for HIV research and monitoring and evaluation activities, but are currently underutilized in LMICs. SMS can be successfully deployed, but logistical aspects of implementation should be carefully considered to avoid common pitfalls during the design stage, including assessments of who may be left out of an SMS-based approach. Researchers and programs should consider using SMS as strategy to improve quality of HIV-related data, including conducting robust evaluations of benefits and risks this approach.