1 Background

Disease surveillance is essential to enable adequate detection and response to disease outbreaks. World Health Organisation (WHO) member states adopted the International Health Regulations in 2005 to help countries to prevent, detect, and respond to acute public health risks that have the potential to cross borders and threaten people worldwide [1]. To strengthen national public health surveillance and response in Africa, WHO member states are implementing the Integrated Disease Surveillance and Response (IDSR) strategy for systems at the community, health facility, district and national level [2].

While progress has been made, today there is still a substantial need to strengthen surveillance and response systems in Africa, especially in low-resource settings and underserved regions [3]. This includes especially the need for scaling up community-based surveillance and event-based surveillance.

Covid-19 has demonstrated the importance of this. The impact of covid-19 in low-resource settings in Africa was felt especially through the social and economic impact of public health measurements, which have put enormous strains on health systems and vulnerable populations [4]. Strengthening of community-based surveillance is needed to enable a more regional approach to public health measures, and reduce the social and economic impact [5]. For example, by recording data about social distancing it becomes possible to take local actions to improve social distancing behavior or respond to a predicted outbreak of covid-19 in a specific region.

Traditional disease surveillance uses data about specific pathogens or diseases in a target population [6]. In remote underserved communities in Africa, traditional surveillance is complicated due to problems such as lack of training for staff collecting data, lack of standardization, slow paper-based data collection procedures, lack of incentive for local health staff to provide data, and gaps in data due to data collection only from the public sector.

To overcome some of these problems, syndromic and event-based surveillance is used to augment traditional approaches, especially in community-based surveillance [7, 8]. Event-based surveillance uses information from diverse internet sources for detecting potential health hazards from reports and rumours [7]. Syndromic surveillance is based on data that are non-specific health indicators including clinical signs, symptoms as well as proxy measures, which are usually collected for purposes other than surveillance, and where possible are automatically generated for allowing a real-time (or near real-time) collection [8, 9].

The advantage of syndromic (symptom-based) surveillance (e.g. fever, cough, diarrhea) is that case definitions can be clearly defined and interpreted by those with limited training, and over time can provide an estimate of both the ongoing burden of disease and can be used to detect high-risk outbreaks early against a background of good local data [8]. This is done routinely in high-income countries for syndromes like influenza-like illness and diarrheal infections without the need for sophisticated diagnostics [6, 9]. The other type of data that can be collected through syndromic data collection approaches, is behavior data. This can be used to track adherence to public health measures [5].

A great opportunity for strengthening community-based surveillance is to build on the existing network of community health workers (CHWs) in Africa [4]. These CHWs are often respected members from communities that volunteer to provide healthcare services to the community that include the promotion of healthy behavior, referral to the primary healthcare facility, and curative services. This is why CHWs have great potential value in both detection and response to disease outbreaks at the community level.

Mobile phones are increasingly used to improve CHWs performance in low- and middle-income countries [10], for example using mobile phones for data collection and reporting, patient-to-provider communication, patient education, CHWs supervision, and monitoring and evaluation. Some of the challenges that need to be addressed when implementing such solutions are a lack of CHWs training on new mobile phone solutions, weak technical support and issues of internet connectivity.

Our research question is: Can CHWs supported by a mobile phone application effectively perform community-based syndromic disease surveillance in low-resource settings and generate pertinent symptom-based and behavioral data.

2 Methods

To investigate our research question, a unique opportunity presented itself when a dataset comprising digitally collected symptom and behavior data was identified, that was generated during CHWs home visits in rural Kenya. We analyzed this data as a proof of principle that the symptoms and behavior they observe can be used as a community-based health surveillance tool, and we demonstrate the relevance of this data by making a comparison with other data sources.

The data was collected by AMREF Health Africa, which is the largest African healthcare organization and works with a network of CHWs (referred to as community health volunteers by AMREF) that are providing door-to-door community health services in underserved communities, as part of the primary healthcare system. We have analyzed an existing data set that AMREF has been collecting during outreach campaigns in which households were signed up for healthcare insurance. These campaigns did not specifically collect data for disease surveillance, but when the households were signed up, a number of questions were recorded about the health of the household members. We searched the database for data that is relevant for either syndromic (symptom-based) surveillance, or for health behavior surveillance.

Data was collected using the AMREF M-Jali mobile phone app, which incorporates a mobile application for capturing data at the household level and transmitting it to a web-based database. Data collection is undertaken by the CHWs in even the most remote areas of Africa.

For this data analysis we only used anonymized information, including date of household visit, county location, and yes/no answers to the specific questions that were asked during household visits.

In total, we analyzed 1.6 million data points, with details shown in Table 1. We did not use any inclusion or exclusion criteria, and therefore the data broadly represents the target population of the healthcare insurance project that generated the data; low-income households in vulnerable remote communities in rural Kenya. The data is therefore also representative of the type of data that a CHW could generate without changing their daily activities. This is important in view of our research question, because a CHW-based community health surveillance would be much less feasible if the CHWs need to change their routines for the purpose of data collection. Table 1 shows that the data is diverse and including a representative fraction of females and children.

Table 1 Data characteristics

Most of the AMREF data points were collected in the period January 2021–April 2021, which aligns with the 3rd covid-19 peak in Kenya. For the counties Kakamega and Meru, data was available starting in October 2020, which means that for these counties we also have data during the 2nd covid-19 peak in Kenya. Supplemental Figure S1 shows the number of observations per day for each county. Typically each household was visited only once during the data collection period, but due to the anonymity we cannot confirm whether some households were visited more than once. Data was transmitted to a central database using the AMREF M-Jali mobile phone app.

In our analysis, we grouped data per date and county, and we calculated the proportion of people answering yes/no to specific questions. We used a threshold of a minimum number of 100 data points per day and eliminated all data points below this threshold. We did not aggregate by gender or age.

3 Results

We identified one question in the database that was relevant for syndromic surveillance (“Has the household member had a cough for two or more weeks”), and two questions that were relevant for health behavior surveillance (“Does the household member have a mosquito net?”, and “Does the household have handwashing facilities?”).

The detailed results are shown in the supplemental figures S2S6, showing the proportion of cough, long-lasting insecticidal nets (LLINs) use and availability of household handwashing facilities for each county. We further explored these proportions separately, comparing them with other data sources to demonstrate their relevance.

3.1 Proportion cough versus covid-19 cases

A well-known symptom of covid-19 is cough, and this means that a high incidence of covid-19 in a population will result in a higher proportion of the population that reports cough [5]. This was confirmed in a study by van Dijk et al. [5] in The Netherlands, who showed that the proportion of self-reported cough symptoms increases in a population during a covid-19 peak. An increased proportion of cough in a population does not necessarily mean a high covid-19 incidence in the population, but we use this symptom in a different way: the absence of an increased proportion of cough implies there cannot be a high covid-19 incidence in the population.

We compared our results with the national covid-19 case data as reported by the Kenya Ministry of Health [11]. The ministry also provides covid-19 case reports data per county, and we tried to use these reports to find county-specific reference data. Unfortunately, the data was not complete for our period of interest, and in some counties, the reported cases were very low, which may indicate that only limited testing was done. Therefore, we only make a comparison with the national covid-19 data.

Our data represents the most remote and vulnerable populations, which were visited by the AMREF CHWs. However, national covid-19 case reports are dominated by covid-19 case reports from a very different population; urban people, especially from Nairobi. Figure 1 shows that there is very poor alignment between the reported cough symptoms in these remote populations and the nationally reported covid-19 cases. For example, the lowest reported cough symptoms in Kakamega are exactly during a national covid-19 peak, and in Meru there was alignment between reported coughing and national covid-19 cases during November 2020, but during the national covid-19 peak in April there was no increase in coughing reported in Meru.

Fig. 1
figure 1

Locally reported proportion of cough (blue dots) compared with national covid-19 cases (gray dashed line) per day. Top: Kakamega. Bottom: Meru

This result leads to two important observations. First, the CHW data was able to track an increase and a decrease in the cough score. This demonstrates the proof-of-principle of tracking cough symptoms is possible through CHW home visits. Second, the CHW collected cough score shows that national covid-19 case data was not indicative of covid-19 in remote populations.

3.2 Proportion of LLIN use and household handwashing

Figure 2 shows the reported use of LLINs for Migori and Meru. The reported LLIN use in Migori was up to 90% around December. This is in agreement with the level of net ownership in Migori County, which is high with close to 90% of households having at least one insecticide-treated net [12]. However, in the first months of 2021, the reported use of nets drops to values as low as 20% in February. This may be due to the seasonality in the mosquito season [13], due to which people have less need for a mosquito net in these months.

Fig. 2
figure 2

Locally reported proportion of LLIN use and availability of handwashing facilities. Top: Migori. Bottom: Meru

The reported use of LLIN in Meru follows a similar pattern. During October–December the use of bed nets is high, and this fits with the humid climate during these months. In January–April, it becomes colder, resulting in fewer mosquitos, and less need for bed nets. Supplemental figures S1S5 show the results for each county analyzed in this study.

The reported availability of household handwashing facilities was generally stable during the study period (see Fig. 2, and supplemental figures S2S6), at around 80% for all counties, except for Kilifi, for which 40% reported having household handwashing facilities. These reported values are high compared to the average value reported for Kenya based on demographic health survey data, which is 50% according to Endalew et al. [14]. This could indicate that the question asked by the CHWs in our study was not specific enough about whether the facilities include water and soap. It is expected that household handwashing facilities are generally stable in time, but we noticed some fluctuations in the reported handwashing facilities in Kakamega and Migori, especially during the national covid-19 peak in March–April 2023.

Our observations from this are that the CHW data was successful in tracking behavior data related to health.

4 Discussion

We have demonstrated that CHWs can collect syndromic and behavior data that can be used as community-based syndromic disease surveillance in low-resource settings. We showed proof-of-principle through CHWs supported by a mobile phone application, in which relevant symptom-based and behavior data was generated. This information is critical to governments for effective and efficient allocation of health interventions and resources. For example, knowledge on local covid-19 cases could have enabled the government to apply more regional covid-19 restrictions.

We also found that CHWs were able to track changes in the use of mosquito nets during the year, and they were able to identify a relatively low availability of household handwashing facilities in Kilifi compared to other counties. These are demonstrations of health behavior surveillance that can inform government policies and interventions.

Our results confirm that that syndromic (symptom-based) and behavior-based surveillance can be performed by those with limited training [8], in this case by CHWs supported by a mobile phone application, which over time can provide an estimate of both the ongoing burden of disease and can be used to detect high-risk outbreaks early against a background of good local data [8].

The fact that national covid-19 case data does not align with the cough symptoms reported by CHWs in remote locations is suggesting that the data generated by CHWs was better able to do covid-19 surveillance amongst these remote populations, and that without this type of surveillance it is not possible to do early detection of outbreaks in these communities. This demonstrates that syndromic & behavior data collected by CHWs can help address the urgent need to strengthen disease surveillance in the most underserved regions [3].

A key question is how to finance and scale up the innovative approach that we have demonstrated here. The data that was used in this study was generated as an extra benefit for a temporary project, but for effective public health surveillance there needs to be financing for continuous data collection and analysis. This also relates to the question of how to incentivize the CHWs to participate in data collection.

5 Conclusions

Strengthening community-based syndromic and behavior surveillance through CHWs is a promising opportunity to strengthen national public health surveillance and response in Africa, and should be included in the Integrated Disease Surveillance and Response (IDSR) strategy. According to the WHO global technical meeting in 2018 [15], the highest priority activity to support community-based surveillance is to develop and compile case studies. Our current pilot study responds to this, but should be implemented at scale to learn further lessons about the implementation and operation of our proposed approach.

What makes our pilot different from other community disease surveillance programs, is that here we specifically tracked changes in the proportion of the population that reports specific symptoms, or specific health behaviors. This is different from community surveillance programs that use symptom-based approaches for identifying and referring patients for treatment [16].

In this study only coughing, LLIN use and availability of household handwashing facilities were tracked, which were useful for showing proof-of-principle. By tracking other variables, it should be possible to extend the syndromic and behavior surveillance to become relevant for other health conditions. Future research could study which variables would be most relevant for a community-based surveillance system.