1 Introduction

Advances in information and communication technologies (ICTs) have transformed the way people create, retrieve, and update information. Despite the digital revolution during the “Information Age,” much of the world’s population was not directly benefiting from ICTs in the 2000s. At the Millennium Summit in 2000, the United Nations established the Millennium Development Goals (MDGs) (Jensen 2010) to combat global issues such as poverty, disease, environmental degradation, and illiteracy. Many organizations, ranging from non-profits to governments, began collaborating on a variety of projects that aimed to help achieve the MDGs. The success of data-driven decision-making during the “Information Age” led development agencies and donors to advocate for using evidence-based development to inform future interventions. However, the scarcity of digital technologies designed to operate in environments with limited infrastructure slowed international development organizations’ progress in gathering timely data for evidence-based decision-making. Additionally, the unavailability of suitable software tools for mobile data management was a limiting factor to the types of services organizations could provide in low-resource contexts.

Even though modern computing devices were transforming data collection workflows, many global development organizations continued to use paper solutions because mobile digital solutions were designed with assumptions that were problematic for remote locations. Examples of problematic design assumptions include available Internet connectivity, available resources, and the existence of infrastructure. The expansion of cellular telephone infrastructure, combined with the decline of mobile devices’ cost, presented organizations with an opportunity to leverage mobile devices to replace paper data collection. In 2008, the Open Data Kit (ODK) project was established to empower global development organizations working in resourced-constrained contexts to build information services. ODK provides a suite of scalable data collection tools by leveraging commercially available mobile devices and cloud platforms. Using mobile devices to digitize data in the field has been shown to decrease latency, improve data accuracy, and enable richer data types to be collected than with paper, such as pictures, video, or GPS traces.

2 Development Challenge and Initial Design Decisions

A challenge for many global development organizations was finding “appropriate” technology that could operate effectively in a variety of locations where vulnerable populations require assistance. Mobile devices are among the few technologies that can operate in most locations because of their lower power requirements and multiple networking options to connect to the Internet. Even in places with abundant connectivity, infrastructure can be damaged or destroyed during natural disasters, thus demonstrating the need for technology to function in disconnected environments. The lack of suitable software tools to create mobile information services in resource-constrained contexts disrupted global humanitarian organizations’ efforts to leverage ICTs to improve services provided to beneficiaries. The ODK project started with an investigation into “why information services technology was not being leveraged in international development?” The problem was that many information technologies were built for high-resource locations, leading to flawed design assumptions about connectivity, available resources, technical expertise, and infrastructure constraints. An “appropriate” technology provides the necessary functionality to meet organizational requirements and is capable of operating within the constraints of the deployment context.

For technology to assist international development organizations, the technology should be usable by minimally-trained users, should be configurable by people with average computer skills (e.g., not software developers), and should operate robustly despite intermittent power and connectivity. Dr. Gaetano Borriello, the Open Data Kit project’s visionary leader, wanted to design ICTs that focused on magnifying human resources by leveraging technology to address global development problems. Dr. Borriello and his team at the University of Washington’s Department of Computer Science & Engineering (UW-CSE) perceived an opportunity to accomplish this using Google’s open-source Android operating system (OS) for mobile devices. The challenge was to create a reusable data collection system that was adaptable to any context so that international development organizations could use the technology wherever they operated. The three design decisions that shaped the ODK project were: (1) perform co-design with field organizations to test assumptions, (2) follow technology trends to leverage market forces and avoid early obsolescence, and (3) use modular design with open source and standards to encourage community.

To ensure that ODK would be an “appropriate” technology for global development, ODK was designed and implemented using an iterative design process. Researchers gathered feedback from stakeholders at multiple stages for validation and refinement. The iterative process began by listening to organizations about the problems they experienced when attempting to use technology in their deployment context. After identifying the technology issues, researchers would build a prototype and then travel to the deployment context to perform co-design with the field partners. While the researchers were in situ with field workers, they would work side by side with field workers to make iterative changes. These co-design stages would often last for weeks with multiple iterative changes tested in situ.

The ODK project leveraged technology trends to avoid premature obsolescence and take advantage of global market forces. Dr. Borriello believed that by using “appropriate” cutting-edge technology to solve global development problems, it would be possible to leverage global technology trends to drive innovation and keep costs low. Therefore, the ODK project focused on building a data collection platform using commercially available mobile devices and cloud services to simplify an organization’s ability to scale its information systems. In 2008, Dr. Borriello decided to ignore many global development experts’ advice and chose Android devices as an “appropriate” cutting-edge technology for low-resource contexts. He identified a logic inconsistency by global development experts who had a relentless focus on “low-cost” solutions without examining how technology becomes a “low-cost” solution. The ODK project took a different approach and chose Android because it was a new cutting-edge, open-source software platform that was encouraging various companies to manufacture a variety of mobile devices that would run the Android OS. The goal was that companies would design different device models with different form factors and prices for different local markets. The hypothesis was that global market forces would be applied to solve problems for international development organizations as companies sought to adapt Android devices to various local markets. The companies would use their resource to localize the technology with appropriate language and price points for communities. Different companies created multiple form factors that targeted different use cases, including high-cost devices that incorporated more processing power, improved cameras, and more accurate GPS systems. The variety of Android device form factors enabled global development organizations to find appropriate hardware for various interventions, contexts, and requirements. Leveraging “appropriate technology” to enable economies of scale from the global market was one of the unconventional decisions that contributed to the ODK project’s success.

Dr. Borriello believed the diversity of requirements from the variety of global development organizations necessitated a modular design to encourage a community of practice to form. The ODK project’s goal was to build an open-source community that would enable organizations to contribute both their experiences and software code to expand and refine the project. With many different world contexts, having an open and modular design was key to the ODK project’s goal of engaging people to contribute their ideas. Therefore, to encourage the creation of multiple modules, the ODK project uses a permissive, open-source software license. The modular design and open-interface standards promote the creation of modules that either complement or compete with the existing modules. Additionally, by being open and modular, the barriers to contribution are lower as developers can reuse most of the ODK frameworks’ functionality, thus saving time and resources. The ODK project uses a free and open-source software (FOSS) model to allow organizations with limited resources to leverage the project’s software with minimal or no cost. In 2008, the free and open reuse model focused on a community-building philosophy was a different operational approach for many global development organizations. Many organizations were using custom software built by companies, and when the intervention’s funding was depleted, the software was often abandoned. In contrast, the ODK project focused on creating a community that would build reusable software for many different subject domains that organizations could leverage without purchasing the software.

3 Implementation Context

Organizations working in developing regions rely on data to determine what projects to implement and to evaluate their program’s effectiveness. For a long time, the most common data collection approach was to send workers into the field to collect data on paper forms. While paper is almost universally available, inexpensive, and requires very little training for use, it comes with several drawbacks. The first drawback is that paper forms suffer from a lag between the time the data is collected and the time the data is actionable by an organization. With field workers often working in remote regions, the time between when the data is collected and when it is delivered to a centralized location to be processed can be months. Paper records can be lost or destroyed during transport, costing an organization months of progress. Once gathered, more time is needed to digitize the data so that it can be analyzed. Another issue is that paper forms can be error-prone since user input cannot be constrained. Data entry clerks must often interpret handwriting or determine the meaning and can introduce their own typographic errors. Lastly, data in paper format is hard to query, retrieve, or aggregate if not carefully organized.

As portable digital devices became more available, organizations seized the opportunity to digitize many of these data collection workflows from end to end. These digital solutions allowed organizations to address many of the paper issues. Digitally captured data is more accurate due to standardized inputs and input constraints. Digital information can be instantly transferred, copied, or backed up wirelessly. Additionally, digital data collection enables a richer set of data types to be collected than what is possible with paper. For example, a smartphone allows the collection of digital data such as pictures, video recordings, or GPS traces. Finally, it is much easier to categorize, sort, and aggregate digital data to spot trends, outliers, fake data, and problems that need immediate attention.

3.1 History

The first mobile data collection solutions were implemented on Personal Digital Assistants (PDAs) such as the Compaq iPAQ or Palm OS. These devices were around the same size as a modern smartphone but had minimal computational power and storage, and most lacked wireless communication. Many of the initial systems developed by organizations targeted very specific use cases such as collecting data about Malaria patients, providing decision support, or paramedics working in remote regions. Eventually, platforms emerged that could be used for more general data collection, such as EpiHandy and Pendragon Forms. However, many of these initial systems implemented proprietary data protocols to communicate between mobile devices and data storage servers, which resulted in tightly coupled, siloed systems. Updating seemingly small items like the data type of a field in a form often required hiring software developers to make changes to the code. As PDAs started to become obsolete, this monolithic architectural approach caused entire systems to be scrapped. Organizations found it was not worth the cost or effort to implement new data collection clients from scratch on new platforms in order to be compatible with the existing data storage components of deployed systems.

As mobile phones started to proliferate, new opportunities presented themselves for using wireless connectivity like SMS to send data. The first generation of these “dumb” phones did not allow external applications, so solutions were limited to call-and-response SMS solutions or sending of one-way unstructured data. The second generation of phones, dubbed “feature phones,” provided limited application development capabilities. One of the first data collection applications for feature phones, JavaRosa, allowed organizations to design forms using the XForms specification, eliminating the need to hire programmers to make updates to forms.

Noting the success of the iPhone, Dr. Borriello and his team at UW-CSE saw an opportunity with the pending release of Android, an open-source operating system for smartphones. The goal was to try to address many of the shortcomings noted in previous mobile data collection systems. In late 2008, coinciding with the initial release of Android, the idea for the ODK project was created in collaboration with UW-CSE, the Android team at Google, and Google.org. ODK’s design focused on taking an open and modular approach for creating information services that could be deployed to address a wide variety of issues for global development/humanitarian organizations. Android was chosen because it had multiple flexible inter-process communication methods that enabled ODK frameworks to leverage the existing apps for additional functionality (e.g., taking pictures, scanning barcodes, determining location), thus speeding development.

4 Innovation: The Open Data Kit Project

The Open Data Kit project aims to empower resource-constrained organizations to build information services in under-resourced contexts through the creation of an extensible suite of open-source tools. The ODK project supports the convergence of computing and mobility to create an information services platform that specifically targets global humanitarian organizations (Brunette et al. 2013b; Hartung et al. 2010). However, an organization’s tasks can vary widely across subject domains, locations, and cultures. To enable broad use by a variety of organizations, ODK tools are designed without prior knowledge of the application domain or deployment conditions. To empower a variety of users, the ODK project focuses on creating interfaces that minimize the technical skills needed to build mobile data applications. Its modular design and open standards enable organizations to create information systems solutions using composable software frameworks. The ODK project leverages Android mobile devices and cloud platforms to simplify scaling interventions. Android compatible devices were chosen because of the variety of device form factors and price points make them a popular device in economically constrained environments.

Microsoft (MS) Office is an example of a suite of tools that are designed without prior knowledge of the data or subject domain. Global development organizations often use MS Office (e.g., MS Excel, MS Word) or other productivity software to digitize data because data can be customized by staff with little programming expertise. For example, MS Excel is commonly used to build tables of data by workers with intermediate technology skills. Additionally, desktop versions of MS Office operate offline, allowing workers to perform most of their work offline and share their files when connectivity is available. While mobile devices were well suited to operate in contexts with sporadic grid power and Internet connectivity, their software applications did not provide the diverse features of conventional PC productivity software (e.g., MS Office). Often mobile apps are designed for a single purpose; thus, users often use multiple mobile apps to perform complex tasks. These focused apps often have minimal customizability to other subject domains, thus making it difficult to create custom reusable templates that are common to MS Word or MS Excel.

The ODK project’s development was guided by a few simple principles (Brunette et al. 2013b):

  • Modularity: Create composable components that can be easily mixed and matched and can be used separately or together. This allows organizations to take a “best-of-breed” approach to create information systems that meet their needs.

  • Interoperability: Encourage the use of standard file formats and data transfer protocols to support customization and connection to other tools.

  • Community: Foster the building of an open-source community that would continue to contribute experiences and code to expand and refine the software framework.

  • Realism: deal with the realities of infrastructure and connectivity in the developing world and always support asynchronous operation and multiple modes of data transfer.

  • Rich user interfaces: Focus on minimizing user training and supporting rich data types like GPS coordinates and photos.

  • Follow technology trends: Use consumer devices to take advantage of multiple suppliers, falling device costs, and a growing pool of software developers.

4.1 ODK 1 Tool Suite

The first set of ODK tools, “ODK 1” (Hartung et al. 2010), focused on replacing paper-based data collection with enhanced digital data collection. Released in 2009, ODK 1 has been used by a variety of organizations working in different global development domains. Field workers collected data via digital surveys on mobile devices, and the survey responses are then aggregated on a server for analysis. When first released, it consisted of three primary tools: ODK Build, ODK Collect, and ODK Aggregate. These tools provide the ability to design forms (Build), collect data on mobile devices (Collect), and organize data into a persistent store where it can be analyzed (Aggregate).

To enable organizations to customize the data being collected, the questionnaires in ODK 1 are specified using the JavaRosa variant of the W3C XForms standard defined by the OpenRosa Consortium (OpenRosa 2011). The JavaRosa XForm specifies the data types, navigation logic, and input constraints. While the XForm specification helped the ODK project separate the organization’s deployment-specific configuration from a reusable domain-independent rendering framework, organizations felt it was too complicated for a user to configure the system. Build provides a graphical drag-and-drop interface for users to design a questionnaire that will automatically generate an XForm specification. To collect data, field workers use the Collect app to record data by navigating a variety of question prompts (e.g., text, pictures, numbers, location, barcodes) specified in the XForm Fig. 23.1. By default, Collect operates disconnected from the Internet to enable use in any environment.

Fig. 23.1
figure 1

ODK Collect provides multiple data input options including text, GPS, selection, and sound

Aggregate provided an easy-to-deploy server that stored and aggregated collected data to make it easy to export the data to a diverse set of data analysis and visualization tools. To export data for analysis, Aggregate provided interfaces to query and extract data in standard formats (e.g., CSVs, KML) and directly integrate with various web services, allowing Aggregate to act as a store and forward system to other software services. Aggregate was designed as a “container-agnostic” platform to enable organizations to deploy it on an “appropriate” hosting infrastructure for the intervention or project based on the type of data to be stored and other constraints. Aggregate could be deployed to a cloud hosting service to enable an organization to take advantage of the cloud infrastructure’s highly-available and scalable services. However, some organizations collect sensitive data creating data hosting locality and security concerns. Organizations are often constrained by local policies or laws that require data to not leave the country of origin, require the prevention of hosting company employees from viewing the data, or prohibit the collection of high-risk data that contains sensitive information.

While Build was created to simplify the creation of XForms by enabling users to graphically compose surveys, some users felt the graphical interface was too burdensome for creating lengthy or complex questionnaires. Additionally, some users found that Build was inconvenient for sharing forms across their organization, especially when reusing the majority of the form and only needing to make minor adjustments for the deployment context. To solve this problem, “pyxforms” was created by Columbia University to give users in the ODK community an option of designing their form in an MS Excel spreadsheet. The spreadsheet was then automatically converted into an XForm by “pyxforms.” After “pyxforms” popularity in the community grew, “pyxforms” was transferred to the ODK project and was renamed XLSForm to reflect the connection of an XLS file becoming an XForm. XLSForm became the most popular method of creating XForms for ODK. The creation of XLSForm is an example of how the ODK project’s open standards and modular design enabled contribution to the project. The modular design also provides options to users as it enables users to choose an appropriate tool for their use case or skill level. Users can choose between Build’s graphical interface and XLSForm’s Excel spreadsheet interface to design their XForm.

4.2 Iterative Design with Users

Much of the success of ODK results from performing co-design in the field with the researchers working side by side with field workers who would later use the software. This interaction with field workers gave ODK researchers real-time, in situ feedback that would not have emerged or been considered in a typical western research university development environment. One of the most important types of feedback received in the co-design process concerned the cultural appropriateness of the solution to the community that would be using the mobile devices. The rest of this section gives several examples illustrating the benefits of in-the-field iterative design with actual target users.

For the first trial deployment of ODK, two graduate student researchers went to Kampala, Uganda, to work with the Grameen Foundation to collect data about several SMS information services they were piloting in the country. In this pilot, the users were a group of rural farmers who owned “dumb” cell phones and used those phones to run secondary businesses as communication services for their villages. The farmers could all read and speak in their local languages. For the project, the farmers were provided an HTC G1 touch screen smartphone in order to fill out survey information whenever any of their customers used Grameen’s SMS information services. During the training, the graduate student researchers encountered several unexpected issues. After entering the information, the farmers were given the instruction to “touch the button” to send the information. At this point, most of the room stared in confusion at the phones in their hands. Since it was the first time any of them had used a device with a touch screen, they were looking for a physical button instead of the virtual image of a button on the screen.

Later during the instruction, several farmers alerted the instructors that they felt the phones were broken because the phones would not respond to their touch. The instructors used the phone, and the phone worked as expected. Upon handing the phones back to the farmers, the same non-functional behavior was observed. It turned out that farmers often had very rough, calloused fingertips that did not interact well with the capacitive touch screens. However, the farmers could use the larger pad of their fingers to interact with the phone, which led the graduate students to significantly increase the size of all the buttons that very night.

Another example occurred when a developer who was working on ODK from Kenya emailed asking if he could swap the main color scheme in Collect from white text on a black background to black text on a white background. The reasoning was that it was easier to see black text when working in bright sunlight. Up to this point, much of the software development had occurred in Seattle, where bright sunlight is not often a problem. The default text and background colors were switched and have remained that way ever since.

4.3 Example ODK 1 Deployments

Two examples of ODK 1 deployments are:

HIV Treatment and Prevention in Kenya

One of the first large-scale deployments of ODK was by the Academic Model Providing Access to Healthcare (AMPATH) in western Kenya. Their goal was to reach two million people in the AMPATH’s catchment area to test and counsel all eligible individuals for HIV, identify pregnant women not in antenatal care, identify orphaned and vulnerable children, and identify people at high risk for tuberculosis. AMPATH stored their patient information in OpenMRS, an open-source medical record system. Because of ODK’s and OpenMRS’ open and modular designs, AMPATH was able to use Collect for their data collection client. They simply implemented a module for OpenMRS that converted the XForm answers into the OpenMRS concept format. This allowed them to switch from using PDAs to using Android smartphones with very little effort. During deployment, AMPATH also ran a study comparing the deployment cost of ODK to that of a previous tablet-based system and a paper-based system. They found that the per-patient cost of deploying the systems was $0.21 for paper, $0.15 using the PDAs, and $0.13 using Collect on Android smartphones. To date, AMPATH has reached more than one million patients through home-based counseling and testing, they have over 160,000 patients in active HIV treatment, and 90% of the HIV-positive people found via the program have enrolled in their care program.

Monitoring Forests in East Africa

The Jane Goodall Institute (JGI) has been using ODK in Uganda and Tanzania to monitor forest reserves. Local community members are selected as Village Forest Monitors to patrol selected forest reserve areas. Monitors used Collect to gather information on deforestation using smartphone features, such as GPS and images, and can send that information for real-time plotting on maps. The data is used to inform conservation decisions relating to the health of forests and habitats for chimpanzees.

5 Evaluation and User Feedback

Millions of people have used the ODK 1 tool suite to perform data collection activities in the majority of the world’s countries. Its modular design successfully enabled multiple organizations to deploy customized application-specific mobile data collection solutions. ODK 1 has been used in a diverse set of subject domains, including disaster response, public health interventions, and election monitoring. Although the ODK 1 tool suite was experiencing success as a data collection platform, the UW-CSE research team wanted to understand how well it met generic mobile information system requirements. Thus a survey of 73 organizations was conducted in 2012 to determine the adequacy of current mobile data collection and management frameworks in meeting the needs of the organizations. The goal was to identify limitations that were preventing mobile information systems usage in the field. The respondents were from a broad range of organizations working in over 30 countries. From the survey responses and field observations, there were four focus areas that users thought should be improved (Brunette et al. 2013b):

  • Support data aggregation, cleansing, and analysis/visualization functions directly on the mobile device by allowing users to view and edit collected data.

  • Increase the ability to change the presentation of the applications and data so that the mobile app can be easily specialized to different situations without requiring a recompilation.

  • Expand the types of information that can be collected from sensing devices, while maintaining usability by non-IT professionals.

  • Incorporate cheaper technologies such as paper and SMS into the data collection pipeline.

ODK 1’s original focus on enabling resource-constrained organizations to replace paper data collection with enhanced mobile data collection meant that many design decisions were made to leave out unnecessary functionality in order to create a solution that was configurable by users with limited technical skills. This purposeful simplicity meant ODK 1 lacked features required for certain use cases. For example, ODK 1 focused on collecting survey data and uploading completed surveys to a server for aggregation and analysis. It did not provide functionality for distributing data back out to mobile devices for review and updating. However, feedback showed that many organizations need to access previously collected data on the device. Many organizations reported being unable to use ODK 1 for usage scenarios that rely on previously collected data. For example, organizations performing logistics management, public health interventions, and environment monitoring often have workflows that require workers to return to a location and reference previously collected data. The worker verifies whether the previously collected data is still accurate and updates the data to reflect the current state. In usage scenarios with follow-up data collections to track progress, organizations requested the ability to use data from previous surveys to complete new entries, as workers complained about having to re-enter information from the previous visit (e.g., patient demographics, refrigerator information).

After gathering feedback from users, it became clear that there were some feature deficiencies in ODK 1 for supporting data management use cases. Incorporating new tools to contribute the missing functionalities into the existing ODK suite was proposed. New tools would maintain the modular design and add new functionality. However, if too many features were added, there were concerns that a core strength of enabling users with limited technical knowledge to create mobile data collection systems could be lost. Additional features often lead to additional complexity for users as they have more choices and more controls to manage. Dr. Borriello decided that instead of jeopardizing the first tool suite’s success, a second tool suite should be made that targeted different use cases. The second parallel tool suite would maintain the ODK project’s goal of creating domain-independent tools that operate in disconnected environments but would address a different set of constraints and requirements. Organizations could then choose which of the parallel tool suites to use based on their constraints and requirements. Both tool suites would continue to target low-resource contexts that have limited availability of resources, power, and expertise. UW-CSE continued to improve and expand ODK 1 and began to explore the creation of a second tool suite to address use cases where ODK 1 functionality was insufficient.

5.1 Creation of the Second Tool Suite

ODK 1’s focus on replacing and enhancing paper-based data collection led to a unidirectional data flow design (similar to paper). Research showed that many organizations wanted bidirectional synchronization of data to enable users to update previously collected data and conduct follow-up longitudinal studies. Additionally, bidirectional synchronization of data can provide field workers with continually updated data to make better decisions in the field and support more complex workflows based on previous data inputs. The feedback from organizations about missing features led to the creation of a second tool suite called “ODK 2.” The new generation of tools would focus on bidirectional data management applications instead of unidirectional data collection (Brunette et al. 2017). ODK 2 was a response to the need of humanitarian organizations for a free, reusable platform capable of bidirectional data management. The four areas of desired improvements as well as other user research and feedback gave four design principles for ODK 2 (Brunette et al. 2013b):

  • When possible, user interface elements should be designed using a more widely understood runtime language instead of a compile-time language, thereby making it easier for individuals with limited programming experience to make customizations.

  • The basic data structures should be easily expressible in a single row, and nested structures should be avoided when data is in display, transmission, or storage states.

  • Data should be stored in a database that can be shared across devices and can be easily extractable to a variety of common data formats.

  • New sensors, data input methods, and data types should be easy to incorporate into the data collection pipeline by individuals with limited technical experience.

After the initial evaluation in 2012, the UW-CSE research continued to perform iterative development and evaluation with global development organizations. This iterative development cycle led to the identification of additional feature requirements beyond the four basic ODK 2 design principles (Brunette et al. 2017). For example, dynamic value checking based on previous data was requested to improve data integrity. Improving data integrity was seen as a benefit of digitizing data collection; thus, organizations requested an expansion in data verification capabilities. An example use case that demonstrates the limitations of static constraint checks versus dynamic constraint checks is an agricultural longitudinal study. Generally, in these agricultural studies, workers travel to crop fields multiple times during a growing season. During these visits, the workers record the growing conditions and track the crop’s progress over time by recording the crop height. By recording the growing conditions and crop heights of different crop locations, research can be conducted to ascertain the effects of different conditions on crop yields. Organizations requested the crop height value inputted be validated to make sure the field worker entered a reasonable value to avoid typing mistakes or other data entry errors. Unfortunately, the ODK 1 tool suite uses static formulas for validation, so the values used during the data integrity check are the same value at the beginning and end of the growing season, as the check is a static absolute min or max crop height value. Organizations desired a data integrity check that would dynamically adjust over time based on the previously collected crop data. The min and max values used to verify the reasonableness of the crop height would be calculated based on previous crop height measurements taken earlier in the growing season. As with any ODK functionality, this dynamic data integrity check needs to be performed during disconnected operation. ODK 2’s bidirectional data synchronization enables organizations to support this type of longitudinal study by enabling disconnected access to previous crop heights allowing the creation of dynamic data checks that can automatically adjust to catch data anomalies. Thus, ODK 2 expanded the ODK project’s ability to create mobile information systems by having a data collection platform (ODK 1) and a data management platform (ODK 2). The following list describes the key design goals of the ODK 2 frameworks based on research findings (Brunette et al. 2017):

  • Workflow navigation should use intuitive procedural constructs that function independently of data validation and allow for user-directed navigation of the form.

  • The presentation layer must be independent of the navigation and validation logic.

  • The presentation layer must be customizable without recompiling Android apps via HTML, JavaScript, and CSS.

  • Partial validation of collected data should be possible and the validation logic should be able to be dynamic.

  • Local storage should be robust and performant for data curation and for longitudinal survey workflows using a relational data model.

  • Multiple data collection forms should be able to modify data within a single, shared data table.

  • Foreground and background sensors should be supported for data collection.

  • Should support adding new sensing and other methods of input.

  • Disconnected operation should be assumed as data must be able to be collected, queried, and stored without a reliable Internet connection. When the Internet becomes available, the framework and cloud components should efficiently replicate data across all devices.

  • User and group permissions are needed to limit data access.

  • Cloud components should be able to fully configure the data management application remotely as well as preserve a change log of all collected data.

Multiple pilot deployments were conducted to investigate, test, and verify the requirements gathered for ODK 2. A diverse set of field partners working on global health interventions, disaster response management, and other areas were involved in validating the suitability of the ODK 2 frameworks in resource-constrained environments. These iterative pilots identified necessary features for ODK 2 that were not fulfilled by ODK 1. Table 23.1 presents a list of case studies and the corresponding ODK 2 features that were necessary to create a data management solution for the use case.

Table 23.1 ODK 2 case study feature requirement summary (Brunette et al. 2017)

5.2 Multi-Perspective Design

One of the ODK project’s challenges was accounting for the many different “users” of the ODK frameworks. During the evaluation of the ODK project, there was a recognition that the classic roles of software where the “software developers” role creates a software application for the “users” role did not adequately explain the various people involved in configuring reusable frameworks like ODK. Since the ODK project seeks to design configurable software frameworks that enable organizations to adapt to their specific subject-domain needs, the classic two-role model did not fit properly. Instead, the “software developers” are creating frameworks for an intermediate set of persons who configure the frameworks to fulfill an organization’s data collection workflow. The specified data collection workflow defines the interface the field workers use to collect the data. This is in contrast to standard software development, when “software developers” make assumptions when designing the system that determines and limits the functionality for “end-users.” Unfortunately, when making these decisions, the “software developer” may not fully understand the implications on how future software “users” may need to adapt the software for usage in different contexts with varying connectivity, laws, data policies, budgets, etc. During the evaluation process, the research team identified multiple roles of various ODK actors that create, configure, extend, deploy, and customize mobile data management applications. Identifying the various actors was needed to ensure that the proper features and interfaces were being created to support the various roles. The establishment of roles came from a multi-perspective analysis (Brunette et al. 2015) performed to create appropriate framework abstractions. ODK 2’s interfaces and abstractions are designed to target the skill levels of the following four roles (Brunette 2020):

  • End users —Often field workers who have been deployed by an organization to perform remote tasks such as providing services to beneficiaries.

  • Deployment Architects —These are often employees of the organization who are subject-domain experts who customize the technology to satisfy meet organizational requirements. They are often non-programmers who are responsible for adapting an ensemble of off-the-shelf software to meet an organization’s information management needs and impose constraints derived from deployment considerations.

  • Programmers —Skilled programmers who have been employed by organizations to customize the look, feel, or workflow of the ODK 2 frameworks to meet an organization’s deployment requirements. Example tasks include adding new sensors, data input methods, custom prompts, and custom data types.

  • ODK Framework Developers —Those who create the core reusable ODK 2 frameworks that are behind reusable abstractions targeted at the other three roles.

The focus on four distinct roles shaped the design of ODK 2’s framework abstractions (e.g., communication resources, complex workflows). It is difficult to design a single abstraction that is usable by a wide range of technical skills. The four roles have different expectations for their technical skills. The ODK 2 design splits the traditional “user” role into three roles end users, deployment architects, and programmers to recognize that there are multiple “users” with different roles. To avoid confusion between the “software developers” who build the ODK frameworks and “software developers” who use the ODK frameworks’ interfaces to customize an organization’s workflow, two developer roles were established ODK framework developers and programmers. Both programmers and ODK framework developers are expected to know how to program; however, the programmer role is not expected to understand how ODK 2 frameworks are implemented, instead simply how to use the ODK programming abstractions for customization. Both the end users and deployment architects are assumed to have no software programming skills. However, deployment architects are assumed to be computer literate so that they can enter enough information about the deployment to customize ODK frameworks using high-level abstractions.

6 Adaptation: ODK 2 and Extensibility Exploration

Based on the evaluation, researchers at UW-CSE began an exploration of extending the functionality of the ODK project by creating additional mobile frameworks. After years of research focused on creating information systems for global development organizations, UW-CSE released part of the ODK 2 tool suite in 2017 as a parallel effort to the ODK 1 tool suite. The 8 years between the initial release of ODK 1 and the initial release of ODK 2 were spent performing iterative development with multiple cycles of innovation, adaptation, and evaluation of both ODK 1 and ODK 2. Key aspects of ODK 2 are (Brunette et al. 2017): (1) a modular design that enables individual software frameworks designed for particular tasks to be used together to achieve complex workflows, (2) data and configuration management that enables data protection, storage, and sharing of data across mobile devices and cloud components, (3) a synchronization protocol that is designed to operate in challenged networking conditions, and (4) a services-based architecture that abstracts common functionalities behind a consolidated unifying services layer.

A usability problem is often created by having a single interface that targets multiple types of users. Users with basic computer skills often want something simple and can feel overwhelmed with too many options; conversely, a user with programming skills can feel frustrated when functionality is not exposed. Instead of one size fits all for the interfaces, ODK 2 frameworks were created with multiple interfaces and abstractions that can perform similar functionalities to help users with different skill levels configure their system. These multiple interfaces follow some of the ideas of design scaffolding (as demonstrated in Anderson et al. (2012); Quintana et al. (2004)) by providing an interface for users to learn about the ODK 2 frameworks without having to understand the full details of the entire system. The goal was to ramp up a user’s ODK 2 knowledge through the use of simple interfaces that are less confusing or intimidating than more advanced interfaces that provide additional functionality. For example, in ODK 2 there are multiple ways for a deployment architect to create a database table. One method of database creation completely hides the database details from a deployment architect. Based on the success of the XLSForm in ODK 1, Excel was chosen again as a method to configure the ODK 2 frameworks, although with different syntax and structures to account for the new functionality. ODK 2 will take a questionnaire designed in an XLSX workbook by a deployment architect and will automatically generate all needed database tables. This automation removes the need for a deployment architect to understand databases. Alternatively, if the deployment architect wants to customize the structure of the database, a spreadsheet can be added to the XLSX workbook to manually define the database table by supplying column names and column types. The deployment architect has a third option to create the database table pragmatically by using ODK 2’s JavaScript functions.

6.1 ODK 2 Mobile Frameworks

ODK 2 frameworks provide a diverse feature set that enables organizations to create customized data management applications. Since ODK 2 was designed to handle be use cases with additional complexity, there was concern that a single app could have overwhelming complexity if it provided abstractions for every usage scenario. Therefore, to simplify customization interfaces, ODK 2 was built as multiple reusable frameworks that focus on specific areas of functionality. To give an end user the look and feel of a unified application, the multiple frameworks are designed to smoothly transition between each other. The six Android apps (shown in Fig. 23.2) each contribute configurable software frameworks that provide specific functionality to ODK 2. The frameworks isolate the deployment architect’s configurable interfaces from the reusable system components to create abstractions that are flexible enough to support varying types of workflows from different subject domains.

Fig. 23.2
figure 2

This architecture diagram shows how six ODK 2 mobile frameworks work together to create what can appear to be a single customized mobile data management application (Brunette et al. 2017). The ODK 2 tool suite mobile apps are Scan, Tables, Survey, Sensors, Services, and Submit

Tables

- Tables (Brunette et al. 2013a) is a data management framework that enables users to view, update, and curate the entire data set with optional user permission enforcement. Tables is designed to render custom user workflows for use cases where previously collected data is viewed and updated (e.g., logistics management, medical care). Data can be viewed in either tabular or graphical format depending on the context, and views are customizable by modifying or extending the HTML, CSS, and JavaScript files. Since Tables uses runtime web rendering for its user interface, the user interface can be customized without recompiling the entire framework. The flexibility provided by Tables framework greatly increases the usability of the ODK 2 tool suite, as it enables organizations to encode complex workflows and decision logic outside the question workflow of Survey. It is also the creation of workflows for use cases where users often return to locations and need to reference multiple pieces of previously collected data.

Survey

- Survey expands the possibilities of data collection through its dynamic question-rendering and constraint-verification framework that allows for interactive, non-linear navigation that enables organizations to customize workflow complexity. Survey collects data through the use of questions similar to Collect. However, Survey uses a runtime web rendering framework for its user interface. Since runtime languages (e.g., JavaScript) are used to define the prompt widgets, rendering logic, and event handling, the entire user interface and flow of Survey can be customized by an organization. The runtime customization permits a wider variety of customization possibilities than XForm’s linear navigation flow used by ODK 1. The runtime flexibility is another example of providing multiple interfaces to deployment architects, as Survey’s XLSX definition structure provides a simpler customization method that can allow someone to ramp up their web scripting knowledge as they require more advanced features.

Submit

- Experiences from the field demonstrate that not all the data that organizations collect are considered equally valuable. Data has different priorities, and the system may need to be flexible when selecting the method of data transmission by accounting for inherent data properties, contextual data properties, and network properties (Brunette et al. 2015). Existing networking paradigms often assume that inherent data properties (e.g., size, type) are sufficient. However, data is not uniform and has both inherent data qualities and contextual data qualities. The inherent data properties are independent of the application domain, while the contextual data properties (e.g., priority, importance, deadline) are determined by the application usage scenario and deployment context. An important contextual data property is data public or private (e.g., private medical records vs. sports scores). Local laws and an organization’s data policies often dictate how data can be stored and transmitted depending on if the data is public or private. Submit (Brunette et al. 2015) is an experimental communication framework built to enhance performance in disconnected environments. It allows organizations to adapt their applications to share data in various network conditions to match their data communication needs. Submit also provides a modified peer-to-peer synchronization protocol.

Services

- Services (Brunette et al. 2017) provides the common functionality to other ODK 2 mobile apps to maintain a modular design and avoid re-implementing core functionality within each tool, thus creating a services-based architecture. For example, Services abstracts the database access to a single-shared service interface that enforces consistent data semantics and provides data-access restrictions. Other examples of the core functionality provided by Services include the data synchronization protocol, a web server, framework preferences, and user authentication.

Sensors

- Sensors (Brunett et al. 2012) is a device-connection framework designed for Android devices. Sensors simplifies connecting and integrating external sensors into an organization’s data collection workflow by providing reusable functionality for sensor connection and processing. Sensors is a modular framework that simplifies both the application and sensor-driver development by creating abstractions that separate functionalities between the sensor framework, device-driver, and user application.

Scan

- Many organizations still rely on paper forms, as the cost of providing a mobile device for every field worker can be problematic for organizations with limited financial resources. During the evaluation, some organizations requested that the ODK project provide the ability to use a hybrid approach that integrates paper forms with mobile devices. The idea was that field workers could use paper forms to collect data and then use a few mobile devices in the field to digitize the data. Scan is a paper digitization framework that explored how to combine paper and digital data collection (Dell et al. 2012, 2015). Scan enables organizations to digitize paper forms in disconnected environments by adding Scan-compatible data input components to their forms. Mobile devices have the computing resources necessary to locally process QR codes, check boxes, multiple-choice bubbles, and structured handwritten number boxes. Scan digitizes paper forms using the Android’s camera and computer vision algorithms that run on the device into snippets corresponding to the individual data components (Dell et al. 2012, 2015). Additional information can be collected using handwritten text boxes that Scan cannot digitize programmatically. These handwritten text boxes are presented to the field worker as an image through an automatically generated Survey form to enable the field worker to manually transcribe the data by looking at an image and manually transcribing the handwritten text.

6.2 Example ODK 2 Deployments

Two examples of ODK 2 deployments that at the time of publication have ongoing field evaluations and iterations to improve the “data management platform” are:

Humanitarian Disaster Response

The International Federation of Red Cross and Red Crescent (IFRC) partnered with UW-CSE to create an innovative humanitarian relief platform called “RC2 Relief” (Fig. 23.3). RC2 Relief leverages ODK 2 frameworks to simplify distributing relief items or providing cash assistance in response to disasters or humanitarian crises. The IFRC had previously leveraged ODK 1 to collect information without Internet connectivity during disaster responses. However, the IFRC found ODK 1’s unidirectional dataflow limited their ability to use ODK as a humanitarian information platform. The issue was that ODK 1 could not provide the field worker with updated data to perform relief tasks. This limitation created a worker efficiency issue as field workers had to use both (1) a mobile device with ODK 1 for data entry and (2) either a paper or a laptop to obtain information about relief assistance. Using ODK 2’s mobile information management frameworks, field workers can view what relief assistance should be delivered as well as the beneficiary’s complete history. Since ODK 2 is a customizable framework, it was a natural fit for the Red Cross movement allowing the reuse of core technology and training materials between National Societies while also empowering National Societies to customize the system to their particular use cases. Since ODK 2 is a rendering framework, a volunteer’s workflow can be easily updated to adapt to the dynamic conditions of a humanitarian relief operation.

Fig. 23.3
figure 3

The IFRC brochure about RC2 Relief (IFRC 2020)

Vaccine Cold Chain Inventory

Immunization programs depend on a network of vaccine refrigerators and cold rooms to keep vaccines within a narrow temperature range to maintain the vaccine’s potency. This network of temperature-controlled storage is called a vaccine cold chain. Management of the vaccine cold chain depends on having an accurate understanding of equipment’s status at different facilities, including which vaccine refrigerators are broken and needing repair. Previous attempts to use ODK 1 to perform regular inventory updates were met with resistance because field workers objected to re-entering the same information about the refrigerator during each visit. Instead, workers simply wanted to make updates to the data that they previously entered. With ODK 2’s data synchronization, remote field workers can view and update the most recent cold chain inventory data on their mobile devices (Brunette et al. 2020). Synchronizing data helps decrease work duplication between field workers as each field worker has the most current information when Internet connectivity is available.

7 Challenges and Lessons

Technology designed for reusability and flexibility can act as an innovation catalyst because of its applicability to a diverse set of use cases. By providing reusable frameworks, the ODK project seeks to enable global development organizations to leverage computing technologies to innovate and improve the organization’s ability to deliver services to combat global issues such as poverty, disease, environmental degradation, and illiteracy. The ODK project focuses on creating information services that are configurable by non-programmers and deployable by resource-constrained organizations (including both financial and technical resource constraints) to address a broad variety of issues for global development organizations. ODK’s modular design allows an organization to leverage different modules to create flexible information technology solutions that can be customized to satisfy an organization’s contextual requirements. Designing a reusable system without prior knowledge of the type of data that will be stored requires a flexible design, as organizations are often constrained by local policies or laws that dictate how data can be stored and transmitted. The ODK project was designed to leverage “appropriate” cutting-edge technology to solve global development problems and to take advantage of the global technology trends that drive innovation. The project targets commercial cloud platforms and Android devices to enable organizations to take advantage of global market forces. The choice of Android smartphones and tablets in 2008 as an “appropriate” cutting-edge technology for low-resource contexts was because of Android’s flexible and open architecture. Choosing Android led the majority of people in the global development space to ridicule the ODK project, with many global development experts stating that Android devices would never be sold in Africa or Asia. The critics insisted that an Android device would never have a price point below $400 USD, and therefore, Androids would only be sold in North America and Europe. At the time of publication of this textbook, Android is the dominant mobile device operating system in the world with devices being sold below $40 USD. A key lesson from the ODK project was identifying the logic inconsistency from global development experts that were exclusively focused on “low-cost” solutions while not examining how technology becomes a “low-cost” solution through market forces. The ODK project seeks to identify and leverage “appropriate” technology that can operate in low-resource contexts to exploit future research and development by the global marketplace.

The ODK project tries to avoid leveraging technology that is not supported by the global marketplace because lacking such support there will likely be minimal momentum to lower prices. By following technology trends, the market solved problems for global development organizations by localizing the technology to users, creating different types of mobile computing devices, and creating devices with prices appropriate to the context. To avoid early obsolescence, ODK used “appropriate” cutting-edge technology to leverage advancements by commercial companies, universities, and governments. Additionally, the timeline of scaling global development interventions often does not align with information technology’s life cycles. The cheapest devices are often discontinued when newer, more capable technologies can be produced at a similar or lower cost. Unfortunately, technology obsolescence creates churn, as organizations have to find replacement technologies and often conduct new pilots to verify that the replacement technology will meet organizational requirements. Additionally, depending on how different the replacing technology is, organizations may need to produce new training material and resource mobilization strategies. Therefore, if the lowest-priced technology does not remain available for purchase through an organization’s scaling timeline, it may not have the lowest overall cost. For organizations to scale technology to multiple countries, locations, and contexts, it often takes years, thus requiring the technology to remain available for purchase for 5 to 10 years to provide ample time for the technical solution to be adapted, tested, and deployed to the various locations.

While ODK has been successful, there have been many struggles with open-source adoption and management. From the beginning, it was obvious that a few people working together at a public research university would not scale to meet the needs of supporting hundreds of organizations and their field workers deployed around the world. To mitigate the problems of a small research team scaling, an open-source model was used to build an ecosystem of technology innovation. Three approaches were used to build a community of practice: (1) modularity—to make it easy for alternative solutions to be developed in parallel, (2) community building—establish an open-source community for users to contribute experiences and code, thereby establishing a process to expand and refine the software, and (3) open-source—to allow organizations with limited finical resources to benefit from innovations at little or no cost through a free and open-source software (FOSS) model. A permissive open-source software license was chosen to encourage commercial enterprises to build features and expand the core ODK technology. The idea was to embrace the concept of a competing ecosystem where many companies, consultants, and organizations could take FOSS technology and build products and services with it. Supporting the diversity of use cases in the field was something beyond the capability of a few people working at a public research university. The goal was to create a robust ecosystem that would give organizations choices on the types of scaling the technical expertise they needed to receive to meet the needs in a local context.

Many global development software projects, like ODK, rely on donor and grant funding for software development, expansion, and upkeep. Funding organizations often have a large influence on the type of features implemented as donor funding often targets-specific domains, such as health, causing certain use cases to be emphasized more because of the availability of funding. For a deploying organization, there are still costs associated with using free software. For example, there are costs associated with procuring and maintaining computer resources. Additionally, there are costs to configure ODK for the deployment context and to train field workers. Furthermore, technology is not static; therefore, ODK needs to be continually upgraded to handle software maintenance tasks, such as library and operating system API changes, bug fixes, and security fixes. Software maintenance can be a difficult area to fund because most funding comes with requirements for innovation with evidence-based outcomes on beneficiaries.

Much of ODK’s success came from the design principle of performing iterative development that included co-design with field organizations to test assumptions. Researchers working side by side in the deployment contexts with organizations gave researchers feedback that helped identify incorrect assumptions and expectations. Building real systems with organizations with challenging constraints helped reveal problematic design assumptions that were limiting functionality because of incorrect expectations about deployment contexts. Often small assumptions made by a developer can make a technology inappropriate for an organization’s use in a specific context.

8 Summary

Since resource-constrained environments often experience a dearth of technical personnel capable of building and customizing information systems, the ODK project was established to provide a configurable set of open-source software tools for organizations to use in resource-constrained contexts. The ODK project’s modular software frameworks target a variety of global development use cases. The ODK tools are designed to be configurable by global development organizations that have an understanding of the field conditions, the types of workflows and actions that are necessary, and which tasks are important to complete. Therefore, the configuration abstractions need to be simple enough for a non-developer to use but flexible enough to handle various constraints based on the deployment context. ODK tools are designed for flexibility because they are created without prior knowledge of the application domain, deployment conditions, or types of data that will be collected. Five decisions that led to the success of the ODK project were:

  • Maintaining functionality when operating disconnected from the Internet

  • Identifying “appropriate” technology and leveraging technology trends

  • Having partners who are local domain experts and perform iterative development in situ

  • Using a modular design with open standards

  • Establishing an open-source community of users and developers to create an ecosystem of support

After a decade, the Open Data Kit project continues to engage in innovation cycles that include requirements gathering, iterative development with field partners, evaluation, and adaption. Academic researchers continue to design, build, and evaluate an ensemble of software frameworks that can be used together or independently to construct mobile information systems. The impacts of the ODK project have been diverse in scope and scale. There are several levels of impact including: (1) the impact on the organizational-level users and whether it meets organizational goals of increased efficiency, reduced cost, and more accurate data; (2) the impact on the field workers and if it helps them to be more efficient; and (3) the impact on the quality and cost-effectiveness of the services the beneficiaries receive. Additionally, the cost-savings are multi-faceted in scope and each needs to be measured to understand the benefits of (1) the interventions themselves, (2) savings from shared software infrastructure, (3) employee productivity, and (4) reduced costs of data acquisition for evidence-based decision-making and metrics and evaluation.

9 Discussion Questions

  1. 1.

    What are five advantages and five disadvantages of creating reusable software frameworks that are designed to be used by different organizations trying to solve problems in multiple subject domains?

  2. 2.

    What makes software for international development different from other software projects?

  3. 3.

    What are some current technology trends that can be applied to resource-constrained contexts to improve global development outcomes?

  4. 4.

    What are some advantages and disadvantages of a free and open-source software model?

  5. 5.

    What challenges would you expect developing organizational human resources that are capable of deploying platforms like ODK and how would you address them?

  6. 6.

    What areas of international development could benefit from applying information technology solutions?

  7. 7.

    What technology trends will the ODK project likely need to adapt to over the next 5 years to remain a useful platform for data collection or data management for low-resource communities?