Skip to main content
Log in

Using Crowd-Sourced Data to Study Public Services: Lessons from Urban India

  • Published:
Studies in Comparative International Development Aims and scope Submit manuscript

Abstract

As cities throughout the developing world grow, they often expand more quickly than the infrastructure and service delivery networks that provide residents with basic necessities such as water and public safety. Why do some cities deliver more effective infrastructure and services in the face of rapid growth than others? Why do some households and communities secure better services than others? Answering these questions requires studying the large, politicized bureaucracies charged with providing urban services, especially the relationships between frontline workers, agency managers, and citizens in informal settlements. Researchers investigating public service delivery in cities of the Global South, however, have faced acute data scarcity when addressing these themes. The recent emergence of crowd-sourced data offers researchers new means of addressing such questions. In this paper, we draw on our own research on the politics of urban water delivery in India to highlight new types of analysis that are possible using crowd-sourced data and propose solutions to common pitfalls associated with analyzing it. These insights should be of use for researchers working on a broad range of topics in comparative politics where crowd-sourced data could provide leverage, such as protest politics, conflict processes, public opinion, and law and order.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Notes

  1. As Scott (1996, pp. 74–75) famously noted, such information asymmetries can arise from experiential knowledge (“mētis”) as well as technical skills.

  2. See World Bank (2016, Chapter 3) for a more extensive catalog of such initiatives.

  3. See also http://timesofindia.indiatimes.com/city/delhi/Delhi-budget-to-be-crowdsourced-Arvind-Kejriwal/articleshow/46362366.cms.

  4. While water meters can tabulate flow through specific pipes and connections, these are typically read manually at regular intervals and thus do not give utilities real-time information on how often and when water is delivered to particular areas.

  5. NextDrop’s revenue model involved charging utilities for information services, including real-time information of water flows.

  6. NextDrop was started by a group of U.C. Berkeley engineering graduates, among others.

  7. The emphasis on assessing the accuracy of a principle source of remotely collected data thus differs subtly from triangulation, usually defined as inference based on multiple sources of evidence, such that “diverse viewpoints cast light upon a topic” (Olsen 2004). Groundtruthing, in contrast, focuses on data validation rather than inference.

  8. This section draws upon Hyun et al. (2018) and Kumar et al. (2017).

  9. Only a small percentage of social media data is geo-referenced, because this requires obtaining user consent or extracting location information from posted messages using automated text analysis. For example, approximately 25% of Tweets are geo-tagged (Bryant 2010; DuVander 2010). On the general point of selection bias in crowd-sourced data, see Mayer-Schönberger and Cukier (2013) and Offenhuber (2017, p. 169).

  10. Van der Windt and Humphreys (2014), for example, compare conflict data sourced electronically from observers with survey data.

  11. See Hyun et al. (2018) for more detail.

  12. Lawrence (2017), for example, constructed a systematic sample of “first mover” protesters and potential protesters in Morocco. This in turn allowed her to recruit participants for a Facebook survey experiment from a network of activists. Van der Windt and Humphreys (2014) provided a set of individuals in randomly selected villages in the Democratic Republic of Congo with mobile phones and training in reporting conflict events.

  13. Our research focused on nine valvemen in one of the utility’s 32 subdivisions. They were shadowed for approximately 4 months in total.

  14. Observation of this sort can, of course, suffer from the Hawthorne effect. In our case, the danger would be that valvemen would be more likely to report as expected when observed. However, this made observations of divergence from expectations in our presence particularly informative.

  15. Hyun et al. (2018) provides this analysis.

  16. A fuller discussion of the use of incentivizing data contributions appears below.

  17. Details in this paragraph are drawn from Kumar et al. (2017).

  18. Van der Windt and Humphreys (2014) also utilize qualitative groundtruthing, in their case to assess the accuracy of reports of conflict. The authors sent field coordinators to verify the quality of their “crowdseeded” conflict data from the Democratic Republic of Congo: coordinators observed whether or not contributors understood coding schemes and assessed the accuracy of reporting. The paper, unfortunately, does not provide detail on the types of qualitative research methods used to assess data accuracy.

  19. Many control group observations are usually needed for sufficient statistical power under such a design (Baird et al. 2015; Gerber and Green 2012, p. 260).

  20. Centers facilitating such collaborations include the Social Media and Political Participation Laboratory at New York University (http://smapp.nyu.edu/about.html), the Center for Information Technology Research in the Interest of Society at University of California at Berkeley (http://citris-uc.org/about-citris/), and the Media Cloud at Harvard and MIT (http://mediacloud.org/).

  21. While recent work suggests that citizens most often approach state officials, such as elected representatives, directly (e.g., Kruks-Wisner 2011; Bussell 2017; Kruks-Wisner 2018), our emphasis here on SLBs is distinct.

References

  • Anand N. Municipal disconnect: on abject water and its urban infrastructures. Ethnography. 2012;13(4):487–509.

    Article  Google Scholar 

  • Auerbach A. Clients and communities: the political economy of party network organization and development in India’s urban slums. World Politics. 2016;68(1):111–48.

    Article  Google Scholar 

  • Bailard CS. A field experiment on the internet’s effect in an African election: savvier citizens, disaffected voters, or both? J Commun. 2012;62(2):330–44.

    Article  Google Scholar 

  • Baird S, Bohren JA, McIntosh C, Ozler B. Designing experiments to measure spillover effects, second version. 2015. Retrieved from https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2619724.

  • Barberá P. The 2013 Italian parliamentary elections on Twitter. 2013.

  • Barberá P. Birds of the same feather tweet together: Bayesian ideal point estimation using Twitter data. Polit Anal. 2015;23(1):76–91.

    Article  Google Scholar 

  • Barberá P, Metzger M. A breakout role for Twitter? The role of social media in the Turkish Protests (social media and political participation lab data report). Social Media and Political Participation Lab; 2013.

  • Björkman L. Pipe politics, contested waters: embedded infrastructures of millennial Mumbai. Durham: Duke University Press; 2015.

    Book  Google Scholar 

  • Bond R, Messing S. Quantifying social media’s political space: estimating ideology from publicly revealed preferences on Facebook. Am Polit Sci Rev. 2015;109(01):62–78.

    Article  Google Scholar 

  • Boutet A, Kim H, Yoneki E. What’s in your tweets? I know who you supported in the UK 2010 general election. ICWSM; 2012.

  • Breuer A, Landman T, Farquhar D. Social media and protest mobilization: evidence from the Tunisian revolution (SSRN Scholarly Paper No. ID 2133897). Rochester, NY: Social Science Research Network. 2012. Retrieved from http://papers.ssrn.com/abstract=2133897.

  • Bryant M. Twitter geo-fail? Only 0.23% of tweets geotagged. 2010. Retrieved April 20, 2015, from http://thenextweb.com/2010/01/15/twitter-geofail-023-tweets-geotagged/.

  • Bussell J. Serving clients and constituents: experimental evidence on political responsiveness. 2017.

  • Calvo E. Anatomia Politica de Twitter en Argentina. Buenos Aires: Capital Intellectual; 2015.

    Google Scholar 

  • Carlson M, Jakli L, Linos K. Rumors and refugees: how government-created information vacuums undermine effective crisis management. Int Stud Q. Forthcoming.

  • Ching A, Zegras C, Kennedy S, Mamun M. A user-flocksourced bus experiment in Dhaka: new data collection technique with smartphones. Transportation Research Record: Journal of the Transportation Research Board. 2013. Retrieved from http://web.mit.edu/czegras/www/Flocksource_JUT.pdf.

  • DuVander A. 3 reasons geocoded tweets haven’t caught on and 2 reasons not to worry. 2010. Retrieved April 20, 2015, from http://www.programmableweb.com/news/3-reasons-geocoded-tweets-havent-caught-and-2-reasons-not-to-worry/2010/02/17.

  • Estellés-Arolas E, González-Ladrón-de-Guevara F. Towards an integrated crowdsourcing definition. J Inf Sci. 2012;38(2):189–200.

    Article  Google Scholar 

  • Furtado V, Caminha C, Ayres L, Santos H. Open government and citizen participation in law enforcement via crowd mapping. IEEE Intell Syst. 2012;27(4):63–9.

    Article  Google Scholar 

  • Gerber AS, Green DP. Field experiments: design, analysis, and interpretation. W. W Norton; 2012.

  • Grossman G, Humphreys M, Sacramone-Lutz G. “I wld like u WMP to extend electricity 2 our village”: on information technology and interest articulation. Am Polit Sci Rev. 2014;108(03):688–705.

    Article  Google Scholar 

  • Hargrave ML. Ground truthing the results of geophysical surveys. In: Johnson JK, Giardano M, Kvamme KL, Clay RB, Green TJ, Dalan RA, editors. Remote sensing in archaeology. Tuscaloosa: University of Alabama Press; 2006. p. 269–303.

    Google Scholar 

  • Hyun C, Post AE, Ray I. Frontline worker compliance with transparency reforms: barriers posed by family and financial responsibilities. Governance. 2018;31:65–83.

  • Iliffe M, Sollazzo G, Morley J, Houghton R. Taarifa: improving public service provision in the developing world through a crowd-sourced location based reporting application. OSGeo J. 2014;13(1):34–40.

    Google Scholar 

  • Jamal A, Keohane R, Romney D, Tingley D. Anti-Americanism or anti-interventionism in Arabic Twitter discourses. Perspect Polit. 2015;13(1):55–73.

    Article  Google Scholar 

  • Jha S, Rao V, Woolcock M. Governance in the gullies. World Dev. 2007;35(2):230–46.

    Article  Google Scholar 

  • Klopp J, Mutua J, Orwa D, Waiganjo P, White A, Williams S. Towards a standard for paratransit data: lessons from developing GTFS data for Nairobi’s Matatu System. Presented at the Transportation Research Board 93rd Annual Meeting. 2014. Retrieved from http://trid.trb.org/view.aspx?id=1289853.

  • Kruks-Wisner G. Seeking the local state: gender, caste, and the pursuit of public services in post-tsunami India. World Dev. 2011;39(7):1143–54.

    Article  Google Scholar 

  • Kruks-Wisner G. The pursuit of social welfare: citizen claim-making in rural India. World Politics. 2018;70(1):122–63. https://doi.org/10.1017/S0043887117000193.

    Article  Google Scholar 

  • Kumar T, Post AE, Ray I. Flows, leaks, and blockages in informational interventions: a field experimental study of Bangalore’s water sector. World Development. 2018;106:149–60.

  • Lawrence AK. Repression and activism among the Arab Spring’s first movers: evidence from Morocco’s February 20th movement. Br J Polit Sci. 2017;47(3):699–718.

    Article  Google Scholar 

  • Lipsky M. Street-level bureaucracy: dilemmas of the individual in public service. New York: Russell Sage Foundation; 1980.

    Google Scholar 

  • Mayer-Schönberger V, Cukier K. Big data: a revolution that will transform how we live, work, and think. New York: Houghton Mifflin Harcourt; 2013.

    Google Scholar 

  • Offenhuber D. Waste is information: infrastructure legibility and governance. Cambridge: The MIT Press; 2017.

    Google Scholar 

  • Olsen W. Triangulation in social research: qualitative and quantitative methods can really be mixed. In Holborn M, editors. Developments in sociology. Causeway Press; 2004.

  • Parikh T. Digital data collection for improving service delivery: a framework for decision-makers. 2015.

  • Peixoto T, Fox J. When does ICT-enabled citizen voice lead to government responsiveness. 2016 World Development Report Background Paper, (Internet for Development). 2015.

  • Post AE, Bronsoler V, Salman L. Hybrid regimes for local public goods provision: a framework for analysis. Perspect Polit. 2017;15(4):952–66.

    Article  Google Scholar 

  • Poushter J. Smartphone ownership and internet usage continues to climb in emerging economies. 2016. Retrieved June 12, 2017, from http://www.pewglobal.org/2016/02/22/smartphone-ownership-and-internet-usage-continues-to-climb-in-emerging-economies/.

  • Raza D. ‘I saw it on WhatsApp’: why people believe hoaxes on the messaging app. 2017. Retrieved June 12, 2017, from http://www.hindustantimes.com/i-saw-it-on-whatsapp-why-people-believe-hoaxes-on-the-messaging-app/story-TTAJjgLC7eL2Mb0LNxVGuJ.html.

  • Sachdev C. If you see dirty water, don’t just gripe. Talk to the cloud! [National Public Radio]. 2017. Retrieved June 9, 2017, from http://www.npr.org/sections/goatsandsoda/2017/06/07/527898124/if-you-see-dirty-water-dont-just-gripe-talk-to-the-cloud.

  • Scott JC. State simplifications: nature, space, and people. Nomos. 1996;38:42–85.

    Google Scholar 

  • Starbird K, Palen L. (How) will the revolution be retweeted?: information diffusion and the 2011 Egyptian uprising. In Proceedings of the ACM 2012 Conference on Computer Supported Cooperative Work. New York, NY, USA: ACM; 2012. p. 7–16.

  • Story M, Congalton RG. Accuracy assessment: a user’s perspective. Photogramm Eng Remote Sens. 1986;52(3):397–9.

    Google Scholar 

  • Telecom Regulatory Authority of India. The Indian telecom services performance indicator report. New Delhi, India. 2017. Retrieved from http://www.trai.gov.in/.

  • Touchton M, Wampler B. Improving social well-being through new democratic institutions. Comp Pol Stud. 2014;47(10):1442–69. https://doi.org/10.1177/0010414013512601.

    Article  Google Scholar 

  • UNICEF. U-report application revolutionizes social mobilization, empowering Ugandan youth. 2012a. Retrieved April 16, 2015, from http://www.unicef.org/infobycountry/uganda_62001.html.

  • UNICEF). TIME magazine covers UNICEF supported mTrac system in Uganda. 2012b. Retrieved April 16, 2015, from http://unicefstories.org/2012/08/16/time-magazine-covers-unicef-supported-mtrac-system-in-uganda/.

  • Vaccari C, Valeriani A, Barberá P, Bonneau R, Jost JT, Nagler J, et al. Political expression and action on social media: exploring the relationship between lower- and higher-threshold political activities among Twitter users in Italy. J Comput-Mediat Commun. 2015;20(2):221–39.

    Article  Google Scholar 

  • van den Berg C, Danilenko A. The IBNET water supply and sanitation performance blue book. Washington D.C.: The World Bank; 2011.

    Google Scholar 

  • van der Windt P, Humphreys M. Crowdseeding conflict data. 2014.

  • World Bank. Digital dividends: world development report 2016. Washington D.C.: The World Bank Group; 2016.

    Book  Google Scholar 

  • Yadav, T. author has posted comments on this articleAnkit. 2015. Tension grips Rithora over objectionable WhatsApp post. Retrieved October 16, 2015, from http://timesofindia.indiatimes.com/city/bareilly/Tension-grips-Rithora-over-objectionable-WhatsApp-post/articleshow/47389539.cms.

Download references

Acknowledgments

This research was funded by a “DIL Innovate” Grant from the Development Impact Laboratory, Blum Center for Developing Economies (USAID Cooperative Agreement AID-OAA-A-13-00002, Alison Post and Isha Ray Principal Investigators), U.C. Berkeley, and a dissertation fieldwork grant from the Institute for International Studies, U.C. Berkeley. Tanu Kumar and Isha Ray, U.C. Berkeley, are co-authors of the impact evaluation project described in this paper. We thank Maria Chang for research assistance. We also thank NextDrop, the Public Affairs Foundation, and the Bangalore Water Supply and Sewerage Board (BWSSB) for their support of our research. We are grateful for comments from Thad Dunning, Agustina Giraudy, Tanu Kumar, Katerina Linos, Aila Matanock, Isha Ray, and seminar participants at U.C. Berkeley and American University.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alison E. Post.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Post, A.E., Agnihotri, A. & Hyun, C. Using Crowd-Sourced Data to Study Public Services: Lessons from Urban India. St Comp Int Dev 53, 324–342 (2018). https://doi.org/10.1007/s12116-018-9271-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12116-018-9271-4

Keywords

Navigation