Skip to main content

Data Extractions and Extractors

  • Chapter
  • First Online:
AI, Ethics, and Discrimination in Business

Abstract

In this chapter, I discuss practices and technologies used to extract data from what (McAfee & Brynjolfsson, Harvard Business Review 90:61–67, 2012), in their seminal paper on big data, call “walking data generators.” I focus on data extractions (namely, practices that focus on maximizing the collection, storage, and processing of so-called “big data”) and data extractors (the technologies used to perform such tasks (extractions)).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 159.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://www.progressive.com/answers/telematics-devices-car-insurance/.

  2. 2.

    https://dihi.org/project/sepsiswatch/.

  3. 3.

    https://nigms.nih.gov/education/fact-sheets/Pages/sepsis.aspx.

  4. 4.

    https://www.wired.com/story/amazons-first-aid-clinics-push-injured-employees-to-keep-working/.

  5. 5.

    https://www.osha.gov/incident-investigation.

  6. 6.

    https://jordanbarab.com/confinedspace/.

  7. 7.

    Other relevant contribution to the literature (and practice) of the dynamics between Uber drivers and the app were recently made by my colleague and friend Mareike Mohlmann, also coauthor of the paper that is cited above. Examples of her contributions are Möhlmann and Henfridsson (2019), Möhlmann (2021), and Möhlmann et al. (2021).

  8. 8.

    https://www.wired.com/story/corporate-surveillance-train-ai/.

  9. 9.

    https://azure.microsoft.com/en-us/free/ai-services/.

  10. 10.

    https://www.brookings.edu/articles/contact-tracing-apps-face-serious-adoption-obstacles/.

  11. 11.

    https://www.nyclu.org/en/stop-and-frisk-data.

  12. 12.

    https://supreme.justia.com/cases/federal/us/392/1/.

  13. 13.

    https://laws-lois.justice.gc.ca/eng/acts/p-21/FullText.html.

  14. 14.

    https://gdpr.eu.

  15. 15.

    https://www.pewresearch.org/internet/fact-sheet/mobile/.

  16. 16.

    https://www.research.ox.ac.uk/article/2020-04-16-digital-contact-tracing-can-slow-or-even-stop-coronavirus-transmission-and-ease-us-out-of-lockdown.

  17. 17.

    https://www.talitrix.com/inside-the-walls/.

  18. 18.

    https://theintercept.com/2023/09/02/smart-epants-wearable-technology/.

  19. 19.

    https://www.wired.com/story/dna-drives-help-identify-missing-people-its-A-privacy-nightmare/.

  20. 20.

    https://www.23andme.com/dna-health-ancestry.

  21. 21.

    https://www.ancestry.com.

  22. 22.

    https://www.wired.com/story/23andme-credential-stuffing-data-stolen/.

  23. 23.

    https://www.healthcare.gov/glossary/affordable-care-act/.

  24. 24.

    https://www.npr.org/2017/07/27/539907467/senate-careens-toward-high-drama-midnight-health-care-vote.

  25. 25.

    https://www.hhs.gov/hipaa/index.html.

  26. 26.

    https://www.fitbit.com/global/us/technology/stress.

  27. 27.

    https://www.nytimes.com/2018/04/04/us/politics/cambridge-analytica-scandal-fallout.html.

  28. 28.

    https://www.cbsnews.com/news/facebook-stock-price-recovers-all-134-billion-lost-in-after-cambridge-analytica-datascandal/.

  29. 29.

    https://supreme.justia.com/cases/federal/us/597/19-1392/.

  30. 30.

    https://www.cnn.com/2022/07/16/politics/abortion-data-what-matters/index.html.

  31. 31.

    https://www.washingtonpost.com/politics/2022/06/25/dobbs-roe-black-racism-disparate-maternal-health/.

  32. 32.

    https://www.nytimes.com/2021/12/14/nyregion/newark-prohibiting-feeding-homeless.html.

  33. 33.

    https://www.wired.com/story/opinion-data-brokers-know-where-you-are-and-want-to-sell-that-intel/.

  34. 34.

    https://www.wired.com/story/criminal-justice-transparency-law-data-brokers/.

  35. 35.

    https://www.nytimes.com/2021/11/12/opinion/facebook-privacy.html.

  36. 36.

    https://www.dataprotection.ie/en/individuals/know-your-rights/right-erasure-articles-17-19-gdpr.

  37. 37.

    https://www.acxiom.com.

  38. 38.

    https://www.corelogic.com.

  39. 39.

    https://www.epsilon.com.

  40. 40.

    https://www.ft.com/content/f1590694-fe68-11e8-aebf-99e208d3e521.

  41. 41.

    https://www.congress.gov/event/116th-congress/house-event/LC64156/text?s=1&r=3.

  42. 42.

    https://www.congress.gov/bill/117th-congress/senate-bill/1265.

  43. 43.

    https://constitution.congress.gov/constitution/amendment-4/.

  44. 44.

    https://rc.library.uta.edu/uta-ir/handle/10106/29572.

  45. 45.

    https://medium.com/@agua.carbonica/twitter-wants-you-to-know-that-youre-still-sol-if-you-get-A-death-threat-unless-you-re-a5cce316b706.

  46. 46.

    https://www.theatlantic.com/technology/archive/2023/09/books3-ai-training-meta-copyright-infringement-lawsuit/675411/.

  47. 47.

    https://fingfx.thomsonreuters.com/gfx/legaldocs/lbpgolxxmpq/META%20AI%20COPYRIGHT%20LAWSUIT%20complaint.pdf.

  48. 48.

    https://epic.org.

  49. 49.

    https://epic.org/wp-content/uploads/2023/05/EPIC-Generative-AI-White-Paper-May2023.pdf.

  50. 50.

    https://www.theguardian.com/technology/2023/aug/25/new-york-times-cnn-and-abc-block-openais-gptbot-web-crawler-from-scraping-content.

  51. 51.

    https://help.nytimes.com/hc/en-us/articles/115014893428-Terms-of-Service.

  52. 52.

    https://nces.ed.gov.

  53. 53.

    https://www.wsj.com/articles/apple-restricts-use-of-chatgpt-joining-other-companies-wary-of-leaks-d44d7d34.

  54. 54.

    https://www.theverge.com/2023/4/28/23702883/chatgpt-italy-ban-lifted-gpdp-data-protection-age-verification.

  55. 55.

    https://about.fb.com/news/2021/11/update-on-use-of-face-recognition/.

  56. 56.

    https://www.technologyreview.com/2023/07/20/1076539/face-recognition-massachusetts-test-police/.

  57. 57.

    https://www.gao.gov/assets/gao-23-105607.pdf.

  58. 58.

    https://openai.com.

  59. 59.

    https://www.nytimes.com/2023/07/18/technology/openai-chatgpt-facial-recognition.html.

  60. 60.

    Note that this system was created in 1965, but it was only in 199 that the system “targeted” the poor (http://etheses.lse.ac.uk/950/).

  61. 61.

    https://nfsa.gov.in/portal/pds_page.

  62. 62.

    https://anderson-review.ucla.edu/wp-content/uploads/2022/04/Digital-Identity-in-India-Palgrave-Handbook-of-Technological-Finance.pdf.

  63. 63.

    https://uidai.gov.in/en/about-uidai/unique-identification-authority-of-india.html.

  64. 64.

    https://uidai.gov.in/en/.

References

  • Barnard, C. I. (1938). The Functions of the Executive. Harvard University Press.

    Google Scholar 

  • Bender, E. M., & Friedman, B. (2018). Data Statements for Natural Language Processing: Toward Mitigating System Bias and Enabling Better Science. Transactions of the Association for Computational Linguistics, 6, 587–604.

    Article  Google Scholar 

  • Bender, E. M., Gebru, T., McMillan-Major, A., & Shmitchell, S. (2021). On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (pp. 610–623).

    Google Scholar 

  • Beyer, M. A., & Laney, D. (2012). The Importance of Big Data: A Definition (Gartner Report, pp. 1–9).

    Google Scholar 

  • Bridges, K. (2011). Reproducing Race: An Ethnography of Pregnancy as a Site of Racialization. University of California Press.

    Google Scholar 

  • Crawford, K. (2021a). The Atlas of AI: Power, Politics, and the Planetary Costs of Artificial Intelligence. Yale University Press.

    Book  Google Scholar 

  • Crawford, K. (2021b). Time to Regulate AI That Interprets Human Emotions. Nature, 592(8), 167.

    Article  Google Scholar 

  • Curchod, C., Patriotta, G., Cohen, L., & Neysen, N. (2019). Working for an Algorithm: Power Asymmetries and Agency in Online Work Settings. Administrative Science Quarterly, 63(5), 644–676.

    Google Scholar 

  • Erlich, Y., Shor, T., Pe’er, I., & Carmi, S. (2018). Identity Inference of Genomic Data Using Long-Range Familial Searches. Science, 362(6415), 690–694.

    Article  Google Scholar 

  • Faraj, S., Pachidi, S., & Sayegh, K. (2018). Working and Organizing in the Age of the Learning Algorithm. Information and Organization, 28(1), 62–70.

    Article  Google Scholar 

  • Gal, U., Jensen, T. B., & Stein, M.-K. (2020). Breaking the Vicious Cycle of Algorithmic Management: A Virtue Ethics Approach to People Analytics. Information and Organization, 30(2), 1–15.

    Article  Google Scholar 

  • Huselid, M. A. (2018). The Science and Practice of Workforce Analytics: Introduction to the HRM Special Issue. Human Resource Management, 57(3), 679–684.

    Google Scholar 

  • Jackson, P. (1986). Introduction to Expert Systems. osti.gov. https://www.osti.gov/biblio/5675197

  • Jhaver, S., Karpfen, Y., & Antin, J. (2018). Algorithmic Anxiety and Coping Strategies of Airbnb Hosts. Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. ACM, paper 421.

    Google Scholar 

  • Kellogg, K., Valentine, M., & Christin, A. (2020). Algorithms at Work: The New Contested Terrain of Control. Academy of Management Annals, 14(1), 366–410.

    Article  Google Scholar 

  • Lee, M. K., et al. (2015). Working with machines: The impact of algorithmic and data-driven management on human workers. Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems, Seoul, South Korea, ACM.

    Google Scholar 

  • Lepore, J. (2020). If Then: How the Simulmatics Corporation Invented the Future. Liveright Publishing.

    Google Scholar 

  • Lewis, D. (2023). China’s Souped-up Data Privacy Laws Deter Researchers. Nature. Published on March 25, 2021. https://www.nature.com/articles/d41586-023-01638-1. Accessed on July 15, 2023.

  • Loebbecke, C., & Picot, A. (2015). Reflections on Societal and Business Model Transformation Arising from Digitization and Big Data Analytics: A Research Agenda. The Journal of Strategic Information Systems, 24(3), 149–157.

    Article  Google Scholar 

  • Lyytinen, K., & Yoo, Y. (2002). Ubiquitous Computing. Communications of the ACM, 45(12), 63–96.

    Google Scholar 

  • Marabelli, M., Hansen, S., Newell, S., & Frigerio, C. (2017). The Light and Dark Side of the Black Box: Sensor-Based Technology in the Automotive Industry. Communication of the AIS, 40(16), 351–374.

    Google Scholar 

  • Marabelli, M., & Markus, M. L. (2017). Researching Big Data Research: Ethical Implications for Is Scholars. Americas Conference of Information Systems (AMCIS), Boston, MA.

    Google Scholar 

  • Marabelli, M., Newell, S., & Handunge, V. (2021). The Lifecycle of Algorithmic Decision-Making Systems: Organizational Choices and Ethical Challenges. Journal of Strategic Information Systems, 30, 1–15.

    Article  Google Scholar 

  • Marabelli, M., Zaza, S., Masiero, S., Li, J., & Chudoba, K. (2023). Diversity, Equity, and Inclusion in the AIS: Challenges and Opportunities of Remote Conferences. Information Systems Journal, 33(6), 1370–1395.

    Article  Google Scholar 

  • McAfee, A., & Brynjolfsson, E. (2012). Big Data: The Management Revolution. Harvard Business Review, 90(10), 61–67.

    Google Scholar 

  • Mintzberg, H. (1980). Structure in 5’s: A Synthesis of the Research on Organization Design. Management Science, 26(3), 322–341.

    Article  Google Scholar 

  • Möhlmann, M. (2021). Algorithmic Nudges Don’t Have to Be Unethical. Harvard Business Review. https://hbr.org/2021/04/algorithmic-nudges-dont-have-to-be-unethical

  • Möhlmann, M., De Lima, A., Salge, C., & Marabelli, M. (2023). Algorithm Sensemaking: How Platform Workers Make Sense of Algorithmic Management. Journal of the Association for Information Systems, 24(1), 35–64.

    Article  Google Scholar 

  • Möhlmann, M., & Henfridsson, O. (2019, August 30). What People Hate About Being Managed by Algorithms, According to a Study of Uber Drivers. Harvard Business Review.

    Google Scholar 

  • Möhlmann, M., Zalmanson, L., Henfridsson, O., & Gregory, R. W. (2021). Algorithmic Management of Work on Online Labor Platforms: When Matching Meets Control. MIS Quarterly, 45(4), 1999–2022.

    Article  Google Scholar 

  • Newell, S., & Marabelli, M. (2015). Strategic Opportunities (and Challenges) of Algorithmic Decision-Making: A Call for Action on the Long-Term Societal Effects of ‘Datification.’ The Journal of Strategic Information Systems, 24(1), 3–14.

    Article  Google Scholar 

  • Noble, S. U. (2018). Algorithms of Oppression. New York University Press.

    Google Scholar 

  • Nunan, D., & Di Domenico, M. (2022). Value Creation in an Algorithmic World: Towards an Ethics of Dynamic Pricing. Journal of Business Research, 150, 451–460.

    Article  Google Scholar 

  • Sendak, M. P., Ratliff, W., Sarro, D., Alderton, E., Futoma, J., Gao, M., Nichols, M., Revoir, M., Yashar, F., & Miller, C. (2020). Real-World Integration of a Sepsis Deep Learning Technology into Routine Clinical Care: Implementation Study. JMIR Medical Informatics, 8(7), 1–16.

    Article  Google Scholar 

  • Seto, E., Challa, P., & Ware, P. (2021). Adoption of Covid-19 Contact Tracing Apps: A Balance Between Privacy and Effectiveness. Journal of Medical Internet Research, 23(3), e25726.

    Article  Google Scholar 

  • Sommers, S. R., & Marotta, S. A. (2014). Racial Disparities in Legal Outcomes: On Policing, Charging Decisions, and Criminal Trial Proceedings. Policy Insights from the Behavioral and Brain Sciences, 1(1), 103–111.

    Article  Google Scholar 

  • Sriraman, T. (2018). In Pursuit of Proof: A History of Identification Documents in India. Oxford University Press.

    Google Scholar 

  • Taylor, F. W. (1911). The Principles of Scientific Management. Harper & Brothers Publishers.

    Google Scholar 

  • Tursunbayeva, A., Di Lauro, S., & Pagliari, C. (2018). People Analytics—A Scoping Review of Conceptual Boundaries and Value Propositions. International Journal of Information Management, 43, 224–247.

    Article  Google Scholar 

  • Wolfsfeld, G., Segev, E., & Sheafer, T. (2013). Social Media and the Arab Spring: Politics Comes First. The International Journal of Press/Politics, 18(2), 115–137.

    Article  Google Scholar 

  • Zuboff, S. (2019). The Age of Surveillance Capitalism: The Fight for a Human Future at the New Frontier of Power. PublicAffairs.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marco Marabelli .

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Marabelli, M. (2024). Data Extractions and Extractors. In: AI, Ethics, and Discrimination in Business. Palgrave Studies in Equity, Diversity, Inclusion, and Indigenization in Business. Palgrave Macmillan, Cham. https://doi.org/10.1007/978-3-031-53919-0_2

Download citation

Publish with us

Policies and ethics