Skip to main content

Text Data and Mining Ethics

  • 1258 Accesses


Before leaping to the critical legal and ethical issues related to text mining, it is vital to comprehend (i) the importance of data management for text mining, (ii) the lifecycle of research data, (iii) data management plan that strategizes the various data security, legal, and ethical constraints, (iv) data citation, and (v) data sharing. This chapter covers all the above-stated concepts in addition to legal and ethical issues related to text mining (such as copyright, licenses, fair use, creative commons, digital management rights), algorithm confounding, and social media research. It further presents text mining licensing conditions by selected prominent publishers and a “do’s and dont’s” list to help library professionals conduct text mining efficiently.

This is a preview of subscription content, log in via an institution.

Buying options

USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
USD   64.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Martone M (ed) (2014) Data Citation Synthesis Group: joint declaration of data citation principles, San Diego, CA: FORCE11.

  2. McNeice K, Caspers M, Gavriilidou M (2017) FutureTDM: reducing barriers and increasing uptake of text and data mining for research environments using a collaborative knowledge and open information approach. Accessed 5 Nov 2020

  3. Wilkinson MD, Dumontier M, Aalbersberg IjJ, Appleton G, Axton M, Baak A, Blomberg N, Boiten J-W, da Silva Santos LB, Bourne PE, Bouwman J, Brookes AJ, Clark T, Crosas M, Dillo I, Dumon O, Edmunds S, Evelo CT, Finkers R, Gonzalez-Beltran A, Gray AJG, Groth P, Goble C, Grethe JS, Heringa J, ’t Hoen PAC, Hooft R, Kuhn T, Kok R, Kok J, Lusher SJ, Martone ME, Mons A, Packer AL, Persson B, Rocca-Serra P, Roos M, van Schaik R, Sansone S-A, Schultes E, Sengstag T, Slater T, Strawn G, Swertz MA, Thompson M, van der Lei J, van Mulligen E, Velterop J, Waagmeester A, Wittenburg P, Wolstencroft K, Zhao J, Mons B (2016) The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data 3:160018.

  4. Creative Commons (2020) About the licenses. Accessed 6 Nov 2020

  5. Finck M, Moscon V (2019) Copyright law on blockchains: between new forms of rights administration and digital rights management 2.0. IIC 50:77–108.

    Article  Google Scholar 

  6. Townsend L (2017) Social media research & ethics. SAGE research methods [streaming video]. SAGE, London. Accessed 26 Feb 2021

  7. Berends F (2020) Library guides: text mining & text analysis: considerations - ethics, copyright, licencing, etiquette. Accessed 6 Nov 2020

  8. Ducato R, Strowel A (2019) Limitations to text and data mining and consumer empowerment: making the case for a right to “Machine Legibility.” IIC 50:649–684.

    Article  Google Scholar 

  9. Caplan R, Donovan J, Hanson L, Matthews J (2018) Algorithmic accountability: a primer, data & society. Accessed 8 Nov 2020

  10. Ntoutsi E, Fafalios P, Gadiraju U, Iosifidis V, Nejdl W, Vidal M-E, Ruggieri S, Turini F, Papadopoulos S, Krasanakis E, Kompatsiaris I, Kinder-Kurlanda K, Wagner C, Karimi F, Fernandez M, Alani H, Berendt B, Kruegel T, Heinze C, Broelemann K, Kasneci G, Tiropanis T, Staab S (2020) Bias in data-driven artificial intelligence systems—an introductory survey. WIREs Data Min Knowl Discovery 10:e1356.

    Google Scholar 

  11. Lepri B, Oliver N, Letouzé E, Pentland A, Vinck P (2018) Fair, transparent, and accountable algorithmic decision-making processes. Philos Technol 31:611–627.

    Article  Google Scholar 

  12. Booker C (2019) Booker, Wyden, Clarke introduce bill requiring companies to target bias in corporate algorithms. Accessed 12 Nov 2020

  13. Butler D (2013) When Google got flu wrong. Nat News 494:155.

    Article  Google Scholar 

  14. Cirillo D, Catuara-Solarz S, Morey C, Guney E, Subirats L, Mellino S, Gigante A, Valencia A, Rementeria MJ, Chadha AS, Mavridis N (2020) Sex and gender differences and biases in artificial intelligence for biomedicine and healthcare. npj Digit Med 3:1–11.

  15. Diaz M, Johnson I, Lazar A et al (2018) Addressing age-related bias in sentiment analysis. In: Proceedings of the 2018 CHI conference on human factors in computing systems. Association for Computing Machinery, New York, NY, pp 1–14

    Google Scholar 

Download references

Author information

Authors and Affiliations


Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Lamba, M., Madhusudhan, M. (2022). Text Data and Mining Ethics. In: Text Mining for Information Professionals. Springer, Cham.

Download citation

  • DOI:

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-85084-5

  • Online ISBN: 978-3-030-85085-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics