Abstract
This chapter first throws light on the standard data file types with their usage, advantages, and disadvantages. In a digital library, data might be useless and considered incomplete without a metadata record. Therefore, the functions, uses, components, and importance of metadata are covered comprehensively, followed by steps to create quality metadata, common metadata standards available, different metadata repositories, common concerns, and solutions. The second part of the chapter focuses on the importance of the inclusion of optical character recognition (OCR) for digitized data, followed by different ways of getting data from (i) online repositories, (ii) relational databases, (iii) web APIs, and (iv) web/screen scraping to start a text mining project. Further, several online repositories, language corpora, and repositories with APIs available for text mining are enumerated. Finally, some of the essential applications of APIs for librarians and for what purpose librarians can use them in their day-to-day work are covered in this chapter.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Eagle N, Macy M, Claxton R (2010) Network diversity and economic development. Science 328:1029–1031. https://doi.org/10.1126/science.1186605
Xu S (2018) Issues in the interpretation of “Altmetrics” digital traces: a review. Front Res Metr Anal 3. https://doi.org/10.3389/frma.2018.00029
Salganik MJ (2017) Bit by bit: social research in the digital age. Princeton University Press, Princeton
Nicholson S (2003) The bibliomining process: data warehousing and data mining for library decision making. Inf Technol Libr 22(4):146–150
Breeding M (2014) The systems librarian: APIs unify library services. Comput Libr 34(3). http://www.infotoday.com/cilmag/apr14/Breeding--APIs-Unify-Library-Services.shtml. Accessed 26 Jul 2020
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Lamba, M., Madhusudhan, M. (2022). Text Data and Where to Find Them?. In: Text Mining for Information Professionals. Springer, Cham. https://doi.org/10.1007/978-3-030-85085-2_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-85085-2_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-85084-5
Online ISBN: 978-3-030-85085-2
eBook Packages: Computer ScienceComputer Science (R0)