Skip to main content

Enterprise Dark Data

  • Conference paper
  • First Online:
Data Analysis and Classification (SKAD 2020)

Abstract

The increasing amount of digital data and the declining cost of data storage have led to the fact that companies began collecting any the data possible, regardless of its adequacy and usability. This results in increasingly diverse data, in terms of its structure, quality, availability and the source of origin. Dark data is one type of data that increases significantly as the volume of data expands. Scientific literature does not precisely define the term “dark data”, while its interpretation among scientists is ambiguous. The aim of this article entails an attempt to define the dark data occurring in an enterprise, by identification of its essential features. The article presents an overview of the definitions of the term dark data, a proposal of its interpretation, and a classification of data in a company with regard to: usability, availability and quality. The analysis of the concept of dark data was carried out via a review of international journals and articles published on the Internet by Data Science practitioners. As part of the research, four universal features of dark datasets have been indicated (unavailability, unawareness, uselessness, and costliness). Based on data availability and its quality, four groups of enterprise data have also been distinguished. The data classification developed in this way allowed systematization of the term “dark data”.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 179.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    The Digital 2020 report is a global overview of Internet users, mobile devices, social networks and e-commerce, organized by Hootsuite and We are social. The statistics published on a quarterly basis refer to global and national data (https://wearesocial.com/digital-2020. Accessed 20 Aug 2020).

  2. 2.

    IDC report (https://www.seagate.com/files/www-content/our-story/trends/files/idc-seagate-dataage-whitepaper.pdf. Accessed 10 Sept 2020).

  3. 3.

    These principles refer to the national statistical authorities as well as the EU statistical authority (Eurostat) and constitute a set of features characterizing the data quality for official statistics (https://ec.europa.eu/eurostat/web/products-catalogues/-/KS-02-18-142. Accessed 14 Sept 2020).

  4. 4.

    A more extensive explanation of good quality data can be found in (European Statistics Code of Practice 2017).

References

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Katarzyna Raca .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Raca, K. (2021). Enterprise Dark Data. In: Jajuga, K., Najman, K., Walesiak, M. (eds) Data Analysis and Classification. SKAD 2020. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Cham. https://doi.org/10.1007/978-3-030-75190-6_8

Download citation

Publish with us

Policies and ethics