The Dicode Data Mining Services

Friesen, Natalja; Jakob, Max; Kindermann, Jörg; Maassen, Doris; Poigné, Axel; Rüping, Stefan; Trabold, Daniel

doi:10.1007/978-3-319-02612-1_5

Natalja Friesen³,
Max Jakob⁴,
Jörg Kindermann³,
Doris Maassen⁴,
Axel Poigné³,
Stefan Rüping³ &
…
Daniel Trabold³

Part of the book series: Studies in Big Data ((SBD,volume 5))

1565 Accesses
3 Citations

Abstract

Real world problems in society, science or economics need human structuring, interpretation and decision making, the limiting factor being the amount of time and effort that the user can invest in the sense-making process. The Dicode data mining services intend to help in clearly defined steps of the sense-making process, where human capacity is most limited and the impact of automatic solutions is most profound. This includes recommendation services to search and filter information, text mining services to search for new information und unknown relations in data, and subgroup discovery services to find and evaluate hypotheses on data. This chapter provides an overview of the data mining services developed in the context of the Dicode project. It addresses the usability of the services and indicates which big data technologies are being used to deal with very large data collections.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Big Data Tools for Tasks

Big Data: Concepts, Challenges and Applications

Various Strategies and Technical Aspects of Data Mining: A Theoretical Approach

Notes

1.
Developers interested in using the statistics might have a look at Max Jacob’s talk at the Berlin Buzzwords Conference 2012 which explains the extraction of Wikipedia statistics in detail: http://vimeo.com/45123391.

References

Han, J., Pei, J., Yiwen, Y., Mao, R.: Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min. Knowl. Discov. 8, 53–87 (2004)
Google Scholar
Lavrac, N., Kavšek, B., Flach, P., Todorovski, L.: Subgroup discovery with CN2-SD. J. Mach. Learn. Res. 5, 153–188 (2004)
Google Scholar
van Leeuwen, M., Knobbe, A.: Diverse subgroup set discovery. Data Min. Knowl. Discov. 25(2), 208–242. Springer, The Netherlands (2012)
Google Scholar
Grosskreutz, H., Paurat, D., Rüping, S.: An enhanced relevance criterion for more concise supervised pattern discovery. In: The 18th Annual ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2012)
Google Scholar
Cantone, D., Ferro, A., Pulvirenti, A., Recupero, D.R., Shasha, D.: Antipole tree indexing to support range search and k-nearest neighbor search in metric spaces. IEEE/TKDE. 17(4), 535–550 (2005)
Google Scholar
Friesen, N., Rüping, S.: Distance metric learning for recommender systems in complex domains mastering data-intensive collaboration through the synergy of human and machine reasoning (dicoSyn 2012). In: A Workshop at CSCW 2012, Seattle, WA, 12 Feb 2012
Google Scholar
Thurau, C., Kersting, K., Wahabzada, M., Bauckhage, C.: Convex non-negative matrix factorization for massive datasets. Knowl. Inf. Syst. 29(2), 457–478 (2010)
Article Google Scholar
Trajkovsky, I.: Functional Interpretation of Gene Expression Data: Translating High-Throughput DNA Microarray Data into Useful Biological Knowledge. LAP LAMBERT Academic Publishing. (2011). ISBN: 978-3-8473-1475-2
Google Scholar
Paass, G., Kindermann, J.: Entity and relation extraction in texts with semi- super-vised extensions. In: Tresp, V., Bundschus, M., Rettinger, A., Huang, Y. (eds.), Security Informatics and Terrorism: Social and Technical Problems of Detecting and Controlling Terrorists’ Use of the World Wide Web; Proceedings of the NATO Advanced Research Workshop on Security Informatics and Terrorism—Patrolling the Web, vol. 15, p. 132. IOS Press, Beer-Sheva (2008)
Google Scholar
Cucerzan, S.: Large-scale named entity disambiguation based on Wikipedia data. In: Proceedings of EMNLP-CoNLL, pp. 708–716 (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

Fraunhofer IAIS, Schloss Birlinghoven, 53754, Sankt Augustin, Germany
Natalja Friesen, Jörg Kindermann, Axel Poigné, Stefan Rüping & Daniel Trabold
Neofonie GMBH, 10115, Berlin, Germany
Max Jakob & Doris Maassen

Authors

Natalja Friesen
View author publications
You can also search for this author in PubMed Google Scholar
Max Jakob
View author publications
You can also search for this author in PubMed Google Scholar
Jörg Kindermann
View author publications
You can also search for this author in PubMed Google Scholar
Doris Maassen
View author publications
You can also search for this author in PubMed Google Scholar
Axel Poigné
View author publications
You can also search for this author in PubMed Google Scholar
Stefan Rüping
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Trabold
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Natalja Friesen .

Editor information

Editors and Affiliations

University of Patras and Computer Technology Institute & Press "Diophantus", Rio Patras, Greece
Nikos Karacapilidis

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Friesen, N. et al. (2014). The Dicode Data Mining Services. In: Karacapilidis, N. (eds) Mastering Data-Intensive Collaboration and Decision Making. Studies in Big Data, vol 5. Springer, Cham. https://doi.org/10.1007/978-3-319-02612-1_5

Download citation

DOI: https://doi.org/10.1007/978-3-319-02612-1_5
Published: 06 April 2014
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-02611-4
Online ISBN: 978-3-319-02612-1
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

The Dicode Data Mining Services

Abstract

Access this chapter

Similar content being viewed by others

Big Data Tools for Tasks

Big Data: Concepts, Challenges and Applications

Various Strategies and Technical Aspects of Data Mining: A Theoretical Approach

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Navigation

The Dicode Data Mining Services

Abstract

Access this chapter

Similar content being viewed by others

Big Data Tools for Tasks

Big Data: Concepts, Challenges and Applications

Various Strategies and Technical Aspects of Data Mining: A Theoretical Approach

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation