Benefits of Unstructured Data for Industrial Quality Analysis

  • Christian Hänig
  • Martin Schierle
  • Daniel Trabold
Part of the Lecture Notes in Electrical Engineering book series (LNEE, volume 103)


Although Natural Language Processing (NLP) methods gained a lot of scientific interest over the past few decades, industrial use cases are still rare. Companies used to have mainly structured data (if there were data warehouses at all), and NLP methods were often complicated, unstandardized or just too slow.




  1. Biemann C (2006) Unsupervised part-of-speech tagging employing efficient graph clustering. In: Proceedings of the COLING/ACL-06 student research workshop, Sydney, AustraliaGoogle Scholar
  2. Cavnar WB, Trenkle JM (1994) N-gram-based text categorization. In: Proceedings of SDAIR-94, 3rd annual symposium on document analysis and information retrieval, Las Vegas. pp 161–175Google Scholar
  3. Cunningham H (2000) Software architecture for language engineering. PhD thesis, University of Sheffield.
  4. Ferrucci D, Lally A (2004) UIMA: an architectural approach to unstructured information processing in the corporate research environment. Nat Lang Eng 10(3–4):327–348CrossRefGoogle Scholar
  5. Hänig C (2010) Improvements in unsupervised co-occurrence based parsing. In: Proceedings of the fourteenth conference on computational natural language learning, Uppsala. Association for Computational Linguistics, Uppsala, pp 1–8Google Scholar
  6. Hänig C, Bordag, S, Quasthoff, U (2008) Unsuparse: unsupervised parsing with unsupervised part of speech tagging. In: Proceedings of the sixth international conference on language resources and evaluation (LREC’08), MarrakechGoogle Scholar
  7. Hänig C, Schierle M (2009) Relation extraction based on unsupervised syntactic parsing. In: Proceedings of the conference on text mining services, LeipzigGoogle Scholar
  8. Hänig C, Schierle M, Trabold, D (2010) Comparison of structured vs. unstructured data for industrial quality analysis. In: Lecture notes in engineering and computer science: proceedings of the world congress on engineering and computer science 2010, San Francisco. pp 432–438Google Scholar
  9. Miller, GA (1995) Wordnet: a lexical database for English. Commun ACM 38:39–41CrossRefGoogle Scholar
  10. Schierle M, Schulz S (2007) Bootstrapping algorithms for an application in the automotive domain. In: Proceedings of the sixth international conference on machine learning and applications, Los Alamitos. IEEE Computer Society, pp 198–203Google Scholar
  11. Schierle M, Trabold D (2010) Multilingual knowledge-based concept recognition in textual data. In: Advances in data analysis, data handling and business intelligence. Studies in classification, data analysis, and knowledge organization. Springer, Berlin/Heidelberg, pp 327–336Google Scholar
  12. van Dongen SM (2000) Graph clustering by flow simulation. PhD thesis, University of UtrechtGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  • Christian Hänig
    • 1
  • Martin Schierle
  • Daniel Trabold
  1. 1.Daimler AG, Research and TechnologyUlmGermany

Personalised recommendations