Skip to main content

Web Based Engine for Processing and Clustering of Polish Texts

  • Conference paper
Book cover Theory and Engineering of Complex Systems and Dependability (DepCoS-RELCOMEX 2015)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 365))

Included in the following conference series:

Abstract

The paper presents a service oriented, online engine for processing and clustering texts in the Polish language. The engine, designed according to Web-Oriented Architecture paradigm, allows to run a large number of different language tools (like tagger, named entity recognizer, feature extractor) and clustering tools (like CLUTO or R) from almost any type of applications including HTML/JavaScript’s ones. It allows constructing of a complex workflow, not only a simple chain of tools. To meet high availability requirements, the engine is deployed in a private cloud.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Broda, B., Kędzia, P., Marcińczuk, M., Radziszewski, A., Ramocki, R., Wardyński, A.: Fextor: A feature extraction framework for natural language processing: A case study in word sense disambiguation, relation recognition and anaphora resolution. In: Przepiórkowski, A., Piasecki, M., Jassem, K., Fuglewicz, P. (eds.) Computational Linguistics. SCI, vol. 458, pp. 41–62. Springer, Heidelberg (2013)

    Google Scholar 

  2. Eder, M.: Rolling stylometry. DSH: Digital Scholarship in the Humanities, vol. 30 (in press, 2015)

    Google Scholar 

  3. Hinrichs, M., Zastrow, T., Hinrichs, E.: WebLicht: Web-based LRT Services in a Distributed eScience Infrastructure. In: Proceedings of the International Conference on Language Resources and Evaluation, pp. 489–493. European Language Resources Association (2010)

    Google Scholar 

  4. Kuta, M., Kitowski, J.: Clustering Polish Texts with Latent Semantic Analysis. In: Rutkowski, L., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2010, Part II. LNCS, vol. 6114, pp. 532–539. Springer, Heidelberg (2010)

    Google Scholar 

  5. Marcińczuk, M., Kocoń, J., Janicki, M.: Liner2 — A Customizable Framework for Proper Names Recognition for Polish. In: Bembenik, R., Skonieczny, Ł., Rybiński, H., Kryszkiewicz, M., Niezgódka, M. (eds.) Intell. Tools for Building a Scientific Information. SCI, vol. 467, pp. 231–254. Springer, Heidelberg (2013)

    Google Scholar 

  6. Ogrodniczuk, M., Lenart, M.: A multi-purpose online toolset for NLP applications. In: Métais, E., Meziane, F., Saraee, M., Sugumaran, V., Vadera, S. (eds.) NLDB 2013. LNCS, vol. 7934, pp. 392–395. Springer, Heidelberg (2013)

    Google Scholar 

  7. Radziszewski, A., Śniatowski, T.: Maca: a configurable tool to integrate Polish morphological data. In: International Workshop on Free/Open-Source Rule-Based Machine Translation, pp. 29–36 (2011)

    Google Scholar 

  8. Radziszewski, A.: A tiered CRF tagger for polish. In: Bembenik, R., Skonieczny, Ł., Rybiński, H., Kryszkiewicz, M., Niezgódka, M. (eds.) Intell. Tools for Building a Scientific Information. SCI, vol. 467, pp. 215–230. Springer, Heidelberg (2013)

    Google Scholar 

  9. Thies, G., Gottfried, V.: Web-oriented architectures: On the impact of web 2.0 on service-oriented architectures. In: Asia-Pacific Services Computing Conference, pp.1075–1082 (2008)

    Google Scholar 

  10. Wittenburg, P., et al.: Resource and Service Centres as the Backbone for a Sustainable Service Infrastructure. In: Proceedings of the International Conference on Language Resources and Evaluation, pp. 60–63. European Language Resources Association (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tomasz Walkowiak .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Walkowiak, T. (2015). Web Based Engine for Processing and Clustering of Polish Texts. In: Zamojski, W., Mazurkiewicz, J., Sugier, J., Walkowiak, T., Kacprzyk, J. (eds) Theory and Engineering of Complex Systems and Dependability. DepCoS-RELCOMEX 2015. Advances in Intelligent Systems and Computing, vol 365. Springer, Cham. https://doi.org/10.1007/978-3-319-19216-1_49

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-19216-1_49

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-19215-4

  • Online ISBN: 978-3-319-19216-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics