Skip to main content

Searching the Web by Meaning: A Case Study of Lithuanian News Websites

  • Conference paper
  • First Online:
Book cover Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2015)

Abstract

The daily growth of unstructured textual information created on the Web raises significant challenges when it comes to serving user information needs. On the other hand, evolving Semantic Web technology has influenced a wide body of research towards meaning-based text processing and information retrieval methods, that go beyond classical keyword-driven approaches. However, most of the work in the field targets English as the primary language of interest. Hence, in this paper we present a very first attempt to process unstructured Lithuanian text at the level of ontological semantics. We introduce an ontology-based semantic search framework capable of answering structured natural Lithuanian language questions, discuss its language-dependent design decisions and draw some observations from the results of a recent case study carried out over domain-specific Lithuanian web news corpus.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic indexing. Commun. ACM 18(11), 613–620 (1975)

    Article  MATH  Google Scholar 

  2. Carpineto, C., Romano, G.: A survey of automatic query expansion in information retrieval. ACM Comput. Surv. (CSUR) 44(1), 1 (2012)

    Article  MATH  Google Scholar 

  3. Stokoe, C., Oakes, M.P., Tait, J.: Word sense disambiguation in information retrieval revisited. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 159–166. ACM (2003)

    Google Scholar 

  4. Mangold, C.: A survey and classification of semantic search approaches. Int. J. Metadata Semant. Ontol. 2(1), 23–34 (2007)

    Article  Google Scholar 

  5. Ferrucci, D., Brown, E., Chu-Carroll, J., Fan, J., Gondek, D., Kalyanpur, A., Welty, C., et al.: Building Watson: an overview of the DeepQA project. AI Mag. 31(3), 59–79 (2010)

    Google Scholar 

  6. Šveikauskienė, D., Telksnys, L.: Accuracy of the parsing of Lithuanian simple sentences. Inf. Technol. Control 43(4), 402–413 (2014)

    Google Scholar 

  7. Kiryakov, A., Popov, B., Terziev, I., Manov, D., Ognyanoff, D.: Semantic annotation, indexing, and retrieval. Web Semant.: Sci. Serv. Agents World Wide Web 2(1), 49–79 (2004)

    Article  Google Scholar 

  8. Castells, P., Fernandez, M., Vallet, D.: An adaptation of the vector-space model for ontology-based information retrieval. IEEE Trans. Knowl. Data Eng. 19(2), 261–272 (2007)

    Article  Google Scholar 

  9. Fernández, M., Cantador, I., López, V., Vallet, D., Castells, P., Motta, E.: Semantically enhanced information retrieval: an ontology-based approach. Web Semant.: Sci. Serv. Agents World Wide Web 9(4), 434–452 (2011)

    Article  Google Scholar 

  10. Lopez, V., Uren, V., Sabou, M.R., Motta, E.: Cross ontology query answering on the semantic web: an initial evaluation. In: Proceedings of the Fifth International Conference on Knowledge Capture, pp. 17–24. ACM (2009)

    Google Scholar 

  11. Zinkevičius, V.: Lemuoklis–morfologinei analizei. Darbai ir dienos 24, 245–274 (2000)

    Google Scholar 

  12. Šveikauskienė, D.: Formal description of the syntax of the Lithuanian language. Inf. Technol. Control 34(3), 1–12 (2005)

    MATH  Google Scholar 

  13. Kapociute-Dzikiene, J., Nivre, J., Krupavicius, A.: Lithuanian dependency parsing with rich morphological features. In: Fourth Workshop on Statistical Parsing of Morphologically Rich Languages, p. 12 (2013)

    Google Scholar 

  14. Krilavičius, T., Medelis, Ž., Kapočiūtė-Dzikienė, J., Žalandauskas, T.: News media analysis using focused crawl and natural language processing: case of Lithuanian news websites. In: Skersys, T., Butleris, R., Butkiene, R. (eds.) ICIST 2012. CCIS, vol. 319, pp. 48–61. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33308-8_5

    Chapter  Google Scholar 

  15. Amardeilh, F.: Semantic annotation and ontology population. In: Semantic Web Engineering in the Knowledge Society, 424 p. (2008)

    Google Scholar 

  16. Navigli, R., Ponzetto, S.P.: BabelNet: the automatic construction, evaluation and application of a wide-coverage multilingual semantic network. Artif. Intell. 193, 217–250 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  17. OMG. Semantics of Business Vocabulary and Business Rules (SBVR). Version 1.0, December 2008, OMG Document Number: formal/2008-01-02 (2008)

    Google Scholar 

  18. Goedertier, S., Vanthienen, J.: A vocabulary and execution model for declarative service orchestration. In: Hofstede, A., Benatallah, B., Paik, H.-Y. (eds.) BPM 2007. LNCS, vol. 4928, pp. 496–501. Springer, Heidelberg (2008). doi:10.1007/978-3-540-78238-4_50

    Chapter  Google Scholar 

  19. Damiani, E., Ceravolo, P., Fugazza, C., Reed, K.: Representing and validating digital business processes. In: Filipe, J., Cordeiro, J. (eds.) WEBIST 2007. LNBIP, vol. 8, pp. 19–32. Springer, Heidelberg (2008). doi:10.1007/978-3-540-68262-2_2

    Chapter  Google Scholar 

  20. Karpovič, J., Kriščiūnienė, G., Ablonskis, L., Nemuraitė, L.: The comprehensive mapping of semantics of business vocabulary and business rules (SBVR) to OWL 2 ontologies. Inf. Technol. Control 43(3), 289–302 (2014)

    Google Scholar 

  21. Sukys, A., Nemuraite, L., Paradauskas, B., Sinkevicius, E.: Transformation framework for SBVR based semantic queries in business information systems. In: The Second International Conference on Business Intelligence and Technology, BUSTECH 2012, pp. 19–24 (2012)

    Google Scholar 

  22. Sukys, A., Nemuraite, L., Paradauskas, B.: Representing and transforming SBVR question patterns into SPARQL. In: Skersys, T., Butleris, R., Butkiene, R. (eds.) ICIST 2012. CCIS, vol. 319, pp. 436–451. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33308-8_36

    Chapter  Google Scholar 

  23. Bernotaityte, G., Nemuraite, L., Butkiene, R., Paradauskas, B.: Developing SBVR vocabularies and business rules from OWL2 ontologies. In: Skersys, T., Butleris, R., Butkiene, R. (eds.) ICIST 2013. CCIS, vol. 403, pp. 134–145. Springer, Heidelberg (2013). doi:10.1007/978-3-642-41947-8_13

    Chapter  Google Scholar 

  24. Shekarpour, S., Marx, E., Ngomo, A.C.N., Auer, S.: Sina: semantic interpretation of user queries for question answering on interlinked data. Web Semant.: Sci. Serv. Agents World Wide Web 30, 39–51 (2015)

    Article  Google Scholar 

  25. Yao, X., Van Durme, B.: Information extraction over structured data: question answering with freebase. In: Proceedings of ACL (2014)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tomas Vileiniškis .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Vileiniškis, T., Šukys, A., Butkienė, R. (2016). Searching the Web by Meaning: A Case Study of Lithuanian News Websites. In: Fred, A., Dietz, J., Aveiro, D., Liu, K., Filipe, J. (eds) Knowledge Discovery, Knowledge Engineering and Knowledge Management. IC3K 2015. Communications in Computer and Information Science, vol 631. Springer, Cham. https://doi.org/10.1007/978-3-319-52758-1_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-52758-1_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-52757-4

  • Online ISBN: 978-3-319-52758-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics