Discovery of Weather Forecast Web Resources Based on Ontology and Content-Driven Hierarchical Classification

Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 383)


Monitoring of environmental information is critical both for the evolvement of important environmental events, as well as for everyday life activities. In this work, we focus on the discovery of web resources that provide weather forecasts. To this end we submit domain-specific queries to a general purpose search engine and post process the results by introducing a hierarchical two layer classification scheme. The top layer includes two classification models: a) the first is trained using ontology concepts as textual features; b) the second is trained using textual features that are learned from a training corpus. The bottom layer includes a hybrid classifier that combines the results of the top layer. We evaluate the proposed technique by discovering weather forecast websites for cities of Finland and compare the results with previous works.


Environmental weather forecast classification ontology 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Epitropou, V., Karatzas, K.D., Bassoukos, A., Kukkonen, J., Balk, T.: A new environmental image processing method for chemical weather forecasts in Europe. In: Proceedings of 5th International Symposium on Information Technologies in Environmental Engineering, Poznan. Springer Series: Environmental Science and Engineering, pp. 781–791 (2011)Google Scholar
  2. 2.
    Moumtzidou, A., Vrochidis, S., Tonelli, S., Kompatsiaris, I., Pianta, E.: Discovery of Environmental Nodes in the Web. In: Proceedings of 5th IRF Conference, Austria, Vienna (2012)Google Scholar
  3. 3.
    Oyama, S., Kokubo, T., Ishida, T.: Domain-Specific Web Search with Keyword Spices. IEEE Transactions on Knowledge and Data Engineering 16, 17–27 (2004)CrossRefGoogle Scholar
  4. 4.
    Menemenis, F., Papadopoulos, S., Bratu, B., Waddington, S., Kompatsiaris, Y.: AQUAM: Automatic Query Formulation Architecture for Mobile Applications. In: Proceedings of 7th International Conference on Mobile and Ubiquitous Multimedia, MUM 2008, December 3-5. ACM, New York (2008)Google Scholar
  5. 5.
    Chen, H., Fan, H., Chau, M., Zeng, D.: MetaSpider: Meta-Searching and Categorization on the Web. Journal of the American Society for Information Science and Technology 52(13), 1134–1147 (2001)CrossRefGoogle Scholar
  6. 6.
    Luong, H.P., Gauch, S., Wang, Q.: Ontology-Based Focused Crawling. In: Int. Conference on Information, Process, and Knowledge Management, pp. 123–128 (2009)Google Scholar
  7. 7.
    Qi, X., Davison, B.D.: Web page classification: Features and algorithms. ACM Comput. Surv. 41(2), 31 pages, Article 12 (2009)CrossRefGoogle Scholar
  8. 8.
    Mladenic, D.: Turning Yahoo into an automatic Web-page classifier. In: Proceedings of the European Conference on Artificial Intelligence, pp. 473–474 (1998)Google Scholar
  9. 9.
    Golub, K., Ardö, A.: Importance of HTML structural elements and metadata in automated subject classification. In: Rauber, A., Christodoulakis, S., Tjoa, A.M. (eds.) ECDL 2005. LNCS, vol. 3652, pp. 368–378. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  10. 10.
    Kwon, O.-W., Lee, J.-H.: Text categorization based on k-nearest neighbor approach for Web site classification. Inform. Process. Manage. 29(1), 25–44 (2003)CrossRefGoogle Scholar
  11. 11.
    Shen, D., Chen, Z., Yang, Q., Zeng, H.-J., Zhang, B., Lu, Y., Ma, W.-Y.: Web-page classification through summarization. In: Proceedings of 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 242–249. ACM Press, New York (2004)Google Scholar
  12. 12.
    Jet Propulsion Laboratory: Semantic Web for Earth and Environmental Terminology (SWEET),
  13. 13.
    Rospocher, M., Serafini, L.: An Ontological Framework for Decision Support. In: 2nd Joint International Semantic Technology Conference (JIST 2012), Nara, Japan (2012)Google Scholar
  14. 14.
    Girardi, C.: The HLT Web Manager. FBK Technical Report n. 23969 (2011)Google Scholar
  15. 15.
    Calado, P., Cristo, M., Moura, E., Ziviani, N., Ribeiro-Neto, B., Gonçalves, M.A.: Combining link-based and content-based methods for web document classification. In: Proceedings of 12th International Conference on Information and Knowledge Management, New Orleans, LA, USA, pp. 394–401 (2003)Google Scholar
  16. 16.
    Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2, 27:1–27:27 (2011)Google Scholar
  17. 17.
    Wanner, L., et al.: Personalized environmental service orchestration for quality of life improvement. In: Iliadis, L., Maglogiannis, I., Papadopoulos, H., Karatzas, K., Sioutas, S. (eds.) Artificial Intelligence Applications and Innovations, Part II. IFIP AICT, vol. 382, pp. 351–360. Springer, Heidelberg (2012)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  1. 1.Centre for Research and Technology HellasInformation Technologies InstituteGreece

Personalised recommendations