Multi-objective Query Optimization Using Topic Ontologies

  • Rocío L. Cecchini
  • Carlos M. Lorenzetti
  • Ana G. Maguitman
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5822)


Formulating search queries based on a thematic context is a challenging problem due to the large number of combinations in which terms can be used to reflect the topic of interest. This paper presents a novel approach to learn topical queries that simultaneously satisfy multiple retrieval objectives. The proposed method consists in using a topic ontology to train an Evolutionary Algorithm that incrementally moves a population of queries towards the proposed objectives. We present an analysis of different single- and multi-objective strategies, discuss their strengths and limitations and test the most promising strategies on a large set of labeled Web pages. Our evaluations indicate that the tested strategies that apply multi-objective Evolutionary Algorithms are significantly superior to a baseline approach that attempts to generate queries directly from a topic description.


topic ontologies topical queries semantic similarity multi-objective evolutionary algorithms query effectiveness 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley, Reading (1999)Google Scholar
  2. 2.
    Bleuler, S., Laumanns, M., Thiele, L., Zitzler, E.: PISA – A Platform and Programming Language Independent Interface for Search Algorithms. In: Fonseca, C.M., Fleming, P.J., Zitzler, E., Deb, K., Thiele, L. (eds.) EMO 2003. LNCS, vol. 2632, pp. 494–508. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  3. 3.
    Budzik, J., Hammond, K.J., Birnbaum, L.: Information access in context. Knowledge based systems 14(1-2), 37–53 (2001)CrossRefGoogle Scholar
  4. 4.
    Cecchini, R.L., Lorenzetti, C.M., Maguitman, A.G., Brignole, N.B.: Using genetic algorithms to evolve a population of topical queries. Information Processing and Management 44(6), 1863–1878 (2008)CrossRefGoogle Scholar
  5. 5.
    Chakrabarti, S., van den Berg, M., Dom, B.: Focused crawling: a new approach to topic-specific Web resource discovery. Computer Networks (Amsterdam, Netherlands: 1999) 31(11–16), 1623–1640 (1999)Google Scholar
  6. 6.
    Coello Coello, C.A., Lamont, G.B., Van Veldhuizen, D.A.: Evolutionary Algorithms for Solving Multi-Objective Problems. Springer, New York (2007)Google Scholar
  7. 7.
    Cormen, T.H., Leiserson, C.E., Rivest, R.L.: Introduction to Algorithms. MIT Press, Cambridge (1990)zbMATHGoogle Scholar
  8. 8.
    Deb, K.: Multi–Objective Optimization Using Evolutionary Algorithms. John Wiley & Sons, Chichester (2004)Google Scholar
  9. 9.
    Deb, K., Agrawal, S., Pratap, A., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Transactions Evolutionary Computation 6(2), 182–197 (2002)CrossRefGoogle Scholar
  10. 10.
    Eiben, A.E., Smith, J.E.: Introduction to Evolutionary Computing. Springer, Heidelberg (2003)zbMATHGoogle Scholar
  11. 11.
    Gordon, M.: Probabilistic and genetic algorithms in document retrieval. Communications of the ACM 31(10), 1208–1218 (1988)CrossRefGoogle Scholar
  12. 12.
    Holland, J.H.: Adaptation in natural and artificial systems. The University of Michigan Press, Ann Arbor (1975)Google Scholar
  13. 13.
    Kautz, H., Selman, B., Shah, M.: The hidden Web. AI Magazine 18(2), 27–36 (1997)Google Scholar
  14. 14.
    Kushchu, I.: Web-based evolutionary and adaptive information retrieval. IEEE Transactions on Evolutionary Computation 9(2), 117–125 (2005)CrossRefGoogle Scholar
  15. 15.
    Lin, D.: An information-theoretic definition of similarity. In: International Conference on Machine Learning, pp. 296–304. Morgan Kaufmann Publishers Inc., San Francisco (1998)Google Scholar
  16. 16.
    Lin, F., He, G.: An improved genetic algorithm for multi-objective optimization. In: PDCAT, pp. 938–940. IEEE Computer Society Press, Los Alamitos (2005)Google Scholar
  17. 17.
    Maguitman, A.G., Menczer, F., Roinestad, H., Vespignani, A.: Algorithmic detection of semantic similarity. In: WWW 2005, pp. 107–116. ACM, New York (2005)Google Scholar
  18. 18.
    Menczer, F., Pant, G., Srinivasan, P.: Topical web crawlers: Evaluating adaptive algorithms. ACM Transactions on Internet Technology 4(4), 378–419 (2004)CrossRefGoogle Scholar
  19. 19.
    Nick, Z.Z., Themis, P.: Web search using a genetic algorithm. IEEE Internet Computing 5(2), 18–26 (2001)CrossRefGoogle Scholar
  20. 20.
    Ounis, I., Lioma, C., Macdonald, C., Plachouras, V.: Research directions in Terrier: a search engine for advanced retrieval on the web. Novatica/UPGRADE Special Issue on Web Information Access VIII(1), 49–56 (2007)Google Scholar
  21. 21.
    Pareto, V.: Cours d’Economie Politique, Droz, Genève (1896)Google Scholar
  22. 22.
    Petry, F.E., Buckles, B.P., Prabhu, D., Kraft, D.H.: Fuzzy information retrieval using genetic algorithms and relevance feedback. In: ASIS Annual Meeting, vol. 30, pp. 122–125 (1993)Google Scholar
  23. 23.
    Raghavan, V., Agarwal, B.: Optimal determination of user-oriented clusters: an application for the reproductive plan. In: International Conference on Genetic algorithms and their application, pp. 241–246. Lawrence Erlbaum Associates, Inc., Mahwah (1987)Google Scholar
  24. 24.
    Van Rijsbergen, C.J.: Information Retrieval. Butterworth-Heinemann, Butterworths (1979)Google Scholar
  25. 25.
    Somlo, G., Howe, A.E.: Querytracker: An agent for tracking persistent information needs. In: AAMAS 2004, pp. 488–495. IEEE Computer Society, Los Alamitos (2004)Google Scholar
  26. 26.
    Yang, J.-J., Korfhage, R.: Query optimization in information retrieval using genetic algorithms. In: International Conference on Genetic Algorithms, pp. 603–613. Morgan Kaufmann Publishers Inc., San Francisco (1993)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Rocío L. Cecchini
    • 1
    • 2
  • Carlos M. Lorenzetti
    • 1
    • 3
  • Ana G. Maguitman
    • 1
    • 3
  1. 1.Depto de Ciencias e Ingeniería de la ComputaciónUniversidad Nacional del SurBahía BlancaArgentina
  2. 2.LIDeCC - Laboratorio de Investigación y Desarrollo en Computación Científica 
  3. 3.LIDIA - Laboratorio de Investigación y Desarrollo en Inteligencia Artificial 

Personalised recommendations