Phrase Pair Classification for Identifying Subtopics

  • Sujatha Das
  • Prasenjit Mitra
  • C. Lee Giles
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7224)


Automatic identification of subtopics for a given topic is desirable because it eliminates the need for manual construction of domain-specific topic hierarchies. In this paper, we design features based on corpus statistics to design a classifier for identifying the (subtopic, topic) links between phrase pairs. We combine these features along with the commonly-used syntactic patterns to classify phrase pairs from datasets in Computer Science and WordNet. In addition, we show a novel application of our is-a-subtopic-of classifier for query expansion in Expert Search and compare it with pseudo-relevance feedback.


hypernym classification expert search query expansion 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Begelman, G., Keller, P., Smadja, F.: Automated tag clustering: Improving search and exploration in the tag space. In: WWW (2006)Google Scholar
  2. 2.
    Deng, H., King, I., Lyu, M.R.: Formal models for expert finding on dblp bibliography data. In: ICDM (2008)Google Scholar
  3. 3.
    Etzioni, O., Cafarella, M., Downey, D., Kok, S., Popescu, A.-M., Shaked, T., Soderland, S., Weld, D.S., Yates, A.: Web-scale information extraction in knowitall: (preliminary results). In: WWW (2004)Google Scholar
  4. 4.
    Hearst, M.A.: Automatic acquisition of hyponyms from large text corpora. In: COLING (1992)Google Scholar
  5. 5.
    Lavrenko, V., Croft, W.B.: Relevance based language models. In: SIGIR (2001)Google Scholar
  6. 6.
    Lawrie, D., Croft, W.B.: Discovering and comparing topic hierarchies. In: RIAO (2000)Google Scholar
  7. 7.
    Lin, H., Davis, J., Zhou, Y.: An Integrated Approach to Extracting Ontological Structures from Folksonomies. In: Aroyo, L., Traverso, P., Ciravegna, F., Cimiano, P., Heath, T., Hyvönen, E., Mizoguchi, R., Oren, E., Sabou, M., Simperl, E. (eds.) ESWC 2009. LNCS, vol. 5554, pp. 654–668. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  8. 8.
    Macdonald, C., Ounis, I.: Expertise drift and query expansion in expert search. In: CIKM (2007)Google Scholar
  9. 9.
    Macdonald, C., Ounis, I.: Using Relevance Feedback in Expert Search. In: Amati, G., Carpineto, C., Romano, G. (eds.) ECIR 2007. LNCS, vol. 4425, pp. 431–443. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  10. 10.
    Sanderson, M., Croft, B.: Deriving concept hierarchies from text. In: SIGIR (1999)Google Scholar
  11. 11.
    Snow, R., Jurafsky, D., Ng, A.Y.: Learning syntactic patterns for automatic hypernym discovery. In: NIPS (2005)Google Scholar
  12. 12.
    Zavitsanos, E., Paliouras, G., Vouros, G.A., Petridis, S.: Discovering subsumption hierarchies of ontology concepts from text corpora. Web Intelligence (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Sujatha Das
    • 1
  • Prasenjit Mitra
    • 2
  • C. Lee Giles
    • 2
  1. 1.Department of Computer Science and EngineeringThe Pennsylvania State UniversityUniversity ParkUSA
  2. 2.School of Information Science and TechnologyThe Pennsylvania State UniversityUniversity ParkUSA

Personalised recommendations