Improving Text Search on Hybrid Data

  • Huaijie Zhu
  • Xiaochun Yang
  • Bin Wang
  • Yue Wang
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7419)


In our real life, there is much hybrid data which contains not only unstructured data but also structured data. In general, the majority techniques of text search on hybrid data are only focused on unstructured data (text) ignoring the structured data. So this may lead a bad ranking of the searching results. In this paper, we describe a new method about improving text search using structured data. Our contributions are summarized as follows: (i) We build the uniform problem model; (ii) Ours is the first approach adopting the mutual information of feature words to qualify the relevance (similarity) between two texts; and (iii) We utilize several rules to consider the structured data to improve text search and build our approach. Finally, experimental results show the relevance function and our approach guarantees the search results with high recall, top-k precision, Mean Average Precision and good search performance, respectively.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Hahn, U., Honeck, M., Schulz, S.: Subword-based text retrieval. System Sciences (2003)Google Scholar
  2. 2.
    Huang, G., Zhang, X., Luoyang: Text Retrieval Based on Semantic Relationship. In: E-Product E-Service and E-Entertainment (ICEEE) (2010)Google Scholar
  3. 3.
    Liu, J., Zhou, H.: Computer Engineering Faculty. Research on the Chinese text retrieval method using context. In: Information Science and Engineering (ICISE), Huaiyin Institute of Technology, Huaian (2010)Google Scholar
  4. 4.
    Lei, J.: A Web Information Retrieval Method Based on Multilayer Vector Space Model. Computer Applications 24, 26–27 (2004)Google Scholar
  5. 5.
    Holt, J.D., Chung, S.M., Li, Y.: Usage of Mined Word Associations for Text Retrieval. In: Tools with Artificial Intelligence. Wright State Univ., Dayton (2007)Google Scholar
  6. 6.
    Strzalkowski, T.: Natural Language Information Retrieval. Text, Speech and Language Technology Book Series, vol. 7 (1999)Google Scholar
  7. 7.
    Rong., F.S., Jun, X.W.: Study on text semantic similarity in information retrieval. Information and Automation (2008)Google Scholar
  8. 8.
    Chen, Y., Wang, W., Liu, Z., Lin, X.: Keyword Search on Structured and Semi-Structured Data. In: SIGMOD (2009)Google Scholar
  9. 9.
    Sahami, M., Heilman, T.D.: A Web-based Kernel Function for Measuring the Similarity of Short Text Snippets. ACM, 1-59593-323-9/06/0005 (2006)Google Scholar
  10. 10.
    Hu, Q., Zhanget, L., et al.: Measuring relevance between discrete and continuous features based on neighborhood mutual information. In: TOC (2011)Google Scholar
  11. 11.
    Li, W.T.: Mutual information functions versus correlation functions. Stat. Phys. 60, 823–837 (1990)zbMATHCrossRefGoogle Scholar
  12. 12.
    Holt, J.D., Chung, S.M., Li, Y.: Usage of Mined Word Associations for Text Retrieval. In: ICTAI (2007)Google Scholar
  13. 13.
    Arslan, A., Yilmazel, O.: A comparison of Relational Databases and information retrieval libraries on Turkish text retrieval. Natural Language Processing and Knowledge Engineering (2008)Google Scholar
  14. 14.
    Chellappa, M., Kambhampaty, S. (Kanishka Syst.): Text retrieval-a trendy cocktail to address the dataworld. In: Computer Software and Applications Conference (1994)Google Scholar
  15. 15.
    Singhal, A.: Modern information retrieval: a brief overview. IEEE Data Engineering Bulletin, Special Issue on Text and Databases 24(4) (December 2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Huaijie Zhu
    • 1
  • Xiaochun Yang
    • 1
  • Bin Wang
    • 1
  • Yue Wang
    • 1
  1. 1.College of Information Science and EngineeringNortheastern UniversityLiaoningChina

Personalised recommendations