Abstract

World Wide Web is the store house of abundant information available in various electronic forms. In the past two decades, the increase in the performance of computers in handling large quantity of text data led researchers to focus on reliable and optimal retrieval of information already exist in the huge resources. Though the existing search engines, answering machines has succeeded in retrieving the data relative to the user query, the relevancy of the text data is not appreciable of the huge set. It is hence binding the range of resultant text data for a given user query with appreciable ranking to each document stand as a major challenge. In this paper, we propose a Query based k-Nearest Neighbor method to access relevant documents for a given query finding the most appropriate boundary to related documents available on web and rank the document on the basis of query rather than customary Content based classification. The experimental results will elucidate the categorization with reference to closeness of the given query to the document.

Keywords

Text Categorization User Query Training Document Predefined Category Wise Approach 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Sebastiani, F.: Machine learning in automated text categorization. ACM Computing Surveys 34(1), 1–47 (2002)CrossRefGoogle Scholar
  2. 2.
    Geng, X., Liu, T.-Y., Qin, T., Arnold, A., Li, H., Shum, H.-Y.: Query Dependent Ranking Using K-Nearest Neighbor. In: ACM, SIGIR 2008, Singapore, July 20–24 (2008)Google Scholar
  3. 3.
    Lee, D.L., Chuang, H., Seamons, K.: Document Ranking and the Vector-Space Model, a research theisis (March-April 1997)Google Scholar
  4. 4.
    Guo, G., Wang, H., Bell, D., Bi, Y., Greer, K.: Using kNN Model-based Approach for Automatic Text CategorizationGoogle Scholar
  5. 5.
    Papadopoulos, S., Wang, L., Yang, Y., Papadias, D., Karras, P.: Authenticated Multi-Step Nearest Neighbor SearchGoogle Scholar
  6. 6.
    Yang, Y., Pedersen, J.O.: A comparative study on feature selection in text categorization. In: Fisher, D.H. (ed.) Proceedings of ICML-1997, 14th International Conference on Machine Learning, pp. 412–420. Morgan Kaufmann Publishers, San Francisco (1997)Google Scholar
  7. 7.
    Guru, D.S., Harish, B.S., Manjunath, S.: Clustering of Textual Data: A Brief Survey. In: The Proceedings of International Conference on Signal and Image Processing, pp. 409–413 (2009)Google Scholar
  8. 8.
    Al-Shalabi, R., Kanaan, G., Gharaibeh, M.H.: Arabic Text Categorization Using kNN AlgorithmGoogle Scholar
  9. 9.
    Aas, K., Eikvil, L.: Text Categorization: A Survey. Norwegian Computation Center, Oslo (1999)Google Scholar
  10. 10.
    Mitchell, T.M.: Machine Learning. McGraw-Hill, Singapore (1997)MATHGoogle Scholar
  11. 11.
    Hotho, A., Nürnberger, A., Paaß, G.: A Brief Survey of Text Mining. Journal for Computational Linguistics and Language Technology 20, 19–62 (2005)Google Scholar
  12. 12.
    Yang, Y., Slattery, S., Ghani, R.: A study of approaches to hypertext categorization. Journal of Intelligent Information Systems 18(2), 219–241 (2002)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Suneetha Manne
    • 1
  • Sita Kumari Kotha
    • 1
  • S. Sameen Fatima
    • 2
  1. 1.Department of ITVRSECVijayawadaIndia
  2. 2.Department of CSEOsmania UniversityHyderabadIndia

Personalised recommendations