Parametric Representation of Paragraphs and Their Classification

Conference paper
Part of the Smart Innovation, Systems and Technologies book series (SIST, volume 27)

Abstract

Automatic paragraph classification is an important task in the field of information retrieval and digital publication. The work presents a novel approach to represent a paragraph of a document using a set of parameters extracted from it and a methodology has been proposed based on multi layer perceptron in designing an automatic paragraph classifier. The proposed framework has been tested on large industrial data and found improved performance compare to conventional rule based approach.

Keywords

Text Classification Information Retrieval Machine Intelligence Multi Layer Perceptron 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Crossley, S.A., Dempsey, K., McNamara, D.S.: Classifying paragraph types using linguistic features: Is paragraph positioning important? Journal of Writing Research 3(2), 119–143 (2011)Google Scholar
  2. 2.
    Sporleder, C.: Automatic paragraph identification: A study across languages and domains. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 72–79 (2004)Google Scholar
  3. 3.
    Filippova, K., Strube, M.: Using linguistically motivated features for paragraph boundary identification. In: EMNLP, pp. 267–274 (2006)Google Scholar
  4. 4.
    Taboada, M., Brooke, J., Stede, M.: Genre-based paragraph classification for sentiment analysis. In: Proceedings of SIGDIAL 2009, pp. 62–70 (2009)Google Scholar
  5. 5.
    Lieske, S.P., Thoby-Brisson, M., Telgkamp, P., Ramirez, J.M.: Reconfiguration of the neural network controlling multiple breathing patterns: eupnea, sighs and gasps. Nature Neuroscience 3, 600–607 (2000)CrossRefGoogle Scholar
  6. 6.
    Collobert, R., Wetson, J.: Fast semantic extraction using a novel neural network architecture. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pp. 560–567 (2007)Google Scholar
  7. 7.
    Sebastiani, F.: Machine learning in automated text categorization. ACM Computing Surveys 34, 1–47 (2002)CrossRefMathSciNetGoogle Scholar
  8. 8.
    Pomerleau, D.A.: Neural network simulation at warp speed: how we got 17 million connections per second. IEEE 2, 143–150 (1988)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  1. 1.Department of Information TechnologyHeritage Institute of TechnologyKolkataIndia
  2. 2.Wipro TechnologiesKolkataIndia

Personalised recommendations