An Algorithmic Formulation for Extracting Learning Concepts and Their Relatedness in eBook Texts

  • Rajesh Piryani
  • Ashraf Uddin
  • Madhavi Devaraj
  • Vivek Kumar Singh
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8284)

Abstract

In this paper, we present an algorithmic formulation to automatically extract learning concepts and their relationships from eBook texts and to generate an RDF data that can be used for a number of purposes. Our algorithmic approach first extracts various parts of an eBook (such as chapters and sections) and then through a sentence-level parsing scheme identifies learning concepts described in the eBook text. We have programmed for the identification and extraction of relationships between different learning concepts occurring in a section. We have also been able to extract some general data about the eBooks such as author, price, and reviews (through eBook content mining and web crawling). The learning concepts, their relationships and other useful information extracted from the eBooks; is then programmatically transformed into a machine readable RDF data. The automated process of concept and relation extraction and their subsequent storage into RDF data, makes our effort important and useful for tasks like Information Extraction, Concept-based Search and Machine Reading.

Keywords

Information Extraction Machine Reading RDF Schema Relation Extraction Semantic Annotation 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Relan, M., Khurana, S., Singh, V.K.: Qualitative Evaluation and Improvement Suggestions for eBooks using Text Analytics Algorithms. In: Proceedings of Second International Conference on Eco-friendly Computing and Communication Systems, Solan, India (2013)Google Scholar
  2. 2.
    Khurana, S., Relan, M., Singh, V.K.: A Text Analytics-based Approach to Compute Coverage, Readability and Comprehensibility of eBooks. In: Proceedings of the 6th International Conference on Contemporary Computing, Noida-India. IEEE Press (2013)Google Scholar
  3. 3.
    Justeson, J.S., Katz, S.M.: Technical terminology: Some linguistic properties and an algorithm for identification in text. Natural Language Engineering 1(1) (1995)Google Scholar
  4. 4.
    Agrawal, R., Gollapudi, S., Kannan, A., Kenthapadi, K.: Data Mining for Improving Textbooks. ACM SIGKDD Explorations 13(2), 7–19 (2011)CrossRefGoogle Scholar
  5. 5.
    Agrawal, R., Gollapudi, S., Kenthapadi, K., Srivastava, N., Velu, R.: Enriching textbooks through data mining. In: ACM DEV. (2010)Google Scholar
  6. 6.
    iText Open Source PDF Library for JAVA, http://www.api.itextpdf.com
  7. 7.
    Singh, V.K., Piryani, R., Uddin, A., Pinto, D.: A Content-based eResource Recommender System to augment eBook-based Learning. In: Proceedings of the 7th Multi-Disciplinary International Workshop in Artificial Intelligence, Krabi, Thailand. LNAI. Springer (2013)Google Scholar
  8. 8.
    Fader, A., Soderland, S., Etzioni, O.: Identifying Relations for Open Information Extraction. In: Conference on Empirical Methods in Natural Language Processing (2011)Google Scholar
  9. 9.
    Banko, M., Cafarella, M.J., Soderland, S., Broadhead, M., Etzioni, O.: Open information extraction from the web. In: International Joint Conference on Artificial Intelligence (2007)Google Scholar
  10. 10.
    Wu, F., Weld, D.S.: Open information extraction using Wikipedia. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, ACL 2010, pp. 118–127. Association for Computational Linguistics, Morristown (2010)Google Scholar
  11. 11.
    Etzioni, O., Fader, A., Christensen, J., Soderland, S., Mausam.: Open Information Extraction: the Second Generation. In: Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence, pp. 3–10 (2011)Google Scholar
  12. 12.
    Singh, V.K., Piryani, R., Uddin, A., Waila, P.: Sentiment Analysis of Movie Reviews and Blog Posts: Evaluating SentiWordNet with different Linguistic Features and Scoring Schemes. In: Proceedings of 2013 IEEE International Advanced Computing Conference. IEEE Press, Ghaziabad (2013)Google Scholar
  13. 13.
    Singh, V.K., Piryani, R., Uddin, A., Waila, P.: Sentiment Analysis of Movie Reviews- A new feature-based Heuristic for Aspect-level Sentiment Classification. In: Proceedings of the 2013 International Muli-Conference on Automation, Communication, Computing, Control and Compressed Sensing, IEEE Press, Kerala (2013)Google Scholar
  14. 14.
    Uddin, A., Piryani, R., Singh, V.K.: Information and Relation Extraction for Semantic Annotation of eBook Texts. In: Thampi, S.M., Abraham, A., Pal, S.K., Rodriguez, J.M.C., et al. (eds.) Recent Advances in Intelligent Informatics. AISC, vol. 235, pp. 215–226. Springer, Heidelberg (2014)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2013

Authors and Affiliations

  • Rajesh Piryani
    • 1
  • Ashraf Uddin
    • 1
  • Madhavi Devaraj
    • 2
  • Vivek Kumar Singh
    • 1
  1. 1.Department of Computer ScienceSouth Asian UniversityNew DelhiIndia
  2. 2.Department of Computer Science & EngineeringGBTULucknowIndia

Personalised recommendations