Skip to main content

A Tool for Converting Different Data Representation Formats

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8891))

Abstract

Recently, data analysis and processing is one of the most interesting and demanding fields in both academics and industries. There are large numbers of tools openly available in web. But, different tools take inputs and return outputs in different data representation formats. To build the appropriate converter for a pair of data representation formats, we need both sufficient time and in depth knowledge of the formats. Here, we discuss CoNLL, SSF, XML and JSON data representation formats and develop a tool for conversion between them. Other conversions will be included in the extended version.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Loper, E., Bird, S.: Nltk: The natural language toolkit. In: Proceedings of the ACL 2002 Workshop on Effective Tools and Methodologies for Teaching Natural Language Processing and Computational Linguistics, vol. 1, pp. 63–70. Association for Computational Linguistics (2002)

    Google Scholar 

  2. Lafferty, J.D., McCallum, A., Pereira, F.C.N.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: ICML, pp. 282–289 (2001)

    Google Scholar 

  3. Sha, F., Pereira, F.: Shallow parsing with conditional random fields. In: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, vol. 1, pp. 134–141. Association for Computational Linguistics (2003)

    Google Scholar 

  4. Kumar, P., Ahmad, R., Chaudhary, B., Sinha, M.: Enriched dashboard: An integration and visualization tool for distributed nlp systems on heterogeneous platforms. In: 2013 13th International Conference on Computational Science and Its Applications (ICCSA), pp. 105–114 (2013)

    Google Scholar 

  5. Bharati, A., Sangal, R., Sharma, D.M.: Ssf: Shakti standard format guide. Language Technologies Research Centre, International Institute of Information Technology, Hyderabad, India, pp. 1–25 (2007)

    Google Scholar 

  6. Bharati, A., Sangal, R., Sharma, D., Singh, A.K.: Ssf: A common representation scheme for language analysis for language technology infrastructure development. In: COLING 2014, p. 66 (2014)

    Google Scholar 

  7. Saxena, A., Madhyasta, P.S., Nivre, J.: Building the uppsala hindi-swedish-english parallel treebank

    Google Scholar 

  8. Agarwal, R.: Automatic Error Detection for Treebank Validation. PhD thesis, International Institute of Information Technology Hyderabad (2012)

    Google Scholar 

  9. Gade, R.P.: Dependency parsing approaches for Indian Languages: Hindi and Sanskrit. PhD thesis, International Institute of Information Technology Hyderabad (2014)

    Google Scholar 

  10. Tammewar, S.J.N.J.A., Sharma, R.A.B.D.M.: Exploring semantic information in hindi wordnet for hindi dependency parsing

    Google Scholar 

  11. Krishnarao, A.A., Gahlot, H., Srinet, A., Kushwaha, D.S.: A comparison of performance of sequential learning algorithms on the task of named entity recognition for indian languages. In: Allen, G., Nabrzyski, J., Seidel, E., van Albada, G.D., Dongarra, J., Sloot, P.M.A. (eds.) ICCS 2009, Part I. LNCS, vol. 5544, pp. 123–132. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  12. Crockford, D.: Json: The fat-free alternative to xml. In: Proc. of XML, vol. (2006)

    Google Scholar 

  13. Ecma, E.: 262: Ecmascript language specification. ECMA (European Association for Standardizing Information and Communication Systems), pub-ECMA: adr (1999)

    Google Scholar 

  14. Tong, K.: Migrating data using an intermediate self-describing format. US Patent 7,290,003 (2007)

    Google Scholar 

  15. Clark, J., Tong, K., Wu, X., Vong, F.: Dynamically pipelined data migration. US Patent 7,299,237 (2007)

    Google Scholar 

  16. Gupta, R., Goyal, P., Diwakar, S.: Transliteration among indian languages using wx notation. g Semantic Approaches in Natural Language Processing, 147 (2010)

    Google Scholar 

  17. Sharma, S., Bora, N., Halder, M.: English-hindi transliteration using statistical machine translation in different notation. Training 20000(297380), 20000 (2012)

    Google Scholar 

  18. Buchholz, S., Marsi, E.: Conll-x shared task on multilingual dependency parsing. In: Proceedings of the Tenth Conference on Computational Natural Language Learning, pp. 149–164. Association for Computational Linguistics (2006)

    Google Scholar 

  19. Leacock, C., Towell, G., Voorhees, E.: Corpus-based statistical sense resolution. In: Proceedings of the ARPA Workshop on Human Language Technology, pp. 260–265 (1993)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Chatterji, S., Sengupta, S., Rao, B.G., Banerjee, D. (2014). A Tool for Converting Different Data Representation Formats. In: Prasath, R., O’Reilly, P., Kathirvalavakumar, T. (eds) Mining Intelligence and Knowledge Exploration. Lecture Notes in Computer Science(), vol 8891. Springer, Cham. https://doi.org/10.1007/978-3-319-13817-6_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-13817-6_28

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-13816-9

  • Online ISBN: 978-3-319-13817-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics