The Journal of Supercomputing

, Volume 73, Issue 2, pp 810–836

A dynamic and parallel approach for repetitive prime labeling of XML with MapReduce

  • Jinhyun Ahn
  • Dong-Hyuk Im
  • Taewhi Lee
  • Hong-Gee Kim
Article

DOI: 10.1007/s11227-016-1803-y

Cite this article as:
Ahn, J., Im, DH., Lee, T. et al. J Supercomput (2017) 73: 810. doi:10.1007/s11227-016-1803-y
  • 164 Downloads

Abstract

A massive amount of extensible markup language (XML) data from various areas is available on the Web. Answering structural queries against XML data is important, as it is the core of information retrieval systems for XML data. Labeling scheme has been suggested for rapid query processing of massive XML data. Interval-based, prefix-based, and prime number labeling scheme exist. Of these, the prime number labeling scheme has the advantage of query processing by arithmetic operations. Recently, the repetitive prime number labeling scheme was proposed; this scheme produces a smaller label size than conventional prime number labeling using prime numbers repetitively. However, a parallel algorithm for the repetitive prime number labeling scheme does not exist; therefore, this scheme is difficult to apply to massive XML data. In this paper, a dynamic and parallel approach of XML labeling algorithm that works with MapReduce is proposed for, particularly, the repetitive prime number labeling scheme. Two optimization techniques are devised: the label assignment order adjustment to further reduce the label size and the upper tree compressing technique to reduce the memory requirements during the labeling process. Experiments over real-world XML data confirmed that the techniques are effective than the previous works.

Keywords

Repetitive prime labeling MapReduce XML Tree 

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  • Jinhyun Ahn
    • 1
  • Dong-Hyuk Im
    • 2
  • Taewhi Lee
    • 3
  • Hong-Gee Kim
    • 4
  1. 1.Biomedical Knowledge Engineering Laboratory and Dental Research InstituteSeoul National UniversitySeoulKorea
  2. 2.Department of Computer and Information EngineeringHoseo UniversityAsanKorea
  3. 3.BigData Intelligence Research DepartmentElectronics and Telecommunications Research InstituteDaejonKorea
  4. 4.Biomedical Knowledge Engineering Laboratory, Dental Research Institute and Institute of Human-Environment Interface BiologySeoul National UniversitySeoulKorea

Personalised recommendations