Encyclopedia of Database Systems

2018 Edition
| Editors: Ling Liu, M. Tamer Özsu

Hierarchical Data Summarization

  • Egemen TaninEmail author
  • Mohammed Eunus Ali
Reference work entry
DOI: https://doi.org/10.1007/978-1-4614-8265-9_536


Hierarchical data summarization


Given a set of records, data summaries on different attributes are frequently produced in data management systems. Commonly used examples are the number of records that fall into a set of ranges of an attribute or the minimum values in these ranges. To improve the efficiency in accessing summaries at different resolutions or due to a direct need for investigating a hierarchy that is inherent to the data type, such as dates, hierarchical versions of data summaries can be used. A data structure or algorithm is labeled as hierarchical if that structure or algorithm uses the concept of subcomponents to systematically obtain conceptually larger components. The method of obtaining a larger component is regularly induced by the user’s understanding of the domain as well as the fact that hierarchies can also be created automatically by a set of rules embedded into the system. Thus, rules used in a data structure’s creation, e.g., B+-trees,...

This is a preview of subscription content, log in to check access.

Recommended Reading

  1. 1.
    Aboulnaga A, Aref WG. Window query processing in linear quadtrees. Distrib Parallel Databases. 2001;10(2):111–26.zbMATHCrossRefGoogle Scholar
  2. 2.
    Ahmad Y, Nath S. Colr-tree: communication-efficient spatio-temporal indexing for a sensor data web portal. In: Proceedings of the 24th International Conference on Data Engineering; 2008. p. 784–93.Google Scholar
  3. 3.
    Ali ME, Zhang R, Tanin E, Kulik L. A motion-aware approach to continuous retrieval of 3D objects. In: Proceedings of the 24th International Conference on Data Engineering; 2008. p. 843–52.Google Scholar
  4. 4.
    Antoshenkov G. Query processing in DEC RDB: major issues and future challenges. IEEE Data Eng Bull 1993;16(4):42–5.Google Scholar
  5. 5.
    Aoki PM. Generalizing “search” in generalized search trees. In: Proceedings of the 14th International Conference on Data Engineering; 1998. p. 380–9.Google Scholar
  6. 6.
    Bruno N, Chaudhuri S, Gravano L. STHoles: a multidimensional workload-aware histogram. SIGMOD Rec. 2001;30(2):211–22.CrossRefGoogle Scholar
  7. 7.
    Camerra A, Palpanas T, Shieh J, Keogh E. isax 2.0: indexing and mining one billion time series. In: Proceedings of the 10th IEEE International Conference on Data Mining; 2010. p. 58–67.Google Scholar
  8. 8.
    Dean J, Ghemawat S. Mapreduce: simplified data processing on large clusters. Commun ACM. 2008;51(1):107–13.CrossRefGoogle Scholar
  9. 9.
    Ganesan D, Estrin D, Heidemann J. Dimensions: why do we need a new data handling architecture for sensor networks? In: Proceedings of the ACM Workshop on Hot Topics in Networks; 2002.Google Scholar
  10. 10.
    Gao J, Guibas LJ, Hershberger J, Zhang L. Fractionally cascaded information in a sensor network. In: Proceedings of the 3rd International Symposium on Information Processing in Sensor Networks; 2004. p. 311–9.Google Scholar
  11. 11.
    Greenstein B, Estrin D, Govindan R, Ratnasamy S, Shenker S. DIFS: a distributed index for features in sensor networks. In: Proceedings of the IEEE Workshop on Sensor Network Protocols and Applications; 2003.Google Scholar
  12. 12.
    Hellerstein JM, Naughton JF, Pfeffer A. Generalized search trees for database systems. In: Proceedings of the 21th International Conference on Very Large Data Bases; 1995. p. 562–73.Google Scholar
  13. 13.
    Keogh E, Chakrabarti K, Pazzani M, Mehrotra S. Dimensionality reduction for fast similarity search in large time series databases. J Knowl Inf Syst. 2000;3(3):263–86.zbMATHCrossRefGoogle Scholar
  14. 14.
    Kitsos I, Magoutis K, Tzitzikas Y. Scalable entity-based summarization of web search results using mapreduce. Distrib Parallel Databases 2014;32(3):405–46.CrossRefGoogle Scholar
  15. 15.
    Knuth DE. Sorting and searching, the art of computer programming, vol. 3. Redwood City: Addison Wesley Publishing; 1973.zbMATHGoogle Scholar
  16. 16.
    Li X, Kim YJ, Govindan R, Hong W. Multi-dimensional range queries in sensor networks. In: Proceedings of the 1st International Conference on Embedded Networked Sensor Systems; 2003. p. 5–7.Google Scholar
  17. 17.
    Madden SR, Franklin MJ, Hellerstein JM, Hong W. TinyDB: an acquisitional query processing system for sensor networks. ACM Trans Database Syst. 2005;30(1):122–73.CrossRefGoogle Scholar
  18. 18.
    Nath S, Gibbons PB, Seshan S, Anderson ZR. Synopsis diffusion for robust aggregation in sensor networks. In: Proceedings of the 2nd International Conference on Embedded Networked Sensor Systems; 2004. p. 250–62.Google Scholar
  19. 19.
    Ordonez C, Mohanam N, Garcia-Alvarado C. PCA for large data sets with parallel data summarization. Distrib Parallel Databases. 2014;32(3): 377–403.CrossRefGoogle Scholar
  20. 20.
    Ratnasamy S, Francis P, Handley M, Karp RM, Shenker S. A scalable content-addressable network. In: Proceedings of the 2001 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communication; 2001. p. 161–72.Google Scholar
  21. 21.
    Reiss F, Garofalakis M, Hellerstein JM. Compact histograms for hierarchical identifiers. In: Proceedings of the 32nd International Conference on Very Large Data Bases; 2006. p. 870–81.Google Scholar
  22. 22.
    Samet H, Sankaranarayanan J, Auerbach M. Indexing methods for moving object databases: games and other applications. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2013. p. 169–80.Google Scholar
  23. 23.
    Wang J, Wu S, Gao H, Li J, Ooi BC. Indexing multi-dimensional data in a cloud system. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2010. p. 591–602.Google Scholar
  24. 24.
    Wang W, Yang J, Muntz R. STING: a statistical information grid approach to spatial data mining. In: Proceedings of the 23th International Conference on Very Large Data Bases; 1997. p. 186–95.Google Scholar
  25. 25.
    Wu S, Jiang D, Ooi BC, Wu K-L. Efficient b-tree based indexing for cloud data processing. Proc VLDB Endowment. 2010;3(1):1207–18.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Computing and Information SystemsUniversity of MelbourneMelbourneAustralia
  2. 2.Department of Computer Science and EngineeringBangladesh University of Engineering and Technology (BUET)DhakaBangladesh