Parsimonious temporal aggregation

Gordevičius, Juozas; Gamper, Johann; Böhlen, Michael

doi:10.1007/s00778-011-0243-9

Parsimonious temporal aggregation

Regular Paper
Published: 24 August 2011

Volume 21, pages 309–332, (2012)
Cite this article

The VLDB Journal Aims and scope Submit manuscript

Juozas Gordevičius¹,
Johann Gamper² &
Michael Böhlen³

215 Accesses
14 Citations
Explore all metrics

Abstract

Temporal aggregation is an important operation in temporal databases, and different variants thereof have been proposed. In this paper, we introduce a novel temporal aggregation operator, termed parsimonious temporal aggregation (PTA), that overcomes major limitations of existing approaches. PTA takes the result of instant temporal aggregation (ITA) of size n, which might be up to twice as large as the argument relation, and merges similar tuples until a given error (\({\epsilon}\)) or size (c) bound is reached. The new operator is data-adaptive and allows the user to control the trade-off between the result size and the error introduced by merging. For the precise evaluation of PTA queries, we propose two dynamic programming–based algorithms for size- and error-bounded queries, respectively, with a worst-case complexity that is quadratic in n. We present two optimizations that take advantage of temporal gaps and different aggregation groups and achieve a linear runtime in experiments with real-world data. For the quick computation of an approximate PTA answer, we propose an efficient greedy merging strategy with a precision that is upper bounded by O(log n). We present two algorithms that implement this strategy and begin to merge as ITA tuples are produced. They require O(n log (c + β)) time and O(c + β) space, where β is the size of a read-ahead buffer and is typically very small. An empirical evaluation on real-world and synthetic data shows that PTA considerably reduces the size of the aggregation result, yet introducing only small errors. The greedy algorithms are scalable for large data sets and introduce less error than other approximation techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Agrawal, R., Faloutsos, C., Swami, A.: Efficient search in sequence databases. In: Proceedings of the 4th International Conference on Foundations of Data Organization and Algorithms (1993)
Berberich, K., Bedathur, S.J., Neumann, T., Weikum, G.: A time machine for text search. In: Proceedings of the 30th Annual International ACM SIGIR Conference On Research and Development in Information Retrieval, pp. 519–526 (2007)
Beyer, K., Goldstein, J., Ramakrishnan, R., Shaft, U.: When is “nearest neighbor” meaningful? In: Proceedings Of the 7th International Conference on Database Theory, pp. 217–235 (1999)
Böhlen, M.H., Gamper, J., Jensen, C.S.: Multi-dimensional aggregation for temporal data. In: Proceedings of the 10th International Conference On Extending Database Technology, pp. 257–275. Springer, Berlin (2006)
Böhlen, M.H., Snodgrass, R.T., Soo, M.D.: Coalescing in temporal databases. In: Proceedings of the 22th International Conference on Very Large Data Bases, pp. 180–191 (1996)
Cai, Y., Ng, R.: Indexing spatio-temporal trajectories with Chebyshev polynomials. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 599–610. ACM (2004)
Chakrabarti K., Keogh E., Mehrotra S., Pazzani M.: Locally adaptive dimensionality reduction for indexing large time series databases. ACM Trans. Database Syst. 27(2), 188–228 (2002)
Article Google Scholar
Elmeleegy H., Elmagarmid A.K., Cecchet E., Aref W.G., Zwaenepoel W.: Online piece-wise linear approximation of numerical streams with precision guarantees. PVLDB 2(1), 145–156 (2009)
Google Scholar
Gordevicius, J., Gamper, J., Böhlen, M.H.: A greedy approach towards parsimonious temporal aggregation. In: Proceedings of the 15th International Symposium on Temporal Representation and Reasoning, pp. 88–92 (2008)
Gordevicius, J., Gamper, J., Böhlen, M.H.: Parsimonious temporal aggregation. In: Proceedings of the 12th International Conference on Extending Database Technology, pp. 1006-1017 (2009)
Jagadish, H.V., Koudas, N., Muthukrishnan, S., Poosala, V., Sevcik, K.C., Suel, T.: Optimal histograms with quality guarantees. In: Proceedings of the 24th International Conference on Very Large Data Bases, pp. 275–286 (1998)
Keogh E., Kasetty S.: On the need for time series data mining benchmarks: a survey and empirical demonstration. Data Min. Knowl. Discov. 7(4), 349–371 (2003)
Article MathSciNet Google Scholar
Keogh, E., Xi, X., Wei, L., Ratanamahatana, C.: The UCR time series classification/clustering repository. http://www.cs.ucr.edu/~eamonn/time_series_data/. Accessed on April 15, (2009)
Keogh, E.J., Pazzani, M.J.: A simple dimensionality reduction technique for fast similarity search in large time series databases. In: Proceedings of the 4th Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 122–133. Springer, Berlin (2000)
Kline, N., Snodgrass, R.T.: Computing temporal aggregates. In: Proceedings of the 11th International Conference on Data Engineering, pp. 222–231 (1995)
Li, C.S., Yu, P., Castelli, V.: Hierarchyscan: a hierarchical similarity search algorithm for databases of long sequences. In: Proceedings of the 12th International Conference on Data Engineering, pp. 546–553 (1996)
Lin J., Keogh E., Wei L., Lonardi S.: Experiencing SAX: a novel symbolic representation of time series. Data Min. Knowl. Discov. 15(2), 107–144 (2007)
Article MathSciNet Google Scholar
Moon B., Vega Lopez I.F., Immanuel V.: Efficient algorithms for large-scale temporal aggregation. IEEE Trans. Knowl. Data Eng. 15(3), 744–759 (2003)
Article Google Scholar
Navathe S.B., Ahmed R.: A temporal relational model and a query language. Inf. Sci. 49(1–3), 147–175 (1989)
Article MATH Google Scholar
Palpanas T., Vlachos M., Keogh E., Gunopulos D.: Streaming time series summarization using user-defined amnesic functions. IEEE Trans. Knowl. Data Eng. 20(7), 992–1006 (2008)
Article Google Scholar
Palpanas, T., Vlachos, M., Keogh, E., Gunopulos, D., Truppel, W.: Online amnesic approximation of streaming time series. In: Proceedings of the 20th International Conference on Data Engineering, pp. 339–349 (2004)
Shieh, J., Keogh, E.: iSAX: indexing and mining terabyte sized time series. In: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 623–631 (2008)
Snodgrass R.T., Gomez S., McKenzie L.E.: Aggregates in the temporal query language TQuel. IEEE Trans. Knowl. Data Eng. 5(5), 826–842 (1993)
Article Google Scholar
Stollnitz E., DeRose A., Salesin D.: Wavelets for computer graphics: a primer, part 1. IEEE Comput. Graph. Appl. 15(3), 76–84 (1995)
Article Google Scholar
Tao, Y., Papadias, D., Faloutsos, C.: Approximate temporal aggregation. In: Proceedings of the 20th International Conference on Data Engineering, pp. 190–201 (2004)
Tuma, P.: Implementing historical aggregates in TempIS. Ph.D. thesis, Wayne State University, Detroit, Michigan (1992)
Vega Lopez I.F., Snodgrass R.T., Moon B.: Spatiotemporal aggregate computation: A survey. IEEE Trans. Knowl. Data Eng. 17(2), 271–286 (2005)
Article Google Scholar
Wang, F.: Employee temporal data set. http://timecenter.cs.aau.dk/. Accessed on April 15, 2009
Wettschereck D., Aha D.W., Mohri T.: A review and empirical evaluation of feature weighting methods for a class of lazy learning algorithms. Artif. Intell. Rev. 11(1–5), 273–314 (1997)
Article Google Scholar
Yang J., Widom J.: Incremental computation and maintenance of temporal aggregates. VLDB J. 12(3), 262–283 (2003)
Article Google Scholar
Yi, B.K., Faloutsos, C.: Fast time sequence indexing for arbitrary Lp norms. In: Proceedings of the 26th International Conference on Very Large Data Bases, pp. 385–394. Morgan Kaufmann, Amsterdam (2000)

Download references

Author information

Authors and Affiliations

Institute of Mathematics and Informatics, Vilnius University, Vilnius, Lithuania
Juozas Gordevičius
Free University of Bozen-Bolzano, Bolzano, Italy
Johann Gamper
Department of Informatics, University of Zurich, Zurich, Switzerland
Michael Böhlen

Authors

Juozas Gordevičius
View author publications
You can also search for this author in PubMed Google Scholar
Johann Gamper
View author publications
You can also search for this author in PubMed Google Scholar
Michael Böhlen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Juozas Gordevičius.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gordevičius, J., Gamper, J. & Böhlen, M. Parsimonious temporal aggregation. The VLDB Journal 21, 309–332 (2012). https://doi.org/10.1007/s00778-011-0243-9

Download citation

Received: 12 November 2010
Revised: 06 June 2011
Accepted: 27 June 2011
Published: 24 August 2011
Issue Date: June 2012
DOI: https://doi.org/10.1007/s00778-011-0243-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Parsimonious temporal aggregation

Abstract

Access this article

Similar content being viewed by others

Efficient Computation of Parsimonious Temporal Aggregation

Approximate Temporal Aggregation with Nearby Coalescing

Sweeping-Based Temporal Aggregation

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Parsimonious temporal aggregation

Abstract

Access this article

Similar content being viewed by others

Efficient Computation of Parsimonious Temporal Aggregation

Approximate Temporal Aggregation with Nearby Coalescing

Sweeping-Based Temporal Aggregation

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation