Mining for Empty Rectangles in Large Data Sets

Edmonds, Jeff; Gryz, Jarek; Liang, Dongming; Miller, Renée J.

doi:10.1007/3-540-44503-X_12

Jeff Edmonds⁶,
Jarek Gryz⁶,
Dongming Liang⁶ &
…
Renée J. Miller⁷

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1973))

Included in the following conference series:

International Conference on Database Theory

2590 Accesses
12 Citations

Abstract

Many data mining approaches focus on the discovery of similar (and frequent) data values in large data sets. We present an alternative, but complementary approach in which we search for empty regions in the data. We consider the problem of finding all maximal empty rectangles in large, two-dimensional data sets. We introduce a novel, scalable algorithm for finding all such rectangles. The algorithm achieves this with a single scan over a sorted data set and requires only a small bounded amount of memory. We also describe an algorithm to find all maximal empty hyper-rectangles in a multi-dimensional space. We consider the complexity of this search problem and present new bounds on the number of maximal empty hyper-rectangles. We briefly overview experimental results obtained by applying our algorithm to a synthetic data set.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

R. Agrawal, T. Imielinksi, and A. Swami. Mining Association Rules between Sets of Items in Large Databases. ACM SIGMOD, 22(2), June 1993.
Google Scholar
M. J. Atallah and Fredrickson G. N. A note on finding a maximum empty rectangle. Discrete Applied Mathematics, (13):87–91, 1986.
Google Scholar
D. Barbará, W. DuMouchel, C. Faloutsos, P. J. Haas, J. M. Hellerstein, Y. E. Ioannidis, H. V. Jagadish, T. Johnson, R. T. Ng, V. Poosala, K. A. Ross, and K. C. Sevcik. The New Jersey Data Reduction Report. Data Engineering Bulletin, 20(4):3–45, 1997.
Google Scholar
Bernard Chazelle, Robert L. (Scot) Drysdale III, and D. T. Lee. Computing the largest empty rectangle. SIAM J. Comput., 15(1):550–555, 1986.
Article Google Scholar
Q. Cheng, J. Gryz, F. Koo, C. Leung, L. Liu, X. Qian, and B. Schiefer. Implementation of two semantic query optimization techniques in DB2 universal database. In Proceedings of the 25th VLDB, pages 687–698, Edinburgh, Scotland, 1999.
Google Scholar
J. Edmonds, J. Gryz, D. Liang, and R. J. Miller. Mining for Empty Rectangles in Large Data Sets (Extended Version). Technical Report CSRG-410, Department of Computer Science, University of Toronto, 2000.
Google Scholar
M. R. Garey and D. S. Johnson. Computers and Intractability. W. H. Freeman and Co., New York, 1979.
MATH Google Scholar
H. V. Jagadish, J. Madar, and R. T. Ng. Semantic Compression and Pattern Extraction with Fascicles. In Proc. of VLDB, pages 186–197, 1999.
Google Scholar
B. Liu, K. Wang, L.-F. Mun, and X.-Z. Qi. Using Decision Tree Induction for Discovering Holes in Data. In 5th Pacific Rim International Conference on Artificial Intelligence, pages 182–193, 1998.
Google Scholar
Bing Liu, Liang-Ping Ku, and Wynne Hsu. Discovering interesting holes in data. In Proceedings of IJCAI, pages 930–935, Nagoya, Japan, 1997. Morgan Kaufmann.
Google Scholar
R. J. Miller and Y. Yang. Association Rules over Interval Data. ACM SIGMOD, 26(2):452–461, May 1997.
Google Scholar
A. Namaad, W. L. Hsu, and D. T. Lee. On the maximum empty rectangle problem. Applied Discrete Mathematics, (8):267–277, 1984.
Google Scholar
M. Orlowski. A New Algorithm for the Largest Empty Rectangle Problem. Algorithmica, 5(1):65–73, 1990.
Article MATH MathSciNet Google Scholar
T. Zhang, R. Ramakrishnan, and M. Livny. BIRCH: An Efficient Data Clustering Method for Very Large Databases. ACM SIGMOD, 25(2), June 1996.
Google Scholar

Download references

Author information

Authors and Affiliations

York University, USA
Jeff Edmonds, Jarek Gryz & Dongming Liang
University of Toronto, Canada
Renée J. Miller

Authors

Jeff Edmonds
View author publications
You can also search for this author in PubMed Google Scholar
Jarek Gryz
View author publications
You can also search for this author in PubMed Google Scholar
Dongming Liang
View author publications
You can also search for this author in PubMed Google Scholar
Renée J. Miller
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Limburg University (LUC), 3590, Diepenbeek, Belgium
Jan Van den Bussche
Department of Computer Science and Engineering, University of California, 92093-0114, La Jolla, CA, USA
Victor Vianu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Edmonds, J., Gryz, J., Liang, D., Miller, R.J. (2001). Mining for Empty Rectangles in Large Data Sets. In: Van den Bussche, J., Vianu, V. (eds) Database Theory — ICDT 2001. ICDT 2001. Lecture Notes in Computer Science, vol 1973. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44503-X_12

Download citation

DOI: https://doi.org/10.1007/3-540-44503-X_12
Published: 12 October 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41456-8
Online ISBN: 978-3-540-44503-6
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics