Advertisement

Mining for Empty Rectangles in Large Data Sets

  • Jeff Edmonds
  • Jarek Gryz
  • Dongming Liang
  • Renée J. Miller
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1973)

Abstract

Many data mining approaches focus on the discovery of similar (and frequent) data values in large data sets. We present an alternative, but complementary approach in which we search for empty regions in the data. We consider the problem of finding all maximal empty rectangles in large, two-dimensional data sets. We introduce a novel, scalable algorithm for finding all such rectangles. The algorithm achieves this with a single scan over a sorted data set and requires only a small bounded amount of memory. We also describe an algorithm to find all maximal empty hyper-rectangles in a multi-dimensional space. We consider the complexity of this search problem and present new bounds on the number of maximal empty hyper-rectangles. We briefly overview experimental results obtained by applying our algorithm to a synthetic data set.

Keywords

Association Rule Scalable Algorithm High Step Empty Region Empty Rectangle 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    R. Agrawal, T. Imielinksi, and A. Swami. Mining Association Rules between Sets of Items in Large Databases. ACM SIGMOD, 22(2), June 1993.Google Scholar
  2. 2.
    M. J. Atallah and Fredrickson G. N. A note on finding a maximum empty rectangle. Discrete Applied Mathematics, (13):87–91, 1986.Google Scholar
  3. 3.
    D. Barbará, W. DuMouchel, C. Faloutsos, P. J. Haas, J. M. Hellerstein, Y. E. Ioannidis, H. V. Jagadish, T. Johnson, R. T. Ng, V. Poosala, K. A. Ross, and K. C. Sevcik. The New Jersey Data Reduction Report. Data Engineering Bulletin, 20(4):3–45, 1997.Google Scholar
  4. 4.
    Bernard Chazelle, Robert L. (Scot) Drysdale III, and D. T. Lee. Computing the largest empty rectangle. SIAM J. Comput., 15(1):550–555, 1986.CrossRefGoogle Scholar
  5. 5.
    Q. Cheng, J. Gryz, F. Koo, C. Leung, L. Liu, X. Qian, and B. Schiefer. Implementation of two semantic query optimization techniques in DB2 universal database. In Proceedings of the 25th VLDB, pages 687–698, Edinburgh, Scotland, 1999.Google Scholar
  6. 6.
    J. Edmonds, J. Gryz, D. Liang, and R. J. Miller. Mining for Empty Rectangles in Large Data Sets (Extended Version). Technical Report CSRG-410, Department of Computer Science, University of Toronto, 2000.Google Scholar
  7. 7.
    M. R. Garey and D. S. Johnson. Computers and Intractability. W. H. Freeman and Co., New York, 1979.MATHGoogle Scholar
  8. 8.
    H. V. Jagadish, J. Madar, and R. T. Ng. Semantic Compression and Pattern Extraction with Fascicles. In Proc. of VLDB, pages 186–197, 1999.Google Scholar
  9. 9.
    B. Liu, K. Wang, L.-F. Mun, and X.-Z. Qi. Using Decision Tree Induction for Discovering Holes in Data. In 5th Pacific Rim International Conference on Artificial Intelligence, pages 182–193, 1998.Google Scholar
  10. 10.
    Bing Liu, Liang-Ping Ku, and Wynne Hsu. Discovering interesting holes in data. In Proceedings of IJCAI, pages 930–935, Nagoya, Japan, 1997. Morgan Kaufmann.Google Scholar
  11. 11.
    R. J. Miller and Y. Yang. Association Rules over Interval Data. ACM SIGMOD, 26(2):452–461, May 1997.Google Scholar
  12. 12.
    A. Namaad, W. L. Hsu, and D. T. Lee. On the maximum empty rectangle problem. Applied Discrete Mathematics, (8):267–277, 1984.Google Scholar
  13. 13.
    M. Orlowski. A New Algorithm for the Largest Empty Rectangle Problem. Algorithmica, 5(1):65–73, 1990.MATHCrossRefMathSciNetGoogle Scholar
  14. 14.
    T. Zhang, R. Ramakrishnan, and M. Livny. BIRCH: An Efficient Data Clustering Method for Very Large Databases. ACM SIGMOD, 25(2), June 1996.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2001

Authors and Affiliations

  • Jeff Edmonds
    • 1
  • Jarek Gryz
    • 1
  • Dongming Liang
    • 1
  • Renée J. Miller
    • 2
  1. 1.York UniversityUSA
  2. 2.University of Toronto

Personalised recommendations