Summary
We consider a family of general aggregation problems and prove each of its members to be NP-complete in the strong sense. These problems require that we partition a set of objects into “aggregates”. The goal is to minimize the expected cost of satisfying an anticipated collection of requests for subsets of the objects, where the cost of satisfying a request includes both the number and the sizes of the aggregates which must be retrieved. The aggregation problems are viewed as very basic versions of important database optimization problems, including: the partitioning of data items into record types, the clustering of records into physical blocks of storage, and the partitioning of a database into granules to support locking. The NP-completeness results demonstrate that such optimization problems are intractable, even when simplified to the extreme. The fact that the problems are NP-complete in the strong sense also rules out pseudopolynomial time solutions, unless P = NP.
Similar content being viewed by others
References
Babad, J.: A record and file partitioning model. CACM, 20, 22–31 (1977)
Chen, P.: The entity-relationship model — towards a unified view of data. ACM Trans. Database Syst. 1, 9–36 (1976)
Garey, M., Johnson, D.: Computers and Intractability: A guide to the theory of NP-completeness. San Francisco, Calif.: W.H. Freeman 1979
Hammer, M., Niamir, B.: A heuristic approach to attribute partitioning. Proc. ACM/SIGMOD Int. Conf. Manage. Data, pp. 93–100, 1979
Helman, P., Veroff, R.: Designing deductive databases. J. Autom. Reasoning 4, 29–69 (1988)
March, S.: Techniques for structuring database records. ACM Comput. Surv. 15, 45–79 (1983)
Meyer, T., Helman, P.: Heuristics for designing database records to minimize retrieval times. University of New Mexico, Department of Computer Science, Technical Report No. CS87-4, 1987
Schkolnick, M.: A clustering algorithm for hierarchical structures. ACM Trans. Database Syst. 2, 27–44 (1977)
Smith, J., Smith, D.: Database abstraction: aggregation and generalization. CACM 20, 405–413 (1977)
Teorey, T., Fry, J.: Design of Database Structures. Englewood Cliffs, N.J.: Prentice Hall 1982
Ullman, J.: Principles of Database Systems. Potomac, Maryland: Computer Science Press 1982
Yao, S., Kunii, T.: Data Base Design Techniques. Berlin Heidelberg New York: Springer 1982
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Helman, P. A family of NP-complete data aggregation problems. Acta Informatica 26, 485–499 (1989). https://doi.org/10.1007/BF00289148
Received:
Issue Date:
DOI: https://doi.org/10.1007/BF00289148