Skip to main content
Log in

Query languages for statistical databases

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

Statistical database management systems keep raw, elementary and/or aggregated data and include query languages with facilities to calculate various statistics from this data. In this article we examine statistical database query languages with respect to the criteria identified and taxonomy developed in Ozsoyoglu and Ozsoyoglu (1985b). The criteria include statistical metadata and objects, aggregation features and interface to statistical packages. The taxonomy of statistical database query languages classifies them with respect to the data model used, the type of user interface and method of implementation. Temporal databases are rich sources of data for statistical analysis. Aggregation features of temporal query languages, as well as the issues in calculating aggregates from temporal data, are also examined.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Ahn, T. H., Jo, H. J., Kim, J. H., Yoon, J. L. and Byung, J. K. (1990) Temporal summary table management and graphic interface. In Proceedings of the 5th International Conference on Statistical and Scientific Database Management, Charlotte, NC.

  • Anderson, G. A., Snider, T., Robinson, B. and Toporek, J. (1983) An integrated support system for inter-package communication and handling large volume output from statistical database analysis operation. In Proc. 2nd International Workshop Statistical Database Management, Los Altos, CA.

  • Boufares, P., Elkabbaj, Y., Joiner, G., Ounally, H. (1985) La version SM90 du SGBD relationnel PEPIN. Journées SM90, Versailles, France.

  • Brown, W. A., Navathe, S. B. and Su, S. Y. W. (1983) Complex data types and a data manipulation language for scientific and statistical databases. In Proceedings of the 2nd International Workshop on Statistical Database Management, Los Altos, CA.

  • Buhler, R. (1981) Data manipulation in P-STAT. In Proceedings of the First International Workshop on Statistical Database Management, Menlo Park, CA.

  • Catarci, T. and Santucci, G. (1990) GRASP: a graphical system for statistical database. In Proceedings of the 5th International Conference on Statistical and Scientific Database Management, Charlotte, NC.

  • Catteil, R. G. G. (1980) An entity-based database user interface. In Proceedings of the ACM SIGMOD Conf.

  • Chan, C. and Michalewicz, Z. (1986) A query language capable of handling incomplete information and statistics. In Proceedings of the 3rd International Workshop on Statistical and Scientific Database Management, Luxemburg.

  • Chan, P. and Shoshani, A. (1980) SUBJECT: A dictionary driven system for organizing and accessing large statistical databases. In Proc. VLDB Conf.

  • Chen, M. C., McNamee, L. and Melkanoff, M. (1988) A model of summary data and its applications in statistical databases. In Proceedings of the 4th International Conference on Statistical and Scientific Database Management, Rome.

  • Chen, P. P. S. (1976) The entity relationship model: toward a unifying view of data. ACM Transactions on Datatbase Systems, 1(1).

  • Codd, E. F. (1972) Relational completeness of database sublanguages. In Database Systems (Courtant Computer Science Symposia Series, Vol. 6), Prentice-Hall, Englewood Cliffs, NJ.

    Google Scholar 

  • Computer Corporation of America (1979) File Manager's Technical Reference Manual, Model 204 Database Management System. Computer Corporation of America, Cambridge, MA.

    Google Scholar 

  • D'attri, A. and Ricci, F. L. (1988) Interpretation of statistical queries to relational databases. In Proceedings of the 4th International Conference on Statistical and Scientific Database Management, Rome.

  • Denning, D. E., Nichelson, W., Sande, G. and Shoshani, A. (1983) Research topics in statistical database management. In Proceedings of the Second International Workshop on Statistical Database Management, Los Altos, CA.

  • Dintelman, S. M. and Maness, A. T. (1982) An implementation of a query language supporting path expressions. In Proc. ACM SIGMOD Conference, Orlando, FL.

  • Elmasri, R. and Wuu, G. T. J. (1990) A temporal model and query language for ER-databases. In Proceedings of the 6th International Conference on Data Engineering, Los Angeles, CA. pp. 76–83.

  • Fortunato, E., Rafanelli, N., Ricci, F. L. and Sebastio, A. (1986) An algebra for statistical data. In Proceedings of the 3rd International Workshop on Statistical and Scientific Database Management, Luxemburg.

  • Fry, J. B. (1981) Data manipulation in SPSS and SPSS-X. In Proceedings of the 1st International Workshop on Statistical Database Management, Menlo Park, CA.

  • Gadia, S. H. and Vaishnav, J. H. (1985) A query language for a homogeneous temporal database. In Proceedings of the Symposium on PODS, Portland, OR. pp. 51–58.

  • Ghosh, S. P. (1984a) Statistical relation tables for statistical database management. Technical Report RJ 4394, IBM Research Laboratory, San Jose, CA.

    Google Scholar 

  • Ghosh, S. P. (1984b) An application of statistical databases in manufacturing testing. Proceedings of the IEEE COMDEC Conference, Chicago, IL.

  • Halanbondrainy, H. (1983) La système SICLA. In Proceedings of the 3rd International Symposium on Data Analysis, Versailles.

  • Hammond, R. (1981) Metadata in the RAPID DBMS. In Proceedings of the 1st International Workshop on Statistical Database Management, Menlo Park, CA.

  • Harlee, G. L. (1986) LABSTAT BROWSE: a search facility built for an existing database. In Proceedings of the 3rd International Workshop on Statistical and Scientific Database Management, Luxemburg.

  • Heiler, S. and Bergman, R. F. (1983) SIBYL: An economist's workbench. In Proceedings of the 2nd International Workshop on Statistical Database Management, Los Altos, CA.

  • Hendrix, G. G. et al. (1978) Developing a natural language interface to complex data. ACM Transactions on Database Systems, 3(2) pp. 105–47.

    Google Scholar 

  • Hollabaugh, L. A. and Reinwald, L. T. (1981) GPI: a statistical package/database interface. In Proceedings of the 1st International Workshop on Statistical Database Management, Menlo Park, CA.

  • Ikeda, H. and Kobayashi, Y. (1981) Additional facilities of a conventional DBMS to support interactive statistical analysis. In Proceedings of the 1st International Workshop on Statistical Database Management, Menlo Park, CA.

  • Jaeschke, G. and Schek, H. J. (1982) Remarks on the algebra of non first normal form relations. In 1st ACM SIGACT/ SIGMOD PODS Conference, Los Angeles, CA. pp. 124–138.

  • Johnson, R. (1981) Modelling summary data. In Proceedings of the ACM SIGMOD Conference, Ann Arbor, MI.

  • Joiner, G., Kezouit, O., Halanbondrainy, H. (1986) Data analysis for relational databases: the PEPIN-SICLA Systems. In Proceedings of the 3rd International Workshop on Statistical and Scientific Database Management, Luxemburg.

  • Karasolo, I. and Sevenson, P. (1983) An overview of CANTOR⊕ new system for data analysis. In Proceedings of the 2nd International Workshop on Statistical Database Management, Los Altos, CA.

  • Karasolo, I. and Sevenson, P. (1986) The design of CANTOR—a new system for data analysis. In Proceedings of the 3rd International Workshop on Statistical and Scientific Database Management, Luxemburg.

  • Klensin, J. C. (1983) A statistical database component of data analysis and modelling system: lessons from eight years of user experience. In Proceedings of the 2nd International Workshop on Statistical Database Management, Los Altos, CA.

  • Klensin, J. C. and Romberg, R. M. (1988) Statistical data management requirements and the SQL standards—an evolving comparison. In Proceedings of the 4th International Statistical and Scientific Database Management, Rome.

  • Klug, A. (1981) ABE—a query language for constructing aggregates-by-example. In Proceedings of the 1st International Workshop on Statistical Database Management, Menlo Park, CA.

  • Klug, A. (1982) Access paths in the ABE statistical query facility. In Proceedings of the ACM SIGMOD Conference, Orlando, FL.

  • Kohji, S. and Sato, H. (1983) Statistical database research project in Japan and the CAS SDB project. In Proceedings of the 2nd International Workshop on Statistical Database Management, Los Altos, CA.

  • Lipski, D. (1979) The semantic issues connected with incomplete information. ACM Transactions on Datatbase Systems, 4(3).

  • Maier, M. and Cirilli, C. (1983) SYSTEM/K: a knowledge base management system. In Proceedings of the 2nd International Workshop on Statistical Database Management, Los Altos, CA.

  • Malnborg, E. (1986) On the semantics of aggregated data. In Proceedings of the 3rd International Workshop on Statistical and Scientific Database Management, Luxemburg, pp. 152–58.

  • Malnborg, E. (1988) Design of user interface for an objectoriented statistical database. In Proceedings of the 4th International Conference on Statistical and Scientific Database Management, Rome.

  • Maness, A. T. and Dintelman, S. A. (1981a) Design of the genealogical information system. In Proceedings of the 1st International Workshop on Statistical Database Management, Menlo Park, CA.

  • Maness, A. T. and Dintelman, S. A. (1981b) The GENISYS data definition facilities. In Proceedings of the 2nd International Workshop on Statistical Database Management, Los Altos, CA.

  • McCarthy, J. I. (1982) Metadata management for large statistical databases. In Proceedings of the VLDB Conference. Mexico City.

  • McKenzie, E. and Snodgrass, R. (1987) Supporting valid time: a historical relational algebra. Technical Report, Department of Computer Science, University of North Carolina at Chapel Hill.

    Google Scholar 

  • Melton, J. (1988) ISD Database language. CPH-2a, ANSI X3H288-127, ANSI X3H2 ISD/TEC JTC1/SCZ1/WG3, Database Languages, ISO-ANSI (Working Draft) SQL2.

  • Merrill, D., McCarthy, J., Gey, F. and Holmes, H. (1983) Distributed data management in a minicomputer network. In Proceedings of the 2nd International Workshop on Statistical Database Management, Los Altos, CA.

  • Olken, F. (1983) How baroque should a statistical database management system be? In Proceedings of the 2nd International Workshop on Statistical Database Management, Los Altos, CA.

  • Ozsoyoglu, G. and Ozsoyoglu, Z. M. (1983a) Features of SSDB. In Proceedings of the 2nd International Workshop on Statistical Database Management, Los Altos, CA.

  • Ozsoyoglu, G. and Ozsoyoglu, Z. M. (1983b) An extension of relational algebra for summary tables. In Proceedings of the 2nd International Workshop on Statistical Database Management, Los Altos, CA.

  • Ozsoyoglu, G. and Ozsoyoglu, Z. M. (1984a) STBE—a database query language for manipulating summary data. In Proceedings of the IEEE COMPDEC Conference, Los Angeles, CA.

  • Ozsoyoglu, G. and Ozsoyoglu, Z. M. (1984b) SSDB—an architecture for statistical databases. In Proceedings of the 4th IJCIT Conference, Jerusalem, Israel.

  • Ozsoyoglu, G. and Ozsoyoglu, Z. M. (1985a) A query language for statistical databases. In Query Processing in Database Systems (W. Kim, D. Reiner and D. S. Batory, eds.), Springer-Verlag, New York.

    Google Scholar 

  • Ozsoyoglu, G. and Ozsoyoglu, Z. M. (1985b) Statistical database query languages. IEEE Transactions on Software Engineering, 11(10), pp. 1071–80.

    Google Scholar 

  • Ozsoyoglu, G., Ozsoyoglu, Z. M. and Mata, F. (1985) A language and a physical organization technique for summary tables. In Proceedings of the ACM SIGMOD Conference, Austin, TX.

  • Ozsoyoglu, G., Ozsoyoglu, Z. M. and Matos, V. (1987) Extending relational algebra and relational calculus with set-valued attributes and aggregate functions. ACM Transactions on Database Systems, 12(4), pp. 566–592.

    Google Scholar 

  • Rafanelli, M. and Ricci, F. (1990) A visual interface for browsing and manipulating statistical entities. In Proceedings of the 5th International Conference on Statistical and Scientific Database Management, Charlotte, NC.

  • Rosenthal, A. and Reiner, D. (1984) Extending the algebraic framework of query processing to handle outerjoins. In Proceedings of the VLDB Conference, Singapore.

  • SAS User Manual (1992) SAS Institute Inc, Box 8000, Cary, NC.

  • Sato, H. (1981) Handling summary information in a database: derivability. In Proceedings of the ACM SIGMOD Conference, Orlando, FL.

  • Sato, H. (1988) A data model, knowledge base and natural language processing for sharing a large statistical database. In Proceedings of the 4th International Conference on Statistical and Scientific Database Management, Rome.

  • Sato, H., Takayaki, O., Youshindu, N. and Pysouke, F. (1986) Conceptual schema for a wide-scope statistical database and its applications. In Proceedings of the 3rd International Workshop on Statistical and Scientific Database Management, Luxemburg.

  • Segev, A. and Shoshani, A. (1987) Logical modelling of temporal databases. In Proceedings of the SIGMOD Conference. San Francisco, CA, pp. 454–466.

  • Shoshani, A. (1979) CABLE: a language based on the E-R model. In Proceedings of the E-R Conference, Los Angeles, CA.

  • Shoshani, A. (1982) Statistical databases: characteristics, problems and some solutions. In Proceedings of the VLDB Conference, Mexico City, pp. 208–222.

  • Shoshani, A. and Kawagoe, K. (1987) Temporal data management. In Proceedings of the VLDB Conference, Kyoto, Japan, pp. 79–88.

  • Snodgrass, R. (1987) The temporal query language TQUEL. ACM Transactions on Database Systems, 12(2), pp. 247–298.

    Google Scholar 

  • Snodgrass, R., Gomez, S. and McKenzie, E. (1987) Aggregates in the temporal query language TQUEL. Technical Report, Department of Computer Science, University of North Carolina at Chapel Hill.

    Google Scholar 

  • SQL/DS (1981) SQL/Data system: general information. Report GH24–5012, IBM Corporation. Department GRIT, 180 Kost Road, Mechanicburg, PE 17055.

    Google Scholar 

  • Stein, D. M. (1986) A database interface to an integrated data analysis and plotting tool. In Proceedings of the 3rd International Workshop on Statistical and Scientific Database Management, Luxemburg.

  • Stephenson, G. A. (1988) Knowledge browsing-front ends to statistical database. In Proceedings of the 4th International Statistical and Scientific Database Management, Rome.

  • Stonebraker, M., Wong, E., Kreps, P. and Held, G. (1976) The design and implementation of INGRES. ACM Transactions on Database Systems, 1(3), pp. 189–222.

    Google Scholar 

  • Su, S. Y. W. (1983) SAM*: A semantic association model for corporate and scientific-statistical database. Information Sciences, 29 pp. 151–199.

    Google Scholar 

  • Su, S. Y. W., Navathe, S. B. and Batory, D. S. (1983) Logical and physical modeling of statistical/scientific databases. In Proceedings of the 2nd International Workshop on Statistical Database Management, Los Altos, CA.

  • Table Producing Language System, version 5 (1980) Bureau of Labor Statistics, Washington, DC.

  • Tansel, A. U. (1986) Adding time dimensions to relational model and extending relational algebra. Information Systems, 11(4), pp. 343–355.

    Google Scholar 

  • Tansel, A. U. (1987) A statistical interface for historical relational databases. In Proceedings of the Data Engineering Conference, Los Angeles, CA.

  • Tansel, A. U. (1988) A statistical database for planning and research. Technical Report, Baruch College, CUNY.

  • Tansel, A. U. (1990a) Modelling temporal data. Journal of Information and Software Technology, 32(8).

  • Tansel, A. U. (1990b) A historical query language. Information Sciences, 32(8), 514–20.

    Google Scholar 

  • Tansel, A. U. (1991) Statistical database query languages. In Statistical and Scientific Databases (ed. Z. Michalewicz), pp. 233–65. Ellis Horwood, London.

    Google Scholar 

  • Tansel, A. U. (1992) Temporal relational data model. Technical Report, CIS-26-92, Baruch College, CUNY.

  • Tansel, A. U. and Arkun, M. E. (1986a) HQUEL, a query language for historical relational databases. In Proceedings of the 3rd International Workshop on Statistical and Scientific Database Management, Luxemburg.

  • Tansel, A. U. and Arkun, M. E. (1986b) Aggregation operations in historical relational databases. In Proceedings of the 3rd International Workshop on Statistical and Scientific Database Management, Luxemburg.

  • Tansel, A. U. and Garnett, I. (1991) Equivalence of algebra and calculus languages for nested relation. Computer and Mathematics with Applications, 23(10) 3–25.

    Google Scholar 

  • Tanel, A. U., Arkun, M. E. and Ozsoyoglu, G. (1989) Time-by-example database query language. IEEE Transactions on Software Engineering, 15(4).

  • Thomas, J. J. and Hall, D. L. (1983) ALDS project: Motivation, statistical database management issues, perspectives, and directions. In Proceedings of the 2nd International Statistical Database Management, Los Altos, CA.

  • Turner, M., Hammond, R. and Gotten, P. (1979) A DBMS for statistical databases. In Proceedings of the VLDB Conference, Rio de Janeiro, Brazil.

  • Weiss, S. E. and Weeks, P. L. (1983) PASTE—a tool to put application systems together easily. In Proceedings of the 2nd International Workshop on Statistical Database Management, Los Altos, CA.

  • Weiss, S. E., Weeks, P. L. and Byrd, N. J. (1981) Must we navigate through databases? In Proceedings of the 1st International Workshop on Statistical Database Management, Menlo Park, CA.

  • Whistler, D. (1986) The design of a database management system for economic time series data. In Proceedings of the 3rd International Workshop on Statistical and Scientific Database Management, Luxemburg.

  • Wong, H. K. T. and Kuo, I. (1982) GUIDE: Graphical user interface for database exploration. In Proceedings of the VLDB Conference, Mexico City.

  • Zloof, M. M. (1977) Query-by-example: a database language. IBM System Journal, 16(4) 324–43.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tansel, A.U. Query languages for statistical databases. Stat Comput 5, 59–72 (1995). https://doi.org/10.1007/BF00140666

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF00140666

Keywords

Navigation