Advertisement

Dimension Enrichment with Factual Data During the Design of Multidimensional Models: Application to Bird Biodiversity

  • Lucile Sautot
  • Sandro Bimonte
  • Ludovic Journaux
  • Bruno Faivre
Conference paper
Part of the Lecture Notes in Business Information Processing book series (LNBIP, volume 241)

Abstract

Data warehouses (DW) and OLAP systems are technologies allowing the on-line analysis of huge volume of data according to decision-makers’ needs. Designing DW involves taking into account functional requirements and data sources (mixed design methodology) [1]. But, for complex applications, existing automatic design methodologies seem inefficient. In some cases, decision-makers need querying, as a dimension, data which have been defined as facts by actual automatic mixed approachs. Therefore, in this paper, we offer a new mixed refinement methodology relevant to constellation multidimensional schema. The proposed methodolgy allows to decision-makers to enrich a dimension with factual data. In order to validate our theoretical proposals, we have implemented an enrichment tool and we have tested it on a real case study from bird biodiversity.

Keywords

Multidimensional design Data warehouse OLAP Data mining 

Notes

Acknowledgements

Data acquisition received financial support from the FEDER Loire, Etablissement Public Loire, DREAL de Bassin Centre, the Région Bourgogne (PARI, Projet Agrale 5) and the French Ministry of Agriculture. We also thank heartily Pr. John Aldo Lee, from the Catholic University of Leuven, for his help.

References

  1. 1.
    Phipps, C., Davis, K.C.: Automating data warehouse conceptual schema design and evaluation. In: Proceedings of the 4th International Workshop on Design and Management of Data Warehouses (DMDW), vol. 2 (2002)Google Scholar
  2. 2.
    Kimball, R.: The Data Warehouse Toolkit: Practical Techniques for Building Dimensional Data Warehouses. Wiley, New York (1996)Google Scholar
  3. 3.
    Romero, O., Abello, A.: A survey of multidimensional modeling methodologies. Int. J. Data Warehouse. Min. 5, 1–23 (2009)CrossRefGoogle Scholar
  4. 4.
    Mahboubi, H., Ralaivao, J.C., Loudcher, S., Boussaïd, O., Bentayeb, F., Darmont, J., et al.: X-WACoDa: an XML-based approach for warehousing and analyzing complex data. In: Data Warehousing Design and Advanced Engineering Applications: Methods for Complex Construction, pp. 38–54 (2009)Google Scholar
  5. 5.
    Jensen, M.R., Holmgren, T., Pedersen, T.B.: Discovering multidimensional structure in relational data. In: Kambayashi, Y., Mohania, M., Wöß, W. (eds.) DaWaK 2004. LNCS, vol. 3181, pp. 138–148. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  6. 6.
    Favre, C., Bentayeb, F., Boussaid, O.: A knowledge-driven data warehouse model for analysis evolution. Frontiers Artif. Intell. Appl. 143, 271 (2006)Google Scholar
  7. 7.
    Sautot, L., Faivre, B., Journaux, L., Molin, P.: The hierarchical agglomerative clustering with gower index: a methodology for automatic design of OLAP cube in ecological data processing context. Ecol. Inf. 26, 217–230 (2014) (in Press)Google Scholar
  8. 8.
    Jovanovic, P., Romero, O., Simitsis, A., Abelló, A.: Ore: An iterative approach to the design and evolution of multi-dimensional schemas. In: Proceedings of the Fifteenth International Workshop on Data Warehousing and OLAP, DOLAP 2012, pp. 1–8. ACM, New York (2012)Google Scholar
  9. 9.
    Romero, O., Abello, A.: Automatic validation of requirements to support multidimensional design. Data Knowl. Eng. 69, 917–942 (2010)CrossRefGoogle Scholar
  10. 10.
    Carmè, A., Mazon, J.N., Rizzi, S.: A model-driven heuristic approach for detecting multidimensional facts in relational data sources. In: Bach Pedersen, T., Mohania, M.K., Tjoa, A.M. (eds.) DAWAK 2010. LNCS, vol. 6263, pp. 13–24. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  11. 11.
    Nguyen, T.B., Tjoa, A.M., Wagner, R.R.: An object oriented multidimensional data model for OLAP. In: Lu, H., Zhou, A. (eds.) WAIM 2000. LNCS, vol. 1846, pp. 69–82. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  12. 12.
    Messaoud, R.B., Boussaid, O., Rabaséda, S.: A new OLAP aggregation based on the AHC technique. In: DOLAP 2004, ACM Seventh International Workshop on Data Warehousing and OLAP, pp. 65–72 (2004)Google Scholar
  13. 13.
    Bentayeb, F.: K-means based approach for OLAP dimension updates. In: 10th International Conference on Enterprise Information Systems (ICEIS), pp. 531–534 (2008)Google Scholar
  14. 14.
    Leonhardi, B., Mitschang, B., Pulido, R., Sieb, C., Wurst, M.: Augmenting OLAP exploration with dynamic advanced analytics. In: 13th International Conference on Extending Database Technology (EDBT 2010) (2010)Google Scholar
  15. 15.
    Ceci, M., Cuzzocrea, A., Malerba, D.: OLAP over continuous domains via density-based hierarchical clustering. In: König, A., Dengel, A., Hinkelmann, K., Kise, K., Howlett, R.J., Jain, L.C. (eds.) KES 2011, Part II. LNCS, vol. 6882, pp. 559–570. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  16. 16.
    Sautot, L., Bimonte, S., Journaux, L., Faivre, B.: A methodology and tool for rapid prototyping of data warehouses using data mining: application to birds biodiversity. In: Ait Ameur, Y., Bellatreche, L., Papadopoulos, G.A. (eds.) MEDI 2014. LNCS, vol. 8748, pp. 250–257. Springer, Heidelberg (2014)Google Scholar
  17. 17.
    Arora, M., Gosain, A.: Schema evolution for data warehouse: a survey. Int. J. Comput. Appl. (0975–8887) 22, 6–14 (2011)Google Scholar
  18. 18.
    Subotic, D., Poscic, P., Jovanovic, V.: Data warehouse schema evolution: state of the art. In: Proceedings of the Central European Conference on Information and Intelligent Systems, pp. 18–25 (2014)Google Scholar
  19. 19.
    Legube, B., Merlet, N.: Les indicateurs biologiques de la qualité de l’eau. In: L’analyse de l’eau. 9e edn., pp. 865–962. Dunod (2009)Google Scholar
  20. 20.
    Blondel, J., Ferry, C., Frochot, B.: Point counts with unlimited distance. In: Ralph, C.J., Scott, J.M. (eds.) Estimating Numbers of Terrestrial Birds. Studies in Avian Biology. vol. 6, pp. 414–420 (1981)Google Scholar
  21. 21.
    I.B.C.C.: Censuring breeding bird by the I.P.A. method. Pol. Ecol. Stud. 3, 15–17 (1977)Google Scholar
  22. 22.
    Miquel, M., Bédard, Y., Brisebois, A., Pouliot, J., Marchand, P., Brodeur, J.: Modeling multi-dimensional spatio-temporal data werehouses in a context of evolving specifications. Int. Arch. Photogrammetry Remote Sens. Spat. Inf. Sci. 34, 142–147 (2002)Google Scholar
  23. 23.
    Lenz, H.J., Thalheim, B.: A formal framework of aggregation for the OLAP-OLTP model. J. Univ. Comput. Sci. 15, 273–303 (2009)MathSciNetMATHGoogle Scholar
  24. 24.
    Briand, L.C., Morasca, S., Basili, V.R.: An operational process for goal-driven definition of measures. IEEE Trans. Softw. Eng. 28, 1106–1125 (2002)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Lucile Sautot
    • 1
  • Sandro Bimonte
    • 2
  • Ludovic Journaux
    • 3
  • Bruno Faivre
    • 1
  1. 1.UMR BiogéosciencesUniversité de BourgogneDijonFrance
  2. 2.IRSTEA Centre de Clermont-FerrandAubièreFrance
  3. 3.UMR LE2IUniversité de BourgogneDijonFrance

Personalised recommendations