Advertisement

Performance Improvement of Open Source Based Business Intelligence System Using Database Modeling and Outlier Detection

  • Tsatsral Amarbayasgalan
  • Meijing Li
  • Oyun-Erdene Namsrai
  • Bilguun Jargalsaikhan
  • Keun Ho RyuEmail author
Chapter
Part of the Studies in Computational Intelligence book series (SCI, volume 830)

Abstract

With all the advanced technology nowadays, new data is being generated every minute. For example, the average size of the computer’s hard disk is 10 gigabytes in 2000, today on the Facebook website has increased 500 terabytes of new data per day [1]. Data is growing rapidly, but it is not enough valuable. Thus, it is important to extract information that is useful in the future from a large amount of data. Business intelligence (BI) systems make a prediction that supports a business decision by analyzing collected data [2]. However, the accuracy of prediction depends on a data quality. In practice, data is usually a very low quality that includes many incomplete and anomaly data. Moreover, another problem is if data size increases, query response will be slow. Previous research work, we proposed a framework based on open-source technologies for the BI systems that possibility to analyze big data efficiently and apply it to the supermarket’s BI system. Under this solution, we have studied Hadoop data storage system, Hive data warehouse software, Sqoop data transmission tool and etc., successfully implemented them. In this paper, we have added anomaly detection stage on the proposed framework to improve information about related products that are purchased together by eliminating anomaly. Also, we have made an experimental study to improve the speed of time-dependent reports by applying the dimensional model to Hive data warehouse. In dimensional model data is stored in context of the single table (centralized context), and in relational model the context is distributed over many tables. As a result of the experimental study, the dimensional model is more efficient; its query response time is shown to be at least two times faster than the relational model based data warehouse.

Keywords

BI system Big data Data warehouse Data mining Anomaly detection 

Notes

Acknowledgements

This research was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT & Future Planning (No. 2017R1A2B4010826), by the Private Intelligence Information Service Expansion (No. C0511-18-1001) funded by the NIPA (National IT Industry Promotion Agency), by the Business for Cooperative R&D between Industry, Academy, and Research Institute funded Korea Small and Medium Business Administration in 2017 (Grants No. C0541451) and also by the National Natural Science Foundation of China (61702324) in People’s Republic of China.

References

  1. 1.
    Big Data Explained: [Online]. Available: https://www.mongodb.com/BIG-DATA-EXPLAINED
  2. 2.
    Chee, T., Chan, L.K., Chuah, M.H., Tan, C.S., Wong, S.F., Yeoh, W.: Business intelligence systems: state-of-the-art review and contemporary applications: In: Symposium on Progress in Information & Communication Technology, pp. 16–30, Kuala Lumpur, Malaysia (2009)Google Scholar
  3. 3.
    Tan, P.N., Steinbach, M.: Introduction to Data Mining, 1st edn. Pearson Education Inc., Boston (2006)Google Scholar
  4. 4.
    Shaw, M.J., Subramaniam, C., Tan, G.W., Welge, M.E.: Knowledge management and data mining for marketing. Decis. Support Syst. 31(1), 127–137 (2001)CrossRefGoogle Scholar
  5. 5.
    Shim, J.P., Warkentin, M., Courtney, J.F., Power, D.J., Sharda, R., Carlsson, C.: Past, present, and future of decision support technology. Decis. Support Syst. 33(2), 111–126 (2002)CrossRefGoogle Scholar
  6. 6.
    Lee, D.G., Ryu, K.S., Bashir, M., Bae, J.W., Ryu, K.H.: Discovering medical knowledge using association rule mining in young adults with acute myocardial infarction. J. Med. Syst. 37(2), 9896 (2013)CrossRefGoogle Scholar
  7. 7.
    Cho, Y.S., Kim, Y.A., Moon, S.C., Park, S.H., Ryu, K.H.: Effective purchase pattern mining with weight based on FRAT analysis for recommender in e-commerce. Lecture Notes in Electrical Engineering 330, 443–454 (2015)CrossRefGoogle Scholar
  8. 8.
    Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. 41(3), 1–58 (2009)CrossRefGoogle Scholar
  9. 9.
    Amarbayasgalan, T., Jargalsaikhan, B., Ryu, K.H.: Unsupervised novelty detection using deep autoencoders with density based clustering. Appl. Sci. 8(9), 1468 (2018)CrossRefGoogle Scholar
  10. 10.
    Wu, X., Zhu, X., Wu, G.Q., Ding, W.: Data mining with big data. IEEE Trans. Knowl. Data Eng. 26(1), 97–107 (2014)CrossRefGoogle Scholar
  11. 11.
    Liu, K., Zhou, X., Feng, Y., Liu, J.: Clinical data preprocessing and case studies of POMDP for TCM treatment knowledge discovery. In: 14th International Conference on e-Health Networking, Applications and Services (Healthcom), pp. 10–14, IEEE (2012)Google Scholar
  12. 12.
    Amarbayasgalan, T., Bukhsuren, E., Namsrai, O., Ryu, K.H.: the approach of implementing business intelligence system: possibility to analyze big data. JARDCS (2), 775–779 (2018)Google Scholar
  13. 13.
    Alfredo, C.: Analytics over big data: exploring the convergence of data warehousing, OLAP and data-intensive cloud infrastructures. In: IEEE 37th Annual Computer Software and Applications Conference (2013)Google Scholar
  14. 14.
    Rogers, S.: Big data is scaling BI and analytics. Inf. Manage. 21(5), 14–20 (2011)Google Scholar
  15. 15.
    Li, D., Park, H.W., Batbaatar, E., Munkhdalai, L., Musa, I., Li, M., Ryu, K.H.: Application of a mobile chronic disease health-care system for hypertension based on big data platforms. J. Sens. (2018)Google Scholar
  16. 16.
  17. 17.
  18. 18.

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Database Laboratory, School of Electrical and Computer EngineeringChungbuk National UniversityCheongjuKorea
  2. 2.College of Information EngineeringShanghai Maritime UniversityShanghaiPeople’s Republic of China
  3. 3.School of Engineering and Applied SciencesNational University of MongoliaUlaanbaatarMongolia
  4. 4.Faculty of Information TechnologyTon Duc Thang UniversityHo Chi Minh CityVietnam
  5. 5.Department of Computer Science, College of Electrical and Computer EngineeringChungbuk National UniversityCheongjuKorea

Personalised recommendations