Statistical modelling and parametric optimization in document fragmentation

  • R. KalaiselviEmail author
  • K. Kousalya
Original Article


In recent days, most of the business enterprises and individuals are attracted towards cloud computing due to its cost efficiency and scalability. Though the cloud adoption is significant, its security has become nightmare due to its multi-tenancy property. Generally cloud service providers commit to ensure data reliability and security, but they may get depleted due to the rapid growth rate of cloud customers. To overcome the security issues and to protect the documents uploaded in cloud, cryptography is more used. Data security can further be improved with a technique called fragmentation which helps in outsourcing data partitions instead of entire document. The fragmentation becomes a difficult and time-consuming process when the size of document grows. In this paper, an efficient fragmentation process with virtualization is proposed. CPU cycles are efficiently used by the generation of VMs which reduce the time complexity of fragmentation process. The factors such as document size, processor capacity, storage capacity and number of VMs are taken into consideration to analyse their influence on the fragmentation time. Healthcare documents’ fragmentation process is conducted, and measured real-time values are analysed statistically. For experimentation purpose, a private cloud OpenStack on Oracle virtual box is used. Taguchi technique (L27 orthogonal array) is employed to find the optimum levels of the parameters on the fragmentation time, while analysis of variance is used to analyse the contribution of the parameters towards the performance of fragmentation process. Results reveal that the document size is the most dominant factor influencing the fragmentation time followed by processor speed. By parallelizing the fragmentation process with the help of multiple VMs, the time complexity of the process gets reduced.


Parametric optimization Fragmentation Taguchi Analysis of variance 



  1. 1.
    Buyya R, Yeo CS, Venugopal S, Broberg J, Brandic I (2009) Cloud computing and emerging it platforms: vision, hype, and reality for delivering computing as the 5th utility. Future Gen Comput Syst 25(6):599–616CrossRefGoogle Scholar
  2. 2.
    Priyan MK, Devi U, Manogaran G, Sundarasekar R, Chilamkurti N, Varatharajan R (2018) Ant colony optimization algorithm with Internet of Vehicles for intelligent traffic control system. Comput Netw 144:154–162CrossRefGoogle Scholar
  3. 3.
    Hameed A, Khoshkbarforoushha A, Ranjan R, Jayaraman PP, Kolodziej J, Balaji P, Zeadally S, Malluhi QM, Tziritas N, Vishnu A, Khan SU, Zomaya A (2016) A survey and taxonomy on energy efficient resource allocation techniques for cloud computing systems. Computing 98(7):751–774MathSciNetCrossRefGoogle Scholar
  4. 4.
    Priyan MK, Lokesh S, Varatharajan R, Babu GC, Parthasarathy P (2018) Cloud and IoT based disease prediction and diagnosis system for healthcare using fuzzy neural classifier. Future Gen Comput Syst 86:527–534CrossRefGoogle Scholar
  5. 5.
    Kossmann D, Kraska T, Loesing S (2010) An evaluation of alternative architectures for transaction processing in the cloud. In: Proceeding of 2010 ACM SIGMOD international conference on management of data (SIGMOD’10), pp 579–590Google Scholar
  6. 6.
    Qureshi M, Patt Y (2006) Utility-based cache partitioning: a low-overhead, high-performance, runtime mechanism to partition shared caches. In: Proceedings of the 39th annual IEEE/ACM international symposium on microarchitecture (MICRO 39), pp 423–432Google Scholar
  7. 7.
    Chandra Babu G, Shantharajah SP (2018) Optimal body mass index cutoff point for cardiovascular disease and high blood pressure. Neural computing and applications, pp 1–10Google Scholar
  8. 8.
    Rolia J, Vetland V (1995) Parameter estimation for performance models of distributed application systems. In: Proceedings of CASCON, IBM Press, Toronto, Ontario, Canada, pp 54–59Google Scholar
  9. 9.
    Kanisha B, Lokesh S, Kumar PM, Parthasarathy P, Chandra Babu G (2018) Speech recognition with improved support vector machine using dual classifiers and cross fitness validation. Personal and ubiquitous computing, pp 1–9Google Scholar
  10. 10.
    Nathuji R, Kansal A, Ghaffarkhah A (2010) Q-clouds: managing performance interference effects for QoS-aware clouds. In: Proceedings of the 5th European conference on computer systems (EuroSys’10), ACM, pp 237–250Google Scholar
  11. 11.
    Manogaran G, Varatharajan R, Lopez D, Priyan MK, Sundarasekar R, Thota C (2018) A new architecture of Internet of Things and big data ecosystem for secured smart healthcare monitoring and alerting system. Future Gen Comput Syst 82:375–387CrossRefGoogle Scholar
  12. 12.
    Kraft S, Pacheco-Sanchez S, Casale G, Dawson S (2009). Estimating service resource consumption from response time measurements. In: Proceedings of the fourth international ICST conference on performance evaluation methodologies and tools, SAP research, ICST VALUETOOLS 2009 (VALUETOOLS’09)Google Scholar
  13. 13.
    Beebe NHF (1994). The impact of memory and architecture on computer performance, PDF text document. Center for Scientific Computing Department of Mathematics, University of Utah, Salt Lake City, UT 84112, USAGoogle Scholar
  14. 14.
    Manogaran G, Vijayakumar V, Varatharajan R, Kumar PM, Sundarasekar R, Hsu CH (2018) Machine learning based big data processing framework for cancer diagnosis using hidden Markov model and GM clustering. Wirel Pers Commun 102(3):2099–2116CrossRefGoogle Scholar
  15. 15.
    Si L, Callan J (2004) The effect of database size distribution on resource selection algorithms, vol 2924. Springer, Berlin, pp 31–42Google Scholar
  16. 16.
    Powell AL, French JC, Callan J (2000) The impact of database selection on distributed searching. In: Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR’00), pp 232–239Google Scholar
  17. 17.
    Devi GU, Priyan MK, Gokulnath C (2018) Wireless camera network with enhanced SIFT algorithm for human tracking mechanism. Int J Internet Technol Secur Trans 8(2):185–194CrossRefGoogle Scholar
  18. 18.
    Bourlai T, Kittler J, Messer K (2006) Database size effects on performance on a smart card face verification system. In: 7th international conference on automatic face and gesture recognition (FGR06), IEEE, pp 66–72Google Scholar
  19. 19.
    Varatharajan R, Manogaran G, Priyan MK, Balaş VE, Barna C (2018) Visual analysis of geospatial habitat suitability model based on inverse distance weighting with paired comparison analysis. Multimed Tools Appl 77(14):17573–17593CrossRefGoogle Scholar
  20. 20.
    Zhang H, Chen G, Tan K-L, Zhang M (2015) In-memory big data management and processing: a survey. IEEE Trans Knowl Data Eng 27(7):1920–1948CrossRefGoogle Scholar
  21. 21.
    Priya S, Varatharajan R, Manogaran G, Sundarasekar R, Kumar PM (2018) Paillier homomorphic cryptosystem with poker shuffling transformation based water marking method for the secured transmission of digital medical images. Personal and ubiquitous computing, pp 1–11Google Scholar
  22. 22.
    Lagar-Cavilla HA, Whitney JA, Scannell A, Patchin P, Rumble SM, de Lara E, Brudno M, Satyanarayanan M (2009). SnowFlock: rapid virtual machine cloning for cloud computing. In: Proceedings of the 4th ACM European conference on computer systems (EuroSys’09), pp 1–12Google Scholar
  23. 23.
    Varatharajan R, Preethi AP, Manogaran G, Kumar PM, Sundarasekar R (2018) Stealthy attack detection in multi-channel multi-radio wireless networks. Multimedia tools and applications, pp 1–24Google Scholar
  24. 24.
    Kalaiselvi R, Kousalya K (2018) Enhanced protection for textual healthcare documents in cloud environment. Taga J Graph Technol 14:1940–1956Google Scholar

Copyright information

© Springer-Verlag London Ltd., part of Springer Nature 2019

Authors and Affiliations

  1. 1.Kumaraguru College of TechnologyCoimbatoreIndia
  2. 2.Kongu Engineering CollegeErodeIndia

Personalised recommendations