Cluster Computing

, Volume 22, Supplement 1, pp 2027–2037 | Cite as

A mutual refinement technique for big data retrieval using hash tag graph

  • T. PrasanthEmail author
  • M. Gunasekaran


Big data is centered upon the technique of expanding volume of high velocity, intricate along with different kind of data. Organizations that hold vast sum of data deals with new creation of systematic tools intended for large data. The conventional data-intensive business application starts to go down behind the times, on account of the deficient abilities to manage large data volumes, unstructured information, low rate of information retrieval along with complex algorithms. Big data relies upon the data complexity, relatively than the data size only. For resolving this kind of trouble, this paper establishes a mutual refinement technique for big data retrieval to augment the performance. The intended system comprises the system of training and retrieval which is performed consecutively. In training process, initially input data is preprocessed by splitting the data. Then frequency and entropy features are extracted from the preprocessed data. After the feature extraction data is exhibited to the mutual refinement process. In mutual refinement step hash tag graph is generated to train the data and this removes the uncertainty from the data. In retrieval process, the input query data is used for the similarity assessment. Features like frequency and entropy are extracted from the query data. Then the feature value is compared with the hash tag graph. If the feature value is matched then the data is retrieved as of the hash tag graph and the retrieved data is visualized. The proposed technique’s performance is assessed by relating our intended work with the other conventional works. The experimental output exhibits that our intended mutual refinement process augments the system performance process by confiscating the uncertainty comprised in the system. This work offered a unique mutual refinement approach which yields better outcomes for retrieving the big data in a proficient manner. The proposed process retrieving process in big data gives the better performance but in future, experiments can be done on large datasets and some real-time applications to calculate the effectiveness of the proposed method.


Big data Map reduce Entropy Frequency Hash tag graph 


  1. 1.
    Jain, A., Bajpai, A., Rohila, M.K.: Efficient clustering technique for information retrieval in data mining. Int. J. Emerg. Technol. Adv. Eng. 2(6), 2250–2459 (2012)Google Scholar
  2. 2.
    Chen, M., Mao, S., Liu, Y.: Big data: a survey. Mob. Netw. Appl. 19(2), 171–209 (2014)CrossRefGoogle Scholar
  3. 3.
    Takaishi, D., Nishiyama, H., Kato, N., Miura, R.: Toward energy efficient big data gathering in densely distributed sensor networks. IEEE Trans. Emerg. Top. Comput. 2(3), 388–397 (2014)CrossRefGoogle Scholar
  4. 4.
    Anagnostopoulos, I., Zeadally, S., Exposito, E.: Handling big data: research challenges and future directions. J. Supercomputing 72(4), 1494–1516 (2016)CrossRefGoogle Scholar
  5. 5.
    Zhao, F., Zhu, Y., Jin, H., Yang, L.T.: A personalized hash tag recommendation approach using LDA-based topic model in micro blog environment. Future Gener. Comput. Syst. 65, 196–206 (2015)CrossRefGoogle Scholar
  6. 6.
    Chen, C.L.P., Zhang, C.Y.: Data-intensive applications, challenges, techniques and technologies: a survey on big data. Inf. Sci. 275, 314–347 (2014)CrossRefGoogle Scholar
  7. 7.
    Fong, S., Wong, R., Vasilakos, A.V.: Accelerated PSO swarm search feature selection for data stream mining big data. IEEE Trans. Serv. Comput. 9(1), 33–45 (2016)Google Scholar
  8. 8.
    Phillip, V., Sato, L.: A proposal for a reference architecture for long-term archiving, preservation, and retrieval of big data. In: 2014 IEEE 13th International Conference on Trust, Security and Privacy in Computing and Communications, pp. 622–629 (2014)Google Scholar
  9. 9.
    Wang, S., Wang, J., Wang, J.: The design of a multi-concept image retrieval system based on Hadoop and GMM. In: 2015 International Conference on Machine Learning and Cybernetics (ICMLC), vol. 2, pp. 820–825 (2015)Google Scholar
  10. 10.
    Sagiroglu, S., Sinanc, D.: Big data: a review. In: Collaboration Technologies and Systems (CTS), pp. 42–47 (2013)Google Scholar
  11. 11.
    Wang, Y., Liu, J., Huang, Y., Feng, X.: Using hashtag graph-based topic model to connect semantically-related words without co-occurrence in microblogs. IEEE Trans. Knowl. Data Eng. 28(7), 1919–1933 (2016)CrossRefGoogle Scholar
  12. 12.
    Najafabadi, M.M., Villanustre, F., Khoshgoftaar, T.M., Seliya, N., Wald, R., Muharemagic, E.: Deep learning applications and challenges in big data analytics. J. Big Data 2(1), 1 (2015)CrossRefGoogle Scholar
  13. 13.
    Bilal, M., Oyedele, L.O., Qadir, J., Munir, K., Ajayi, S.O., Akinade, O.O., Owolabi, A.H., Alaka, H.A., Pasha, M.: Big data in the construction industry: a review of present status, opportunities, and future trends. Adv. Eng. Inform. 30(3), 500–521 (2016)CrossRefGoogle Scholar
  14. 14.
    Mack, P., Megherbi, DB.: A content-based image retrieval technique with tolerance via multi-page differentiate hashing and binary-tree searching multi-object buckets. In: 2016 IEEE International Conference on Computational Intelligence and Virtual Environments for Measurement Systems and Applications (CIVEMSA), pp. 1–6 (2016)Google Scholar
  15. 15.
    Harabagiu, S.: Big mechanisms for processing big data in medical informatics. In: 2015 IEEE Signal Processing in Medicine and Biology Symposium (SPMB), pp. 1–2 (2015)Google Scholar
  16. 16.
    Duan, H., Peng, Y., Min, G., Xiang, X., Zhan, W., Zou, H.: Distributed in-memory vocabulary tree for real-time retrieval of big data images. Ad Hoc Netw. 35, 137–148 (2015)CrossRefGoogle Scholar
  17. 17.
    Noha, S.A., Ali, E.I., Arafat, H.: An efficient fast-response content-based image retrieval framework for big data. Comput. Electr. Eng. 54, 522–538 (2016)CrossRefGoogle Scholar
  18. 18.
    Wang, J., Liu, W., Kumar, S., Chang, SF.: Learning to hash for indexing big data—a survey. In: Proceedings of the IEEE, vol. 104, No. 1, pp. 34–57 (2016)Google Scholar
  19. 19.
    Caballero, I., Manuel, S., Piattini M.: A data quality in use model for big data. In: International Conference on Conceptual Modeling, Springer, New York, pp. 65–74 (2014)Google Scholar
  20. 20.
    Han, X., Li, J., Yang, D., Wang, J.: Efficient skyline computation on big data. IEEE Trans. Knowl. Data Eng. 25(11), 2521–2535 (2013)CrossRefGoogle Scholar
  21. 21.
    Bollegala, D., Matsuo, Y., Ishizuka, M.: Minimally supervised novel relation extraction using a latent relational mapping. IEEE Trans. Knowl. Data Eng. 25(2), 419–432 (2013)CrossRefGoogle Scholar
  22. 22.
    Guo, K., Zhang, R., Kuang, L.: TMR: towards an efficient semantic-based heterogeneous transportation media big data retrieval. Neurocomputing 181, 122–131 (2016)CrossRefGoogle Scholar
  23. 23.
    Jiang, S., Qian, X., Mei, T., Fu, Y.: Personalized travel sequence recommendation on multi-source big social media. IEEE Trans. Big Data 2(1), 43–56 (2016)CrossRefGoogle Scholar
  24. 24.
    Kalloubi, F., Nfaoui, E.I.H., beqqali, E.I.B.: Microblog semantic context retrieval system based on linked open data and graph-based theory. Expert Syst. Appl. 53, 138–148 (2016)CrossRefGoogle Scholar
  25. 25.
    Wang, P., Sun, L., Yang, S., Smeaton, A.F.: Training-free indexing refinement for visual media via multi-semantics. Neurocomputing 236, 39–47 (2017)CrossRefGoogle Scholar
  26. 26.
    Lu, Y., Wang, X., Zhang, W., Chen, H., Peng, L., Zhao, W.: Performance analysis of multimedia retrieval workloads running on multicores. IEEE Trans. Parallel Distrib. Syst. 27(11), 3323–3337 (2016)CrossRefGoogle Scholar
  27. 27.
    Wang, X., Wei, F., Liu, X., Zhou, M., Zhang, M.: Topic sentiment analysis in twitter: a graph-based hashtag sentiment classification approach. In: Proceedings of the 20th ACM international conference on Information and knowledge management, pp. 1031–1040 (2011)Google Scholar
  28. 28.

Copyright information

© Springer Science+Business Media, LLC 2017

Authors and Affiliations

  1. 1.Department of Information TechnologyBannari Amman Institute of TechnologySathyamangalamIndia

Personalised recommendations