Skip to main content

Enhancing Big Data Exploration with Faceted Browsing

  • Conference paper
  • First Online:
Classification, (Big) Data Analysis and Statistical Learning

Abstract

Big data analysis now drives nearly every aspect of modern society, from manufacturing and retail, through mobile and financial services, through the life sciences and physical sciences. The ability to continue to use big data to make new connections and discoveries will help to drive the breakthroughs of tomorrow. One of the most valuable means through which to make sense of big data, and thus make it more approachable to most people, is data visualization. Data visualization can guide decision-making and become a tool to convey information critical in all data analysis. However, to be actually actionable, data visualizations should contain the right amount of interactivity. They have to be well designed, easy to use, understandable, meaningful, and approachable. In this article, we present a new approach to visualize huge amount of data, based on a Bayesian suggestion algorithm and the widely used enterprise search platform Solr. We demonstrate how the proposed Bayesian suggestion algorithm became a key ingredient in a big data scenario, where generally a query can generate so many results that the user can be confused. Thus, the selection of the best results, together with the result path chosen by the user by means of multifaceted querying and faceted navigation, can be very useful.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://lucene.apache.org/solr/resources.html.

  2. 2.

    http://www.openmarkov.org/users.html.

  3. 3.

    http://archive.ics.uci.edu/ml/datasets/Mushroom.

References

  1. Bergamaschi, S.: Big Data analysis: Trends & challenges. In: Proceedings of 2014 International Conference on High Performance Computing & Simulation (HPCS), IEEE, pp. 303–304 (2014)

    Google Scholar 

  2. Bergamaschi, S., Ferrari, D., Guerra, F., Simonini, G., Velegrakis, Y.: Providing insight into data source topics. J. Data Seman. 5(4), 211–228 (2016)

    Article  Google Scholar 

  3. Labrinidis, A., Jagadish, H.V.: Challenges and opportunities with Big Data. Proc. VLDB Endowment 5(12), 2032–2033 (2012)

    Article  Google Scholar 

  4. Simonini, G., Bergamaschi, S., Jagadish, H.V.: BLAST: a loosely schema-aware meta-blocking approach for entity resolution. PVLDB 9(12), 1173–1184 (2016)

    Google Scholar 

  5. Guerra, F., Simonini, G., Vincini, M.: Supporting image search with tag clouds: a preliminary approach. Adv. Multimedia 2015, 1–10 (2015). https://doi.org/10.1155/2015/439020

    Article  Google Scholar 

  6. Cooper, G.F., Herskovits, E.: A Bayesian method for constructing Bayesian belief networks from databases (2013). arXiv:1303.5714

  7. Simonini, G., Song, Z.: Big Data exploration with faceted browsing, International Conference on High Performance Computing & Simulation, HPCS 2015, IEEE, pp. 541–544 (2015)

    Google Scholar 

  8. Heckerman, D., Chickering, D.M., Meek, C., Rounthwaite, R., Kadie, C.: Dependency networks for inference, collaborative filtering, and data visualization. J. Mach. Learn. Res. 1, 49–75 (2000)

    MATH  Google Scholar 

  9. Yee, K.P., Swearingen, K.L., Li, Hearst, M.: Faceted metadata for image search and browsing. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI ’03). ACM, New York, NY, USA, pp. 401–408 (2003)

    Google Scholar 

  10. Nielsen, T.D., Jensen, F.V.: Bayesian Networks and Decision Graphs. Springer Science & Business Media, Berlin (2007)

    MATH  Google Scholar 

  11. Grainger, T., Potter, T., Seeley, Y.: Solr in action, Manning (2014). www.manning.com

  12. Malavolta, P.: Faceted browsing: analysis and implementation of a Big Data solution using Apache Solr (2014). https://www.dbgroup.unimo.it/tesi

  13. Charalambis, E., Bergamaschi, S., Jagadish, H.V.: Bayesian networks: optimization of the human-computer interaction process in a Big Data scenario (2014). https://www.dbgroup.unimo.it/tesi

  14. Gámez, J.A., Mateo, J.L., Puerta, J.M.: Learning Bayesian networks by hill climbing: efficient methods based on progressive restriction of the neighborhood. Data Min. Knowl. Disc. 22(1–2), 106–148 (2011)

    Article  MathSciNet  Google Scholar 

  15. Meng, X., Bradley, J., Yavuz, B., Sparks, E., Venkataraman, S., Liu, D., Freeman, J., Tsai, D.B.. Amde, M., Owen, S., et al.: MLlib: Machine Learning in Apache Spark (2015). arXiv:1505.06807

  16. Scott, S.L., Blocker, A.W., Bonassi, F.V., Chipman, H.A., George, E.I., McCulloch, R.E.: Bayes and Big Data: the consensus Monte Carlo algorithm. Int. J. Manage. Sci. Eng. Manage. 11, 77–88 (2016)

    Google Scholar 

  17. Fang Q., Yue K., Fu X., Wu H., Liu W.: A MapReduce-based method for learning Bayesian network from massive data. In: Ishikawa Y., Li J., Wang W., Zhang R., Zhang W. (eds.) Web Technologies and Applications. APWeb 2013. Lecture Notes in Computer Science, vol. 7808, pp. 697–708. Springer, Berlin (2013)

    Chapter  Google Scholar 

Download references

Acknowledgements

We would like to thank Paolo Malavolta and Emanuele Charalambis for working on this project for their master thesis as students of the DBGroup (www.dbgroup.unimo.it) of the University of Modena e Reggio Emilia, during their period abroad, hosted by the University of Michigan under the supervision of professor H. V. Jagadish.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sonia Bergamaschi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Bergamaschi, S., Simonini, G., Zhu, S. (2018). Enhancing Big Data Exploration with Faceted Browsing. In: Mola, F., Conversano, C., Vichi, M. (eds) Classification, (Big) Data Analysis and Statistical Learning. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Cham. https://doi.org/10.1007/978-3-319-55708-3_2

Download citation

Publish with us

Policies and ethics