Skip to main content

Interesting Subset Discovery and Its Application on Service Processes

  • Chapter
  • First Online:
Data Mining for Service

Part of the book series: Studies in Big Data ((SBD,volume 3))

Abstract

Various real-life datasets can be viewed as a set of records consisting of attributes explaining the records and set of measures evaluating the records. We address the problem of automatically discovering interesting subsets from such a dataset, such that the discovered interesting subsets have significantly different characteristics of performance than the rest of the dataset. We present an algorithm to discover such interesting subsets. The proposed algorithm uses a generic domain-independent definition of interestingness and uses various heuristics to intelligently prune the search space in order to build a solution scalable to large size datasets. We present application of the interesting subset discovery algorithm on four real-world case-studies and demonstrates the effectiveness of the interesting subset discovery algorithm in extracting insights in order to identify problem areas and provide improvement recommendations to wide variety of systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Atzmueller, M., Puppe, F.: Sd-map: a fast algorithm for exhaustive subgroup discovery. In: Proceedings of PKDD 2006, LNAI, vol. 4213, pp. 6–17. Springer, Berlin (2006)

    Google Scholar 

  2. Atzmueller, M., Puppe, F., Buscher, H.: Profiling examiners using intelligent subgroup mining. In: Proceedings of 10th International Workshop on Intelligent Data Analysis in Medicine and, Pharmacology (IDAMAP-2005), pp. 46–51 (2005)

    Google Scholar 

  3. c, N.L., Cestnik, B., Gemberger, D., Flach, P.: Subgroup discovery with cn2-sd. Machine Learning 57, 115–143 (2004).

    Google Scholar 

  4. Lavrac, N., Sek, B.K., Flach, P., Todorovski, L.: Subgroup discovery with cn2-sd. J. Mach. Learn. Res. 5, 153–188 (2004)

    Google Scholar 

  5. Friedman, J., Fisher, N.I.: Bump hunting in high-dimensional data. Stat. Comput. 9, 123–143 (1999)

    Article  Google Scholar 

  6. Scheffer, T., Wrobel, S.: Finding the most interesting patterns in a database quickly by using sequential sampling. J. Mach. Learn. Res. 3, 833–862 (2002)

    MathSciNet  Google Scholar 

  7. Scholtz, M.: Sampling based sequential subgroup mining. In: Proceedings of 11th SIG KDD, pp. 265–274 (2005)

    Google Scholar 

  8. Sek, B.K., Lavrac, N., Jovanoski, V.: Apriori-sd: adapting association rule learning to subgroup discovery. In: Proceedings of 5th International Symposium On Intelligent Data Analysis, pp. 230–241. Springer, Berlin (2003)

    Google Scholar 

  9. Clark, P., Niblett, T.: The CN2 induction algorithm. Mach. Learn. 3(4), 261–283 (1989)

    Google Scholar 

  10. Palshikar, G., Deshpande, S., Bhat, S.: Quest: Discovering insights from survey responses. In: Proceedings of 8th Australasian Data Mining Conference (AusDM09), pp. 83–92 (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Maitreya Natu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Natu, M., Palshikar, G.K. (2014). Interesting Subset Discovery and Its Application on Service Processes. In: Yada, K. (eds) Data Mining for Service. Studies in Big Data, vol 3. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-45252-9_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-45252-9_14

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-45251-2

  • Online ISBN: 978-3-642-45252-9

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics