Interesting Subset Discovery and Its Application on Service Processes

Natu, Maitreya; Palshikar, Girish Keshav

doi:10.1007/978-3-642-45252-9_14

Maitreya Natu³ &
Girish Keshav Palshikar³

Part of the book series: Studies in Big Data ((SBD,volume 3))

3385 Accesses
3 Citations

Abstract

Various real-life datasets can be viewed as a set of records consisting of attributes explaining the records and set of measures evaluating the records. We address the problem of automatically discovering interesting subsets from such a dataset, such that the discovered interesting subsets have significantly different characteristics of performance than the rest of the dataset. We present an algorithm to discover such interesting subsets. The proposed algorithm uses a generic domain-independent definition of interestingness and uses various heuristics to intelligently prune the search space in order to build a solution scalable to large size datasets. We present application of the interesting subset discovery algorithm on four real-world case-studies and demonstrates the effectiveness of the interesting subset discovery algorithm in extracting insights in order to identify problem areas and provide improvement recommendations to wide variety of systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Atzmueller, M., Puppe, F.: Sd-map: a fast algorithm for exhaustive subgroup discovery. In: Proceedings of PKDD 2006, LNAI, vol. 4213, pp. 6–17. Springer, Berlin (2006)
Google Scholar
Atzmueller, M., Puppe, F., Buscher, H.: Profiling examiners using intelligent subgroup mining. In: Proceedings of 10th International Workshop on Intelligent Data Analysis in Medicine and, Pharmacology (IDAMAP-2005), pp. 46–51 (2005)
Google Scholar
c, N.L., Cestnik, B., Gemberger, D., Flach, P.: Subgroup discovery with cn2-sd. Machine Learning 57, 115–143 (2004).
Google Scholar
Lavrac, N., Sek, B.K., Flach, P., Todorovski, L.: Subgroup discovery with cn2-sd. J. Mach. Learn. Res. 5, 153–188 (2004)
Google Scholar
Friedman, J., Fisher, N.I.: Bump hunting in high-dimensional data. Stat. Comput. 9, 123–143 (1999)
Article Google Scholar
Scheffer, T., Wrobel, S.: Finding the most interesting patterns in a database quickly by using sequential sampling. J. Mach. Learn. Res. 3, 833–862 (2002)
MathSciNet Google Scholar
Scholtz, M.: Sampling based sequential subgroup mining. In: Proceedings of 11th SIG KDD, pp. 265–274 (2005)
Google Scholar
Sek, B.K., Lavrac, N., Jovanoski, V.: Apriori-sd: adapting association rule learning to subgroup discovery. In: Proceedings of 5th International Symposium On Intelligent Data Analysis, pp. 230–241. Springer, Berlin (2003)
Google Scholar
Clark, P., Niblett, T.: The CN2 induction algorithm. Mach. Learn. 3(4), 261–283 (1989)
Google Scholar
Palshikar, G., Deshpande, S., Bhat, S.: Quest: Discovering insights from survey responses. In: Proceedings of 8th Australasian Data Mining Conference (AusDM09), pp. 83–92 (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

Tata Research Development and Design Centre, Tata Consultancy Services Limited, Pune, MH, 411013, India
Maitreya Natu & Girish Keshav Palshikar

Authors

Maitreya Natu
View author publications
You can also search for this author in PubMed Google Scholar
Girish Keshav Palshikar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Maitreya Natu .

Editor information

Editors and Affiliations

Faculty of Commerce, Kansai University, Osaka, Japan
Katsutoshi Yada

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Natu, M., Palshikar, G.K. (2014). Interesting Subset Discovery and Its Application on Service Processes. In: Yada, K. (eds) Data Mining for Service. Studies in Big Data, vol 3. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-45252-9_14

Download citation

DOI: https://doi.org/10.1007/978-3-642-45252-9_14
Published: 04 January 2014
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-45251-2
Online ISBN: 978-3-642-45252-9
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics