Identifying Suspicious Activities in Company Networks Through Data Mining and Visualization

Landes, Dieter; Otto, Florian; Schumann, Sven; Schlottke, Frank

doi:10.1007/978-1-4471-4866-1_6

Dieter Landes⁴,
Florian Otto⁴,
Sven Schumann⁵ &
…
Frank Schlottke⁶

Part of the book series: Advanced Information and Knowledge Processing ((AI&KP))

5336 Accesses
2 Citations

Abstract

Company data are a precious asset which need to be truly authentic and must not be disclosed to unauthorized parties. In this contribution, we report on ongoing work that aims at supporting human IT security experts by pinpointing significant alerts that really need closer inspection. We developed an experimental tool environment to support the analysis of IT infrastructure data with data mining methods. In particular, various clustering algorithms are used to differentiate normal behavior from activities that call for intervention through IT security experts. Before being subjected to clustering, data can be pre-processed in various ways. In particular, categorical values can be cleverly mapped to numerical values while preserving the semantics of the data as far as possible. Resulting clusters can be subjected to visual inspection using techniques such as parallel coordinates or pixel-based techniques, e.g. circle segments or recursive patterns.

Preliminary results indicate that clustering is well suited to structure monitoring data appropriately. Also, fairly large data volumes can be clustered effectively and efficiently. Currently, the main focus is on more elaborate visualization and classification techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Agrawal, R., Gehrke, J., Gunopulos, D., Raghavan, P.: Automatic subspace clustering of high-dimensional data for data mining applications. In: Proc. 25th Int. Conference on Management of Data (SIGMOD’98), pp. 94–105 (1998)
Google Scholar
Boriah, S., Chandola, V., Kumar, V.: Similarity measures for categorical data: a comparative evaluation. In: Proc. SIAM Int. Conference on Data Mining, pp. 243–254 (2008)
Google Scholar
Buschmann, F., Meunier, R., Rohnert, H., Sommerlad, P., Stal, M.: Pattern-Oriented Software Architecture—A System of Patterns. Wiley, Chichester (1996)
Google Scholar
Chaturvedi, A.D., Green, P.E., Carroll, J.D.: k-Means, k-medians, and k-modes: special cases of partitioning multiway data. In: Classification Society of North America Meeting, Houston (1994)
Google Scholar
Chou, C.-H., Su, M.-C., Lai, E.: A new cluster validity measure and its application to image compression. PAA Pattern Anal. Appl. 7(2), 205–220 (2004)
MathSciNet Google Scholar
Davies, D.L., Bouldin, D.W.: A cluster separation measure. IEEE Trans. Pattern Anal. Mach. Intell. 1(2), 224–227 (1979)
Article Google Scholar
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. B 39(1), 1–38 (1977)
MathSciNet MATH Google Scholar
Dunn, J.C.: Well separated clusters and optimal fuzzy partitions. J. Cybern. 4, 95–104 (1974)
Article MathSciNet Google Scholar
Dutta, M., Kakoti Mahanta, A., Pujari, A.K.: QROCK: A quick version of the ROCK algorithm for clustering of categorical data. Pattern Recognit. Lett. 26, 2364–2373 (2005)
Article Google Scholar
Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proc. 2nd Int. Conference on Knowledge Discovery and Data Mining (KDD-96), pp. 226–231 (1996)
Google Scholar
Goil, S., Nagesh, H., Choudhary, A.: MAFIA: Efficient and scalable subspace clustering for very large data sets. Technical report CPDC-TR-9906-010, Northwestern University, Evanston (1999)
Google Scholar
Guha, S., Rastogi, R., Shim, K.: ROCK; a robust clustering algorithm for categorical attributes. In: Proc. 15th Int. Conference on Data Engineering (ICDE’99), pp. 512–521 (1999)
Google Scholar
Halkidi, M., Batistakis, Y., Vazirgiannis, M.: On clustering validation techniques. J. Intell. Syst. 17(2/3), 107–145 (2001)
Article MATH Google Scholar
Han, J., Kamber, M., Pei, J.: Data Mining—Concepts and Techniques, 3rd edn. Morgan Kaufmann, Waltham (2012)
MATH Google Scholar
Huang, Z.: Extensions to the k-means algorithm for clustering large data sets with categorical values. In: Data Mining and Knowledge Discovery, vol. 2, pp. 283–302 (1998)
Google Scholar
Inselberg, A.: The plane with parallel coordinates. Vis. Comput. 1, 69–91 (1985)
Article MATH Google Scholar
Inselberg, A., Dimsdale, B.: Parallel coordinates: a tool for visualizing multidimensional geometry. In: Proc. 1st IEEE Conference on Visualization (Visualization’90), pp. 361–378 (1990)
Chapter Google Scholar
Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice Hall, Englewood Cliffs (1988)
MATH Google Scholar
Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Comput. Surv. 31(3), 264–323 (1999)
Article Google Scholar
Keim, D., Kriegel, H.-P., Ankerst, M.: Recursive pattern: a technique for visualizing very large amounts of data. In: Proc. 6th IEEE Conference on Visualization (Visualization’95), pp. 279–286 (1995)
Chapter Google Scholar
Kozak, M.: Watch out for superman: first visualize, then analyze. IEEE Comput. Graphics Appl. 32(3), 6–9 (2012)
Article MathSciNet Google Scholar
Liu, Q., Dong, G.: CPCQ—contrast pattern based clustering quality index for categorical data. Pattern Recognit. 45, 1739–1748 (2012)
Article Google Scholar
Liu, Y., Li, Z., Xiong, H., Gao, X., Wu, J.: Understanding of internal clustering validation measures. In: Proc. 10th Int. Conference on Data Mining (ICDM 2010), pp. 911–916 (2010)
Chapter Google Scholar
Lloyd, S.P.: Least squares optimization in PCM. Technical report, Bell Labs (1957). Also IEEE Trans. Inf. Theory 28(2), 129–137 (1982)
Google Scholar
MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proc. 5th Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297 (1967)
Google Scholar
Wirth, R., Hipp, J.: CRISP-DM: towards a standard process model for data mining. In: Proc. 4th Int. Conference on the Practical Application of Knowledge Discovery and Data Mining, pp. 29–39 (2000)
Google Scholar

Download references

Acknowledgements

The SecMine project is supported under grant no. 17049X10 by Bundesministerium für Bildung und Forschung (BMBF). We thank Christian Bergmann, Toni Böhnlein, Sebastian Detsch, Thomas Geus, Steffen Hammer, Johannes Henninger, Matthias Herrmann, Sebastian Jakob, Daniel Klett, Adrian Köhlein, Evelyn Krüger, Benjamin Krull, Andreas Kühntopf, Hannes Müller, Marc Pieruschek, Markus Pütz, Markus Ring, Martin Rosenbaum, Manuel Schnapp, Tobias Schmidtlein, Christopher Schramm, Elena Tereshko, Melanie Westendorf, Thomas Worch, and Bernhard Sick for their contributions.

Author information

Authors and Affiliations

Coburg University of Applied Sciences and Arts, Coburg, Germany
Dieter Landes & Florian Otto
HUK COBURG, Coburg, Germany
Sven Schumann
Applied Security, Stockstadt, Germany
Frank Schlottke

Authors

Dieter Landes
View author publications
You can also search for this author in PubMed Google Scholar
Florian Otto
View author publications
You can also search for this author in PubMed Google Scholar
Sven Schumann
View author publications
You can also search for this author in PubMed Google Scholar
Frank Schlottke
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dieter Landes .

Editor information

Editors and Affiliations

Department of Computer Science, Georg Simon Ohm Univ. of Applied Science, Philipp-Kittler-Str. 18, Nuremberg, 90480, Germany
Peter Rausch
Department of Computer Science, Taif University, Makkah, Al-huwayah, Taif, PO888, Saudi Arabia
Alaa F. Sheta
Faculty of Technology, De Montfort University, The Gateway, Leicester, LE1 9BH, Leicestershire, United Kingdom
Aladdin Ayesh

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Landes, D., Otto, F., Schumann, S., Schlottke, F. (2013). Identifying Suspicious Activities in Company Networks Through Data Mining and Visualization. In: Rausch, P., Sheta, A., Ayesh, A. (eds) Business Intelligence and Performance Management. Advanced Information and Knowledge Processing. Springer, London. https://doi.org/10.1007/978-1-4471-4866-1_6

Download citation

DOI: https://doi.org/10.1007/978-1-4471-4866-1_6
Publisher Name: Springer, London
Print ISBN: 978-1-4471-4865-4
Online ISBN: 978-1-4471-4866-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics