Evidential techniques in parallel Database Mining
Realisation of the fact that stored masses of data contain more information than what is obvious has led to a great interest in the field of Database Mining in the last couple of years. While hardware requirements for storage of these masses of data have advanced rapidly with the demand as have software methodologies for storage, manipulation and reporting of the data, little progress has been made in methods for automatically analysing the data and extracting knowledge stored implicitly within the data. This process of “reading between the lines” is called Database Mining (DM).
Clearly, the process of DM is a difficult one. This is due to the fact that methods required to achieve the goal of discovering knowledge are complex and data intensive. In this paper we explain how high performance computing can play a vital role in DM and discuss the implementation of a specific algorithm, STRIP (Strong Rule Induction in Parallel) [ANAN94b, ANAN95] developed by the authors for the discovery of Strong or “almost exact” rules from databases. STRIP is the first algorithm to be implemented within a parallel framework for Database Mining based on Evidence Theory (EDM) [ANAN94a] developed by the authors.
In an earlier paper we discussed the different levels of parallelism within STRIP and demonstrated them using a transputer network [ANAN95]. In this paper we discuss the implementation of STRIP on a cluster of Silicon Graphics Workstations connected using an ATM network.
Unable to display preview. Download preview PDF.
- [ANAN94a]S.S. Anand, D.A. Bell, J.G. Hughes. A General Framework for Database Mining Based on Evidential Theory, Internal Report, Sch. of Inf. and Soft. Eng., Univ. of Ulster (Jordanstown), Nov. 94Google Scholar
- [ANAN94b]S.S. Anand, D.A. Bell, J.G. Hughes. Discovery of Strong Rules from Databases in Parallel, Internal Report, Sch. of Inf. and Soft. Eng., Univ. of Ulster (Jordanstown), Oct. 94Google Scholar
- [ANAN94c]S.S. Anand, D.A. Bell, J.G. Hughes. An Empirical Performance Study of the Ingres Search Accelerator for a Large Property Management Database System, Proc. of the 20th VLDB Conference, Sept. 1994.Google Scholar
- [ANAN95]S.S. Anand, C.M. Shapcott, D.A. Bell, J.G. Hughes. Data Mining in Parallel, World Occam and Transputer User Group Conference, April, 1995.Google Scholar
- [PVM3]A. Geist, A. Beguelin, J. Dingarra, W. Jiang, R. Manchek, V. Sunderam. PVM3 User's Guide and Reference Manual.Google Scholar