Mining the Relationships in the form of the Predisposing Factors and Co-Incident Factors among Numerical Dynamic Attributes in Time Series Data Set by Using the Combination of Some Existing Techniques

Kooptiwoot, Suwimon; Salam, M. Abdus

doi:10.1007/1-4020-3675-2_16

Suwimon Kooptiwoot⁴ &
M. Abdus Salam⁴

634 Accesses

Abstract

Temporal mining is a natural extension of data mining with added capabilities of discovering interesting patterns, inferring relationships of contextual and temporal proximity and may also lead to possible cause-effect associations. Temporal mining covers a wide range of paradigms for knowledge modeling and discovery. A common practice is to discover frequent sequences and patterns of a single variable. In this paper we present a new algorithm which is the combination of many existing ideas consists of the reference event as proposed in (Bettini, Wang et al. 1998), the event detection technique proposed in (Guralnik and Srivastava 1999), the large fraction proposed in (Mannila, Toivonen et al. 1997), the causal inference proposed in (Blum 1982) We use all of these ideas to build up our new algorithm for the discovery of multivariable sequences in the form of the predisposing factor and co-incident factor of the reference event of interest. We define the event as positive direction of data change or negative direction of data change above a threshold value. From these patterns we infer predisposing and co-incident factors with respect to a reference variable. For this purpose we study the Open Source Software data collected from SourceForge website. Out of 240+ attributes we only consider thirteen time dependent attributes such as Page-views, Download, Bugs0, Bugs1, Support0, Support1, Patches0, Patches1, Tracker0, Tracker1, Tasks0, Tasks1 and CVS. These attributes indicate the degree and patterns of activities of projects through the course of their progress. The number of the Download is a good indication of the progress of the projects. So we use the Download as the reference attribute. We also test our algorithm with four synthetic data sets including noise up to 50 %. The results show that our algorithm can work well and tolerate the noise data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agrawal, R. and Srikant R., 1995. Mining Sequential Patterns. In Proceedings of the IEEE International Conference on Data Engineering, Taipei, Taiwan.
Google Scholar
Bettini, C., Wang S., et al. 1998. Discovering Frequent Event Patterns with Multiple Granularities in Time Sequences. In IEEE Transactions on Knowledge and Data Engineering 10(2).
Google Scholar
Blum, R. L., 1982. Discovery, Confirmation and Interpretation of Causal Relationships from a Large Time-Oriented Clinical Databases: The Rx Project. Computers and Biomedical Research 15(2): 164–187.
Article Google Scholar
Dasgupta, D. and Forrest S., 1995. Novelty Detection in Time Series Data using Ideas from Immunology. In Proceedings of the 5th International Conference on Intelligent Systems, Reno, Nevada.
Google Scholar
Guralnik, V. and Srivastava J., 1999. Event Detection from Time Series Data. In KDD-99, San Diego, CA USA.
Google Scholar
Hirano, S., Sun X., et al., 2001. Analysis of Time-series Medical Databases Using Multiscale Structure Matching and Rough Sets-Based Clustering Technique. In IEEE International Fuzzy Systems Conference.
Google Scholar
Hirano, S. and Tsumoto S., 2001. A Knowledge-Oriented Clustering Technique Based on Rough Sets. In 25th Annual International Computer Software and Applications Conference (COMPSAC’01), Chicago, Illinois.
Google Scholar
Hirano, S. and Tsumoto S., 2002. Mining Similar Temporal Patterns in Long Time-Series Data and Its Application to Medicine. In IEEE: 219–226.
Google Scholar
Kantardzic, M., 2003. Data Mining Concepts, Models, Methods, and Algorithms. USA, IEEE Press.
Google Scholar
Keogh, E., Chu S., et al., 2001. An Online Algorithm for Segmenting Time Series. In Proceedings of IEEE International Conference on Data Mining, 2001.
Google Scholar
Keogh, E., Lonardi S., et al., 2002. Finding Surprising Patterns in a Time Series Database in Linear Time and Space. In Proceedings of The Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’ 02), Edmonton, Alberta, Canada.
Google Scholar
Last, M., Klein Y., et al., 2001. Knowledge Discovery in Time Series Databases. In IEEE Transactions on Systems, Man, and Cybernetics 31(1): 160–169.
Article Google Scholar
Lu, H., Han J., et al., 1998. Stock Movement Prediction and N-Dimensional Inter-Transaction Association Rules. In Proc. of 1998 SIGMOD’98 Workshop on Research Issues on Data Mining and Knowledge Discovery (DMKD’98), Seattle, Washington.
Google Scholar
Mannila, H., Toivonen H., et al., 1997. Discovery of frequent episodes in event sequences. In Data Mining and Knowledge Discovery 1(3): 258–289.
Google Scholar
Roddick, J. F. and Spiliopoulou M., 2002. A Survey of Temporal Knowledge Discovery Paradigms and Methods. In IEEE Transactions on Knowledge and Data Mining 14(4): 750–767.
Article Google Scholar
Salam, M. A., 2001. Quasi Fuzzy Paths in Semantic Networks. In Proceedings 10th IEEE International Conference on Fuzzy Systems, Melbourne, Australia.
Google Scholar
Tung, A., Lu H., et al., 1999. Breaking the Barrier of Transactions: Mining Inter-Transaction Association Rules. In Proceedings of the Fifth International on Knowledge Discovery and Data Mining [KDD 99], San Diego, CA.
Google Scholar
Ueda, N. and Suzuki S., 1990. A Matching Algorithm of Deformed Planar Curves Using Multiscale Convex/Concave Structures. In JEICE Transactions on Information and Systems J73-D-II(7): 992–1000.
Google Scholar
Weiss, S. M. and Indurkhya N., 1998. Predictive Data Mining. San Francisco, California, Morgn Kaufmann Publsihers, Inc.
Google Scholar

Download references

Author information

Authors and Affiliations

School of Information Technologies, The University of Sydney, Sydney, Australia
Suwimon Kooptiwoot & M. Abdus Salam

Authors

Suwimon Kooptiwoot
View author publications
You can also search for this author in PubMed Google Scholar
M. Abdus Salam
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Universidade Portucalense, Porto, Portugal
Isabel Seruca
INSTICC, Setúbal, Portugal
José Cordeiro & Joaquim Filipe &
Ecole Supérieure d’Electronique de L’Ouest, Angers, France
Slimane Hammoudi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kooptiwoot, S., Salam, M.A. (2006). Mining the Relationships in the form of the Predisposing Factors and Co-Incident Factors among Numerical Dynamic Attributes in Time Series Data Set by Using the Combination of Some Existing Techniques. In: Seruca, I., Cordeiro, J., Hammoudi, S., Filipe, J. (eds) Enterprise Information Systems VI. Springer, Dordrecht. https://doi.org/10.1007/1-4020-3675-2_16

Download citation

DOI: https://doi.org/10.1007/1-4020-3675-2_16
Publisher Name: Springer, Dordrecht
Print ISBN: 978-1-4020-3674-3
Online ISBN: 978-1-4020-3675-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics