Skip to main content

Mining the Relationships in the form of the Predisposing Factors and Co-Incident Factors among Numerical Dynamic Attributes in Time Series Data Set by Using the Combination of Some Existing Techniques

  • Conference paper
Book cover Enterprise Information Systems VI

Abstract

Temporal mining is a natural extension of data mining with added capabilities of discovering interesting patterns, inferring relationships of contextual and temporal proximity and may also lead to possible cause-effect associations. Temporal mining covers a wide range of paradigms for knowledge modeling and discovery. A common practice is to discover frequent sequences and patterns of a single variable. In this paper we present a new algorithm which is the combination of many existing ideas consists of the reference event as proposed in (Bettini, Wang et al. 1998), the event detection technique proposed in (Guralnik and Srivastava 1999), the large fraction proposed in (Mannila, Toivonen et al. 1997), the causal inference proposed in (Blum 1982) We use all of these ideas to build up our new algorithm for the discovery of multivariable sequences in the form of the predisposing factor and co-incident factor of the reference event of interest. We define the event as positive direction of data change or negative direction of data change above a threshold value. From these patterns we infer predisposing and co-incident factors with respect to a reference variable. For this purpose we study the Open Source Software data collected from SourceForge website. Out of 240+ attributes we only consider thirteen time dependent attributes such as Page-views, Download, Bugs0, Bugs1, Support0, Support1, Patches0, Patches1, Tracker0, Tracker1, Tasks0, Tasks1 and CVS. These attributes indicate the degree and patterns of activities of projects through the course of their progress. The number of the Download is a good indication of the progress of the projects. So we use the Download as the reference attribute. We also test our algorithm with four synthetic data sets including noise up to 50 %. The results show that our algorithm can work well and tolerate the noise data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Agrawal, R. and Srikant R., 1995. Mining Sequential Patterns. In Proceedings of the IEEE International Conference on Data Engineering, Taipei, Taiwan.

    Google Scholar 

  • Bettini, C., Wang S., et al. 1998. Discovering Frequent Event Patterns with Multiple Granularities in Time Sequences. In IEEE Transactions on Knowledge and Data Engineering 10(2).

    Google Scholar 

  • Blum, R. L., 1982. Discovery, Confirmation and Interpretation of Causal Relationships from a Large Time-Oriented Clinical Databases: The Rx Project. Computers and Biomedical Research 15(2): 164–187.

    Article  Google Scholar 

  • Dasgupta, D. and Forrest S., 1995. Novelty Detection in Time Series Data using Ideas from Immunology. In Proceedings of the 5th International Conference on Intelligent Systems, Reno, Nevada.

    Google Scholar 

  • Guralnik, V. and Srivastava J., 1999. Event Detection from Time Series Data. In KDD-99, San Diego, CA USA.

    Google Scholar 

  • Hirano, S., Sun X., et al., 2001. Analysis of Time-series Medical Databases Using Multiscale Structure Matching and Rough Sets-Based Clustering Technique. In IEEE International Fuzzy Systems Conference.

    Google Scholar 

  • Hirano, S. and Tsumoto S., 2001. A Knowledge-Oriented Clustering Technique Based on Rough Sets. In 25th Annual International Computer Software and Applications Conference (COMPSAC’01), Chicago, Illinois.

    Google Scholar 

  • Hirano, S. and Tsumoto S., 2002. Mining Similar Temporal Patterns in Long Time-Series Data and Its Application to Medicine. In IEEE: 219–226.

    Google Scholar 

  • Kantardzic, M., 2003. Data Mining Concepts, Models, Methods, and Algorithms. USA, IEEE Press.

    Google Scholar 

  • Keogh, E., Chu S., et al., 2001. An Online Algorithm for Segmenting Time Series. In Proceedings of IEEE International Conference on Data Mining, 2001.

    Google Scholar 

  • Keogh, E., Lonardi S., et al., 2002. Finding Surprising Patterns in a Time Series Database in Linear Time and Space. In Proceedings of The Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’ 02), Edmonton, Alberta, Canada.

    Google Scholar 

  • Last, M., Klein Y., et al., 2001. Knowledge Discovery in Time Series Databases. In IEEE Transactions on Systems, Man, and Cybernetics 31(1): 160–169.

    Article  Google Scholar 

  • Lu, H., Han J., et al., 1998. Stock Movement Prediction and N-Dimensional Inter-Transaction Association Rules. In Proc. of 1998 SIGMOD’98 Workshop on Research Issues on Data Mining and Knowledge Discovery (DMKD’98), Seattle, Washington.

    Google Scholar 

  • Mannila, H., Toivonen H., et al., 1997. Discovery of frequent episodes in event sequences. In Data Mining and Knowledge Discovery 1(3): 258–289.

    Google Scholar 

  • Roddick, J. F. and Spiliopoulou M., 2002. A Survey of Temporal Knowledge Discovery Paradigms and Methods. In IEEE Transactions on Knowledge and Data Mining 14(4): 750–767.

    Article  Google Scholar 

  • Salam, M. A., 2001. Quasi Fuzzy Paths in Semantic Networks. In Proceedings 10th IEEE International Conference on Fuzzy Systems, Melbourne, Australia.

    Google Scholar 

  • Tung, A., Lu H., et al., 1999. Breaking the Barrier of Transactions: Mining Inter-Transaction Association Rules. In Proceedings of the Fifth International on Knowledge Discovery and Data Mining [KDD 99], San Diego, CA.

    Google Scholar 

  • Ueda, N. and Suzuki S., 1990. A Matching Algorithm of Deformed Planar Curves Using Multiscale Convex/Concave Structures. In JEICE Transactions on Information and Systems J73-D-II(7): 992–1000.

    Google Scholar 

  • Weiss, S. M. and Indurkhya N., 1998. Predictive Data Mining. San Francisco, California, Morgn Kaufmann Publsihers, Inc.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer

About this paper

Cite this paper

Kooptiwoot, S., Salam, M.A. (2006). Mining the Relationships in the form of the Predisposing Factors and Co-Incident Factors among Numerical Dynamic Attributes in Time Series Data Set by Using the Combination of Some Existing Techniques. In: Seruca, I., Cordeiro, J., Hammoudi, S., Filipe, J. (eds) Enterprise Information Systems VI. Springer, Dordrecht. https://doi.org/10.1007/1-4020-3675-2_16

Download citation

  • DOI: https://doi.org/10.1007/1-4020-3675-2_16

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-1-4020-3674-3

  • Online ISBN: 978-1-4020-3675-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics