Abstract
This chapter is concerned with the organization of categorical sequence data. We first build a typology of sequences distinguishing for example between chronological sequences and sequences without time content. This permits to identify the kind of information that the data organization should preserve. Focusing then mainly on chronological sequences, we discuss the advantages and limits of different ways of representing time stamped event and state sequence data and present solutions for automatically converting between various formats, e.g., between horizontal and vertical presentations but also from state sequences into event sequences and reciprocally. Special attention is also drawn to the handling of missing values in these conversion processes.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Aassve, A., Billari, F., Piccarreta, R.: Strings of adulthood: A sequence analysis of young British women’s work-family trajectories. European Journal of Population 23(3), 369–388 (2007)
Blossfeld, H.P., Golsch, K., Rohwer, G.: Event History Analysis with Stata. Lawrence Erlbaum, Mahwah (2007)
Brock, G.N., Shaffer, J.R., Blakesley, R.E., Lotz, M.J., Tseng, G.C.: Which missing value imputation method to use in expression profiles: A comparative study and two selection schemes. BMC Bioinformatics 9, 12 (2008)
Gabadinho, A., Ritschard, G., Studer, M., Müller, N.S.: Mining sequence data in R with TraMineR: A user’s guide for version 1.1. Technical report, Department of Econometrics and Laboratory of Demography, University of Geneva, Geneva (2009), http://mephisto.unige.ch/traminer
Gauthier, J.A., Widmer, E.D., Bucher, P., Notredame, C.: Multichannel sequence analysis applied to social science data, University of Lausanne (2007) (manuscript) (under review)
Hobbs, J.R., Pan, F.: An ontology of time for the semantic web. ACM Transactions on Asian Language Information Processing 3(1), 66–85 (2004)
Karweit, N., Kertzer, D.: Data organization and conceptualization. In: Giele, J.Z., Elder, G.H. (eds.) Methods of Life Course Research: Qualitative and Quantitative Approaches, pp. 81–97. Sage, Thousand Oaks (1998)
Little, R.J.A.: Modeling the drop-out mechanism in repeated-measures studies. Journal of the American Statistical Association 90(431), 1112–1121 (1995), http://www.jstor.org/stable/2291350
Ritschard, G., Oris, M.: Life course data in demography and social sciences: Statistical and data mining approaches. In: Levy, R., Ghisletta, P., Le Goff, J.M., Spini, D., Widmer, E. (eds.) Towards an Interdisciplinary Perspective on the Life Course, Advances in Life Course Research, vol. 10, pp. 289–320. Elsevier, Amsterdam (2005)
Yamaguchi, K.: Event history analysis. In: ASRM 28. Sage, Newbury Park (1991)
Zaki, M.J.: SPADE: An efficient algorithm for mining frequent sequences. Machine Learning 42(1/2), 31–60 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Ritschard, G., Gabadinho, A., Studer, M., Müller, N.S. (2009). Converting between Various Sequence Representations. In: Ras, Z.W., Dardzinska, A. (eds) Advances in Data Management. Studies in Computational Intelligence, vol 223. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02190-9_8
Download citation
DOI: https://doi.org/10.1007/978-3-642-02190-9_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02189-3
Online ISBN: 978-3-642-02190-9
eBook Packages: EngineeringEngineering (R0)