Data Organization

Organizing Chaos


A preliminary step to designing and programming an algorithm is gathering data and sorting it. When you first go out to test a thesis or write code to analyze network traffic, only part of the information is readily available; some of the data is still unknown. First estimations are made based on the first set of data files. As data is gathered, new insights and understandings arise, resulting in possible changes to the processing script and data gathering application, such as adding a previously unlogged parameter and graphing it over time. Some changes may include data gathering over substantial longer time periods than originally anticipated. Consequently, to accommodate for manageable data files, a reduction in the sampling rate is required, implemented by logging only every nth value. Another plausible scenario is that of parsing log files, where the generating application, for example, a web server, recently went through a software upgrade altering the file format and the file name scheme.


Copyright information

© Shai Vaingast 2009

