MultiLog is a research tool that controls, gathers, filters, and combines the output, on-the-fly, from existing research and commercial logging applications. It allows researchers to easily deploy multiple software logging systems to observe user behavior in either short- or long-term user studies. Automatic log uploading facilitates large-scale data collection.
The system gathers log data on-the-fly: when a logger is enabled, MultiLog actively polls the corresponding log file (or listens on the specified TCP/UDP port) at an interval configurable by the researcher (set to 1 min by default) and checks for updates. If changes are detected (or new data are received on the open TCP/UDP socket), the relevant lines are extracted from the log file, formatted to MultiLog’s pre-defined format (as shown in Fig. 1), presented in the main interface, and written to an output database.
MultiLog is designed for two groups of users — researchers and study participants — with each having a distinct mode of operation: Researcher Mode and Deployment Mode. Researcher Mode provides the user with full control and configuration ability, while Deployment Mode is intended for user study deployments with settings controlled via a configuration file.
Researcher Mode
By default, MultiLog runs in Researcher Mode, where the user sees the full user interface, is able to add and remove loggers, can start and stop loggers, and can view the log output from all currently active loggers, as shown in Fig. 1. This mode allows researchers to experiment with logger configurations, examine the combined output from loggers, and prepare logging environments for deployment during a user study.
A key feature of MultiLog is its “plug and play” architecture that allows the researcher to “plug-in” any existing logger, at any time. MultiLog will work with any existing logging application as long as the researcher can provide the executable name, start and stop commands, the location of the continually updated log file (or port number if the logger outputs data to a TCP/UDP socket), the position of the timestamp within this output (or the attribute/element that contains the timestamp if the log is in XML format), and an idea of which log lines are required to appear in the output. The “plug and play” architecture even allows non-technical researchers to quickly configure a series of logging systems. Once configured, the researcher can manually start and stop each logger through MultiLog’s user interface, sending appropriate signals to the relevant logger.
Researchers can also choose to filter incoming log lines to reduce the amount of information collected. MultiLog supports filtering via line matching to include/exclude text provided by the researcher at configuration time.
The log file polling interval is configurable by the researcher. By default, this is set to 1 min, selected as a result of a trade-off analysis between obtaining real-time data, without experiencing degradation in performance (during configuration, researchers will often wish to reduce this value to immediately see the result of their changes). Data received from loggers that output to TCP/UDP socket is automatically received and processed in real time and, thus, the polling interval does not apply to these loggers.
Researchers can “save” the current logger setup (enabled loggers, filters, and polling interval) and generate a configuration file ready to deploy the logger in Deployment Mode.
Deployment Mode
Deployment Mode helps researchers to quickly “roll out” the application to many computers using MultiLog’s executable and an editable configuration file. In this mode, no interface is displayed and the logger runs “silently” in the user’s system tray. The configuration file provides details of each logger to be run (name, executable location, start and stop commands, location of the log file (or port number if the logger outputs data to a TCP/UDP stream), timestamp, and filtering data). If the relevant flag is set inside this file, its contents are read by MultiLog on start-up and the relevant loggers are started with MultiLog minimized to the user’s system tray.
Users can open the interface from the system tray icon, view logged actions, remove individual lines if they do not wish these to be uploaded, or pause logging completely. The log lines are stored locally in a database that is automatically uploaded via a secure FTP connection to a server once daily.
In an effort to reduce privacy issues surrounding logging, MultiLog can hash the data part of a log line or detect and hash URLs. As an example, when URL hashing is enabled via the add logger wizard in Researcher Mode, the URL http://www.bbc.co.uk/news/uk/ could appear as http://www.bbc.co.uk/HGTRFDH. When enabled, MultiLog detects and hashes the path part of the URL, preventing the exact website address from appearing in the output (although identical URLs will hash to the same value). Hashing of the data part of the log line is also set up in the add logger wizard where lines containing certain textual phrases can be hashed.
Deployment
MultiLog saves log data into a local SQLite database that is then uploaded to a server. The local database is then truncated to prevent large amounts of log data accumulating on the user’s computer. The researcher configures the connection by providing the address, username, and password of the remote web server. Data can be extracted by non-technical researchers by using MultiLog’s re-combination software which combines the output for a given user into a text file.
Summary
The main features of MultiLog are: (1) Two distinct modes of operation for different audiences; (2) its “plug and play” architecture allowing on-the-fly addition and removal of loggers; (3) on-the-fly gathering, combination, and display of logged data; (4) fully featured Deployment Mode allowing it to start up and run silently in the user’s system tray, allowing user “pausing” and where necessary removal of log data, and hashing to address privacy issues; (5) log files are securely uploaded to a server on a daily basis.