In this section we will give an introduction into the technical details of the toolbox. We will explain the main data structure that is used throughout the toolbox, show an overview of the toolbox functions, discuss performance concerns, explain how we utilize unit testing as a mean of quality assurance, and how the extensive documentation is created.
In order to work efficiently with the toolbox, it is necessary to understand the toolbox’ main data structure, dubbed Data. It is used in almost all functions of the toolbox and fortunately it is not very difficult. Before we can begin, we have to explain the terminology that is used in NumPy and thus throughout our toolbox for describing n-dimensional arrays. A NumPy array is a table of elements of the same type, indexed by positive integers. The dimensions of an array are sometimes called axes. For example: an array with n rows and m columns has two dimensions (axes), the first dimension having the length n and the second the length m. The shape of an array is a tuple indicating the length (or size) of each dimension. The length of the shape tuple is therefore the number of dimensions of the array. Let’s assume we have an EEG recording with 1000 data points and 32 channels. We could store this data in a [time, channel] array. This array would have two dimensions and the shape (1000, 32). The time axis would have the length of 1000 and the channel axis the length of 32.
For the design of the data structure it is essential to take into account that the functions would deal with many kinds of data, such as continuous multi-channel EEG recordings, epoched data of different kinds, spectrograms, spectra, feature vectors, and many more. What all those types of data have in common is that they are representable as n-dimensional data. What separates them, from a data structure point of view, is merely the number of dimensions and the different names (and meanings) of their axes. We decided to create a simple data structure which has an n-dimensional array to store the data at its core, and a small set of meta information to describe the data sufficiently. Those extra attributes are: names, axes, and units. The names attribute is used to store the quantities or names for each dimension in the data. For example: a multi-channel spectrogram has the dimensions: (time, frequency, channel), consequently would the names attribute be an array of three strings: ['time', 'frequency', 'channel']. The order of the elements in the names attribute corresponds to the order of the dimensions in the Data object: the first element belongs to the first dimension of the data, and so on. The axes attribute describes the rows and columns of the data, like headers describe the rows and columns of a table. It is an array of arrays. The length of the axes array is equal to the number of dimensions of the data, the lengths of the arrays inside correspond to the shape of the data. For the spectrogram, the first array would contain the times, the second the frequencies and the third the channel names of the data. The last attribute, units contains the (preferably) physical units of the data in axes. For the spectrogram that array would be: ['ms', 'Hz', '#'] (Since the channel names have no physical unit we use the hash (#) sign to indicate that the corresponding axis contains labels).
These three attributes are mandatory. It is tempting to add more meta information to describe the data even better, but more metadata adds more complexity to the toolbox functions in order to maintain consistency. So there is a trade-off between completeness of information and complexity of the code. Since complex (or more) code is harder to understand, harder to maintain and tends to have more bugs (Lipow 1982), we decided for a small set of obligatory metadata to describe the data sufficiently and make the toolbox pleasant to use, without the claim to provide a data structure that is completely self-explaining on its own.
Keeping the data structure simple and easy to understand was an important design decision. The rationale behind this decision was that is must be clear what is stored in the data structure, and where, to encourage scientists to not only look at the data in different ways, but also manipulate at it at will without the data structure getting in the way. It was also clear that specific experiments have specific requirements for the information being stored, since we cannot anticipate all future use cases of the toolbox, it was important for us to allow the data structure to be easily extended, so users can add more information to the data structure if needed. Consequently, we designed all toolbox functions to ignore unknown attributes and more importantly, to never remove any additional information from Data objects.
To summarize, Wyrm’s main data structure (visualized in Fig. 1), the Data class, has the following attributes: .data, which contains arbitrary, n-dimensional data, .axes which contains the headers for the columns of the data, .names which contains the of names the axes of .data, and .units which contains the units for the values in .axes. The Data class has some more functionality, for example built-in consistency checking to test whether the lengths of the attributes are compatible. This data structure is intentionally generic enough to contain many kinds of data, even data the authors of this paper did not anticipate during the design. Whenever additional information is needed, it can be easily added to the Data class by means of subclassing or by simply adding it to existing Data objects, thanks to the dynamic nature of Python.
Wyrm also implements two other data structures: a ring buffer and a block buffer. Those data structures are useful in online experiments and are demonstrated in Section “Performing Online- and Simulated Online Experiments”.
Our toolbox implements dozens of functions, covering a broad range of aspects for offline analysis and online applications. The list of algorithms includes: channel selection, IIR filters, sub-sampling, spectrograms, spectra, baseline removal for signal processing, Common Spatial Patterns (CSP) (Ramoser et al. 2000), Source Power Co-modulation (SPoC) (Dähne et al. 2014), classwise average, jumping means, signed r
2-values for feature extraction, Linear Discriminant Analysis (LDA) with and without shrinkage for machine learning (Blankertz et al. 2011), various plotting functions and many more. Wyrm’s io module also provides a few input/output functions for foreign formats. Currently supported file formats are EEG files from Brain Products and from the Mushu signal acquisition, reading data from amplifiers supported by Mushu, and two functions specifically written to load the BCI competition data sets used in Sections “Classification of Motor Imagery in ECoG Recordings”, “ERP Component Classification in EEG Recordings”, and “Performing Online- and Simulated Online Experiments”. For a complete overview, please refer to Wyrm’s documentation (http://bbci.github.io/wyrm/).
It is worth mentioning that with scikit-learn (Pedregosa et al. 2011) you have a wide range of machine learning algorithms readily at your disposal. This list includes: cross validation, Support Vector Machines (SVM), k-Nearest Neighbours (KNN), Independent- and Principal Component Analysis (ICA, PCA), Gaussian Mixture Models (GMM), Kernel Regression, and many more. Our data format (Section “Data Structures”) is compatible with scikit-learn and one can mostly apply the algorithms without any data conversion step at all.
Almost all functions operate on Data objects introduced in Section “Data Structures” and are responsible for keeping the data and the metadata consistent. While a few functions like square, variance, or ;logarithm are just convenient wrappers around the respective NumPy equivalents that accept Data object instead of NumPy arrays, the vast majority of functions implement a lot more functionality. For example the function select_channels requires a Data object and a list of strings as parameters. The strings can be channel names or regular expressions that are matched against the channel names in the Data object’s metadata. select_channels will not only return a copy of the Data object with all channels removed that where not part of the list, it will also make sure the metadata that contains the channel names for the returned Data object is correctly updated. This approach is less error prone and much easier to read, than doing the equivalent operations on the data and metadata separately.
To ease the understanding of the processing functions, special attention was paid to keep syntax and semantics of the functions consistent. We also made sure that the user can rely on a set of features shared by all functions of the toolbox. For example: functions never modify their input arguments. They create a deep copy of them and return a possibly modified version of that copy if necessary. This encourages a functional style of programming which, in our opinion, is well suited when delving into the data:
A function never touches attributes of a Data object which are unrelated to the functionality of that function. In particular, a function never removes custom or unknown attributes:
If a function operates on a specific axis of a Data object (Section “Data Structures”), it adheres by default to our convention, but gives the option to change the index of the axis to operate on by means of Python’s default arguments. Those default arguments are clearly named as timeaxis, or classaxis, etc.:
In Sections “Classification of Motor Imagery in ECoG Recordings”, “ERP Component Classification in EEG Recordings”, and “Performing Online- and Simulated Online Experiments”, you will find some realistic examples of the usage of our toolbox and its functions.
We realize that speed is an important factor in scientific computation, especially for online experiments, were one iteration of the main loop must not take longer than the duration of the samples being processed in that iteration. One drawback of dynamic languages like Python or Ruby is the slow execution speed compared to compiled languages like C or Java. This issue is particularly important in scientific computing, where non-trivial computations in Python can easily be in the order of two or more magnitudes slower than the equivalent implementations in C. The main reason for the slow execution speed is the dynamic type system: since variables in Python have no fixed type and can change at any time during the execution of the program, the Python interpreter has to check the types of the involved variables for compatibility before every single operation.
NumPy mitigates this problem by providing statically typed arrays and fast operations on them. When used properly, this allows for almost C-like execution speed in Python programs. In Wyrm all data structures use NumPy arrays internally and Wyrm’s toolbox functions use NumPy or SciPy operations on those data structures. We also carefully profiled our functions in order to find and eliminate bottlenecks in execution speed. Wyrm is thus very fast and suitable even for online experiments, as we will demonstrate in the Sections “Performing Online- and Simulated Online Experiments” and “Performance”.
Unit Tests and Continuous Integration
Since the correctness of its functions is crucial for a toolbox, we used unit testing to ensure all functions work as intended. The concept of unit testing is to write tests for small, individual units of code (usually single functions). These tests ensure that the tested function meets its design and is fit for use. Typically, a test will simply call the tested function with defined arguments and compare the returned result with the expected result. If both are equal the test passes, if not it fails. Well written tests are independent of each other and treat the tested method as a black box by not making any assumptions about how the function works, but only comparing the expected result with the actual one. Those tests should be organized in a way that makes it easy to run all tests at once with little effort (usually a single command). This encourages developers to run tests often. When done properly, unit tests facilitate refactoring of the code base (i.e. restructuring the code without changing its functionality), speed up development time significantly, and reduce the number of bugs.
In our toolbox each method is tested respectively by a handful of test cases which ensure that the functions calculate the correct results, throw the expected errors if necessary, do not modify the input arguments, work with non-conventional ordering of axis, etc. The total amount of code for all tests is roughly 2-3 times bigger than the amount code for the toolbox functions. This is not unusual for software projects.
To automate the testing even further, we use a continuous integration (CI) service in conjunction with Wyrm’s github repository. Whenever a new version is pushed to github, the CI will run the unit tests with three different Python versions (2.7, 3.3, and 3.4) to verify that all tests still pass. If and only if the unit tests pass with all three Python versions, the revision counts as passing, otherwise the developers will get a notification via mail. The whole CI process is fully automated and requires no interaction.
A software toolbox would be hard to use without proper documentation. We provide documentation that consists of readable prose and extensive API documentation (http://bbci.github.io/wyrm/). The first part consists of a high level introduction to the toolbox, explaining the conventions and terminology being used, as well as tutorials how to write your own toolbox functions. The second part, the API documentation, is generated from special comments in the source code of the tool box, so called docstrings (Goodger and van Rossum 2001). External documentation of software tends to get outdated as the software evolves. Therefore, having documentation directly in the source code of the respective module, class, or method is an important mean to keep the documentation and the actual behaviour of the code consistent. Each method of the toolbox is extensively documented. Usually a method has a short summary, a detailed description of the algorithm, a list of expected inputs, return values and exceptions, as well as cross references to related functions in- or outside the toolbox and example code to demonstrate how to use the method. All this information is written within the docstring of the method (i.e. in the actual source code) and HTML or PDF documentation can be generated for the whole toolbox with a single command. The docstrings are also used by Python’s interactive help system.
Python 2 versus Python 3
By the end of 2008 Python 3 was released. Python 3 was intentionally not backwards compatible with Python 2, in order to fix some longstanding design problems with Python 2. Since the porting of Python 2 software to Python 3 is not trivial for bigger projects, the adoption of Python 3 gained momentum only slowly. Although Python 2.7 is the last version of the 2.x series, it still receives backwards compatible bug fixes and enhancements. This is certainly a responsible decision by the Python developers but probably one of the reasons for the slow adoption of Python 3. As of today, most of the important packages have been ported to Python 3, but there is still a bit of a divide between the Python 2 and Python 3 packages.
We decided to support both Python versions. Wyrm is mainly developed under Python 2.7, but written in a forward compatible way to support Python 3 as well. Our unit tests ensure that the functions provide the expected results in Python 2.7, Python 3.3, and Python 3.4. Classification of Motor Imagery in ECoG Recordings