AUX: A scripting language for auditory signal processing and software packages for psychoacoustic experiments and education
- First Online:
- Cite this article as:
- Kwon, B.J. Behav Res (2012) 44: 361. doi:10.3758/s13428-011-0161-1
- 760 Downloads
This article introduces AUX (AUditory syntaX), a scripting syntax specifically designed to describe auditory signals and processing, to the members of the behavioral research community. The syntax is based on descriptive function names and intuitive operators suitable for researchers and students without substantial training in programming, who wish to generate and examine sound signals using a written script. In this article, the essence of AUX is discussed and practical examples of AUX scripts specifying various signals are illustrated. Additionally, two accompanying Windows-based programs and development libraries are described. AUX Viewer is a program that generates, visualizes, and plays sounds specified in AUX. AUX Viewer can also be used for class demonstrations or presentations. Another program, Psycon, allows a wide range of sound signals to be used as stimuli in common psychophysical testing paradigms, such as the adaptive procedure, the method of constant stimuli, and the method of adjustment. AUX Library is also provided, so that researchers can develop their own programs utilizing AUX. The philosophical basis of AUX is to separate signal generation from the user interface needed for experiments. AUX scripts are portable and reusable; they can be shared by other researchers, regardless of differences in actual AUX-based programs, and reused for future experiments. In short, the use of AUX can be potentially beneficial to all members of the research community—both those with programming backgrounds and those without.
Many experiments in behavioral research use audio signals for stimuli. Nowadays, signals are generated and processed predominantly in a digital form, and the presentation of stimuli is controlled by a computer, as opposed to analog equipment such as tone/noise generators, filters or mixers. While researchers appreciate the flexibility and efficiency provided by digital technology, they are still required to select or create proper programs to generate and process signals. Many researchers have adopted MATLAB (The Mathworks Inc., Natick, MA) as a programming tool, because it provides an intuitive computing environment that visualizes the processing of signals in a script-based language so that users can define and analyze arbitrary signals with relative ease. However, although the definition of a single sound is straightforward, operations among multiple signals can be cumbersome. Because all signals in MATLAB are processed as matrix or vector operations, signal lengths must be carefully adjusted before manipulation. This requirement often overshadows the benefits of using MATLAB in psychoacoustic research. In other words, conceptually simple manipulations of sound (e.g., two short tones with onset/offset smoothing windows, occurring sequentially but separated by a certain delay and embedded in noise with a longer duration) often require lengthy MATLAB expressions with accurate indices and sample numbers in the signal vectors that correspond to the actual times desired in (milli)seconds. Although the indices and sample counts constitute critical elements in MATLAB, because they specify how sample points in one signal interact with those in another signal in the discrete-time domain, they are not intrinsically relevant to conceptual components of the sound. This is due in part to the nature of MATLAB, which was originally developed to facilitate computational tasks for engineering problems, not necessarily to generate audio signals.
Therefore, an alternative scripting language—namely, AUX (AUditory syntaX)—was developed specifically for generating and processing audio signals. The purpose of this article is to introduce AUX and the related software packages AUX Viewer, Psycon, and AUX Library to researchers and educators in the behavioral sciences. These tools have been used by the author and by a small number of colleagues for research and classroom demonstration and have been polished with feedback from users over several years.
AUX is based on a paradigm of device-independent programming, where sounds are specified in a conceptual representation. This is in contrast to representing sounds directly as digital samples in conventional programming languages, such as MATLAB or C, where the rendition of sounds is dependent on the settings in the device, such as the digital sampling rate or the data type chosen for sound playback. The conceptual identity of sounds sometimes becomes unclear when they are embedded with device settings. In AUX, an abstract specification of signals suffices, without the implementation details of the device.
AUX is available as a syntax module and can be adopted by other software tools that offer desirable graphical user interfaces (GUIs), or it could be incorporated in auditory research tools available in the literature, such as Praat (Boersma & Weenink, 2010), APEX3 (Francart, van Wieringen, & Wouters, 2008), DMDX (Forster & Forster, 2003), Paradigm (López-Bascuas, Carrero Marín & Serradilla García, 1999), and Alvin (Hillenbrand & Gayvert, 2005).
The primary beneficiaries of AUX would be those who find programming in MATLAB daunting or who have less programming knowledge. The primary applications of AUX would be psychoacoustical research and education with no heavy engineering requirements, such as sophisticated control of hardware, data acquisition, or real-time signal processing. AUX was not created to fulfill any signal generation and processing needs that are impossible to fulfill with MATLAB or C. Instead, AUX provides an alternative way of generating signals by representing sounds using a conceptually simpler syntax in a device-independent manner.
In this article, the background and crux of AUX are discussed first, and the operators and rules of the syntax are explained. Subsequently, the AUX-based software packages AUX Viewer, Psycon, and AUX Library are described, to illustrate how AUX can be used in actual programs. These software packages are distributed under Academic Free License 3.0 and are available for download from the following website: http://auditorypro.com/download/aux. Detailed information about the software, including the manuals, is also available on this website for interested readers.
Generation of simple signals in AUX
Examples of built-in AUX functions for signal generation
As described above, AUX employs unique syntax rules that are different from the conventions used in other programming languages, such as MATLAB. Detailed rules and conventions are presented in the following sections.
While both terms refer to a sequence (or an array) of scalar values, in this article the former implies a sound, but the latter does not. Although this distinction (auditory vs. nonauditory) is relatively arbitrary and can be flexible, it might be useful for understanding the differences between them when composing or understanding AUX expressions.
Assignment and sole expressions
An assignment expression is used to assign a signal or vector to a variable: for instance, Open image in new window. A sole expression is the representation of signal without such assignment: for instance, Open image in new window.
Null signal/null portion
A signal is referred to as “null” if it has not been defined. As such, it has no effect during arithmetic operations. For example, in this signal, Open image in new window, the null portion is from 0 to 300 ms. Note that a null signal is different from a signal with zeros (a “silence” signal), because a zero signal can be multiplied with another signal to silence out all or part of the other signal, whereas a null signal has no effect in multiplication.
A script is one or more lines of AUX statements that represent a signal. In all examples shown below, the last statement of the AUX script refers to the signal to be generated, regardless of whether it is an assignment statement or a sole expression.
AUX user-defined function
In addition to built-in functions, AUX allows users to define their own functions, store them in a specified file path (determined by each program), and call them when necessary (see the Appendix). A user-defined function is called with input arguments to perform an intended task (such as computation or the generation of a signal), and it produces specified output arguments (such as signals or vectors), comparable to user-defined *.m functions in MATLAB.
Operators and rules in AUX
Rule 1—Arithmetic operators: + , -, *, and /
The arithmetic symbols (+, -, *, and /) operate on a point-wise basis; that is, values are added, subtracted, multiplied, or divided at each point of time. In a strictly mathematical sense, and in programming languages such as MATLAB, arithmetic operations on two vectors are meaningful only if vector dimension requirements are met; for example, in Open image in new window, Open image in new window and Open image in new window must be the same length. Arithmetic operations in AUX do not have this unwieldy constraint: Open image in new window simply means “two signals being put together.” Depending on the time windows for which Open image in new window and Open image in new window are defined, this can be addition, concatenation with or without a silent gap, or a combination of addition and concatenation of the signals. On the other hand, the “*” operation occurs only during the intervals in which both signals are defined (examples are shown in subsequent sections).
Operations between a scalar and a signal/vector apply to the entire length of the signal/vector. For example, in Open image in new window, 0.5 is a scaling coefficient, and 0.1 is a dc term.
Rule 2—Time-shift operator: >>
Open image in new window indicates the signal Open image in new window time-shifted by α milliseconds. By definition, an AUX expression without the time-shift operator is considered a signal with zero shift (i.e., Open image in new window). Therefore, all signals in AUX are represented with explicit timing information. The use of “time-shifted” signals with arithmetic operators, discussed above, is useful when arranging multiple signals in time or applying amplitude modulation to different signals.
Rule 3—Array operations with [ ]
Although the expression above is valid in AUX, it is not aligned with the principle of device-independent programming, since each sample is explicitly determined by the sampling rate. Instead, the following simpler and more descriptive representation is recommended: Open image in new window. Similarly, while it is possible to access a portion of a signal by indices, as in MATLAB, using the extraction operator ~ with time markers (see Rule 6) is recommended, because it eliminates the burden of tracking sample indices.
Rule 4—Concatenation operator: ++
Open image in new window indicates signal Open image in new window followed immediately in time by signal Open image in new window, and is equivalent to Open image in new window, where Open image in new window is a built-in AUX function that returns the duration of signal Open image in new window in milliseconds.
Rule 5—RMS scaling operator: @
In AUX, the full-scale amplitude of a signal is from 1 to −1. A signal generation function, such as Open image in new window or Open image in new window, without a scaling factor creates the signal at full scale. By definition, the RMS of a pure tone with full-scale amplitude is 0 dB. For example, Open image in new window generates the tone at full scale with an RMS of 0 dB.
Rule 6—Extraction operator: ~
Examples of built-in AUX functions for computations of signal properties or frequently used conversion
Rule 7—Playback adjustment operator: %
The % operator changes the playback rate of the signal—for instance, modulating the pitch and increasing or decreasing the duration. For example, Open image in new window is the signal with half the pitch and double the duration, and Open image in new window—or Open image in new window—is the signal with doubled pitch and half of the duration.
Rule 8—Stereo signal
Open image in new window is a stereo signal: Open image in new window on the left channel, Open image in new window on the right channel. As with the “+” operator, Open image in new window and Open image in new window do not need to have the same duration or time interval. Note that a semicolon is the delimiter between channels, and this should not be confused with the use of brackets for arrays, as used in MATLAB and specified in Rule 3. Open image in new window does not indicate a two-dimensional array or matrix as in MATLAB.
Rule 9—Conditional statements, logical operators, and loop control
To support programming needs, AUX provides conditional and looping control statements, such as Open image in new window. The following relational or logical operators are used for logical statements: Open image in new window (conditional equal), Open image in new window (conditional not equal), Open image in new window (and), and Open image in new window (or).
Rule 10—Mathematical functions
AUX also provides mathematical functions, such as Open image in new window, and the power operator ^. If the input is a vector, the output is also a vector where each element is the result of the function applied to the corresponding input element. If the input array has invalid values for the function, the corresponding output for those input values will be null. For example, negative values for the square-root function, Open image in new window, return null as output.
Rule 11—A symbol for commenting: //
The symbol // can be used to leave a comment for the user’s own reference in AUX scripts and user-defined functions.
Examples of AUX scripts
Examples of built-in AUX functions for signal processing
The last line indicates the speech target (from the .wav file) presented with band-pass-filtered noise of the same duration, with RMS levels of −20 and −30 dB, respectively, resulting in a signal-to-noise ratio of 10 dB.
For class demonstration, the following modifications can be made to generate different signals: varying signal-to-noise ratios, varying cutoff frequencies, or different filtering functions, such as low-pass (Open image in new window) or high-pass (Open image in new window) filtering.
For class demonstration, the number of harmonics and/or the distribution of the magnitudes of harmonics can be adjusted. Complex signals with a missing fundamental frequency component, or partial components with shifted frequencies, can also be generated, and their perceptual effects can be demonstrated in the classroom.
In addition to the examples above, other examples in which the sigma function could be useful include generating a Schroeder phase tone complex (Schroeder, 1970) or iterated rippled noise (Yost, 1996).
On the first line, Open image in new window is a tone glide with the frequency changing from 1000 to 1500 Hz. On the second line, Open image in new window is a silence signal for 50 ms, time-shifted by 225 ms. Then Open image in new window means the tone with an interrupted interval in the middle (from 225 to 257 ms). Open image in new window is low-pass noise to be inserted into that interrupted interval.
While many features are covered in the examples thus far, a beginning user might learn and use only the features of AUX that are required for the complexity of the signals that he or she wishes to create. For example, if one needs to generate relatively simple signals, consisting of a small number of tonal and noise components, but with specific temporal arrangements or intensity relations, one needs to be familiar with basic built-in functions and operators (such as >>, ++, or @). A wide range of signals that are covered in psychoacoustic textbooks and behavioral listening tests can be created using only these language features. Finally, since this article is intended to introduce only the fundamental features of AUX, interested readers are highly encouraged to refer to the website mentioned earlier for more in-depth information, such as user-defined functions, cell arrays, and string manipulations. Some select examples of user-defined functions are included in the Appendix.
Psycon is a Windows-based program for administering psychoacoustic experiments using multiple presentation intervals. It offers three commonly used experimental modules: adaptive procedure (Levitt, 1971), the method of constant stimuli, and the method of adjustment. A diverse range of signals can be tested in Psycon, because the signals are specified in AUX.
Structure and basic operations
Psycon supports experimental procedures in which the stimulus is presented in multiple intervals, either “standard/reference” or “odd-ball/variable,” in a random order. The subject’s task is to pick the interval with the odd-ball stimulus. Two executables are included in the program: psycon.exe and psycon_reseponse.exe. The former is for the experimenter, while the latter is for the subject. Upon presentation of the stimulus, the subject responds with a mouse click on the response screen, which can be on the same computer or on another computer connected via a network (TCP/IP). The program on the experimenter’s side, psycon.exe, administers the entire procedure: presenting the stimuli as specified in the AUX script, collecting the subject’s response, providing feedback if desired, and displaying and saving the result upon completion of the session. During the testing session, the progress of the procedure is visualized with graphs (such as the excursion of values for the adaptive procedure).
Specification of signals
for 8-Hz amplitude modulation (AM) detection with a noise carrier.
respectively. Upon beginning the testing session, AUX statements in “Variable Definitions” are executed once, and the variables are used throughout trials. The other box, “Variables per trial,” serves a similar purpose, but AUX statements in this box are updated in each trial. Therefore, the use of this box allows presentation of “frozen” noise within each trial.
The control tab of the adaptive procedure is circled in Fig. 3. The number of responses before adjusting the stimuli up or down and the direction of adjustment are specified by the experimenter. In Fig. 3, these are set as “3 down, 1 up” and “descending,” indicating that the adjustment will be made in the direction of the decrease in the value (Open image in new window). The “Initial Value” box specifies the value that Open image in new window starts with when the testing session begins, in units implied by the AUX script used for testing. In the examples above, it could be 20 (Hz) for frequency discrimination, 6 (dB) for intensity discrimination, 100 (ms) for duration discrimination, or −10 (dB) for AM detection, and so on. Likewise, the “Step sizes” boxes indicate desired step sizes in the implied unit, along with the number of reversals for the adaptive procedure. Upon completion of all reversals, the mean of the Open image in new window values at a specified number of reversal points is calculated and presented as the result of the session.
Settings in common
In all three procedures, customized instructions for the subject can be used for testing sessions. The number of intervals to present and the use of feedback can also be specified. The definitions of signals and variables can be saved into a file and retrieved later from the File menu.
Decision/adjustment of the sampling rate
Although consideration of the sampling rate extends beyond the scope of AUX scripting, a sensible decision or adjustment of the sampling rate is required in actual programs using AUX, such as Psycon. For example, in order to represent a 10000-Hz tone, an AUX user may simply write it as Open image in new window, but in order to generate this signal, the sampling rate must be set properly in Psycon (i.e., greater than 20000 Hz), which can be done in “Settings” under the File menu. In addition to the Open image in new window function, other functions in AUX require proper setting of the sampling rate.2
Because Psycon uses the sound card in the PC, it does not provide functionality for calibration of the reference signal level, but it can generate a continuous tone for calibration purposes. The user ought to measure the output from the playback device (e.g., speakers or headphones) with this tone, which corresponds to the “full-scale level” mentioned throughout this article. The user should also be aware that the output level changes if the Windows “Volume Control” switch is adjusted, which can occur not only by the user’s explicit control, but also by any other Windows programs that control the default sound card in the PC. For this reason, the use of a dedicated sound card that is not used by other Windows programs is recommended. The “Settings” control in the File menu provides a way to designate or change the sound card for a PC with multiple sound cards.
AUX Library provides a programming tool that allows users to create their own programs, just like AUX Viewer or Psycon. This is provided in the form of DLLs (dynamically linked libraries) with the calling convention of C. In addition, the AUX interface can be used in MATLAB through MEX (MATLAB executable), where a string input of AUX expressions (including variable definitions across multiple lines) is interpreted and the intended signal is generated. These programming tools are particularly useful to individuals who have invested significant effort or resources to develop programs for their own particular needs—for instance, a procedure with special adaptive staircase rules. They would only need to replace the signal generation routines with the codes calling AUX C Library or AUX MEX.
In the current release, both tools can be used only in the Windows environment (Windows 95 and later), since AUX was developed in Microsoft Visual C++. However, the core codes of AUX are not Microsoft-specific. Therefore, the library can be easily built for other platforms, such as Linux. The author welcomes modifications and redistributions of AUX Library for Linux under the terms of Academic Free License 3.0.
AUX was developed to benefit psychoacoustic researchers, educators, and students by relieving them of the burden of programming in C or MATLAB that would otherwise be necessary to generate and analyze sound signals. According to the author’s own experience, most students, regardless of their training history in programming, demonstrated command of AUX after a relatively short introduction. In recent years, MATLAB has often been taught in research methods courses covering instrumentation in the curricula of experimental psychology, behavioral neuroscience, or speech–language–hearing sciences. As mentioned earlier, MATLAB is a general tool for engineering needs. Attempts to teach MATLAB to students without programming or engineering backgrounds often yield only questionable profits, despite considerable time and effort expended. AUX might be presented as an alternative.
While the device-independent representation of sounds in AUX is beneficial, in that sounds can be specified as conceptual entities, this also delineates a limit to the application of AUX. For example, AUX is not applicable when signal handling involves real-time signal processing—that is, synthesis or modification of signals by real-time analysis. C or other programming languages would be more appropriate to handle those needs. Thus, AUX, MATLAB, and C cover different scopes of demands, and AUX is not intended to replace either MATLAB or C. The convenient features that AUX offers for generation and processing of signals could be emulated with tools developed as a MATLAB toolbox or a C library. However, to the extent that issues of engineering applications are of little or no concern to the researcher, AUX can be an appropriate tool to represent sounds. In fact, it is the author’s intention that AUX will be used in environments of other programming languages when specifying sounds, which is why AUX Library was created. For example, the AUX Viewer program has been replicated to run in the MATLAB environment using AUX MEX. While AUX Viewer for MATLAB runs the same way as the original AUX Viewer, it provides an improved screen layout and greater flexibility in viewing and analyzing signals, because it is based on MATLAB graphics.
Though AUX is capable of representing a variety of sounds, as discussed in this article, AUX and the accompanying software are still works in progress. It would be an overstatement to suggest that AUX allows any conceivable signal to be specified in a simple form, because each AUX statement is still a manifestation of semimathematical and algorithmic operations. Therefore, there might be several signals that are conceptually simple but problematic to represent using simple AUX statements without specific algorithms from the user, especially if the processing involves analyses of speech sounds (e.g., the fundamental frequency or formant frequencies). While this can be handled by user-defined functions (see the Appendix), in future releases of AUX some of these functionalities might be realized through operators or built-in functions. Debugging support in AUX is another area that needs improvement. The software currently provides only limited debugging support,3 though a debugging tool is under development as part of AUX Library.
The benefits of AUX include the portability and reusability of scripts and the separation of signal definition from the software. In the context of research software development, researchers do not need to develop software for signal generation, as this can be handled by AUX. Instead, they can focus on the user interface or testing paradigm specific to their needs. Yet, signals used by one researcher can be easily shared in the research community through AUX scripts. Even if the actual program that uses the script is not shared, other researchers can easily regenerate the signals in AUX Viewer or Psycon and examine them for subsequent studies. Thus, the widespread use of AUX could ultimately promote research productivity and collaboration among researchers.
This bracket notation also supports the use of a colon (:), the MATLAB-style expression.
In addition, users should be mindful of the effect of sampling rate when using the Open image in new window function. In actual AUX-based programs, it opens the .wav file and resamples it to the chosen sampling rate in the current program, if is not already sampled at that rate. Careful consideration of the sampling rate is needed to ensure that the desired spectral information is retained during the resampling process. From a purist’s view, the Open image in new window function is a violation of the device-independence principle, since a .wav file already includes information about the implementation (the sampling rate). This function was included in AUX as a realistic consideration, and for convenience’s sake.
Current tools for debugging AUX code are relatively crude. Users can debug their AUX scripts or user-defined functions either by displaying intermediate results of values on the screen with a Open image in new window function or by successively saving the intermediate results in the file with Open image in new window.
The support was provided by NIH/NIDCD (R03DC009061) and The Ohio State University Medical Center. All due credit must go to Jae Heung Park for his improvement of author’s original source codes with yacc/lex. The author thanks Trevor Perry for developing AUX Viewer for MATLAB and for valuable comments on earlier drafts of the manuscript, John Galvin III for useful editorial comments, and the users of Psycon of earlier versions for the feedback on software features. Finally, the author expresses appreciation to Judy Dubno for her encouragement to publish this manuscript.
This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.