Focused Audification and the optimization of its parameters

We present a sonification method which we call Focused Audification (FA; previously: Augmented Audification) that allows to expand pure audification in a flexible way. It is based on a combination of single-side-band modulation and a pitch modulation of the original data stream. Based on two free parameters, the sonification’s frequency range is adjustable to the human hearing range and allows to interactively zoom into the data set at any scale. The parameters have been adjusted in a multimodal experiment on cardiac data by laypeople. Following from these results we suggest a procedure for parameter optimization to achieve an optimal listening range for any data set, adjusted to human speech.


Introduction
Sonification is still a young field building up a canon of methodologies, e.g. for supporting multimodal displays. Two of its standard methods are audification and auditory graphs. Audification is "a direct translation of a data waveform to the audible domain" (cited in [16], p. 186). It is often used to display one-dimensional, large data sets (with data display rates of tens of kHz). The sonic design space map orders data along their size on the x-axis and along their dimensionality on the y-axis (for a full description see de Campo [5]). Therefore, audification is to be found there on one end of x-axis. On the other end of this axis we may find auditory graphs, a special case of parameter mapping [11]. Auditory graphs classically use a discontinuous pitch-time-display to display relatively few data points.
This paper suggests an expansion of audification that may even lead to a display comparable to an auditory graph. We call this method Focused Audification (FA). 1 It builds on well-known techniques of signal processing, i.e., single-sideband modulation utilizing the Hilbert transform, and pitch modulation. Focused Audification conserves fundamental properties of pure audification, notably the compact temporal support and the translation of high frequency content of the data into transient events in the sound. At the same time, both the mean position of the frequency range and the bandwidth of the resulting sound can be controlled by free parameters, independently of the rate of the data display. Therefore, data sets can be explored interactively at various time scales and in different frequency ranges. These relationships are discussed in detail in this paper.
The proposed method has been presented in [12] and basically evaluated with a small study on the perception of higher statistical momentums in random noise data [26]. In Sect. 2 of this paper we discuss the background, i.e. basic properties of audification and auditory graphs. Then, this paper summarizes the signal processing algorithms of FA (Sect. 3), gives an example of the behavior of the algorithm with seismic data (Sect. 4), and evaluates it in a larger study (Sect. 5). The goal of this study with laypeople and cardiac data was to find the optimal calibration for the free parameter of FA. Concluding on its results, a generic method for perception-based Focused Audification of any data is proposed in Sect. 6.

Audification
Audification is one of the oldest methods of sonification research. A prominent, early study on the audification of seismic signals was conducted by Speeth et al. [23]. Subjects showed up to 90% discrimination rates between the sounds of earthquakes from the ones of atomic bombs.
A crucial advantage of audification is the following: By conserving the time regime of the data signal, audifications of real physical processes are usually broad-band with a pronounced proportion of high frequencies during rapid transients. In the task of identifying natural sounds, e.g., the attack of musical instruments or speech signals, the transient signal portions provide important and salient features for the human ear and thus should serve as basis for pattern detection or recognition tasks in the auditory data exploration. Many authors, e.g., Dombois and Eckel [6], have argued in favor of a puristic approach to audification with as little data preprocessing as possible. This strategy should maximize the potential of the human hearing to detect yet unknown structures in the data which might be impaired by more sophisticated preprocessing.
The ideal audification signal has relevant auditory gestalts within time and frequency regimes that can be well-perceived by the human auditory system. Data sets that are problematic in audification shall be discussed with a thought experiment: let us assume a data stream with transient events that appear within a range of 1 k data points and with an (aperiodic) interval of roughly 10 k data points. This hypothetical data set is shown in Fig. 1. With a playback rate of 44.1 kHz (Fig. 1a), we find approximately four of these events per second, which is comparable to the rate of syllables per second in English spoken language 2 and thus apt for human hearing. On the other hand, each transient event lasts for approximately 22 ms and appears as a band-limited impulse with a cut-off frequency of around 50 Hz, which is far below the most sensitive frequency range of the hearing system. If the playback rate were to be raised by, e.g., a factor of 10, see Fig. 1b, the individual impulses would be transposed to a more appropriate frequency range, but at the cost of an indiscernible temporal structure of the impulse series. Concluding from this example, pure audification may suffer from a trade-off between the rhythmic structure and the displayed frequency range of individual events.
Different concepts have been elaborated to cope with this trade-off. Worrall [28] extends the notion of audification, and allows other means of data pre-processing: besides filtering and data interpolation, i.e. compression and frequency shifting. This wider definition of audification still excludes the explicit synthesis of sound or the use of specific signal models (such as the one we develop in this paper). Another similar approach has been explored within the CoRSAIRe project for a multisensory virtual reality environment [25]. In this project, "sonification-audification metaphors" have been developed for a specific data set (computational fluid dynamics, data), notably three different ways of shifting the pitch of a signals without changing its duration. Each of these algorithms has specific artifacts changing the audio outcome, but all have the advantage of adapting the audified data set better to the human listening range. This argument meets our motivation for FA, but we use a different algorithm (leading to different artifacts). Equally favourable to FA, Feedback from the data experts involved in the CoRSAIRe project found a fourth method based on FM more intuitive than the approaches using pitch shifting.

Auditory graphs
Just as audification, auditory graphs belong to the standard repertoire of sonification research since its beginning. Obvious benefits are the straightforward analogy to visual graphs, which make them intuitively understandable for sighted users, and their accessibility for non-sighted users. The data sets used are normally small, up to a few hundred data points, but may have several dimensions, as human audition is apt to segregate parallel data streams [11]. Focused Audification, depending on the setting of its parameters, may lead to a sonification that resembles in many ways an auditory graph (i.e., data points are mapped to pitch over time). Therefore, we shortly discuss the design of auditory graphs in the following.
Earlier research [3,15] recommended using musical pitches (e.g., MIDI notes following the Western 12-tone scale) mapped to the y-axis and time to the x-axis. The Sonification Sandbox [4] was possibly the largest effort to develop a general tool for auditory graphs in such a way. From experience  with the toolbox it can be concluded that most real-world sonification applications need a more flexible software environment. Also, we consider the limitation to 12 semi-tones and (aesthetically unpleasant) MIDI-sounds not any more state-of-the-art. A recent development of a general-purpose tool for sonification is the sonification workstation (see [20] also for a discussion of previous attempts).
In an analysis, Flowers [8] discussed promises and pitfalls of auditory graphs. He suggested the following strategies for successful displays: -Pitch coding of numeric value -Exploiting temporal resolution of human audition -Manipulating loudness changes in a pitch mapped stream to provide contextual cues and signal-critical events -Using time to represent time All strategies but the last one are taken into account in the design of the proposed method: the last point, using time to represent time, might be fulfilled depending on the data set. 3 3 Furthermore, in the case of several data sets, Flowers suggests to choose distinct timbres to minimize stream confusions and unwanted

Focused audification: the model
For explaining Focused Audification (henceforth: FA), we start with a simple audification. We assume a dataset x(n) with n = 1..N data points of a constant sampling frequency f s . In the most direct audification we take f s equivalent to the playback rate f p , i.e., f s data points are displayed per second. The rendering over a D/A converter with a reconstruction filter leads to a continuous signal x(t) with a bandwidth B between zero and 1 2 f s Hz. If f s and therefore f p is as low as a few hundred data points per second, the resulting audification will be in a low frequency range, where the human ear is not very sensitive.

Frequency shifting
Therefore, as a first step, we perform frequency shifting using a single-side-band (SSB) modulation. Applying a Hilbert transform H (see, e.g., [18]), the original audification signal x(t) becomes the complex-valued signal x a (t): Footnote 3 continued perceptual grouping and, in general, to compare sonified data sequentially rather than simultaneously.
with the imaginary constant j. This analytical signal can be written using a real-valued envelope env(t) = |x a (t)| modulated by a phasor with the instantaneous phase θ(t) = [x a (t)]: Performing a frequency shift by Δf and taking the real part of this signal leads to a SSB-modulated sound signal x SS B (t): The spectrum of the analytical signal, which contains (only non-negative) frequencies between zero and B Hz, is shifted to the range between Δf and (Δf + B) Hz. Discarding the imaginary part re-builds a symmetric spectrum.
The frequency shift Δf is a free parameter of the method, which helps to yield a perceptually optimal frequency range of the sonification, e.g., somewhere within the range of 100 Hz and 2 kHz. If Δf = 0, there is no difference to a pure audification. A schematic illustration of the frequency shift is shown in Fig. 2.
Let us consider two scenarios: (1) Assuming a signal bandwidth B of 10 kHz, a small frequency shift of 100 Hz hardly changes the overall signal, but might make low frequency components of the signal better audible, as the spectrum is now shifted to the range between 100 Hz and 10.1 kHz. Note that in the case of large frequency shifts, the issue of aliasing eventually has to be taken into account. (2) In the second scenario, combining a strong frequency shift with small signal bandwidth results in a very narrow-banded signal which might be problematic from a perceptual point of view. The frequency shift squeezes the original spectrum to a pitch range (Δf + B)/Δf . For example, if the bandwidth of the primary audification signal is 100 Hz, and the spectrum is shifted by Δf = 500 Hz, the resulting bandwidth is 500-600 Hz. Speaking in musical terms, all frequency components of the original data stream are now concentrated within a minor third. Fluctuations of such narrow-banded Signals might be difficult to perceive.

Exponential frequency modulation
Therefore our approach is extended by modulating the frequency shift of the phasor of the analytic signal x a (t). The instantaneous frequency shift of the modulator, f i (t), encodes the numeric data values of x(t) as pitch, i.e. as an exponential function of x(t), following Flowers' recommendations: The freely choosable parameter c controls the magnitude of the modulation: Setting c = 0 results in a constant instantaneous frequency of the frequency modulation (FM) which is then independent of the data values x(t). This results in a pure frequency shift as described in Sect. 3.1.
Setting c = 1 leads to a transposition of one octave higher and lower for signal values x(t) = ± 1. The value of c has to be carefully chosen depending on the signal amplitude and bandwidth to prevent aliasing resulting from strong FM sidebands.
Integrating the instantaneous frequency results in the instantaneous phase φ i (t), which serves as a phase modulating term for the analytical signal, The complete model of Focused Audification is thus defined by: The model of FA is controlled by two freely choosable model parameters, Δf and c, that can be set according to the explorative goals of the sonification. Figure 2 illustrates the effect of the parameters with a schematic data set as compared to the absolute hearing threshold.
One issue of FA when dealing with signals of harmonic complexes needs to be discussed. Many physical processes are-at least approximately-periodic. The related signals therefore consist of harmonic partials, and their audification makes use of human audition which groups these frequency components into a single auditory gestalt. In pure audification, frequency ratios and thus the periodicity of harmonic complexes are preserved, resulting in one "sound" with a certain timbre and pitch. In FA on the contrary, the frequency shift destroys the harmonic relationship between the partials and thus the periodicity of the signal. This results in a complex superposition of individual sinusoidal tracks instead of one gestalt with a certain timbre. Fig. 2 A schematic spectrum of a pure audification signal X ( f ) of bandwidth B is compared to the absolute hearing threshold in a. The frequency shift by Δf , depicted in b, transposes the spectrum to a more sensitive region of hearing, while narrowing the resulting pitch range. This can be compensated by the frequency modulation controlled by the parameter c. The resulting signal X F A ( f ) is located in a more sensitive region of human hearing and has comparable bandwidth to the original signal X ( f ) tions for Seismology [13]. The data set is a seismological event with 5 s length given f p = 44.1 kHz. Figure 3a shows the spectrogram. The appertaining sound examples can be found at http://phaidra.kug.ac.at/o:92490. The event is characterized by an impulsive sequence in the beginning with a bandwidth of around 5 kHz. The rest of the example shows relevant signal energies within 600 Hz bandwidth. The first half of the data set, we find high energies at very low frequencies that are perceivable as a low pitched glissando in the pure audification.
For focused audification, the playback rate is now reduced by a factor of 4 ( f p = 5.5 kHz) and the frequency shifted by Δf = 250 Hz in order to stay in the audible range, see Zooming into the glissando sequence, Fig. 3c, the frequency is shifted even more, δf = 500Hz. The modulation of c = 3/12 leads to a pitch transposition of a minor third. The sound behaves as an auditory graph, and the former dull glissando event can be explored in detail.

Listening test with electrocardiogram data
In this section we present the evaluation of FA in an interactive setting. The main focus of the presented experiments was the adjustment of the free parameters of the model. As data set we chose electrocardiogram (ECG) data: on the one hand, there are well-established, scientifically labeled data sets of ECG data available. On the other hand, these are communicable even to medical laypeople-our test subjects-who are able to categorize these data (this is arguable as, e.g., Ballora et al. [1] have shown that part of their test subjects could achieve 90% correct identification rates with ECG data in a sonification of four different cardiac states). Previous sonification research on ECG signals was conducted by Worrall et al. [29] and Terasawa et al. [24], with a more diagnostic focus.
We performed a pilot test and consecutive experiment, with both quantitative and qualitative analysis, to answer our research questions: -Which are the optimal (preferred and efficient) interindividual parameter settings for the FA of ECG signals? -Which general lessons can be learned using our approach in an interactive, explorative setting?

Choice of data files
We used data from the online MIT-BIH Arrhythmia Database. It is one of the most used ECG databases due to its long and consistent data series [10, 14,17], digitized with an average sampling rate of 360 Hz. One cycle of the basic ECG trace is shown in Fig. 4. Different types of arrhythmias (i.e., irregular heartbeats) have lengthened or shortened intervals within one cycle, or may exhibit an abnormal polarity of the signal part. Furthermore, some arrhythmias appear alternating, where each second or third heartbeat is different. 4 For our experiment, we chose three types of cardiac states, as labeled in the database: "Paced rhythm", "Premature ventricular contraction" and "Ventricular trigeminy". This selection was made following the consultation of an internist with specialization on cardiac insufficiency, ensuring that 4 In analyzing the ECG signal, the cardiologist uses a template heart beat (averaged over, e.g., a hundred beats) for each of the twelve ECG leads (i.e., taken from 12 positions on the patient's body), and measures their behavior. This process is often automated today using specialized algorithms to discern different arrhythmias. Still, the expert knowledge of the cardiologists comes into play for border cases, when s/he uses the data plots to explore the signal, as in previous times. the signals show differences that are obvious to laypeople. For each of those, we used three segments of three different patients of 20 s length each (i.e., there were 9 different files divided in three anonymized groups, A, B, and C). The files showed some variability based on the individuality of the patients, both in terms of heart rate and the form of the signal. The mean maximum amplitude was 0.32 with a standard deviation of 0.06 (calculated over the means of the amplitude spread for each file).

Scenario and task
Our scenario was the usage of FA to monitor ECG data in real-time in an explorative setting. The subjects' task was to find one optimal parameter set (i.e., for frequency shift and pitch modulation) for all nine sound files, with the goal to (a) be able to hear differences between the groups A, B, and C (and verbalize their findings) and (b) have the most amenable sound possible.
NB We did not test if FA is better (e.g., more efficient or its sound more amenable) than other sonification approaches. Developing new methodologies for sonification requires to work also with well-known data sets and environmental conditions of real, scientific, data. For sorting the chosen cardiac states by auditory means a variety of methods could be chosen. The states differ from each other and might even be differentiated by pure audification. In our experiment, subjects were free to choose their preferences from pure audification to FA within a far range of parameters. The efficiency of FA versus, e.g., audification, can be deduced by the theoretical considerations regarding the human hearing in Sect. 2, and would need further testing.

Test conditions
The procedure was repeated in four conditions with different playback rates, see Table 1. We chose these values for the following reasons: we wanted to explore the two free parameters Δf and c, while the playback rate was freely choosable as well. This 3D-space of inter-depending factors is arguably too complex to fully explore within the limited time of an experiment. Therefore we fixed three playback speeds in independent conditions. In real-time, the heartbeat of a healthy person varies between 50 and 100 beats per minute, i.e. roughly 1 Hz. Exploring the details of each individual cycle as shown in Fig. 4 requires a slower playback rate that we set to one fourth of real-time as a result of informal tests by the authors. For exploring the macro-structure over several cycles, the acceleration of a factor of 5 was chosen, leading to a rhythm of roughly 5 beats per second, the upper range of the speaking rate measured as vowels per second (see footnote in Sect. 2.1 and [19]). In the pilot test, we only tested conditions (1) to (3). For the full experiment, we added a fourth condition with adapted task: using individual averages, Δf mean and c mean , as calculated over the conditions (1) to (3) for each subject, the subject should set the optimal playback speed. Figure 5 shows the graphical user interface (GUI) of the experiment. For each condition, the subjects were free to choose any sound file and change the parameters as often and as long as they wanted. The experiment was accompanied by an observer, who led through it and collected qualitative data based on an open questionnaire. Questions covered differences between the sound file groups A, B, and C; between the playback speeds, i.e., conditions; and general remarks on the sound quality and the understanding of the mapping (in particular, the correlation between the graph and the sound).

Test design of the pilot test
The pilot test consisted of two successive rounds that were repeated after a pause that ranged from one hour to 2 days for each subject. The only difference between round one and round two was the use of a different slider design. Our hypothesis was that participants might rely on their visual memory of the slider positions in the first The pilot test showed statistical differences between the two slider designs: this was obviously not intended and due to poor experiment design. Our main hypothesis is that the SQRT design is perceptually counter-intuitive as compared to the exponential dependancy between pitch and frequency. The QUAD design is more similar to the psychometric curve of pitch sensitivity [21]. For this reason we re-designed the interaction paradigm of setting the parameters for the main experiment.
Test design of the experiment Instead of sliders in the GUI, we used an Apple Mighty Mouse 5 with one miniature trackball that served as a simple, "endless" slider interface. The subjects did thus not receive any visual or tactile feedback on the position of the parameter setting as compared to its possible range.
The experiment was conducted in one round.

Test subjects and time
For the pilot test, 12 test subjects were recruited out of the colleagues of the authors and the authors themselves, i.e., all experienced listeners, but all laypeople in the field of medicine/cardiology (7 out of 12 with a certified hearing loss of less than 15 dB, being part of a trained expert listening panel [9,22]). In the experiment, 14 subjects participated out of which 7 had already taken part in the pilot test (5 of them are part of the trained expert listening panel). The experiment took roughly 15 min each.

Results
First, we compared results from the two slider designs in the pilot test with the ones of the experiment. Figure 6 shows the 95% confidence ellipses for the parameter settings chosen by the participants. The influence of the slider design on c is not significant for the mean nor the median ( p > 0.43). However, in the QUAD slider design, participants generally chose smaller values for Δf than in the SQRT design ( p < 0.001). The sequence of the presented slider designs had no significant influence ( p = 0.55 for c and p = 0.41 for Δf ). Figure 6 also shows that the mean results of the QUAD design  coincide with the mean results of the experiment. Therefore, we summarized the results from the QUAD slider design of the pilot test and the ones from the endless slider of the experiment in the further analysis.
Results of the experiment are given in Table 2, showing mean, standard deviation, and 95% confidence interval for Δf and c for all three conditions and the overall average. The mean ideal playback rate (for the individually calculated mean parameters, i.e. condition (4) in Table 1) was found to be 3.6 (minimum 0.9, maximum 7, standard deviation 1.7). Figure 7 shows the summarized results for the settings of parameters, averaged over all conditions. We may conclude a useful parameter range for the ECG data between There is no, e.g. linear, trend that could be expected with raising playback rate: given a heartbeat frequency of, e.g., 1 Hz, the playback conditions lead to heartbeat rates of 0.25 Hz (slow), 1 Hz (real-time), and 5 Hz (fast). This initial offset can be neglected as compared to a frequency shift of 240-350 Hz. Therefore, the ellipses in Fig. 7 are all mainly overlapping.

Discussion
The results show a clear, inter-individual preference of setting the parameters to a specific range. We looked into more detail why this could be. The findings in Fig. 7 show a rather uniform distribution, with a few outliers. We hypothesized that the test subjects tried to set the resulting spectra according to their preferred listening range, e.g., as used in speech. We therefore computed the spectra from the audio files as resulting from the parameter settings for each subject, and compared these spectra to an averaged speech spectrum computed from English, French, and German female and male speakers from EBU SQAM recordings [7]. Figure 8 shows the spectra of sound files of FA for different conditions.  Fig. 8b reveal that their main energies lie in the region where also speech has most energy. We may conclude that test subjects tried to adjust the spectrum resulting from the FA to the region that they are used to by speech. Furthermore, results of the mean ideal playback rate (based on a heartbeat rate of approximately 1 Hz) correspond to the rate of syllables in, e.g., English language, which is approximately 4 Hz as discussed in Sect. 2.1.
Finally, we may draw general conclusions from the qualitative results on the effectiveness of the new method. Preliminary qualitative research of the answers of the test subjects lead to the following conclusions: -FA is efficient for categorizing data Most subjects could verbalize differences between the data categories A, B, and C. As far as the understanding of the authors is concerned (all of us being medical laypeople), these differences correspond well to the specificities of the cardiac arrhythmias as described in the database. -FA is flexible in interactive data exploration In the experiment, we could categorize the participants into two groups: the larger one focused on rhythmic aspects and preferred the fast condition. A smaller group of subjects liked the slow condition better because they were more interested in the details of the modulation within one heartbeat cycle. This finding shows one of the benefits of FA: the interactive, seamless setting of parameters allows to focus on different aspects or scales of a data set. Cardiologists focus both on the behavior within one heartbeat and the general rhythm. Both behaviors have been found and explored by our laypeople listeners. -FA provides an acceptable sound Participants were rather neutral towards the sound quality, many stating that within the context it would be ok (the context had not been defined but assumed by the listeners to be a clinical one). A few participants stated that they would not like to listen to the sound for a longer period. The relationship between the data plot and the sound were reported as clear, even if no participant drew 100% correct conclusions about the underlying mapping. Many participants hypothesized about the data sets, thus they clearly used our approach to explore the data.
Obvious interpretations on the data (e.g., "like a heartbeat", "again arrhythmies") were equally mentioned as music metaphors ("a syncopated rhythm", "strange beat") or technical ones ("metallic piston noise", "background noise as in our server room", "as a remote disco sound", "chaotic"), and general statements ("unagitated/dull", "annoying", "cool/interesting"). It would need a cardiologist to check if the individual findings of the participants could be useful in diagnosis.

Optimized parameter selection for FA
Concluding from the experiment of Sect. 5 we propose a procedure for selecting the FA parameters Δf and c for arbitrary data sets. These are optimal in the sense that they adjust the resulting spectrum of the FA as much as possible to the spectrum of average human speech.
Based on B speech we set Δf equal to f mid,speech By this we achieve a frequency shift of the original signal spectrum to the typical center frequency of the speech spectrum. Equation 11 is an approximation for signals with negligible bandwidth B sig as compared to B speech (as is the case, e.g., for the ECG signals in the experiment above). For the case where B sig is broadband, ranging from f low,sig = f sig − B sig /2 to f up,sig = f sig + B sig /2, we need to account for the logarithmic frequency scale and the frequency shift Δf has to satisfy the following equation: Solving the quadratic equation gives as the one physical solution For the special case of low-frequency signals, we may approximate: 6. The optimal value of the modulation parameter c is based on the effective speech bandwidth and is set to c = 1 for amplitude-normalized narrow-band signals. For broadband signals, the following equation has to be solved for c: 2 c = f up,speech Δf + f up,sig = 2 · f mid,speech Δf + f up,sig (16) Again, for the special case of low frequency signals (Eq. 14), the optimal value of the modulation parameter c can be approximated 7 by: c ∼ 1 − 0.7 · B sig f mid,speech (17) In the case of data sets exhibiting different time regimes of information, i.e. a larger rhythmic structure of events and individual events of interest, we recommend selecting optimal values for f p , Δf , and c for every event rate found in Step 2 and let the user explore all of them.
With this semi-automatic selection of parameters for FA, the resulting spectrum of the sonification is similar to the one of speech, and thus comfortable for human hearing.

Conclusions and outlook
We presented Focused Audification as a method that allows to adjust a sonification between a pure audification and a pitchbased auditory graph. As opposed to pure audification, where only the playback rate can be changed, two more model parameters can be chosen independently. One parameter, Δ f , controls the magnitude of a frequency shift. The second, c, sets the excursion of a pitch modulation. The implementation of FA is simple and preserves preferable properties of audification whilst permitting a true "zooming" at any time scale for the interactive exploration of a data set. 7 Using the identity x = e ln(x) yields e c·ln 2 = e ln 2−ln(1+B sig /2· f mid,speech ) , c · ln 2 = ln 2 − ln(1 + B sig /(2 · f mid,speech )), and c = 1 − 1.4427 · ln(1 + B sig /(2 · f mid,speech )). Since B sig /(2· f mid,speech ) 1 and taking into account that ln(1+x) ∼ x for x 1, the logarithm can be approximated. This leads to c = 1 − 1.4427 · B sig /(2 · f mid,speech ) or c = 1 − 0.7213 · B sig / f mid,speech , which leads to Eq. 17.
The method has been discussed by the example of a seismological data set. Preferred and efficient settings for the model's free parameters have been explored in an experiment with ECG data. They appeared to be adjusted in a relatively narrow region, whose spectrum has maximum energy within the one of speech. Therefore we concluded on a procedure to find parameters for FA for any data set in such a way that their resulting spectrum is similar as much as possible to the one of speech. Further research has to test the procedure with different types of data and check the efficiency of FA against other sonification methods. Fig. 9 Synth definition for an FA implemented in SuperCollider (Version 3.9.3). The implementation of FA in SC starts from a given buffer b, with an adjustable playback rate and a start position startpos from which the buffer read-out starts. The model parameters are called deltaf and c according to the model definition in Eq. 6. The instantaneous frequency fMod is defined, its sine and cosine calculated. The existing unit generator HilbertFIR returns a two-dimensional array: hilb[0] contains the primary signal sig (and is multiplied by the cosine), hilb [1] contains its Hilbert transform (which is multiplied by the sine). The final output is the difference between those two, according to Eq. 8 thus we prepared the sound examples and plots discussed in Sect. 4 in MATLAB. SuperCollider, on the other hand, is more handy for real-time, interactive use of the method, and was thus used for the experiment described in Sect. 5. We present the basic SC Code in Fig. 9.