Experiment 2 examined the transit duration for MIDI messages from the time they are sent to the time they are first read when being directed through a MIDI–PCI or MIDI–USB interface (see Fig. 1b). The latency and variability of six MIDI interfaces were tested, three of which were MIDI–PCI interfaces and three of which were MIDI–USB interfaces to test whether MIDI–PCI interfaces produce shorter and less variable latencies than MIDI–USB interfaces. Finally, I tested whether MIDI–USB devices poll—that is, is the USB port periodically sampled to see if new information has arrived at intervals greater than 1 ms.
Method
Materials
The “send” and “read” Arduino triggers (audio data) were recorded using the same setup as Experiment 1. An Intel Core i7-2670QM, 2.2 GHz, running Linux Ubuntu v3.2.0-23 was used to perform the FTAP loop test (Finney, 2001).
Procedure
Latencies were measured using the SMIDIBT route test, which sends 4,002 MIDI “note on” and “note off” messages and compares the sent time with the received time. Experiment 1 indicated that transit durations were shorter for the “single-byte” read method. Accordingly, that method was used here to record the transit and total durations. As in Experiment 1, sent and received times are demarcated, respectively, by a change from high to low for the “send Arduino” and by a change from low to high for the “read Arduino.” The difference between these times indicates the transit duration for the MIDI message. When routed through a MIDI–PCI or MIDI–USB, the transit duration indexes the latencies incurred when a PC receives and sends a MIDI message. The onset and offset times used to measure transit durations were recorded at a sampling rate of 44100 Hz, allowing a temporal resolution of 0.023 ms. The SMIDIBT route test was conducted 20 times for each device (6) by interval (3) combination. The three interval rates represented different information loads: 1, 2, and 3 ms between sends. Moreover, a baseline was calculated on the basis of the latencies produced when the MIDI messages were not routed through an interface but were, instead, directly sent from the “send Arduino” to the “read Arduino” (as in Exp. 1). To measure polling, the FTAP loop test was used, which sends 4,002 MIDI messages through a MIDI device that connects its own output to its input and records the latency between the sent and received times (rounded to the nearest millisecond). I then assessed whether the distribution of latencies was bimodal or unimodal. A bimodal distribution would be indicative of polling, whereas a unimodal distribution would indicate either that the data transfers were not subject to polling or they polled at a rate that was distributed around one central latency value. The FTAP loop test was conducted 20 times per device.
Design and hypotheses
The dependent variables were the raw latencies and the standard deviation (SD) of latencies for each loop test. To better reflect the typical standard deviations found in sensorimotor synchronization experiments, the SD of each group of 40 consecutive events was calculated. The FTAP loop test further included Hartigan’s distribution statistic as a dependent variable for measuring multimodality for each trial (Hartigan & Hartigan, 1985). The independent variables were the interval (1, 2, 3 ms; SMIDIBT route test onlyFootnote 1) the device model (six levels: LogiLink MIDI–USB, M-Audio UNO, Roland UM-ONE, Labway Soundboard D66, Sound Blaster Live, TerraTec TT Solo 1-NL). The former three models were MIDI–USB interfaces, and the latter three were PCI devices. Following the suggestions of Finney (2001), I hypothesized that the three MIDI–PCI cards would produce lower and less variable latencies than the MIDI–USB interfaces. I also hypothesized that the latencies produced by the three MIDI–PCI cards would produce a unimodal distribution, whereas the MIDI–USB devices would produce a multimodal distribution if they poll (or a unimodal distribution if they do not poll). I further hypothesized that the MIDI messages would be read and sent within 1 ms.
Statistical analyses
Since there were unequal variances between devices, the data were analyzed using linear mixed-effects models (LMEM) with the fixed factors device and interval and a random effect of repetition (20 levels), and unequal variances were permitted across the levels of the device factor. The model was fit using the lme function from the nlme library (Pinheiro, Bates, DebRoy, Sarkar, & R Core Team, 2015) for the R package of statistical computing (R Core Team, 2013), and unequal variances were implemented using the varIdent model formula term. Pair-wise contrasts were computed using generalized linear hypothesis testing for Tukey contrasts, using the glht function in the multcomp library (Hothorn et al., 2008). The LMEM was used to analyze the dependent variables latency and variability. Bayes factors were calculated to test the probability that the null hypothesis could be accepted (less than 1) or rejected (greater than 1; Rouder, Speckman, Sun, Morey, & Iverson, 2009). The Bayes factor is an odds ratio, but I adopt the nomenclature used by Jeffreys (1961) that values around 1 suggest “no evidence,” values between 1 and 3 suggest “anecdotal evidence,” those between 3 and 10 suggest “substantial evidence,” those between 10 and 30 suggest “strong evidence,” those between 30 and 100 suggest “strong evidence,” and those over 100 suggest “decisive evidence.” Moreover, I represent evidence for the alternative hypothesis as BFHA, and evidence for the null hypothesis as BFH0 (i.e., 1/BF). The Bayes factor was computed using the ttestBF function in the BayesFactor library (Morey, Rouder, & Jamil, 2009).
Multimodality of latency distributions was assessed using Hartigan’s dip test for unimodality (Hartigan & Hartigan, 1985), where values lower than .05 indicate unimodality and values greater than 0.05 indicate multimodality. The FTAP loop test only measures latencies to the nearest millisecond and, with the resulting range of 0 to 3 ms, this provided up to four bins per condition, thus failing to meet the recommended number of bins 1 + log2(N), or about 13 bins for the 4,002 data points produced in each loop test (see Sturges, 1926). Since Hartigan’s dip test could not be performed on the rounded raw data, a resampling method was employed whereby a uniform distribution of random numbers between – .49 and .49 were added to the raw data and any values less than zero were made positive. This method was chosen because it reflects both the possible latency values prior to rounding and the mean delay between output scheduling calls reported by FTAP during the FTAP loop test (0.49 ms). Moreover, a send/receive delay of zero is both theoretically and practically impossible given that MIDI messages require approximately 1 ms to be transmitted (Kierley, 1991), as demonstrated in Experiment 1. A uniform distribution was used because a normal distribution would have favored a multimodal distribution. Thus, the use of a uniform distribution is more conservative given our hypothesis of multimodality for the MIDI–USB devices. Five different random distributions were applied to the latencies of the six devices for each of the 20 repetitions and the Hartigan’s dip test statistic was calculated for each repetition by random distribution combination using the dip.test function in the diptest library (Maechler, 2015). The resulting distribution statistics were compared to the critical value of .05 using one-sample two-tailed t tests.
Results
SMIDIBT route test
In the 1-ms condition, the LogiLink device was unable to complete any trial without an error, and the M-Audio demonstrated a constant drift, with transit durations monotonically increasing in each trial. For these reasons, two LMEM analyses were conducted; one with the 1-ms interval condition removed and another with the LogiLink device removed. For latencies, both the interactions excluding the 1-ms interval [F(5, 480240) = 941.97, p < .001, ηG2 = .99] and excluding the LogiLink device [F(5, 480240) = 941.97, p < .001, ηG2 = .99] were significant (ps < .001), so I proceeded with pair-wise comparisons. All conditions were significantly different from each other, with the exception of the following: The TerracTec and Labway were not significantly different for the 2-ms and 3-ms intervals, and there were no significant differences between the 2-ms and 3-ms intervals for the LogiLink and Sound Blaster. As is shown in Fig. 4, the 1-ms interval produced higher latencies than the 2-ms and 3-ms intervals for all devices (ps < .001). The 2-ms interval only produced significantly larger latencies than the 3-ms interval for the M-Audio (p < .001) and Roland UM-ONE (p = .03). For the 1-ms interval (i.e., high information load), the Sound Blaster had the lowest latency, followed by the Labway, TerraTec, Roland UM-ONE, and M-Audio (ps < .001). For the both the 2-ms and 3-ms intervals, the Labway and TerraTec had the lowest latencies (ps = 1), followed by the Sound Blaster, M-Audio, LogiLink, and Roland UM-ONE (ps < .001). These results support the hypothesis that MIDI–PCI interfaces have significantly lower latencies than MIDI–USB interfaces. Finally, to test whether 1-ms latencies could be achieved, the mean baseline value for each interval was subtracted from each latency within that interval, and one-sample, two-tailed t tests were performed against the test value 1. For all devices and intervals, the latencies were significantly greater than 1 ms (ps < .001, dfs = 19). These results fail to support the hypothesis that 1-ms performance is achievable for either MIDI–PCI or MIDI–USB interfaces. For descriptive statistics related to the route test, see Appendix 3.
For variability, two LMEM analyses were conducted, separately excluding the 1-ms interval and then excluding the LogiLink (as was performed for latencies). The LMEM with the 1-ms interval excluded only showed a significant main effect of device [F(5, 85) = 3587.47, p < .001, ηG2 = .988], whereas the LMEM with the LogiLink excluded demonstrated significant main effects of device and interval (ps < .001, ηG2s > .99), as well as a significant two-way interaction [F(8, 280) = 19,823, p < .001, ηG2 = .998]. On the basis of these results, I proceeded with planned comparisons only between the 1-ms interval and the 2-ms and 3-ms intervals for each device and between devices. Only the Labway, TerraTec, and M-Audio had significantly greater variability for the 1-ms interval compared to the 3-ms interval (ps < .04), and only the TerraTec and M-Audio had significantly greater variability for the 1-ms interval compared to the 2-ms interval (ps < .001; all other ps > .99). As is shown in Fig. 4 (bottom panel), for the 1-ms interval, the PCI interfaces (Sound Blaster, Labway, and TerraTec) all had the least variability (ps > .10), followed by the Roland UM-ONE (ps < .001), whereas the M-Audio was the most variable (ps < .001). For the 2-ms and 3-ms intervals, the Roland UM-ONE was significantly more variable than all other devices (p < .001), and no other devices demonstrated significant differences (ps > .99). These results partially support the hypothesis that PCI–USB interfaces are less variable than MIDI–USB interfaces; PCI–USB interfaces are only less variable than MIDI–USB interfaces when under high information load—that is, the 1-ms interval condition.
FTAP loop test
For FTAP loop test latencies, there was a significant main effect of device [F(5, 480240) = 941.97, p < .001, ηG2 = .99]. Pair-wise comparisons yielded significant differences between devices (ps < .001), with the exceptions of between the M-Audio UNO and the three MIDI–PCI devices (ps > .83) and between the three PCI devices (ps > .86). As is shown in Table 2, the three PCI devices and the M-Audio UNO were significantly faster, followed by the Roland UM-ONE, and the LogiLink MIDI–USB was the slowest. Bayes factor t tests between the M-Audio UNO and PCI devices suggested substantial evidence for the null hypothesis for the Sound Blaster Live card (BFH0 = 8.12), and extreme evidence for the null hypothesis for the TerraTec TT and Labway Soundboard (BFH0s > 245.5). Bayes factor t tests between the three PCI devices revealed strong evidence for the null hypothesis between the Labway and Sound Blaster Live (BFH0 = 17.54) and between the Sound Blaster Live and TerraTec (BFH0 = 20.61), and extreme evidence for the null hypothesis between the Labway and TerraTec (BFH0 = 249.83).
Table 2 Descriptive statistics for FTAP loop latency, variability, and Hartigan’s D statistic
For variability, I observed a significant main effect of device [F(5, 120) = 7,865.6, p < .001, ηG2 = .99]. Pair-wise comparisons yielded significant differences between devices (ps < .001), with the exception of the three PCI devices (ps > .71). As is shown in Table 2, the Sound Blaster Live card was the least variable, then the Labway and TerraTec, then the M-Audio UNO and the Roland UM-ONE, and the LogiLink was the most variable. Bayes factor t tests between the three PCI devices revealed anecdotal evidence for the null hypothesis between the devices (BFH0s < 2.65).
Regarding unimodality, all three PCI cards demonstrated Hartigan’s distribution statistics that were significantly less than the critical value for multimodality (.05) [ts < – 397.4, ps < .001], indicating a unimodal distribution (see Fig. 5). Conversely, all MIDI–USB devices showed Hartigan’s distribution statistics that were significantly greater than the critical value for multimodality (.05) [ts > 31.9, ps < .001], indicating multimodal distributions. Note that none of the resampling methods produced values approaching .05 for the PCI devices, and that all of the resamples were greater than .05 for the MIDI–USB devices (see Fig. 5).
Discussion
The results of Experiment 2 indicated that the three MIDI–PCI cards have lower latencies than the MIDI–USB interfaces (except for the M-Audio UNO in the FTAP loop test). Moreover, the MIDI–PCI cards were less variable than the MIDI–USB interfaces but only when placed under high information load (i.e., the 1-ms interval of the SMIDIBT route test and in the FTAP loop test). I also found support for the hypothesis that the MIDI–USB interfaces poll as suggested by the multimodal distributions given by the latencies these devices and further corroborated by the lack of polling for the PCI cards. These results support the statements of Finney (2001) that suggested that MIDI–PCI cards produce smaller latencies than alternative options. Polling of the MIDI–USB devices could be a driving factor of the poorer performance of these devices as compared to MIDI–PCI devices. In the SMIDIBT route test, latencies were significantly above the 1-ms resolution reported by Finney (2001), regardless of the interval or interface type. In the FTAP loop test, the average sub–millisecond latencies reported by Finney were replicated here for the MIDI–PCI cards, and also for the M-Audio UNO. However, the latency values for the MIDI–PCI cards ranged from 0 to 2 ms, the lower limit being theoretically impossible, given that serial MIDI messages are not instantaneous (Kierley, 1991), and the upper limit remaining above the within-1-ms resolution that is sometimes assumed for MIDI interfaces. Thus, even using nonpolling MIDI–PCI devices does not consistently produce sub-millisecond latencies when recording responses, and MIDI–PCI interfaces are only significantly less variable than MIDI–USB interfaces under high information load.
The present study only involved FTAP software because it had been found to produce lower and less variable latencies than Max/MSP in a previous experiment (Schultz & van Vugt, 2016). Other programs are available (e.g., Max/MSP, Python) with which one could make custom scripts for parsing MIDI messages. Since there are no standard scripts for parsing MIDI messages using these programs, there could be variations in the latencies produced by the different scripts. Future studies could use the SMIDIBT to examine differences between custom scripts and various software packages used to parse MIDI messages. Moreover, the effect of different computer operating systems on MIDI latencies could be benchmarked using cross-platform software, such as Max/MSP (available for Windows or Macintosh OS) or Python (available for Windows, Macintosh, or Linux OS).