A noninvasive swallowing measurement system using a combination of respiratory flow, swallowing sound, and laryngeal motion
- 1.5k Downloads
The assessment of swallowing function is important for the prevention of aspiration pneumonia. We developed a new swallowing monitoring system that uses respiratory flow, swallowing sound, and laryngeal motion. We applied this device to 11 healthy volunteers and 10 patients with dysphagia. Videofluoroscopy (VF) was conducted simultaneously with swallowing monitoring using our device. We measured laryngeal rising time (LRT), the time required for the larynx to elevate to the highest position, and laryngeal activation duration (LAD), the duration between the onset of rapid laryngeal elevation and the time when the larynx returned to the lowest position. In addition, we evaluated the coordination between swallowing and breathing. We found that LAD was correlated with a VF-derived parameter, pharyngeal response duration (PRD) in healthy subjects (LAD: 959 ± 259 ms vs. PRD: 1062 ± 149 ms, r = 0.60); however, this correlation was not found in the dysphagia patients. LRT was significantly prolonged in patients (healthy subjects: 320 ± 175 ms vs. patients: 465 ± 295 ms, P < 0.001, t test). Furthermore, frequency of swallowing immediately after inspiration was significantly increased in patients. Therefore, the new device may facilitate the assessment of some aspects of swallowing dysfunction.
KeywordsSwallowing Dysphagia Deglutition apnea Coordination between swallowing and breathing
Dysphagia, or swallowing difficulty, is the medical term for a condition in which the swallowing process is disrupted and eating ability is impaired. Patients with dysphagia can be at higher risk of pulmonary aspiration and subsequent aspiration pneumonia. According to WHO report in 2012, pneumonia was at third rank among causes of death in the world (World Health Organization—Fact sheets of Media Centre, The top 10 causes of death, http://www.who.int/mediacentre/factsheets/fs310/en/), and the majority of pneumonia cases in the elderly population are associated with aspiration. Swallowing abnormality may also contribute to exacerbations of pulmonary diseases [4, 10, 13, 24, 33, 43, 44]. Recurrence of aspiration pneumonia frequently occurs if the underlying swallowing problems have not been properly treated. Therefore, the assessment of swallowing function and early intervention are critical for preventing the occurrence and recurrence of aspiration pneumonia.
There are several assessment methods that can be applied to evaluate patients with dysphagia. The two widely used bedside swallow assessment tests, repetitive saliva swallowing test (RSST) and modified water swallowing test (MWST), lack quantitative analyses. Currently, videofluoroscopy (VF) and videoendoscopy (VE) are considered the gold standard for evaluating dysphagia. However, VF cannot be conducted frequently since X-ray exposure may endanger the health of patients as well as medical staff. In addition, VF cannot be done outside the medical facility. VE is more portable than VF; however, specifically trained medical doctors (or dentists) must be available to diagnose the findings on site. Swallowing sound and motion analyses are alternative swallowing assessment techniques [1, 5, 22, 34]. However, in that motion analysis, ‘motion’ does not refer to that of the vocal cord, but rather it refers to the elevation of the larynx that causes the downward motion of the epiglottis to cover the airway to protect it during swallowing. Although they are safe, relatively simple, and easily repeatable, these techniques need to process acoustic or kinetic signals obtained by specially designed sensor devices, typically a laryngeal microphone or an accelerometer. Therefore, many researchers have developed algorithms to process these signals for the assessment of swallowing function [8, 22, 27, 36, 42]. To date however, there is still no sufficiently accurate and efficient way for objective monitoring of swallowing behavior in typical daily life environments. Therefore, we have devised a swallowing monitoring system that utilizes a combination of respiratory, acoustical and kinetic signals for more integrated monitoring and assessment of swallowing function. The rationale for the use of respiratory information is twofold: First, it serves as a good marker to detect swallows, since respiratory flow stops during swallowing (deglutition apnea). Secondly, it enables assessment of the coordination between swallowing and breathing, an important airway protection mechanism .
This paper mainly focuses on the method by which the new swallowing monitoring system detects swallowing events and assesses swallowing function from collected signal components, i.e., a respiratory flow, swallowing sound, and laryngeal motion. This paper extends the previous research work done by Yagi et al. .
In order to evaluate the efficiency and effectiveness of the newly developed monitoring system, we simultaneously monitored the VF measurement in both volunteer subjects and patients with dysphagia. We then compared the results obtained by our system and those by VF. We found that the newly developed method is able to accurately detect swallowing events and yield quantitative indices that may facilitate the assessment of some aspects of swallowing dysfunction.
2 Materials and methods
2.1 Recorded components
2.2 Experimental protocol
Test food texture
5104 ± 354
14 ± 11
0.262 ± 0.015
11,618 ± 846
10 ± 1
0.430 ± 0.060
451 ± 16
83 ± 9
0.862 ± 0.015
4682 ± 247
40 ± 11
0.246 ± 0.021
11,414 ± 596
24 ± 6
0.292 ± 0.017
476 ± 19
74 ± 7
0.808 ± 0.012
2.3 Respiratory flow component processing
2.4 Sound component processing
The sound data during respiratory cessation periods are retrieved, and the percent power of 500–2300 Hz frequency bands is calculated for each sound signal during epochs. If the percent power of 500–2300 Hz frequency bands is less than 20 %, then we determined that it is less likely to be a swallow according to the sensor characteristics. The sound signal was then decomposed into pulses to obtain two parameters, the number of pulses and the maximal pulse width (Fig. 3b). We defined and discriminated swallowing sound characteristics from those parameters. If the number of pulses is greater than 20, or if the maximal pulse width is greater than 40 ms, then it is considered to be an artifact or noise.
A normal swallowing sound typically consists of three sound components . The first sound (Sound I) and the third sound (Sound III) are not always detected; however, the second component (Sound II) is consistently and most remarkably audible among three swallowing sound components . Therefore, the program searches the time point of the most prominent sound power peak within each respiratory cessation period to identify the possible Sound II (Fig. 2).
2.5 Detection of swallowing candidate periods
First respiratory cessation periods (>0.35 s) are extracted (Step 1). If an extracted period contains sound whose intensity is greater than a certain threshold (e.g., noise level + 2 × standard deviation), proceed to further steps (Step 2). In the next step, sound characteristics are analyzed as described above. If the pulse count is <20 counts, the maximal pulse width of which is <40 ms, and mel-scale spectrogram %power within 500–2300 Hz frequency bands is >20 %, and it is associated with laryngeal motion of amplitude >5 % of the maximal amplitude within the entire record; then, the extracted period is registered as a candidate of swallowing period.
2.6 Laryngeal motion component processing
2.7 Laryngeal rising time (LRT)
Within each identified respiratory cessation period, the program first searches for the time point (P) at which the sensor output (LM) reaches the highest peak. Due to the differential characteristics of the piezoelectric sensor, this corresponds to the instance when the laryngeal elevation speed becomes maximal. Since the time point P corresponds to the onset of pharyngeal swallow, which usually occurs after the onset of respiratory cessation , and Sound II is associated with bolus transit , we limited the possible position of P to be the range between (the onset of deglutition apnea-200 ms) and (Sound_II + 200 ms). Then, we defined P as the local maxima within this range (Fig. 5a).
Next, the program searches the zero-cross point backward to estimate the start point of LRT (T 1). If ILM at this zero-cross point has a positive value, then the backward search is continued until the local minima of ILM with a negative value are found. The program then looks for the time point M where the larynx reaches the maximal elevation (Fig. 5b). M corresponds to the first zero-cross point in LM searched forward from P. Finally, LRT is calculated as the duration between T 1 and M. Due to the varying structure of swallowing pattern recorded from subjects with different swallowing functions and different postures, several types of ILM patterns can be observed (Fig. 6). Therefore, if LRT is less than 45 ms, then the program finds the second highest LM peak within the range between (the onset of deglutition apnea-200 ms ) and (Sound II + 200 ms) and repeats the LRT calculation until LRT > 45 ms.
2.8 Laryngeal activation duration (LAD)
We next define laryngeal activation duration (LAD) as the duration between the time point P and the time point at which the integrated sensor output becomes the trough (T 2) during the descent of the larynx (Figs. 5 and 6). Since LAD represents the duration of the pharyngeal swallow, LAD should be greater than 500 ms; otherwise, the next trough is searched forward.
We set minimal values for LRT and LAD, since ILM sometimes displayed zero-cross activity patterns (Fig. 6b, c). Zero-cross activity patterns were often observed associated with extension and flexion of the head of subjects, since such swallowing maneuvers cause the piezoelectric sensor to bend and generate a signal which overlaps with the laryngeal motion signal.
2.9 Swallowing simulator
2.10 Videofluorographic (VF) measurements
Swallowing function is often assessed by temporal parameters measured using videofluoroscopic images. Among these parameters, we measure the pharyngeal response duration (PRD)  and laryngeal elevation delay time (LEDT) . PRD reflects dynamics of the hyoid bone during swallowing. The hyoid bone slowly elevates posteriorly before the initiation of the swallowing reflex and then rapidly starts moving forward upon initiation of the swallowing reflex (pharyngeal swallow) to elevate the larynx. When suprahyoid muscles relax and infrahyoid muscles are activated, the hyoid bone moves backward and downward to return to its original position, and the swallowing reflex is completed. PRD is defined as the duration between the beginning of forward movement and the end of backward and downward movement of the hyoid bone.
LEDT is the time difference between the time when the test food reaches the piriform recess and the time when the larynx reaches the highest position to complete laryngeal closure. LEDT of >0.35 s indicates a risk of aspiration . The prolongation of LEDT may be caused by two independent factors: a delay of the initiation of swallowing reflex and a decrease in laryngeal elevation speed. Since LRT reflects the laryngeal elevation speed, we sought to clarify the relationship between LEDT and PRD.
For spatial measurements, the Y-axis was defined as the line connecting the anterior–superior edge of C3 and the anterior–inferior edge of C5, and the X-axis was defined as the line perpendicular to the Y-axis. Trajectories of the larynx and the hyoid bone were measured by tracking the vocal cord and the anterior ridge of the hyoid bone using a two-dimensional motion analysis software (Move-tr/2D, Library Co. Ltd., Japan).
To compensate for movement of the body, the X–Y coordinates of the anterior–inferior edge of the C4 vertebral body were also measured, which served as the anchor point. Then, the anterior and vertical displacements of the hyoid bone were calculated according to the method described by Kim and McCullough .
Occurrence rates of specific coordination patterns between swallowing and breathing in healthy subjects and in patients with dysphagia were compared using Chi-square test. The swallowing characteristics of different food textures/levels were tested using ANOVA. Correlations between the parameters derived from the new device and VF-derived parameters were evaluated by Pearson’s correlation coefficient. LRT and LAD values in healthy subjects and patients were compared using unpaired t test, with all data presented as mean ± standard deviation. P values were two-sided, and P < 0.05 was considered as statistically significant. Statistical analyses were performed using JMP Pro, SAS Institute Inc. (version 12).
3.1 Accuracy of semiautomatic swallowing detection
The accuracy of semiautomatic swallowing detection was assessed using data from 7 healthy subjects, for whom two speech therapists judged swallow candidates. First, swallowing candidate periods were automatically extracted using the algorithm described in Methods. The automatic detection algorithm picked up 94 swallow candidate periods from 7 subjects, which included 55 test food swallowing periods (confirmed by the timing coincident with foot switch signals) and 39 additional dry (saliva) swallowing candidate periods. Since each subject swallowed test foods eight times, the sensitivity of the automatic swallowing detection algorithm with regard to test food swallows was 55/(7 × 8) = 0.982.
At this point, additional swallowing candidate periods contain false-positive detections (non-swallowing sounds) due to environmental noises, e.g., speech, motion artifacts, and electrical interference. Subsequently, the sound within these respiratory cessation periods (swallowing candidate periods) was played back, and two speech therapists independently judged whether the sound and the laryngeal motion (Fig. 5b) were compatible with a swallow. Each speech therapist judged 28 candidates as dry swallows (true positives) and 11 candidates as non-swallowing sounds (false positives). The judgment perfectly matched, and thus, the Cohen’s kappa coefficient was 1.0. Therefore, the specificity of the automatic swallowing detection algorithm was (94 − 11)/94 = 0.883. When we use only the laryngeal motion characteristics to detect swallows, the sensitivity with regard to test food swallows was 1.0, and the specificity was 0.712.
3.2 Characteristics of swallows
Normal swallows in healthy subjects were accompanied by deglutition apneas, the duration of which was 1441 ± 1152 ms (range 302–5834 ms). It is known that, in general, typical swallows occur during expiration and are followed by expiration (E–SW–E pattern; Fig. 2a). However, in healthy subjects, 4 of 98 swallows occurred during inspiration (I–SW pattern), and 5 of 98 swallows were followed by inspiration (SW–I pattern; exemplified in Fig. 2b). In patients with dysphagia, the duration of deglutition apneas was 2386 ± 2089 ms (range 375–11,599 ms), 7 of 46 swallows occurred during inspiration, and 6 of 46 swallows were followed by inspiration. The occurrence rate of I–SW pattern but not SW–I pattern was significantly increased in the patient group (Chi-squared test of proportion, I–SW: P = 0.019 and SW–I: P = 0.094).
Normal swallowing sounds in healthy subjects consisted of 6 ± 4 pulses (range 1–20 pulses), the maximal pulse width of which was 8.1 ± 5.8 ms (range 2.4–33.3 ms), and mel-scale spectrogram %power within 500–2300 Hz frequency bands was 71.1 ± 21.9 % (range 21.5–97.3 %). In patients with dysphagia, swallowing sounds consist of 7 ± 4 pulses (range 1–15 pulses), the maximal pulse width of which was 9.1 ± 7.4 ms (range 2.4–34.4 ms), and mel-scale spectrogram %power within 500–2300 Hz frequency bands was 57.3 ± 16.0 % (range 26.4–93.4 %).
3.3 Estimation of swallowing function
Following the semiautomatic swallowing detection, we estimated LRT and LAD and compared those with LEDT and PRD derived from videofluoroscopic image analysis. Since water did not contain a contrast agent, LEDT was not measured for water swallows.
3.4 Temporal relationships between swallowing sound, motion, and respiratory flow
In the present study, we developed a new swallowing monitoring system that uses respiratory flow, swallowing sound, and laryngeal motion. We found that LAD was moderately correlated with the VF-derived parameter, PRD; however, this correlation was not observed in patients with dysphagia, suggesting that the motion of the hyoid bone and that of the thyroid cartilage were uncoordinated in these patients. On the other hand, although LRT was not correlated with the comparable VF-derived parameter LEDT, LRT was significantly prolonged in patients with dysphagia. Therefore, LRT may also be a useful parameter for detecting dysphagia. Furthermore, the frequency of the I–SW pattern was significantly increased in patients with dysphagia. These results suggest that the new device may facilitate the assessment of some aspects of swallowing dysfunction, as well as the detection of aspiration risk in specific patient populations.
4.1 Consideration of sound component
The origin of swallowing sound components remains controversial. Vice et al.  described the three components of swallowing sound as follows: (1) the initial discrete sound which corresponds to a period when the cricopharyngeus muscle opens, (2) the bolus transit sound which corresponds to the passage of the meal lump to the esophagus, and (3) the final discrete sound which does not always occur. On the other hand, Sato et al.  proposed that swallowing sound consists of three sound phases: (1) Sound phase I, may be considered as a closure sound of the epiglottis, (2) Sound phase II, a passage sound through UES, and (3) Sound phase III, an opening sound of the epiglottis. More recently, Moriniere et al.  identified three sound components according to the position of the bolus and the anatomic structure in movement: (1) the laryngeal ascension sound when the bolus is located in the oropharynx and/or hypopharynx, (2) the upper esophageal sphincter opening sound where the bolus goes through the sphincter, and (3) the laryngeal release sound when the bolus is located in the esophagus.
4.2 Correspondence between LAD and PRD
Videofluoroscopic studies have shown that coordinated neuromuscular activity of the mouth, pharynx, larynx and esophagus occurs during swallowing. During swallowing, the larynx is elevated by the contraction of suprahyoid muscles and the thyrohyoid muscle, and the epiglottis covers the laryngeal orifice for airway protection. Although these coordinated activities are generated by a reflex and thus stereotypic, the onset of the pharyngeal swallow is variable in its time of occurrence relative to the position of the bolus . This might be one of the reasons why in the present study LRT was poorly correlated with LEDT (Fig. 9a). In contrast, PRD, which does not depend on the position of the bolus, was moderately correlated with LAD (Fig. 9b). PRD is a temporal parameter associated with the motion of the hyoid bone and estimated from VF images, whereas LAD is a temporal parameter associated with the laryngeal motion, and it is estimated from the piezoelectric sensor signal. Although these two parameters are measured by different systems, they are defined to indicate the same temporal property, i.e., the duration between the onset and the offset of the pharyngeal swallow. The duration of the pharyngeal swallow can be one of the parameters defining the swallowing function, since for example the duration of the pharyngeal swallow is prolonged in COPD patients .
In the present study, we evaluated whether the onset and the offset of the swallowing reflex, as measured by the two systems, match by simultaneous recordings. The onset of PRD is the time point when the hyoid bone starts the rapid movement anteriorly and upwardly. This time point was coincident with, or slightly (−200 ms) delayed relative to the onset of LAD, which is defined as the time point when the upward laryngeal motion reaches the highest speed (Fig. 10). Since the hyoid bone moves by being tracked by the contraction of suprahyoid muscles, there may be a lag between the hyoid bone movement and the muscle contraction. Therefore, the piezoelectric sensor may detect the muscle contraction associated with the laryngeal elevation and consequently respond slightly earlier than the upward hyoid bone movement. Further, the slow frame rate (30 frames/s) may cause an additional time lag.
We defined the offset of PRD as the time point when the larynx returns to the resting position. The reason why we did not choose the time point when the hyoid bone returns to the onset position was that, since the hyoid bone slightly moves upwardly and posteriorly before the onset of the swallowing reflex, the hyoid bone does not return to the onset position. Further, since the resting position of the larynx is determined by the balance between suprahyoid and infrahyoid muscle tones, and these muscle tones are modified by swallowing activity, it was sometimes difficult to judge whether the larynx had returned to the resting position. Indeed, it has been pointed out that the reliability of parameters tracked on videofluorographic images is poor . Therefore, we added an additional constraint that the thyroid cartilage should be at the locally lowest position when the larynx is at the resting position. As a result, the offset of PRD was coincident with, or slightly (−200 ms) delayed relative to the offset of LAD, which is defined as the time point when the integrated laryngeal motion signal becomes a local minima (Figs. 5, 10). Considering that the inter-rater reliability of temporal VF parameter values, assessed by Cohen’s kappa coefficient, ranged between 0.35 and 0.46 , we think that the value of the correlation between PRD and LAD in healthy subjects (r = 0.6) was reasonable. However, this correlation was disrupted in patients, suggesting that the linkage between the motion of the hyoid bone and that of the larynx is altered in dysphagic patients.
The speed for food to enter into the pharynx depends on the texture. For instance, L2 and L3 foods enter into the pharynx faster than L0 food, and thus, for dysphagic patients, L2 and L3 foods are more difficult to eat than L0 food. Swallowing characteristics change depending on the food texture, and the alteration is more marked in patients . In the present study, we observed that LRT values of patients were prolonged as compared to those of healthy subjects, and this prolongation was more marked in L2 and L3 foods. Therefore, the use of L2 or L3 foods may be preferable to distinguish swallowing abnormality.
4.3 Coordination between swallowing and breathing
Swallowing and breathing share a common anatomical pathway in the pharynx. Therefore, the airway must be protected against aspiration by a sequence of laryngeal closure, and a precise coordination between breathing and swallowing was controlled by neuronal networks in the medulla. This coordination is critical during swallowing, and its failure can lead to serious consequences. A normal swallowing activity most frequently occurs during the expiratory phase of the breathing cycle, which interrupts the exhalation movement and the breathing resumes with expiration after swallowing has been completed . However, in elderly persons the chance of swallowing occurrence following inspiration and the chance of post-deglutitive resumption of the respiration being inspiration (not expiration) increase [17, 40]. A similar pattern of alteration of the coordination between swallowing and breathing occurs in patients with COPD  and Parkinson’s disease , which may increase the risk of aspiration.
Martin et al.  reported that laryngeal elevation follows the onset of respiratory cessation by 0.19 ± 0.15 s for water swallows. We also observed that the onset of rapid laryngeal elevation associated with swallowing reflex usually follows the onset of respiratory cessation; however, we found that a preparatory slow laryngeal elevation, during which the closure of the oropharynx occurs, is initiated before deglutition apnea (Fig. 5b). Furthermore, we observed that LRT was greater in dysphagic patients, suggesting that this preparatory slow laryngeal elevation is marked in patients. As to the relationship between laryngeal descent and the termination of respiratory cessation, Martin et al.  reported that expiration resumes 0.47 ± 0.44 s before the completion of laryngeal descent. We also observed that expiration resumed before the completion of laryngeal descent in a majority of the healthy subjects (Fig. 11a). However, such a phenomenon was less evident in the patients (Fig. 11b). The physiological significance of expiration before the completion of laryngeal descent remains unclear and necessitates further exploration in the future.
The coordination between swallowing and breathing occurs by the interaction of central pattern generators (CPGs) for swallowing and breathing within the brainstem [7, 30]. Bautista et al.  proposed that balanced synaptic interaction along the nucleus of the solitary tract (NTS)/Kölliker–Fuse (KF) nucleus axis is pivotal for effective swallowing/breathing coordination, and an imbalance of the synaptic interaction between and within NTS and KF may have an important role in the pathophysiology of swallowing disorders. In the present study, similar to the cases of COPD  and Parkinson’s disease , the I–SW pattern was more frequently observed in patients with dysphagia. Thus, we suggest that the disordered coordination between swallowing and breathing may be a sensitive and early indicator of a functional abnormality of swallowing CPG and/or an interaction between swallowing and respiratory CPGs. In addition, altered properties of peripheral effector organs, e.g., lung function impairments, would profoundly affect the coordination between swallowing and breathing. Interestingly, CPAP improves respiration–swallowing coordination during sleep , and the improvement in respiratory–swallowing coordination results in favorable effects on airway protection and bolus clearance . Therefore, detection of discoordination between breathing and swallowing may lead to early intervention for asymptomatic patients to prevent aspiration.
Swallows can be viewed as external stimuli to the respiratory CPG. The respiratory rhythm is reset by a swallow, and the respiratory phase is shifted. The amount of shift depends on the timing when the swallow occurs within the respiratory cycle. Such phase–response characteristics reflect the internal structure of the respiratory CPG as well as the properties of relay pathways and effector organs (diaphragm, lung, and chest wall) [29, 31]. Paydarfar et al.  reported that the interval between the swallowing event and the onset of inspiration is the shortest when swallowing occurs at the E–I transition and largest when swallowing occurs at the I–E transition, in the case of water swallowing. Therefore, swallows at early expiration are the safest with regard to the risk of aspiration. We observed similar phase–response characteristics in both the healthy subjects and patients. The interval between swallowing and subsequent inspirations was highly variable for swallows which occurred near the I–E transition. This may result from the difference in food consistency. In case of level 0 and level 2 test foods (jelly consistency), the interval tended to be prolonged (Fig. 8). Further study is necessary to elucidate factors altering the variation in the phase–response near the I–E transition.
In addition to the autonomic regulation, voluntary and behavioral controls by higher brain centers may affect the coordination between laryngeal motion and breathing activity. For instance, anticipation of the speed of bolus movement may advance or delay the onset of slow laryngeal ascension before the pharyngeal swallow, because depending on the food consistency, subjects can predict the speed of bolus passing through the oropharynx based on their experience. On the other hand, the residue awareness may delay the laryngeal relaxation to prepare for a dry swallowing, or to clear the residual food.
4.4 Technical considerations
In the present study, we adopted a semiautomatic swallowing detection method. The reason why we adopted a semiautomatic rather than a full-automatic detection method was that the sensitivity must be almost 100 % for the clinical assessment of swallowing function. It was reported that the accuracy of the full-automatic swallowing detection method was 82–85 % [8, 39], and we also achieved a similar accuracy; however, it was not sufficient. Although the present study was done while the participants were awake, the device may be used to monitor swallowing while subjects are asleep. Therefore, in a practical situation, several artifacts such as head movement, talking, snoring and electrical interference may further deteriorate the accuracy of swallowing activity detection. In addition, mouth breathing, often observed during the chew–swallow complex behavior, may blunt the respiratory flow signal captured by the nasal cannula-type flow sensor, thereby obscuring the expiratory flow. Therefore, we adopted the semiautomatic detection method to pick up all swallows. We found that the inter-rater variability judged from played back sound and displayed laryngeal motion was extremely small (kappa coefficient = 1.0).
We set the cutoff frequency at 100 Hz to divide sound and motion components. This cutoff frequency might be too high, because the low-frequency component contains both laryngeal motion signal and low-frequency (20–100 Hz) acoustic signal. Lee et al.  reported that most of the signal energy measured by accelerometry is contained in the low frequencies, approximately below 100 Hz. However, they suggested that the accelerometry signal may be primarily attributed to a mechanical rather than acoustic phenomenon. On the other hand, swallowing sound typically contains a high-frequency component of greater than 750 Hz (see Fig. 1 in Sazonov et al. ), and Sarraf-Shirazi et al.  use the 100 Hz cutoff frequency to characterize the swallowing sounds recorded in the ear, nose, and on trachea. Therefore, we assumed that this high (>100 Hz)-frequency component is essential in discriminating the swallowing sound from environmental noises. Indeed, speech therapists were able to accurately discriminate swallows by playing back piezoelectric signals above 100 Hz. Therefore, we think that the signal above 100 Hz captures important features of the swallowing sound; however, the cutoff frequency can be optimized by future development.
4.5 Study limitation
Obviously, the sample size in the present study is insufficient to draw a promising conclusion. Further data collection from healthy subjects as well as patients with dysphagia is necessary to improve the detection algorithm and to define normal swallows. In addition, a full-automatic detection method, such as one using pattern recognition methods, needs to be developed in the future.
The coordination between breathing and swallowing is important in the detection of aspiration, although this study was not designed to investigate the effect of aspiration on the parameters. Further study is necessary to elucidate whether the discoordination between breathing and swallowing detects an aspiration event and/or predicts the risk of aspiration.
In this study, we proposed a novel sound, motion, and air flow recognition technique to detect swallowing events and assess the swallowing function. To our knowledge, this is the first bedside swallowing monitoring system that can assess the swallowing sound, the laryngeal motion during swallowing, and the coordination between swallowing and breathing in a systematic manner. With the device developed in the present study, swallowing activity is semiautomatically detected at a high sensitivity, and the quality of swallows can be assessed from various aspects, i.e., the duration of swallowing reflex, the timing of swallowing sound relative to the laryngeal motion, and the coordination between breathing and swallowing. Therefore, the new device may facilitate the assessment of some aspects of swallowing dysfunction, especially with respect to the coordination between swallowing and breathing.
We thank Prof. Akira Ishikawa of Kobe University and Dr. Hajime Takahashi, the President of Takahashi Hospital for providing opportunity to acquire videofluoroscopic data, Prof. Jun Kayashita and Dr. Yoshie Yamagata of Hiroshima Prefectural University for providing test foods, Hiroshi Ueno and Hiroyuki Takeda of J Craft Co., Ltd. for constructing the ‘swallowing simulator,’ Dr. Masako Kijima of Wakakusa rehabilitation hospital, Kayoko Ohtsuka, SLP of Yamato University and Yuri Katsuta, SLP of Wakakusa rehabilitation hospital for helping to obtain patient data, and Prof. Michiaki Mishima and Prof. Ryosuke Takahashi of Kyoto University for giving us critical comments. This work was supported by Grant-in-Aid for Scientific Research(C) 16K01546.
Compliance with ethical standards
Conflict of interest
This study has been conducted with funding support of Foodcare Co., Ltd. and J Craft Co., Ltd.
- 20.Martin BJ, Logemann JA, Shaker R (1985) Dodds WJ (1994) Coordination between respiration and swallowing: respiratory phase relationships and temporal integration. J Appl Physiol 76:714–723Google Scholar
- 47.Yagi N, Takahashi R, Ueno H, Yabe T, Oke Y, Oku Y (2014) Swallow-monitoring system with acoustic analysis for dysphagia. In: Paper presented at the IEEE International Conference on Systems, Man and Cybernetics (SMC), San Diego, CA, 5–8 OctGoogle Scholar
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.