1 Introduction

Asthma is a chronic recurring disease caused by an increase in airway sensitivity [1] that can affect both large and small airways [2, 3]. It is characterized by bronchial inflammation, which induces increased airway secretions, bronchial wall swelling, and smooth muscle contraction [4]. In 2019, asthma afflicted 262 million people globally and caused the death of 455.000 [5, 6]. Its incidence is increasing [7], which is exacerbated by rising obesity, stress, mood disorders, medication use, environmental exposure to pollen, pets, smoke, air pollution, and indoor allergens such as house dust mites [8,9,10]. Although asthma can present at any age, 30% of cases occur in the first year of life. Indeed, asthma is the most common chronic disease in children. The risk of developing asthma is higher in those with a family history of asthma [11].

Asthmatic symptoms include recurrent coughing, wheezing, and shortness of breath, which can be triggered by dust, smoke, odor, and pollen. Asthma may be caused by allergies or it can develop independently of allergies [12]. In the presence of frequent cough episodes, day/night coughing, chest tightness, a family history of asthma, and allergic symptoms, asthma should be considered. Spirometry and peak expiratory flow rate (PEFR) measurement are diagnostic tests that can also quantify the severity of airflow limitation caused by bronchial narrowing, which is proportional to the level of airway inflammation [13]. Serial portable peak flow meter measurements can be used to track changes in airway narrowing, which is particularly useful for managing asthma in children [14]. Because allergens can cause asthma, adjunctive allergy tests may be performed to identify culprit allergens [15]. Avoiding asthma triggers can help reduce the frequency of asthmatic attacks [16]. With contemporary medical care, asthma can generally be well controlled with inhaled medications, allowing patients to lead normal lives. However, underdiagnosis and under-treatment, which are more pervasive in low- and middle-income countries, are still responsible for residual risks, including mortality, among asthma patients.

While the clinical symptoms and signs of asthma are well-known, the physical examination may neither be sufficiently sensitive nor specific. Various machine learning models have been proposed to detect asthma automatically using recorded respiratory sounds. Haider and Behera [17] developed an automated method for detecting asthma and chronic obstructive pulmonary disease based on Hurst analysis, empirical mode decomposition, and spectral subtraction methods. Trained and tested on a dataset of lung sounds acquired from 80 normal, 80 asthmatic, and 80 chronic obstructive pulmonary disease subjects, the model attained 99.30% accuracy using a decision tree classifier. Kilic et al. [18] proposed a new machine learning model called global chaotic logistic pattern to discriminate asthma from other lung conditions and a healthy control group. TQWT-based signal decomposition was applied and four different feature selectors were used for this purpose. The support vector machine (SVM) algorithm was used as a classifier in the model, and a classification success of 98.53% was reported. Iqbal et al. [19] used machine learning methods for asthma detection on a four-class lung sound dataset comprising 100 normal, 321 wheezes (typical of asthma), 98 stridor, and 73 rattle sounds and attained 100% accuracy. Iqbal et al. [20] proposed a forecasting technique for real-time asthma disease detection on a cough sound dataset of 18 asthma patients and attained 99.91% accuracy. Sen et al. [21] recorded lung sounds with a 14-channel device to discriminate between chronic obstructive pulmonary disease (COPD) and asthma. Multivariate autoregressive model, Gaussian mixture model and SVM were used together. In this study, 98% classification accuracy was achieved using sound samples from 50 subjects. Khan et al. [22] developed a real-time system for real-time asthma detection on lung sounds based on signal normalization and empirical mode decomposition embedded on Raspberry Pi. The model attained 9.40% accuracy. On a lung sound dataset of 64 pneumonia, 48 asthma, and 100 healthy subjects, Yahyaoui and Yumusak [23] applied machine learning to detect pneumonia and asthma with 95.00% accuracy using the k-nearest neighbor classifier. Topaloglu et al. [24] proposed a ResNet18-based asthma detection model. For this purpose, a sound dataset was created using a digital stethoscope and these sound signals were converted into images using a Mel spectrogram. Next, features were extracted from these images using ResNet18 deep network architecture, and the most significant features were selected by iterative NCA algorithm. The selected features were classified using kNN and SVM algorithms. This model achieved a classification accuracy of 99.73%. Yue and Xu [25] developed an automated asthma and pneumonia detection model based on short-time energy and Mel-frequency cepstral coefficients. Their study of 850 cough sounds recorded from patients attained 93.34% accuracy using a SVM classifier. Tasar et al. [26] developed a piccolo pattern-based respiratory sound classification model. Three cases were created in the research. They classified 7 respiratory diseases including asthma. In the developed model, the kNN algorithm achieved the highest classification success and a classification performance of over 99% was obtained for each case. A small number of asthma detection studies were based on other signal inputs.

1.1 Motivation and our model

The main goal of our research is to design and implement an automated system dedicated to asthma screening. Based on a mobile-based machine learning model, our approach integrates a state-of-the-art feature extractor carefully tailored to the analysis of cough sounds. The ultimate goal is to achieve a higher level of accuracy in asthma detection, providing a solid foundation for the development of an efficient and reliable screening tool. We were motivated to develop an automated system to screen for asthma that is accessible and easy to implement, i.e., computationally lightweight. The reference standard for asthma diagnosis is formal pulmonary function tests, while PEFR measurement may be used for therapeutic monitoring in diagnosed patients. On clinical examination, lung auscultation may uncover wheezing sounds typical of acute asthma attacks, but stethoscopes are not readily available, and lung auscultation requires prior training. Many respiratory conditions, including asthma, present with cough, cough sounds, which can be readily recorded using mobile devices, may contain features that can be used to discriminate the underlying respiratory condition. To validate this hypothesis, a new cough sound-based machine learning model was proposed in this work and tested on a new dataset of cough sounds recorded from more than 1000 participants. We chose handcrafted machine learning over deep learning to economize on computational demands and running time. The proposed model comprised four steps: (i) preprocessing; (ii) feature extraction; (iii) feature selection; and (iv) classification. Asthma classification using cough sound recordings is technically challenging as they inadvertently contain ambient noise like speech and are of variable duration with inutile time pauses between the actual coughs. Hence, preprocessing was obligatory to remove unwanted speech and ambient noise from the sound signal and segment the signal into standardized segment lengths containing analyzable cough sounds to optimize the fidelity and data efficiency of the input cough sound signals. We used a one-dimensional (1D) version of the popular attractive repulsive center-symmetric local binary pattern (ARCSLBP) image descriptor [27] to extract features from the cough sound signals. However, 1D-ARCSLBP is a handcrafted feature extractor that, by itself, can only generate features at a low level. To overcome this constraint, we applied tunable q-factor wavelet transformation (TQWT) [28] to deconstruct the energy of the cough sound signal effectively into multiple low- and high-frequency wavelet bands. Applying 1D-ARCSLBP to the cough sound input signal and the wavelet subbands could extract features at both low and high levels. A simple neighborhood component function (NCA) [29] feature selector was applied to the extracted features to select the most discriminative features. These were fed to a downstream standard shallow SVM [30, 31] for classification.

1.2 Novelties and contributions

Novel contributions of this work include:

  • A new cough sound dataset was acquired from more than 1,000 subjects.

  • New preprocessing method for efficient removal of ambient noise, speech, and unwanted pauses in cough sound recordings.

  • Handcrafted 1D-ARCSLBP feature extraction enabled the model to inherit known advantages of ARCSLBP [27] for advanced signal processing without the need for parameter tuning.

  • Efficient and accurate computationally lightweight architecture built on shallow models that required linear running time to attain excellent performance commensurate with deep models (> 96% classification accuracy) on robust tenfold and LOSO CV.

2 Proposed model

The handcrafted asthma classification model comprised four phases: (i) cough sound signal preprocessing; (ii) TQWT- and 1D-ARCSLBP-based feature extraction; (iii) NCA-based feature selection; and (iv) SVM-based classification using tenfold and LOSO CV (Fig. 1). Details of the steps are explained in the following subsections.

Fig. 1
figure 1

Block diagram of the proposed 1D-ARCSLBP-based model for asthma classification using recorded cough sounds. First, cough sound signal input samples of variable time lengths were preprocessed to remove ambient speech and segment the samples into one-second segments that each contained an analyzable “clean cough sound”. Next, TQWT was applied to each one-second clean cough sound segment to generate 12 wavelet subbands (t1, t2, t3,…t12). Next, 1D-ARCSLBP was applied to the clean cough sound segment and its 12 wavelet subbands to extract low- and high-level features. This yielded 13 feature vectors (f1, f2, f3,…f13) each of length 256, which were concatenated into a final feature vector of length 3,328 (= 13 × 256). From the latter, NCA selected the top 100 most discriminative features to feed to the SVM classifier for two-class classification of input cough sound samples into asthma versus non-asthma classes using tenfold and leave-one-subject-out cross-validation strategies

2.1 Preprocessing step

First, the recorded cough sound signal input samples were preprocessed to remove ambient speech using power spectrogram-based classification [32]. This model was used to detect cough sounds. Next, one-second segments containing analyzable “clean cough sound” were sifted from the input samples, which varied in duration, by applying a predetermined threshold (the threshold value is selected as 0.01 dB for these sounds) to remove inutile stop sounds (pauses) between the coughs and a standardized one-second segment length. A result summarizing the effect (before/after) of the Preprocessing step applied to cough sounds from asthma patients is given in Fig. 2. In addition, the steps of the preprocessing and the pseudocode (Algorithm 1) are given below:

Fig. 2
figure 2

Before and after the preprocessing step

As shown in Fig. 2, a preprocessing step was applied to remove the ambient and stop sounds from the raw signal. This step is effective in improving the accuracy of the proposed model. The steps of the preprocessing process are as follows:

Step 1 Obtain clean sounds from the collected sound samples by removing speech signals.

Step 2 Delete stop sounds (pauses) from the clean sound, retaining only usable cough sounds.

Step 3 Create one-second sound segments containing cough sounds to input into the model.

Algorithm 1
figure a

Pseudocode of cough sound preprocessing

2.2 Feature extraction step

TQWT was used to create wavelet subbands, from which low-and high-level features could be extracted. TQWT is a parametric signal transformation model that uses three parameters to assign the wavelet filters: Q (Q-factor) defines the oscillatory value; R is the redundancy value; and J is the number of levels. A multileveled (J + 1) wavelet transform is created using J parameters. In this work, we selected Q, R, and J parameter values of 1 (non-oscillatory decomposition), 3, and 11, respectively, to perform 12 levels of signal decomposition to generate 12 wavelet subbands. The signal sampling frequency determined the choice of J parameter value (48 kHz, i.e., each one-second segment contained 48,000 values) and the length of the overlapping block (9) employed in 1D-ARCSLBP-based local textural feature extraction, based on the following calculation: number of levels = 12 \(\left( { = \log_{2} \frac{48000}{9}} \right)\). The steps of the feature extraction are given below:

Step 4 Generate subbands from each sound segment using TQWT.

$$w=\phi ({\text{cs}},\mathrm{1,3},11)$$
(1)

where \(w\) represents the wavelet subband structure with 12 wavelet subbands; and \(\phi (.)\), the TQWT function.

Step 5 Extract features from wavelet subbands and cough segments.

$${f}^{1}=\varphi ({\text{cs}})$$
(2)
$${f}^{i+1}=\varphi \left({w}^{i}\right), i\in \left\{\mathrm{1,2},\dots ,12\right\}$$
(3)

where \(\varphi (.)\) represents the 1D-ARCSLBP function, the pseudocode of which is given below.

Algorithm 2
figure b

Pseudocode of 1D-ARCSLBP feature extraction function

From algorithm 2, it can be seen that 1D-ARCSLBP generated seven attractive and seven repulsive bits per run. Two map signals were generated from each bit by deploying these attractive and repulsive bits. By extracting the histograms of these map values, 2 × 27 = 256 features were obtained. A block diagram summarizing Algorithm 2 is given in Fig. 3.

Fig. 3
figure 3

1D-ARCSLBP feature extraction method proposed in this work

As can be seen in Fig. 3, the median value of both the main signal and the overlapping block is calculated. In addition, the mean value of the overlapping block is determined. These values are then compared with the center value of the overlapping block. In this way, the first three bits of the attractive and repulsive bits are determined. The remaining four bits are obtained by sequential comparison. After this process, the attractive and repulsive bits are converted to decimals and added to the map signal. Finally, histograms are extracted using the map signal, and the two histograms are combined to obtain the feature vector.

Step 6 Concatenate all features generated from the cough sound segment and its 12 subbands into a final feature vector.

$$ X\left( {d,j + 256 \times \left( {i - 1} \right)} \right) = f^{i} \left( {d,j} \right), d\varepsilon \left\{ {1,2, \ldots ,\dim } \right\}, i\varepsilon \left\{ {1,2, \ldots ,13} \right\},j\varepsilon \left\{ {1,2, \ldots ,256} \right\} $$
(4)

where \(X\) represents the final feature generated, and \(dim\) is the dimension of the dataset (number of signals). The length of \(X\) is 3328 (= 13 × 256).

2.3 Feature selection step

NCA, a simple but effective feature selector, was deployed to select the 100 most discriminative features from 3328 features in \(X\) based on the calculated individual feature weights, which represented the distinctive level of each feature. The main purpose of NCA is to bring together similar instances in the feature space and to remove instances belonging to different classes from each other. For this purpose, NCA optimizes an objective function. This function measures the similarity of a given pair of instances. In this research, the top 100 features were chosen on the qualified feature vector by sorting these features in descending order. Detailed steps of feature selection are given below:

Step 7 Apply NCA to the generated features to calculate 3,328 individual weights.

Step 8 Sort these weights and obtain sorted indexes.

Step 9 Choose the best 100 features.

2.4 Classification step

We used cubic SVM [33], a standard shallow classifier, for the two-class classification of input cough sound segments into asthma versus non-asthma classes. The main goal of SVM is to classify a dataset into two or more classes. In the case of binary classification, SVM tries to find a hyperplane that separates the classes as clearly as possible. In this paper, default hyperparameter settings were employed:

Kernel: polynomial;

Polynomial order: three;

Kernel scale: automatic;

Box constraint: one;

Validation: tenfold CV and LOSO CV.

Step 10 Classify the top 100 features by deploying SVM with tenfold and LOSO CVs.

3 Experiments

3.1 Experimental setup

The model was implemented in central processing unit mode on a personal computer with the following specifications: 16 GB memory, 512 GB solid-state disk, Intel i7 processor with a 4.3 GHz clock, and Windows 11 operating system. MATLAB programming environment was used in the model development process and the toolboxes, and libraries used in this process are listed in Table 1.

Table 1 Details of the programming platform

3.2 Dataset

The dataset comprised 994 and 881 cough sound recordings obtained from 511 asthmatics (103 male, 408 female; mean age 55.23 ± 14.97 years, range 10–2 years) and 815 non-asthmatic subjects (509 male, 306 female; most of the subjects in this group were healthy university students without a history of asthma), respectively. The cough sounds were recorded with varying durations in the hospital environment using a Samsung S6 Edge mobile phone. All recordings have a sampling frequency of 48 kHz. The duration of cough sound recordings obtained from 511 asthmatics ranges from a minimum of 0.5 s to a maximum of 6.59 s. Recordings from non-asthmatic subjects vary in length, ranging from 0.23 to 5.42 s. The cough sound recordings obtained from the subjects are in.wav file format. The hospital ethics committee had approved the retrospective collection of the cough sound dataset.

3.3 Performance evaluation metrics

For the evaluation of model performance for binary classification into asthma versus non-asthma classes, standard metrics were calculated: accuracy, sensitivity, specificity, precision, geometric mean (of sensitivity and specificity), and F1-score (harmonic mean of sensitivity and precision) [34, 35]. The mathematical equivalents of these performance metric values are given in Eqs. (5)-(10).

$$ {\text{Accuracy}} = \frac{{{\text{True}} {\text{Positives}} + {\text{True}} {\text{Negatives}}}}{{{\text{Total}} {\text{Examples}}}} $$
(5)
$$ {\text{Sensitivity}} = \frac{{\text{True Positives}}}{{{\text{True}} {\text{Positives}} + {\text{False}} {\text{Negatives}}}} $$
(6)
$$ {\text{Specifity}} = \frac{{{\text{True}} {\text{Negatives}}}}{{{\text{False}} {\text{Positives}} + {\text{True}} {\text{Negatives}}}} $$
(7)
$$ \Pr ecision = \frac{{{\text{True}} {\text{Positives}}}}{{{\text{True}} {\text{Positives}} + {\text{False}} {\text{Positives}}}} $$
(8)
$$ {\text{Geometric}} {\text{mean}} = \sqrt {{\text{Sensitivity}} \times {\text{Specificity}}} $$
(9)
$$ F1 {\text{score}} = 2 \times \frac{{{\text{Precision}} \times {\text{Sensitivity}}}}{{{\text{Precision}} + {\text{Sensitivity}}}} $$
(10)

3.4 Results

Our model attained excellent results for binary classification of cough sounds into asthma versus non-asthma classes, with 98.24% and 96.91% accuracy rates on tenfold and LOSO CV (Table 2).

Table 2 Performance of the 1D-ARCSLBP-based cough sound classification model for asthma detection

These metrics highlight the robust performance of our model, showcasing its accuracy, sensitivity, specificity, precision, geometric mean, and F1-score across different cross-validation techniques. As shown in Table 2, the proposed method achieves a very high classification success for both cross-validation techniques (98.24% and 96.91%). In addition, when the F1-score result is analyzed, a very high-performance value is achieved. This result demonstrates the ability of the proposed model to strike a harmonious balance between precision and sensitivity.

3.5 Time burden

The time complexity of our model, shown for every layer using big O notation, is shown in Table 3,

Table 3 Transition table of the proposed model

As can be seen in Table 3, the approximate computational complexity of the proposed method is \(O(n+k+d)\). Feature selection and classification steps are well-known methods in the literature, and their complexities are \(O(k)\) and \(O(d)\), respectively. The preprocessing and feature extraction steps consist of multiple phases. In this context, the time complexity of preprocessing (Algorithm 1) is analyzed in detail in Table 4.

Table 4 Detailed computational complexity analysis of the preprocessing step

When the analysis is performed using the cost and time information given in Table 4, the time complexity of the preprocessing step is given below:

$$ {\text{Total}}\,{\text{time}}\,{\text{cost}} = c_{1} \times 1 + c_{2} \times \left( {L + 1} \right) + c_{3} \times L + c_{4} \times L + c_{5} \times L + c_{6} \times L + c_{7} \times {\text{cnt}} + c_{8} \times \left( {f + 1} \right) + c_{9} \times f $$
(11)
$$ = c_{1} + L \times \left( {c_{2} + c_{3} + c_{4} + c_{5} + c_{6} } \right) + c_{2} + c_{7} \times {\text{cnt}} + f \times \left( {c_{8} + c_{9} } \right) + c_{8} $$
(12)
$$ = O\left( n \right) $$
(13)

Herein, \(c\) represents the cost and \(L\) represents the length of the signal. The values of \({\text{cost}}\), \({\text{cnt}}\) and \(f\) are neglected in the algorithmic analysis. In addition, since \(L\cong n\), the time complexity of the preprocessing step is \(O(n)\). The next phase of the proposed model is feature extraction. In this phase, the signal is decomposed into 12 levels using the TQWT algorithm. Then, features are extracted from both the raw signal and the subbands using the 1D-ARCSLBP method. The pseudo code for this process is given in Algorithm 3, and the time complexity calculated using this algorithm is shown in Table 5.

Table 5 Detailed computational complexity analysis of the feature extraction step

As shown in Table 5, the proposed methods consist of TQWT-based signal decomposition and 1D-ARCSLBP. Since TQWT is a signal decomposition process, the computational complexity of this step is \(O({\text{log}}n)\). In addition, an analysis of the computational complexity of the 1D-ARCSLBP method is given in Eqs. (14)-(19).

$$\mathrm{Total \, time \, complexity} = {\text{number}} \, {\text{of}} \, {\text{levels}} ({\text{TQWT}}) x 1{\text{D}}-{\text{ARCSLBP}}$$
(14)
$${\text{TQWT}} \, {\text{signal}} \, {\text{decomposition}} \, {\text{complexity}}\cong O({\text{log}}n)$$
(15)
$$ \begin{aligned} = & O\left( {{\text{log}}n} \right) \times \left[ {(c_{1} \times 1 + c_{2} \times \left( {{\text{ln}} - 7} \right) + c_{3} \times \left( {{\text{ln}} - 8} \right)} \right. \\ & + c_{4} \times \left( {{\text{ln}} - 8} \right) + c_{5} \times \left( {{\text{ln}} - 8} \right) + c_{6} \times \left( {{\text{ln}} - 8} \right) + c_{7} \\ & \times \left( {{\text{ln}} - 8} \right) + c_{8} \times \left( {4 \times \left( {{\text{ln}} - 8} \right) + 1} \right) + c_{9} \times \left( {4 \times \left( {{\text{ln}} - 8} \right)} \right) \\ & + c_{{10}} \times \left( {4 \times \left( {{\text{ln}} - 8} \right)} \right) + c_{{11}} \times \left( {{\text{ln}} - 8} \right) + c_{{12}} \times \left( {7 \times \left( {{\text{ln}} - 8} \right) + 1} \right) \\ & + c_{{13}} \times \left( {7 \times \left( {{\text{ln}} - 8} \right)} \right) + c_{{14}} \times \left( {7 \times \left( {{\text{ln}} - 8} \right)} \right) \\ & \left. { + c_{{15}} \times 2 + c_{{16}} \times 1} \right] \\ \end{aligned} $$
(16)
$$=O\left({\text{log}}n\right)\times {[c}_{1}+ln\left({c}_{2}+{c}_{3}+{c}_{4}+{c}_{5}+{c}_{6}+{c}_{7}+4\times ({c}_{8}+{c}_{9}+{c}_{10})+{c}_{11}+7\times {(c}_{12}+{c}_{13}+{c}_{14}\right)-7\times {c}_{2}-8\times {(c}_{3}+{c}_{4}+{c}_{5}+{c}_{6}+{c}_{7}+{c}_{11})-31\times {c}_{8}-32\times ({c}_{9}+{c}_{10})-55\times {c}_{12}-56\times ({c}_{13}+{c}_{14})+ {c}_{15}\times 2+{c}_{16}]$$
(17)
$$=O\left({\text{log}}n\right)\times O(n)$$
(18)
$${\text{Total}} {\text{time}} {\text{cost}}=O\left(n\times {\text{log}}n\right)$$
(19)

As shown in Eq. (19), the total computational complexity of the TQWT-based 1D-ARCSLBP method is \(O(n{\text{log}}n)\). In this context, considering all the steps performed, the preprocessing step is \(O(n)\), feature extraction step is \(O(n{\text{log}}n)\), feature selection step is \(O(k)\), and classification step is \(O(d)\). As a result of these calculated values, the time complexity of the model developed in this research is \(O(n{\text{log}}n+k+d)\).

4 Discussion

In this work, we have presented a new 1D-ARCSLBP-based cough sound classification architecture that was tested on a new two-class asthma cough sound dataset collected from more than 1000 patients. We were motivated by the success of ARCSLBP-based feature extraction in computer vision applications to develop a novel asthma cough sound classification model using a one-dimensional version of this feature extractor (1D-ARCSLBP). TQWT was incorporated to generate multilevel wavelet subbands from which both low- and high-level features could be extracted, effectively surmounting 1D-ARCSLBP’s ability to extract only low-level features. As a result of our conscious decision to employ only shallow functions, our model possessed a low-time burden (Table 3). Despite this, our handcrafted multileveled feature extraction-based architecture attained excellent performance with 98.24% and 96.91% accuracy rates on tenfold and LOSO CV, which is better than or commensurate with other state-of-art methods, including deep models (Table 6).

Table 6 Comparison of automated asthma detection models

As shown in Table 6, the 1D-ARCSLBP and NCA-based model, which is a lightweight method proposed in this research, was validated by applying LOSO and tenfold CV strategies. In this context, a classification success of 96.91% for LOSO CV and 98.24% for tenfold CV was achieved. The dataset used in this research is larger than most of the state-of-the-art methods in the literature [21, 37, 39,40,41, 43]. In addition, two different validation strategies were used in this research. These strategies increase the reliability of the results obtained using the proposed method. The proposed method provides a lightweight solution according to the literature [24]. Tasar et al. [26] used 8 classes in their research. However, the computational complexity of the method used in this study is higher than our model. The LOSO strategy, which is used to ensure the generalizability of the results, has been used in only two studies [18, 43]. One of these studies, Singh et al. [43] achieved 94.52% classification success and the size of the dataset used in this study is quite small. Kilic et al. [18] achieved a higher classification success than our model. However, the computational complexity of the model developed in this research is higher than our model. In this context, when state-of-the-art methods are analyzed, the automatic asthma detection model presented in this research is more efficient in terms of both classification performance and computational complexity.

The model developed in this work uses the SVM algorithm as the classification method. In addition to this algorithm, some well-known classification algorithms in the literature were also tested. These are k-nearest neighbor (kNN), artificial neural network (ANN), decision tree (DT), and random forest (RF) algorithms. The SVM algorithm showed the best performance among these methods and the calculated classification accuracies (for tenfold CV) are comparatively given in Fig. 4.

Fig. 4
figure 4

Summary of classification results obtained with tenfold CV

As shown in Fig. 4, the best classification performance was obtained with the SVM algorithm. The DT algorithm showed the lowest classification performance with about 73%.

The advantages and limitations of our model are discussed below.

Advantages:

  • A learning architecture was built using novel 1D-ARCSLBP and other shallow methods. While computationally lightweight, it was demonstrated to be highly accurate when tested on a new asthma cough sound dataset comprising 1,875 samples recorded from 1,326 hospitalized subjects.

  • Over 96% accuracy rates for binary classification into asthma versus non-asthma classes were attained by deploying robust validation techniques. In particular, the LOSO CV results support the readiness of the model for implementation in real-world applications, e.g., in the clinic or hospital ward.

  • Our proposed architecture has linear time complexity (\(O(nlogn+k+d)\)), is very simple, and can be coded by researchers/developers efficiently.

Limitations:

  • The cough sounds were collected from a single center. The reliance on data from a solitary center may introduce potential biases and limit the generalizability of our findings. The specificity of the dataset to a particular demographic, environmental conditions, or healthcare setting raises concerns about the external validity of our model. The potential consequences of this constraint include the risk of model overfitting to the characteristics unique to the single-center dataset. Variabilities in cough sound patterns influenced by regional accents, environmental factors, or demographic differences may not be adequately captured. Consequently, the model may exhibit reduced performance when applied to diverse populations or alternative healthcare environments. However, in future works, multiple centers can contribute to a common larger dataset. Collaboration with multiple institutions or leveraging existing databases with diverse samples could be instrumental in addressing this limitation.

  • Cubic SVM was implemented with default settings. The use of default settings without hyperparameter optimization may result in suboptimal model performance. Cubic SVM has hyperparameters that, if left unoptimized, might not be well-suited for the specific characteristics of the dataset. This could lead to issues such as underfitting or overfitting, compromising the model's ability to generalize to new, unseen data. Hyperparameters can be further tuned using an optimizer to obtain better classification results. By optimizing hyperparameters, researchers can fine-tune the model to extract the best possible performance from the chosen algorithm, ensuring it aligns with the specific characteristics of the data.

Potential advantages and challenges in clinical applications:

  • The successful translation of our proposed model from a research context to practical, real-world scenarios, such as clinics or hospital wards, holds significant potential. The model's high accuracy in asthma screening can facilitate early detection of the condition. This, in turn, enables prompt intervention and treatment, potentially improving patient outcomes and reducing the burden on healthcare resources.

  • By automating the screening process, the proposed model has the potential to optimize healthcare resources. Clinics and hospital wards can allocate personnel more efficiently, directing attention to confirmed cases while streamlining the diagnostic workflow.

  • Seamless integration with existing healthcare systems can enhance the model's adoption. Compatibility with Electronic Health Records (EHR) or other clinical databases ensures a smooth incorporation into routine medical practices.

  • The challenges associated with diverse patient populations and varied cough sound characteristics across different clinical settings may affect the model's generalizability. Addressing this requires continuous validation and adaptation of the model to accommodate diverse datasets.

  • Ensuring regulatory compliance and adherence to healthcare standards are paramount. The proposed model must meet stringent regulatory requirements to guarantee patient safety and data security, adding an extra layer of complexity to implementation.

Clinician acceptance and training are crucial factors for successful integration. Clinicians may initially be skeptical of automated systems, and effective training programs must be implemented to familiarize healthcare professionals with the model’s capabilities and limitations.

5 Conclusions

Using a novel 1D-ARCSLBP feature extractor, a new hand-modeled architecture was proposed and studied on a new large cough sound dataset that comprised 1875 cough sound segments acquired from 1326 participants. Our model attained excellent 98.24% and 96.91% accuracy rates using tenfold and LOSO CV, respectively. Moreover, the model possessed a low-time complexity of O(nlogn + k + d). The excellent results confirmed the feature extraction capability of 1D-ARCSLBP on cough sound signals. In addition, the low-computational demands and ease of implementation position our model favorably against published state-of-the-art models, demonstrating that our model is ready to be developed into a real-world application.