1 Introduction

While considering the energy consumption of a building, maintaining good thermal comfort in an indoor environment plays a significant role (Petidis et al. 2018). The setpoint of airspeed and temperature and the number of occupants present impact both the energy-saving and the occupant's thermal comfort (Hoyt et al. 2015). Many studies have been carried out in reducing the consumption of energy of a building while maintaining comfort for the occupants in an indoor environment (Sharma et al. 2019; Petidis et al. 2018).

As per the ASHRAE standard, thermal comfort is given as ‘the state of mind, which expresses the satisfaction with the thermal environment’ (ANSI/ASHRAE 2017). Hence, this is influenced by a person's mindset, culture, and social factors. The individual difference was first found in the early 1970s. The comfort may vary from person to person because everyone does not feel the same. The thermal comfort in an indoor environment is evaluated using two methods: heat balancing and adaptive. The controlled laboratory experimentation is considered for the heat balance method, and the field studies in the building derive the adaptive models. Many experimental studies have proposed the Effective Standard Temperature (SET) to develop the human heat balance model (Gagge et al. 1986). Fanger (1967) developed the most popular method for calculating Predicted Mean Vote (PMV), Predicted Percentage of Dissatisfied (PPD) for estimating the thermal comfort based on the four environmental and the two personal factors with the two models. The currently developed comfort models fail in terms of self-learning to update the model for proper operation to maintain comfort for the occupants (Deng and Chen 2018; Wu et al. 2018) in different types of buildings. The traditional thermal comfort evaluation method needs in-depth subject knowledge to understand the process and is complex to implement in a real-time environment, and computationally expensive to update the models based on human perception. Accordingly, the advances in a data-based machine learning approach help overcome the shortcomings and challenges faced in the current/traditional comfort models and update the models in a real-time environment.

Wu et al. (2018) proposed the method of ensemble-based machine learning to predict the TS, the PMV, the PPD and the SET by considering the indoor and the outdoor environmental parameters. Kim et al. (2018) proposed the model of machine learning-based personal comfort for the individual occupant. Deng and Chen (2018) proposed the ANN and the ensemble-based model to predict the thermal comfort in an indoor environment by considering the behaviour of the occupant and the thermal sensation. However, the robust design of the model of heat balancing predictive (PMV, PPD, SET, TS) with a proper algorithm, suitable feature selection and learning method (regression and classification) selection with appropriate evaluation techniques to improve the generalization ability is still lacking. The feature selection and the hyper-parameter optimization technique are getting crucial attention in developing the machine-learning model due to their ability to reduce the complexity and improve the model's performance (Probst et al. 2019). Based on the ASHRAE comfort database, Wang et al. (2020) proposed the Principal Component Analysis (PCA) based dimensionality reduction technique for the thermal comfort metrics.

Nowadays, the NCA feature selection (Amasyali and El-Gohary 2019; Yang et al. 2012) and the Bayesian hyper-parameter optimization to tune the parameters (Shang et al. 2019) have been getting importance and proven to be more effective in the machine learning field to improve the performance of the model. Amasyali et al. proposed the NCA feature selection for convenient features and proved how to predict energy consumption effectively. The current literature is lagging in applying the automatic feature selection with the optimization technique to predict the thermal indices in a real-time environment to find the thermal comfort of the group of occupants in an indoor environment. Most of the methods considered the relevant features to achieve a better prediction accuracy of the models. The preprocessing using a proper feature selection technique helps find the most suitable features to develop the model, reduces the complexity, and improves the model’s performance.

Similarly, the recent literature considered the default model parameters given by the software or k-fold cross-validation to develop the thermal comfort predictive model. To derive the best model, we considered the NCA feature selection for automatic feature selection and the Bayesian hyper-parameter optimization technique to tune the parameter. Hence, for the comparison of various algorithms with multiple data set (training, testing and validation), we considered the Analysis of Variance Method (ANOVA) (Demšar 2006). As per the literature, Friedman's/Kruskal Wallis tests (Alpaydin 2014) are better to find that the algorithm performs significantly better than the other algorithms.

This paper considered the physical measurement method to collect the data and conducted the subjective survey for clothing insulation evaluation as per the ASHRAE clothing chart. The proposed method leads to the pre-emptive control of the thermal comfort of the occupants based on the HVAC/AC temperature setpoint or the fan speed (Air velocity) for the looping of the split air-conditioning system. The main highlights of this research are (1) considered the NCA feature selection method to find the best suitable features automatically in developing the comfort models (2) considering the BOA in optimizing the hyper-parameters, which helps in selecting the best parameters to improve the model performance (3) proper evaluation method to estimate the performance (e) estimate in a real-time environment to predict the thermal comfort of the group of occupants present in an indoor environment iv) decide at the proper setpoint selection with this prediction.

2 Methodology

This section delineates the development of the Machine-learning model, including data collection, data preprocessing, feature selection, Bayesian hyper-parameter optimization-based model development, and statistical evaluation method for both the regression (PMV, SET) and the classification (TS) models to select the best model for the prediction of the thermal comfort for the occupants. Figure 1 shows the detailed architecture of developing the machine-learning model. CBE Comfort Tool block details are available in Tartarini et al. (2020).

Fig. 1
figure 1

Flow chart for the Machine learning based thermal comfort model development and the model selection

2.1 Data collection

We collected the data from the college building (located at longitude 74.79° E, latitude 13.36° N) for the development and evaluation of the model. The data was collected during the college hours from 8 AM to 6 PM and was collected in different seasons to develop the models. In this work, we referred to the ASHRAE 55-2017 standard for collecting the data and positioning the sensor to calculate the thermal comfort of the occupants based on the classical thermal indices for the group of occupants in a centralized and split air-conditioned indoor environment.

The data was collected from the classrooms, the computer labs and the test room (Kumar et al. 2020a, b), which has a centralized air conditioning (A/C) and Split A/C system for every day, by considering with and without occupants. We collected the indoor and the outdoor temperature, humidity, the number of occupants, A/C operation (on or off), A/C temperature setpoint, Clothing insulation, and air velocity during this period. Here collected the field data on both occupied and unoccupied hours from 8:00 AM to 6.00 PM with the A/C status (on/off). We considered the without A/C operation to collect the data for warm and hot conditions as per the ASHRAE comfort model because, during the A/C operation, the indoor environment maintains comfort for the occupants, and some state goes slightly warm slightly cold. The students and the faculty members involved were considered as the samples in the field study to find the clothing insulation. In this work, we did not conduct the survey using the questionnaires for the adaptive thermal comfort but rather considered the physical measurement to estimate the comfort of the occupants as per the ASHRAE 55-2017 standard (Standard 2017; de Dear 2011; ANSI/ASHRAE 2017) and CBE tool (Tartarini et al. 2020). According to the ASHRAE 55-2017 collected the indoor relative humidity, air temperature and air velocity for every 5 min, the sensors were positioned at the height of 1.1 m above the ground. Temperature and Relative Humidity (RH) sensor (Si7021i/Hoboware sensor), Anemometer (AVM 08) and Pyranometer(BF5 sunshine sensor) are the main sensors used for measuring the indoor and outdoor environmental conditions.

2.1.1 Features and selection of output index

The input features were chosen based on their correlation with the target variables, as given in Sect. 2.2.1. These features can be selected for different categories such as environmental, personal, occupancy and A/C parameters. Firstly, determine the characteristics by focusing on the Fanger PMV index (Fanger 1967). ‘The index indicated that the thermal sensation depended on four environmental parameters: indoor temperature, relative humidity, mean radiant temperature, and air velocity and the two personal parameters were: clothing insulation and metabolic rate’. Additionally, the earlier studies showed that the outdoor thermal history, the number of occupants present in an indoor environment, and the A/C system information indicated a relationship with the thermal sensation (von Grabe 2016; Wu et al. 2018). Accordingly, selected the number of occupants present in an indoor environment, and the outdoor environmental parameters: the outdoor temperature, the outdoor humidity, and the A/C information parameters: A/C operation status represented in binary (‘0’ for ‘off’ and ‘1’ for ‘on’) and A/C temperature set point. To access the thermal sensation and evaluate the accuracy of the comfort algorithms for many people, several classical thermal indices (i.e., TS, SET, and PMV) were selected as the output features (responses). The statistical ranges of the collected data used for finding thermal comfort, as shown in Table 1, after the data collection, the data preprocessing was carried out. Table 2 shows the thermal sensation scale based on PMV.

Table 1 Statistical characteristics of the collected data
Table 2 Thermal sensation scale based on PMV

2.2 Data preprocessing

The data preprocessing is vital as the models are based on the real-time sensor data and the recorded data required for data cleaning. For thermal comfort, the sensors measure and log every 5-min data and samples to thirty minutes of data by averaging the data points. During the data preprocessing, discarded the unique data points, which increase the controller’s memory. After the data cleaning, it needs to select the suitable feature, which influences the response to enhance the performance of the generalization prediction and reduces the complexity as shown in the Feature Selection Section; once the best appropriate features are chosen, standardized the data into a standard format as per the Eq. (1), for developing the model.

$$Z=\frac{\left(x-\mu \right)}{\sigma }$$
(1)

where \(x\) represents a predictor sample from a population, \(\mu\) a sample mean, and \(\sigma\) a standard deviation.

2.2.1 Feature selection

During the machine learning model development, the suitable feature selection (predictors) plays a vital role to improve the model performance. Figure 2 shows the NCA feature selection pseudo code. The feature which is having the highest weight shows more impact on the response. Figure 3 shows the weights of the NCA feature for different indices. Finally, select the best suitable features based on the features that have more significance than the 5% of the highest weighted feature. We consider the trial and error method to select the threshold for feature weight to eliminate the feature under different experimentation (5%/15%/20%/25%). Finally, we found that 5% is appropriate for our application to select the suitable features. The increasing data points under different conditions may vary the feature weight. Table 3 shows the selected best features to develop the predictive thermal indices model.

Fig. 2
figure 2

Pseudocode for NCA Feature Selection (NCAFS)

Fig. 3
figure 3

The weights of the NCA feature for the feature indexes a TS, b PMV, and c SET

Table 3 predictive features considered to develop the model for PMV, SET and TS

We used the selected best features to develop the predictive model for PMV, SET, and TS throughout. In addition, this procedure allows updating the model with new features and finding the optimum temperature set point for the air-conditioning system.

Let T = {((outdoor temperature, outdoor humidity, indoor temperature, indoor humidity, temperature setpoint, number of occupants, HVAC operation, Air velocity, Clothing insulation), (PMV, SET, TS))} = {(\(x\epsilon {\mathbb{R}}\)d, \(y\epsilon {\mathbb{R}}\))}. \({\mathbb{R}}\)d represents the d dimensional, the feature vector, and \({\mathbb{R}}\) the individual response (PMV/PPD/Set/TS).

2.3 Development and selection of Machine Learning (ML) model

We have tested different regression algorithms for predicting the PMV and the SET. Testing the different types of machine learning algorithms to predict the 7-point TS scale represented in the labels helps find how the occupants feel subjectively, as shown in Table 2. Here, we tested (SVR) with Gaussian kernel function (Megri et al. 2005), decision Tree (Loh 2002), bagged Ensemble with tree learner (Dietterich 2000), and ANN with Bayesian regularization learner (Liu et al. 2007). The tested classification algorithms are Decision tree, bagged ensemble tree and ANN with a scaled conjugate gradient backpropagation for the TS prediction (Anyanwu and Shiva 2009). We considered a k-fold validation with the Bayesian hyper-parameter optimization technique to select a suitable parameter to develop the models of the predictive thermal indices during training. The k-fold cross-validation technique helps to avoid model overfitting.

In k-fold cross-validation, the condition splits the samples into k groups. The model is used to train the k − 1 group, used the kth group to validate the model, and similarly repeats the step for all the groups. It helps to use all the data sets for training and validating and finally averages the error obtained from each training to validate the model. In this work, the tenfold cross-validation (Wu et al. 2018) has been used to evaluate the model. Initially, the data was divided into two sets: training (80%) and testing set (20%). The training data set was used to train the ML models with the k-fold cross-validation with a parameter optimization technique and used the testing data set to validate the model performance based on the standard statistical error evaluation methods. Finally, it compares all the developed machine-learning models based on the standard statistical evaluation technique and helps to select the best suitable model. After obtaining the best suitable thermal comfort model, retrain the model with the best hyper-parameter obtained and the original data (the training and the test data) for the final ML model to predict the thermal comfort.

2.3.1 Bayesian hyper-parameter optimization

While developing the machine-learning model, the hyper-parameter plays the leading role in improving the model’s performance. The hyper-parameters are input into any machine-learning model, which generates its parameters to influence the values of the said created settings to make the model more accurate. ‘Parameters which define the model architecture are referred to as the hyper-parameter, and thus this process searching for the ideal model architecture is referred to as parameter tuning’. This work considered the Bayesian optimization algorithm (BOA) to tune the machine learning hyper-parameter (Gelbart et al. 2014), which helps auto-select the best suitable parameters by minimizing the objective function. The objective function considered to tune the parameter for the regression-based model is the Mean Square Error (MSE) (Eq. 3) and the classification error (Eq. 4) (percentage of misclassification) for the classification model. The BOA has an advantage over the other traditional optimization technique for optimizing the hyper-parameter because of its fewer iteration and the faster search speed during optimization. Figure 4 shows the pseudocode for the Bayesian optimization algorithm for optimizing the hyper-parameter. A more detailed explanation of the BOA is available in the literature, including authors publications (Snoek et al. 2012; Gelbart et al. 2014; Kumar et al. 2020a, b). This work finally selects the ANN model based on its statistical error performance and ranking obtained from the standard hypothesis testing for both the regression and the classification-based models. For the ANN model, the hyper-parameter considered for optimization is the number of neurons (1–100) with a single hidden layer and the learning rate (0–1). Finally, based on the optimization, selects the optimum number of neurons and learning rate to make the thermal comfort models. This study considered the log-sigmoid function in the hidden layer and used the Bayesian regularization-training algorithm to train the model. Equation (2) shows the primary neural network mathematical model used in this study, and Fig. 5 shows the neural network structure of the models for different comfort indices, where w and b represent the weight and the bias.

$$\widehat{\text{Y}}= \text{ } {\text{w}}_{\text{output}} \, \left\{\text{ 1 } + {\text{exp}}\left[\text{ } - \left( \, {\text{w}}_{\text{hidden}}\text{ X + }{\text{b}}_{\text{hidden}}\right)\right]\right\}\text{-1 + }{\text{b}}_{\text{output}}$$
(2)
$$MSE= \frac{1}{N}\sum_{i=1}^{N}{\left({y}_{i}-{\widehat{Y}}_{i}\right)}^{2}$$
(3)
$${\text{Percentage Error}} = \sum\limits_{{{\text{i}} = 1}}^{{\text{N}}} {\frac{{{\text{FP}}_{{\text{i}}} + {\text{FN}}_{{\text{i}}} }}{{{\text{TP}}_{{\text{i}}} + {\text{TN}}_{{\text{i}}} + {\text{FP}}_{{\text{i}}} + {\text{FN}}_{{\text{i}}} }}}$$
(4)

where \(\widehat{Y}\) represents the predictive thermal comfort output, X the features selected to develop the model, \({\mathbf{w}}_{{\varvec{o}}{\varvec{u}}{\varvec{t}}{\varvec{p}}{\varvec{u}}{\varvec{t}}}\) the output weight matrix, \({\mathbf{w}}_{{\varvec{h}}{\varvec{i}}{\varvec{d}}{\varvec{d}}{\varvec{e}}{\varvec{n}}}\) the hidden layer weight matrix and \({\mathbf{b}}_{{\varvec{o}}{\varvec{u}}{\varvec{t}}{\varvec{p}}{\varvec{u}}{\varvec{t}}}\) the bias in the output layer.

Fig. 4
figure 4

A pseudocode for Bayesian Optimization (Snoek et al. 2012)

Fig. 5
figure 5

Sample structure of the neural network a 7-point TS prediction multiclass neural network model b PMV prediction neural network model

2.4 Model evaluation

This section discusses the different model performances for the selected features from the NCA. The regression-based model performance is assessed based on MSE, RMSE, MAE, and MAPE (Wu et al. 2018). Also, we used the error histogram to visualize the model performance based on the minimum and the maximum error levels and the effectiveness of the model's prediction with the lowest error. The model with low MAE/RMSE/MAPE indicates that the prediction accuracy is good, and the model with the high R2 indicates a higher efficiency of the model. The standard statistical metrics were used to evaluate the classification model, and the confusion matrix provides the information about each class’s correct and incorrect prediction. The confusion matrix helps estimate each category’s performance measure based on precision, recall, and F1-score. The diagonal element in a confusion matrix gives the correct predicted responses and is represented in a metric called accuracy—the higher the accuracy, the better the model performance. The off-diagonal elements give the misclassification error rate represented in percentage error. The most commonly used metric to evaluate the classification model performance is the area under the curve (AUC) in the range of 0–1, and the higher the value better the version of the model. The receiver operating characteristics (ROC)-AUC is mainly evaluated based on the true positive (sensitivity) versus the false-positive rates (specificity).

2.4.1 Model selection

We considered the multiple algorithms (SVM, Decision tree, Ensemble bagged tree and ANN) to develop the predictive models. Hence, for the comparison of various algorithms with multiple data set we considered ANOVA. As per the literature, Friedman’s /Kruskal Wallis tests (Alpaydin 2014) are better to find at least one algorithm that performs significantly different than the other algorithms. Moreover, these tests rank the algorithms from the best performing one to the poorest one (Alpaydin 2014). In the second step, we applied the pairwise test with the corresponding post-hoc correction for multiple comparisons to obtain a significant difference between the algorithms.

However, sometimes it is preferred to perform a test to determine which pairs of the column effect are significantly different. Therefore, in this work, we considered Friedman's test to check the significant difference between the algorithms, the Fisher's the least significant post hoc test to check for pairwise comparison of the ranks, used the 5 × 2 cross-validation losses for testing the hypothesis, and 5% significance level to reject the null hypothesis. Finally, model one that reached grade 1 was selected as the best performing model for the final production.

The goal was to test the null hypothesis for the significant difference between the algorithms and analyze all the pairwise comparisons. Therefore, the first hypothesis was to try whether all the algorithms performed equally or, in contrast, some of them had significantly different behaviour. When all the differences were tested for every pair of algorithms, the resulting p-values showed a significant difference between the two algorithms. The details of the model selection and the comparative analysis is given in the result analysis section.

3 Result and analysis

This section presents the performance of PMV, SET, and TS prediction models developed to estimate the comfort level of the occupants. We analyzed the prediction capability of the SVR, Tree, Bagged Ensemble Tree, and ANN model approach using the correlation plot of True response values against the predicted response values. Based on the results illustrated in Tables 4 and 5 we can infer that the proposed ANN model outperforms the other machine learning methods in terms of accuracy, linear relationships and ranking. Figure 6 shows the confusion plot of true class versus the predicted class of seven points TS scale for SVM, Tree, bagged ensemble tree, and ANN models, and helps to find the best suitable model for predicting the TS. The confusion matrix gives information about the probability of correct and incorrect classification of observations. Table 7 shows recall, precision, and F1-Score for different classification models. The BO select the suitable hyperparameters automatically to develop the ML models by minimizing the objective function, and it is one of the techniques used for the automatic selection of hyperparameters to the ML algorithms.

Table 4 Statistical performance of the SVR, Tree, Ensemble Tree, and ANN model for PMV prediction
Table 5 Statistical performance of the SVR, Tree, Ensemble Tree, and ANN model for SET prediction
Fig. 6
figure 6

A confusion matrix of the true class versus the predicted class for different models

The model with the highest precision, recall, and accuracy represents the performance of a good model. This estimate the quality of the classifier (individually) based on sensitivity and specificity for each class, and the ROC applies threshold values across the interval [0, 1] to the outputs. Tables 6, 7 and 8 indicate that the ANN model outperforms all the other models. The AUC is shown in Table 8, which indicates the overall quality of the classifier, the higher the AUC indicates, the better the performance. Finally, we conducted hypothesis testing to estimate the performance of different models quantitatively and select the best performing model based on the ranking obtained from Friedman's rank test. The quantitative estimation considered training, testing, and 5 × twofold cross-validation losses for the hypothesis testing. Tables 9 and 10 show the results of the statistical testing for the PMV model, indicate a significant difference between the performances of algorithms and Friedman’s ranking, and show how the ANN model stands in rank 1.

Table 6 Statistical performance of the SVR, Tree, Ensemble Tree, and ANN model for TS prediction
Table 7 Precision, recall, and F1-score for thermal comfort model
Table 8 AUC numbers which help to measure the quality of the classifier
Table 9 Average rank position for four different algorithms determined during Friedman’s test
Table 10 A comparison of results with the corresponding group names

The accepted error level for this model is ± 0.1, and based on this error level, the selected ANN predicts 100% of the instances within the accepted error level. Finally, the maximum and the minimum error levels for the ANN are less than the other models.

This work selects the ANN model to predict the PMV, which helps find the occupants' thermal comfort. The chosen model gives the lowest MAPE of 0.56 and the highest coefficient (R2) of 100%, as shown in Table 4. Although the other models also give the highest ratio (R2), the MAPE is high compared to the ANN model.

The accepted error level for this model is ± 0.5 °C, and based on this, the selected ANN predicts 100% of the instances within the accepted error level, and similarly, the SVR, tree, and bagged ensemble tree models predict 95.93%, 76.51%, and 80.72% of the instances, respectively. Finally, the maximum and the minimum error levels for the ANN are less than the other models.

The selected model gives the lowest MAPE of 0.5326 and the highest coefficient (R2) of 99.98%, as shown in Table 5. The confusion matrix analysis helps select the best suitable model to predict the thermal sensation, which helps to find how the occupant feels in an indoor environment. Even though the SVM and the ensemble tree models perform well with an accuracy of 91.53% and 93.89%, respectively, for all the classes from − 3 to 3, the sensitivity (Recall) of the individual classifier classifies the SVM model cold condition 70% and 24.5% incorrectly as cool and slightly cool and classifies 32.5% correctly as cold. In addition, the cold state misclassified 61.6% and 2.6% incorrectly as cool and slightly cool for the ensemble tree model and classified 35.8% correctly as cold. However, the misclassification happens in the other classes too and adequately have a high chance to mispredict cool, slightly cold, warm, and somewhat. Therefore, when compared the ANN with other models, we observed that the six categories (Hot to Cool) are predicted with the highest probability of greater than 0.9. The cold level classifies 26% incorrectly as cool, and 74% classifies correctly, with an accuracy of 98.71% for all classes from − 3 to 3.

Finally, since it is impossible to conclude the model’s performance only with the confusion matrix, we did a ROC-AUC analysis, which helped find the classifier’s quality. Table 8 shows the AUC numbers, which help measure the classifier’s overall quality in which the higher the AUC value indicates, the better the performance. Based on the AUC-ROC comparison found that the ANN model performed better than the other algorithms. During the comparison of the ANN model with others, the ROC curve showed that all the classes of the ANN model (7-classes: ‘Cold’, ‘Cool’, ‘Slightly Warm’, ‘Comfort’, ‘Slightly Warm’, ‘warm’, ‘hot’) were close to 1, and its overall quality is given in AUC measurement. Finally, Tables 6, 7 and 8 indicate that the ANN model performs better than all the other models.

3.1 Testing the hypothesis

Once again, we conducted Friedman’s ANOVA test to compare the performance of different models for the confirmation of the final model. Besides, the ranking established in Table 9 indicates that the ANN model performs better than all the other models.

Secondly, we conducted Fisher’s pairwise comparison test to estimate the significant difference between the performances of the multiple algorithms. Table 10 shows the significant difference in performances of various algorithms. The p-value column helps to find the considerable difference between the two algorithms; if the p-value is less than or equal to 0.05, it indicates rejecting the null hypothesis and suggests that there is a significant difference between the algorithm or otherwise does not reject. The ANN model is significantly different from the SVM, ensemble tree and decision tree model, but the p-value of 0.299 refers to accept the null hypothesis. As per the ranking, ANN is found in 1st place, hence selected the ANN as the final model. Similarly, the SET and TS models also conducted hypothesis testing and found that the ANN performed better than the other models. The p-value 1.3800e−06 and 2.39e−10 obtained for the SET and TS indicate a significant difference between performances of the algorithms.

3.2 Pre-emptive control of A/C setpoint

The real-time ML-based predicted PMV model helps to find the temperature setpoint to the A/C system automatically. Here, PMV is the reference to estimate the setpoint temperature to the A/C system to maintain thermal comfort to occupants present in an indoor environment. The conditions considered to estimate the temperature setpoint are (1) initially considered the PMV range between − 0.5 to 0.5 as the neutral represents the comfort, and initialize the human parameters difficult to monitor implement in a real-time environment. (2) If PMV ranges exceed 5, then reduce the temperature setpoint 1 °C and estimate the PMV by considering setpoint temperature is similar to room temperature because in A/C rooms, maintain room temperature as similar to setpoint shown in data collection section. (3) Similarly, if PMV ranges less than − 0.5, then increase the temperature set point to the A/C system and consider it input to the PMV model rather than air temperature to estimate the occupant comfort. After setting the temperature setpoint to the A/C system, wait for 5–10 min and monitor the room temperature, other measurable sensor data (because after changing the setpoint to the A/C system takes a minimum of 5–10 min to maintain room temperature) and use as the input to the predictive model and estimate the PMV to find the comfort to the occupants. Figure 7 shows the flowchart for the estimation of temperature setpoint to the A/C system.

Fig. 7
figure 7

Flowchart for A/C temperature setpoint estimation leading to thermal comfort

4 Conclusion

In this work, we have derived machine learning-based models for predicting the thermal comfort of the occupants. Here, we considered the models PMV, SET, and TS using different algorithms and identified the best model using different performance measures. Finally, the work proposed NCA-BOA based ANN model to predict the thermal indices. This work considered the NCA feature selection method to select the most suitable feature influence on the response and used the Bayesian hyper-parameter optimization to tune the parameters during the development of the machine-learning model. We considered a regression-based strategy to develop the predictive model for PMV, SET and considered a classification algorithm to develop the predictive model for the 7-point TS. Initially, we considered four different algorithms to develop the predictive models SVM, DT, ET, and ANN. After training the data, here considered the statistical evaluation metrics to find the performance of all the four algorithms and conducted the hypothesis test to evaluate the performance of multiple algorithms. This methodology helps select the best performing algorithm and find a significant difference between the performances of the different algorithms. Besides, for testing the hypothesis, considered Friedman’s ranking and Fisher’s least significance test. Finally, the statistical error test results and Friedman's ranking test specify that the performance of the recommended ANN model is highly accurate and more reliable in predicting the thermal indices.

Table 11 quantifies the features of the proposed model, in which we highlight the adoption of the feature selection, parameter tuning method and hence it is possible to obtain high accuracy compared to the that of the models in recent literature.

Table 11 Summarized features of the model

The predicted PMV is further used for the estimation of setpoint leading to occupant thermal comfort. For a building space to be ASHRAE 55 compliant, the PPD value should be below 20%, and PMV should be in the range of − 0.5 to + 0.5. The thermal conditions of the room for different humidity ranges were observed. An operative temperature of 25 °C was found to satisfy relative humidity values of 61%, 71%, 81% when airspeed was 0.2 m/s, and clothing level was 0.6, and the metabolic rate was set as 1.2 met. The PMV and PPD values were − 0.31 and 7%, respectively, when the humidity was set as 61%, and operating temperature as 25˚C. The PMV and PPD values were − 0.14 and 5%.for the same conditions; the humidity was 81%. For higher humidity ranges, the more comfortable cooling setpoint would be 24 °C. A cooling setpoint of 25 °C would be satisfactory for low humidity values. Each 1 °C changes in setpoint lead to 3–6% energy consumption.

Finally, here implemented the model in a test room with the myRIO platform; the model is verified with subjective measures with an error rate of 5% and verified the setpoint changes manually. The proposed model is not universal; it can be made universal by retraining with a global dataset under different climatic conditions and locations. In machine learning model development, the time and space constraint of the models vary based on the training data, processor, system specification high end and highly processed or GPU based systems take less time to train even more complex models); in this work, we have not focused on time and space complexity.