1 Introduction

The discipline of Artificial Intelligence that describes the manner in which knowledge from videos or images to computers is termed as Computer Vision. This inter disciplinary branch enables the machines to see the real world just as a human eye can do in co-ordination with the brain [1,2,3]. It extracts the features from the images, analyses the images using the extracted features and finally derives useful knowledge automatically. This is all totally done with the help of mathematical techniques in the background [4]. The source of information for this computer vision system in often images or videos from single or several cameras. The data could also be multi dimensional like the data from medical imaging devices [5]. These systems finds several applications in various domains like identification (fingerprint, iris, face, voice recognition and so on), manufacturing industry which can use it to detect faults, robotics for building robots for various purposes, surveillance, medical image processing (cancer, neurological disorder detection and so on), unmanned vehicles and so on. The objective of such systems would be to decide whether the target image belongs to a particular class or not. This decision is made by using several steps like image acquisition, preprocessing, feature extraction, segmentation, image recognition, image registration and decision making. In this paper, few of the aforementioned steps are used for image classification. The remaining sections of this paper are organized as follows. Section 2 discusses literature review. Section 3 proposes the novel framework. Section 4 discusses the findings and finally conclusion summarizes the entire paper.

2 Literature Review

Table 1, summarizes the literature based on image classification using convolution neural network for feature extraction. The review clearly highlights that the researchers are recently focusing on feature extraction using convolution neural network. The benefit of this approach is that huge volume and wide variety of images can be used to train the convolution neural network to extract the features. Thus the extracted features would be of good quality and thus result in a better classifier model. Also the enormous time and hardware required to learn from the huge volume of images can be drastically reduced by employing a pre-trained convolution neural network for feature extraction. Fuzzy logic based classification is becoming popular recently due to its capacity to handle uncertainty and produce highly interpretable knowledge.

Table 1. Summary of literature.

3 Proposed Framework

The proposed framework is depicted in Fig. 1. The Caltech-101 dataset [20] contains images of 101 categories of objects. ResNet [21] is the Convolutional Neural Networks trained on ImageNet dataset, that has 1000 categories of object and 1.2 million images. When the Caltech-101 dataset is given as input to Resnet-50, the features are extracted. Then, Caltech-101 is partitioned into training and test datasets which comprises the features extracted using Resnet-50. The training dataset is given as input to the Brain Storm optimization [22] algorithm which derives the optimal rule base for the image classification. The test set is then classified using the Fuzzy Inference System and the optimal rule base.

Fig. 1.
figure 1

Proposed framework

The Brain Storm optimization [22] algorithm, described below, is customized according to the proposed framework, to produce optimal rule base. The brainstorm optimization algorithm is preferred than other algorithms like Genetic algorithm, Particle Swarm Optimization and so on. This is because brainstorm optimization algorithm is based on brain storming activity done by humans, whereas other algorithms are based on the social behavior of birds, ants, etc. in Particle Swarm Optimization, Ant Colony Optimization respectively. Since humans are considered to be the superior most in the ecology, the brainstorm optimization algorithm is assumed to produce the best results.

figure a

The individuals in the population of the brainstorm optimization algorithm are called as ideas. The ideas are represented as vectors for easy of computation. The idea vector consists of ‘m + 2’ elements, where ‘m’ represents the number of features extracted by Resnet-50, (m + 1)th element represents image class and the (m + 1)th element represents the AND or OR method used by the Fuzzy Inference System. Initial population consists of ‘n’ ideas.

The fitness function for the brainstorm optimization algorithm to generate optimal rule base is designed based on two factors, namely, length of the rule and adaptiveness of the rules to the training dataset. Adaptiveness of the rules is described based on how well the rules match the training dataset. Generally optimal rules are those which are having small length and more adaptive. Hence the fitness function is inversely proportional to rule length and directly proportional to adaptivity. The below Eq. (1) describes the proposed fitness function.

$$ {\text{F}} = \left( {{\text{w}}*{\text{m/l}}} \right) + \left( {{\text{w}}*{\text{r/p}}} \right) $$
(1)

Where, ‘w’ is a constant deciding the weightage of the length of the rule and adaptiveness factor. Generally ‘w’ takes the value 0.5 indicating that both the factors are of equal weightage. ‘m’ represents the number of features extracted by Resnet-50, ‘l’ represents the length of the rule generated by brain storm optimization algorithm, ‘r’ represents the number of rules matching the training dataset and ‘p’ represents the total number of instances in the training dataset.

The ‘e’ and ‘k’ constants, which ranges between 0 to 1, are experimentally decided. The input probabilities P5a, P6b, P6c are randomly chosen between 0 to 1. Initially ‘n’ ideas are generated and clustered into groups. The cluster center is decided based on the fitness value of the ideas in the cluster. New ideas are generated and worst ideas are replaced with them. This is repeated until the maximum fitness value is achieved for the cluster centers. These cluster centers form the optimal rule base for the proposed framework.

4 Results and Discussions

The features are extracted from Caltech-101 dataset using a traditional Local Binary Pattern (LBP) [23] approach and classified using traditional classifiers like support vector machine, naïve bayes, decision tree and k-nearest neighbours. Then the features are extracted from Caltech-101 dataset using Resnet-50 and classified using Fuzzy Inference System.

The ‘e’ and ‘k’ values for the brain storm optimization algorithm are experimentally decided. Figure 2 depicts that the average accuracy of classification is constant for ‘e’ values in the range 0.3 to 0.7 and Fig. 3 depicts that the average accuracy of classification is constant for ‘k’ values in the range 10 to 30. Thus ‘e’ and ‘k’ are chosen in this range.

Fig. 2.
figure 2

Comparison of accuracy for various ‘e’ values

Fig. 3.
figure 3

Comparison of accuracy for various ‘k’ values

The accuracy of the classifier is defined as the ratio of correctly classified images to total images in the dataset. The k-folds average accuracy of the above combinations of feature extraction and classification in tabulated in Table 2. From the results, it is evident that the proposed framework outperforms the traditional feature extraction techniques based classification.

Table 2. Comparison of accuracy.

5 Conclusion

A novel image classification framework has been proposed. The framework employs a pre-trained convolution neural network for feature extraction. Brain Storm Optimization algorithm is designed to learn the classification rules from the extracted features. Fuzzy rules based classifier is used for classification. The proposed framework is applied on Caltech 101 dataset and evaluated using accuracy of the classifier as the performance metric. The results demonstrate that the proposed framework outperforms the traditional feature extraction based classification techniques by achieving better accuracy of classification.