1 Introduction

Recently, deep convolutional neural networks (CNNs) have been overwhelmingly successful across a variety of visual perception tasks. LeNet5 in [1], designed by Yann LeCun and Yoshua Benfio in 1998, is considered as the beginning of CNN. Over the past several years, many successful CNN architectures have emerged, such as AlexNet [2], VGG [3], GoogLeNet [4], ResNet [5, 6], MobileNet [7], and DenseNet [8], etc. Most deep neural networks are trained by the gradient descent (GD) based algorithms and their variations [1, 3]. However, it is found that the gradient descent based algorithm in deep neural networks has inherent instability. This instability blocks the learning process of the previous or later layers. Though CNN has good performing result, it needs much professional knowledge to use and it takes a lot of time to train.

In this paper, we proposed a method combines Gabor kernel [9], random kernel and pseudoinverse kernel. It is corresponding to multiple convolutional kernels. Gabor feature from Gabor kernel is a kind of handcraft feature which is faster obtained than learned features. In paper [10], perturbation layer is an alternative of convolutional layer. Their theoretical analysis shows that the perturbation layer can approximate the response of a standard convolutional layer. Inspired by perturbative neural network, a kind of random kernel with the same size of input data was proposed. Pseudoinverse learning algorithm was proposed by Guo et al. [11,12,13]. It’s a fast feedforward propagation algorithm. In our method, a random weight was used as the input weight of the pseudoinverse learning algorithm. As a result, the training time is reduced significantly and the random weight can regulate the whole model.

Our model combines multiple fixed convolutional kernels, such as Gabor kernel, random kernel and pseudoinverse kernel. The parameters of convolutional kernel can be obtained without iteration. Therefore, the training process is accelerated. Moreover, the variant kernels contribute to variant image features which facilitate the recognition task. Instead of using gradient-based algorithm, pseudoinverse learning algorithm was used to speed up the training process significantly. three base learner were trained, then feed their prediction to the meta learner to obtain the final result. Our model was tested on MNIST and CIFAR-10 datasets without using GPU. The experimental results show that our model is better than existing benchmark methods in speed, at the same time it has the comparative recognition accuracy.

2 Related Work

Recently, random feature has attracted researchers’ attention. Random feature shows its significant success in many research fields. The test of time award paper in NIPS 2017 [14], presented two method, random Fourier features and random binning features to map the input data to random features. Random feature mapping speeds up the training of large-scale kernel methods. Perturbative Neural Networks [10] presented a perturbative layer as the alternative of convolutional layer. The perturbative layer computes its response as a weighted linear combination of non-linearly activated additive noise perturbed inputs. The input data added a random and fixed noise is a kind of random features. The perturbation layer in [10] shows that maybe the convolutional layers are not necessary to be learned from input image. Perturbative Neural Networks performs as well as standard convolutional neural network.

Pseudoinverse learning algorithm was originally proposed by Guo et al. [11,12,13], which is a kind of fast feedforward training algorithm. As a variant of pseudoinverse learning algorithm, pseudoinverse learning autoencoder [15] is a useful method to train the multiplayer neural network.

Our previous works include combining handcraft features with pseudoinverse learning algorithm [16, 17]. These works perform well in terms of training time, however, it’s not satisfactory in accuracy especially on complicated data sets. In this paper, we proposed a method combining multiple fixed convolutional kernels, using pseudoinverse learning algorithm to accelerate the training. Our method performs better than other baseline method in speed, and obtains comparable accuracy. Meanwhile, our proposed method does not need large compute resource. It can meet the need of edge learning.

3 Proposed Methodology

3.1 Gabor Kernel

The base learner 1 was presented as shown in Fig. 1. The input image is extracted features by Gabor kernels firstly and then trained by PIL1 [18]. PIL1 is original pseudoinverse adding Gaussian noise perturbation matrix. Gabor kernel is corresponding to convolutional kernel and Gabor feature is corresponding to convolutional feature as shown in formula (1),

Fig. 1.
figure 1

Gabor kernels are variant, the Gabor kernels is set in advance.

$$ {\mathbf{I}}_{G} = {\mathbf{I}} \oplus {\mathbf{G}}, $$
(1)

where I is the grayscale distribution of the image, IG is the feature extracted from I, “\( \oplus \)” stands for 2D convolution operator, G is the defined Gabor kernel. As a kind of handcraft feature, Gabor feature can be obtained faster than learned features. Meanwhile, multiple Gabor features will facilitate the recognition.

3.2 Random Kernel

The second base learner was shown in Fig. 2. The PIL1 part is as same as demonstrated in Sect. 3.1. The difference is on the front part. The input image was added with a random kernel, which has the same size as the input image. The values in random kernel are derived from specific distribution. Gaussian distribution and uniform distribution both work well. It’s better to control the mean of extracted values is zero [19, 20]. At the same time, the noise value should be small, otherwise, the original information in input is covered by heavy noise. The features added noise are activated by RELU. Then features are combined by linear weight. The obtained feature is as follow,

Fig. 2.
figure 2

Random kernel is a random noise matrix with the same size of the input image.

$$ F = \sum\nolimits_{i = 1}^{q} {W_{i} *f_{relu} (X + R_{i} ),} $$
(2)

where, q is the number of random features, R is the random kernel matrix.

Random features are obtained by adding random noise to the input image. This is the simplest and fastest way to get random features. Moreover, adding noise to the input data in neural network can regulate the performance.

3.3 Pseudoinverse Kernel

The third base learner was shown in Fig. 3. The input image is sent to PIL0 [18]. The input weight of PIL0 is a random weight whose values are within a small scale, such as [−1, 1]. The number of input data is n, and the number of hidden neurons is p (p <= n). The size of random input weight is n * p. The random input weight is usually gotten from the Gaussian distribution and uniform distribution, and it’s necessary to restrict the mean value to be zero and variance to be limited [19, 20].

Fig. 3.
figure 3

Pseudoinverse kernel. The input weight V is obtained randomly.

3.4 Ensemble Model

Our proposed ensemble model is shown in Fig. 4. The ensemble model contains three base learners, combining Gabor kernel, random kernel and pseudoinverse kernel. The training data set is extracted from the original sample data set. \( N_{t} \) Training samples will be sampled from the original sample set using the Bootstrapping method. We obtain three training data sets which are independent of each other. Each training data set corresponds to a base learner. The prediction from the three base learners is fed to a meta learner to get the final result. The meta learner here is a multilayer neural network. The multilayer neural network was trained by pseudoinverse learning algorithm.

Fig. 4.
figure 4

Ensemble model. The model contains three base learners, combining Gabor kernel, random kernel and pseudoinverse kernel.

4 Performance Evaluation

We use our proposed method to classify real world datasets, including MNIST dataset and CIFAR-10 dataset. All the experiments are conducted on the same hardware computer with Core i7 3.20 GHz processors.

4.1 MNIST Dataset

The Table 1 shows comparison results of our method and other benchmark methods on MNIST dataset. In this experiment, Three training data sets are extracted from original data. Each training data set has 50000 samples. In the first base learner, Four Gabor kernels chose from Gabor kernel bank are used to obtain Gabor features. The number of hidden neurons in PIL1 is 576. In the second base learner, the random noise is extracted from uniform distribution with scale (−0.05, 0.05). Four variant fixed noise matrixes are added to training data separately. The activation function is RELU. Then, they are added linearly with average weights. In the third base learner, the input weight of PIL0 is extracted from uniform distribution randomly. In this experiment, 5 and 10 convolution kernels are used in different layers in LeNet5. MLP uses one hidden layer and the number of hidden neurons is 300. PILAE has one encoder layer in this experiment. The same Gabor features are used in GF + PILAE and the method proposed in this paper.

Table 1. Performance comparison on MNIST dataset.

From the experimental results shown in Table 1, we can see that the proposed method is faster than other methods, and the accuracy is comparable to other methods.

4.2 CIFAR-10 Dataset

The Table 2 shows comparison results of our method and other benchmark methods on CIFAR-10 dataset. In this experiment, each training dataset has 50000 samples. In the first base learner, four Gabor filters are used. The number of hidden neurons in PIL1 is 789. In the second base learner, the random noise is extracted from uniform distribution with scale (−0.25, 0.25). In the third base learner, the input weight of PIL0 is extracted from uniform distribution randomly with scale (−0.5, 0.5). In this experiment, LeNet5 has 20 and 50 kernels in different layers. MLP uses one hidden layer and the number of hidden neurons is 2000. PILAE has three layers in this experiment. The same Gabor features are used in GF + PILAE and the method proposed in this paper.

Table 2. Performance comparison on CIFAR10 dataset.

From the experimental results shown in Table 2, we can see that the proposed method is superior than other methods in speed, and is comparable to other methods in accuracy.

4.3 Discussion

From the results shown in Tables 1 and 2, we know that our method is better than other methods in speed. Our method combines multiple kernels including Gabor kernel, random kernel and pseudoinverse kernel, which corresponded to Gabor convolutional kernel, random convolutional kernel, and pseudoinverse convolutional kernel. Gabor feature is handcraft feature which is easier to obtain than learned features in terms of time. Adding noise is the easiest way to get random features which will speed up the training. Moreover, adding noise to the input data is a way to regularize the whole network. The pseudoinverse algorithm is a feedforward way to train network. Instead of backpropagation algorithm, the pseudoinverse algorithm does not need repeated iteration. The input weight in pseudoinverse learning network was set as random weight, which will speed up and regularize the whole network. Our method performs well on MNIST dataset in speed and accuracy. On CIFAR-10 dataset, the method performs well on speed, however, the accuracy is not good enough. The reason may be that we only use three layers in pseudoinverse learning network. The network is not deep enough to get better results. In the future, we will design more complicated network architecture to improve the classification performance on the complicated image.

5 Conclusions

In this paper, a method is proposed to improve the performance of image classification task. The classification model contains three base learners, taking the advantages of Gabor convolutional kernel, random convolutional kernel and pseudoinverse convolutional kernel. The multiple convolutional kernels generate variant submodules. It satisfies the diversity of ensemble learning requirements. In the proposed model, the convolution kernels are set manually, without iteration. Instead of gradient descent based backpropagation algorithm which is time consuming, the full connection layer is trained by pseudoinverse learning algorithm. All the pseudoinverse algorithm in submodules use random input weight, which is extracted randomly from uniform distribution. The Gabor kernels, random kernels and random input weight all speed up the training process. The performance of our model was tested on some benchmark datasets such as MNIST, CIFAR-10 without using GPU. The results show that our model is superior to other models in learning speed and the learning accuracy can be compared with other models.