Abstract
We introduce a novelty in the method of the deep salient wood image quality assessment (DS-WIQA) for no-reference image quality assessment (NR-IQA). We exploit a five-layer deep convolutional neural network (DCNN) for the salient wood image map. DS-WIQA also employs the n-convex-concave model. The outcomes obviously prove that our DCNN and DS-WIQA architectures can deliver a superior achievement on Zenodo and Lignoindo datasets, respectively. We compute a salient wood image map of each wood image in small wood image patches. Our exploratory outcomes evince that the proposed DCNN and DS-WIQA methods are superior to other the advanced methods on Zenodo and Lignoindo datasets, respectively. Our proposed DCNN for NR-IQA also obtains a better result compared with the other NR-IQA methods in the five distortion types of JP2K, JPEG, white noise Gaussian, blocking artifact, and the fast fading and also in the undistorted wood images. Our DCNN outruns the recent most sophisticated methods in terms of SROCC and LCC evaluation, respectively. DS-WIQA outpaces other the advanced methods by \(0.38\%\) and \(0.22\%\) greater than our proposed DCNN, and \(34.84\%\) and \(30.15\%\) greater than other methods with respect to SROCC and LCC, respectively. In computational complexity of our proposed DCNN and DS-WIQA cut down the shift add operation in exponential, logarithmic, and trigonometric functions. DS-WIQA shows up to be more significant than our proposed DCNN and the other DCNN methods.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Wood species recognition is still a new discovery in the computer vision which has a challenging task for the well-trained experts to study the characteristic on the wood surfaces under the macroscopic and microscopic views. The wood image quality intensely depends on the wood capturing quality. Many objective image quality assessment (IQA) methods propose to codify image quality. If we use a full-reference image quality assessment (FR-IQA), the observer can better assess the image by considering the distorted and undistorted image. We assess the wood image quality from no-reference wood image.
In the study of [1], they observed two types of IQA methods. They use the distorted image which causes Gaussian white noise or Gaussian blur and also human visual system (HVS) method. We propose the problem solving of that two points by combining deep convolutional neural networks (DCNNs) as a sophisticated method with saliency map. The IQA methods were mentioned in the distortion type. The study of [2] offered a NR-IQA method for JPEG2000 compression by associating a couple Gaussian mixture and wavelet coefficient. The most recent study observes the more distortion type and also the unknown distortion type.
NR-IQA methods can be restricted into natural scene statistics (NSS) and the training-based methods. In NSS, the distorted image can be detected in the undistorted image as mentioned in [2,3,4,5,6,6]. In the training-based method, we study the features learned from images where the classifier is trained. The training-based method can be recognized as the conventional machine learning assessment [7,8,9,10,11]. The conventional CNN method extracts the image features in recognition and training-based of the IQA. CNN technique concerns in the object classification [12], age and gender recognition [13], or fashion recognition [14]. Many CNN outcomes have been established to NR-IQA and accomplished the advanced outcomes [7,8,9] and also in the feature derivation [12, 13, 15, 16].
We propose a deeper CNN, which combines with salient wood image map, namely, deep salient wood image-based quality assessment (DS-WIQA). DS-WIQA architecture has five convolutional layers in our proposed DCNN, which is deeper than AlexNet which has three convolutional layers [17]. Compared with a closely DCNN, AlexNet [17] does not fit the deeper training model. DS-WIQA uses the convex and concave n-square methods for the salient wood image map. The saliency map of CNN-based in [1] did not analyze HVS into IQA. While in the works of [7, 8], all of images can be extracted to many image patches. In Fig. 3, HVS codifies the wood images quality. Unfortunately, it is difficult to perceive the difference between one and the other wood image patches. It causes a low quality assessment to all patches within a wood image. To introduce HVS, DS-WIQA combines the proposed DCNN which has the convex and concave n-square methods for the salient wood image map method. The more closer of our studies [9, 18] exploit a gradient map as wood’s image patch court. The others, [19, 20] calculates saliency map for each image and its saliency patch score of each patch.
We experimented with a proposed DCNN algorithm on Zenodo wood species [21] and Lignoindo [22] datasets. Our experiment employs Spearman’s rank order correlation coefficient (SROCC) and the linear correlation coefficient (LCC) scores, respectively. Our outcome shows that our salient wood image maps can improve DCNN in NR-IQA. To validate the work of our DS-WIQA, we also analyze a DS-WIQA model on the Zenodo dataset and spread it on the Lignoindo dataset for cross-dataset evaluation. Our DS-WIQA obtains the advanced outcomes on Zenodo and Lignoindo datasets.
2 Related work
The combining NSS-based NR-IQA method explores to capture statistical properties of the undistorted wood images. To evaluate the NSS performance, most algorithms formulate the distributions or train a model. In the study of [6], the NSS is evaluated on a set of wavelet coefficients. They identify the distortion image before applying a distortion classifier. The other case, method in [2] transforms each image using discrete cosine transform (DCT) and the resulting coefficients are used for a generalized Gaussian density model. Method in [4] extracts features in the shearlet domain by using a NR-IQA and neural network method. However, the auto-encoder used in that method is different blue from CNN. Many methods in [12,13,14,15,16] achieve state-of-the-art results in unsupervised reduction. Recently, method in [3] extracts a quality feature from the wavelet transform domain. Unfortunately, this transformation in wavelet domain is highly consuming time. Method in [5] proposes the NSS-IQA method, which is applied in the spatial domain. This latter method shows that subtracted, contrasted, and normalized coefficients can represent a statistical properties of distortion in the local normalization.
CNN-based NR-IQA methods in [7, 8], including our proposed method, are also based on spatial domain, but the difference is that our DCNN learn quality features instead of the naturalness wood image. The method in [10] proposes a method that is developing a codebook in image patches. The training model [10] is similar to CNN-based methods that the learned quality feature is not handcrafted. The codebook was combined with object detection in this method [11].
The saliency map is more advisable for the distorted and undistorted treatment [23, 24] in NR-IQA methods. Regrettably, it has a drawback of feature learning quality appraisal by employing sparse coding. To explore the CNN performance, method in [7, 8] proposed a CNN architecture by using the median subtracted contrast normalised coefficients (MDSCN) [5]. However, the depth of this CNN model is limited in the feature extraction and it is not persistent with HVS. The more closer study of our proposed method is analyzed in [9, 25], which associates in a couple CNNs and salient gradient map algorithm. A two-layer CNN architecture is exploited for feature extraction and classification. Notably, the Prewitt method probes the edges of each image in [9]. However, the weighing on the edges can lose the consideration on the important characteristic of image quality, such as a luminance [26, 27].
Two other closer studies [19, 20], they explore the salient image patches to appraise the image quality. In [19], their saliency-based DCNN (SDCNN) method divides an image into more image patches for saliency mapping which is considered a threshold value to cut out non-salient image patches in the weighted length of [0,1].The image quality is artificially determined by the weighted average in salient image patches. Accordingly, this SDCNN imprecises to evaluate image quality score. The other proposed deep learning based and saliency map method [20] measure HDR images quality from salient image patches. After all, this method needs a lot of training data.
The proposed method considers DCNN in feature extraction. Two or more convolutional layers of small filter size derive more good enough features [15]. Notwithstanding, our proposed DCNN model contains five convolutional layers. To raise the HVS in NR-IQA, we nominate salient wood image maps to evaluate the importance of each wood image patch. Nonetheless, the Gaussian white noise or Gaussian blurs and HVS methods [1] have some drawbacks. They will lower contrast and inconsistently level of detail with human visual perception. We are inspired by the method of [18] out of the existing salient algorithms of IQA performance.
3 Methods
3.1 DCNN architecture
Multilayer perceptron (MLP) in [7] uses the stacking multiple convolution and max-pooling layers, and one-column CNN, which has only the image patch of the different image input, and three-column CNN which has the image patches of left- and right-view image and the different image input. The study in [7] also constructs the CNN-learned features for stereoscopic images in NR-IQA. However, it is very difficult to recognize CNN featuring a highly HVS quality perception. The study of [28] shows fine-tuning which uses pre-trained CNNs of visual recognition tasks and a feature extractor.
The other studies [12, 13, 15, 16] show better performance of CNN architecture, which has more layers in feature extraction. Our proposed DCNN architecture is expanded from our previous study [25] that we explored five convolutional layers with an effective transfer learning which can well investigate wood image classification in NR-IQA. We are inspired by the study of [7]. We separate each input of wood’s image patches of the size 227x227. To reconstruct a DCNN architecture, we inspect the large kernels [15] which are represented as our powerful small kernels 7x7. Three overlapping of max pooling units (Pool 1, Pool 2, and Pool 3) are added to the output in first, second and fifth convolutional layer, respectively. We also propose the local feed method in the data pre-processing by using local response normalization (LRN). LRN1 and LRN2 are computed to the outputs in Pool 1 and Pool 2 units, respectively. We also configure first, second, third, and forth convolutional layers with 100 channels for alleviation. The fifth convolutional layer is configured with 25 channels because it is an interface between convolutional and fully connected (FC) layers. We connect FC layers to the higher number of channels because FC layers do not share them with convolutional layers. FC layers would result in a large total number of learnable parameters to be trained.
SR-CNN [29] applies activation functions of ReLU [30] and other derivatives, such as leaky ReLU (LReLU) [31] and parametrized ReLU (PReLU) [32]. SR-CNN avoids a vanishing gradient of positive values. However, ELU [33, 34] has fixed the learning characteristics among the other activation functions [30,31,32]. ELU also has a smaller computational time and the mean activators close to zero. The activation functions [30,31,32] have negative values, and they do not ensure a deactivated noise. In this case, we expand the \(\alpha\) hyperparameter of the ELU activation function in our previous study [25] as shown in Eq. 1.
where \(\alpha\) is the hyperparameter of the ELU, which controls the negative values of satellite image inputs \(\rho _i\). When x is leading more than zero, its achievement like Rectified Linear Units.The opposite, when x is less or equal than zero, the function close to a negative constant value for \(\beta =1\) (Fig. 1).
3.2 DS-WIQA architecture
In DS-WIQA, we nominated the n-convex and n-concave salient wood image maps. The salient wood image map of a n-square is convex as shown in Fig. 2 (red color). The expansion of convexedly rectangular means that all diagonals of each vertex are placed in this n-square, except the end points. So, from \(A_1\), we make diagonals of \(\overline{{\mathrm{A}}_1 {\mathrm{A}}_{{\mathrm{j}}}}\); \(j = 3,4, \dots , n-1\) which are all of this n-square except the end points. This means that the n-square is a combination of triangles \(\triangle A_1, \triangle A_i, \dots , \triangle A_{i + 1}; i = 2, 3, \dots , n-1\). Suppose \(A_i (x_i, y_i); \quad i = 2, 3, \dots , n\), which has the area of each triangle \(L_i\) is as follows.
So, we calculated the salient wood image map n-convex L from Eq. 2 as follows.
Salient wood image map of a n-square is concave as shown in Fig. 2 (blue color). We create it in two types, namely n-square concave which has a vertex and no-vertex. All the diagonals of each vertex are the diagonals of the end points, which are outside of the n-square, except the vertex \(A_1\). Therefore, this n-square can be formed from triangles of \(\triangle A_1, \triangle A_i, \dots , \triangle A_ {i + 1}; i = 2, 3, \dots , n-1\). We need to emphasize that \(A_1\) on concave n-square can not be extracted from any of these n-square points. This is only true if \(A_1\) is a n-square corner, so all diagonals of \(A_1\) is placed again inside or on the concave n-square.
It is incorrect if \(A_1\) is replaced by \(A_2\), because there is the diagonal \(\overline{{\mathrm{A}}_2 {\mathrm{A}}_{{\mathrm{j}} + 1}}\) and \(\overline{{\mathrm{A}}_2 {\mathrm{A}}_{{\mathrm{n}}-1}}\) from \(A_2\) of the end points which are outside this concave n-square. Likewise, if \(A_1\) is replaced by \(A_3\). But, there is the diagonal \(\overline{{\mathrm{A}}_3 {\mathrm{A}}_{\mathrm{k}}}\) of \(A_3\) which in addition to being at the end points is also outside this concave n-square. Replacing \(A_1\) with \(A_m\); \(m = 4, 5, \dots , n\) is still incorrect, because it can always be shown that there is a diagonal of \(A_m\) which in addition to being at the end points is also outside this concave n-square.
Determining a triangle with a vertex of \(A_1\) and a vertex diagonally from that vertex besides its end points outside the concave n-square means false. This happens because there is a side of the \(\triangle A_{m}, \triangle A_{p}, \triangle {A_{p + 1}}; 1 \le p \ne m \le n\) which is located outside this concave n-square, so that it has an effect:
where \(L_{\triangle A_{m}, \triangle A_{p}, \triangle {A_{p + 1}}} = the p-triangle square; 1 \le p \ne m \le n; and L_n = the n-square\).
If an n-square is concave n-square which has a vertex so that all diagonals from this vertex are located in or at this n-square, then the area can be determined by Eq. 2, after taking \(A_1 (x_1, y_1)\) as a vertex’s concave.
If the concave n-square does not have a vertex, so all diagonals of this vertex lie within or on this n-square, then the concave n-square is divided into \(n-square_{m}\). So, \(m = 1, 2, \dots , k\) part, so that every two n-square parts adjacent are only allied on one side. For each \(n-square_{m}\) there is a vertex. All diagonals of this vertex is located inside or at \(n-square_{m}\). Thus, the width of each concave n-square with \(m = 1, 2, \dots , k\) can be determined from Eq. 3. If the width of each \(n-square_ {m}\); \(m = 1,2, \dots , k\) is \(A_m\), then the salient wood image map is
The relationship between \(n_m\) and n; \(m=1,2, \dots , k\) of \(n-square and n-square_{m}\), where every part is
4 Experiments and results
4.1 Datasets
Zenodo dataset [21] trains and tests our DCNN and also DS-WIQA methods. Lignoindo dataset [22] is used for DS-WIQA cross-dataset evaluation. We train a classifier on five types of distortion included in Zenodo dataset. They are JP2K (JPEG2000), JPEG, white noise Gaussian (WG), blocking artifact (BA) and the fast fading (FF). We assess our proposed model from a Lignoindo dataset of those distortion types. Zenodo dataset contains 544 distorted wood images from five types of distortion. Zenodo also comes with 109 reference wood images. We gather the perceptive model for this dataset which uses difference mean opinion score (DMOS) in the length of [0, 99] as mentioned in [35,36,37,38]. The higher DMOS indicates a lower quality.
In this work, each wood image of the whole Zenodo dataset is assigned for training, validation, and test sets in the smaller non-overlapping patches. At the same time, Lignoindo dataset is desired for cross-dataset validation in the length of [0, 1]. We also redound the predicted Zenodo dataset into the similar range with Lignoindo dataset by exploiting a nonlinear function.
4.2 Evaluation measurement
We consider our prospective DCNN and DS-WIQA methods as illustrated in Fig. 3 by calculating LCC and SROCC appraisals. LCC computes validation and test sets of the predictions and ground truth. Considering that, ground truth can represent a SROCC value from the same set appraisal. In the distorted wood images, all the wood images from five types distortion are extracted into \(70\%\), \(15\%\), and \(15\%\) of training, validation, and testing, respectively. The all of corrected wood images on the Zenodo dataset are also allotted in the similar protocol code. On the validation set, the highest LCC resolves the best result of each training iteration, which is repeated until 10 time iterations. We employ the five types of distorted wood image on the cross-dataset validation in 10 iterations by using the Zenodo and Lignoindo datasets.
To generate the training and testing data, we analyzed that the five types distortion of wood image is determined by using a distortion coefficient value within \(10^{-7}\). Consistently, to get each wood image with a distortion coefficient value, we make a set of number \([-250, -249, -248, ..., -1, 0]\) correlating to \(-250 x 10^{-7}\) in 10 epochs, and so on. The saliency coefficient value will be defined during the testing by our DS-WIQA on the whole Lignoindo dataset.
4.3 DCNN experiment
In the training set of DCNN, we divide each wood image into small sizes of 7x7 wood image patches as the initial wood image in the length of [0.01, 0.9], 10 of training epochs, and 32 of the mini-batch size. The learning rate cuts down 0.1 each five epochs iterations.
The average of LCC and SROCC in every ten iterations testing is demonstrated in Table 1. For the distorted wood image, our DCNN outperforms the multilayer perceptron (MLP) which uses CNN-based [7] on five types distortion. We outperform MLP in SROCC by \(18.84\%\) JP2K, \(21.29\%\) JPEG, and \(16.62\%\) WG, \(48.44\%\) BA, and \(26.06\%\) FF. Our DCNN is also superior compared with MLP in LCC by \(29.93\%\) JP2K, \(32.43\%\) JPEG, and \(29.10\%\) WG, \(39.66\%\) BA, and \(33.69\%\) FF. The undistorted wood image is represented as All in Table 1. All wood images from Zenodo dataset are used in the training regardless of their distortion types, as shown in Table 1. The same measurements are used in the distorted wood image. The higher LCC and SROCC are achieved by our proposed DCNN. We outperform MLP by \(25.44\%\) SROCC and \(32.88\%\) LCC, respectively.
4.4 DS-WIQA experiment
In the later study, we incorporate the proposed DCNN with the salient wood image map of [21]. When DS-WIQA concludes the wood image quality appraisal, the insignificant wood image patches can ignored by using a coefficient \(\alpha \in \{ 0, 0.01, 0.1, 0.7 \}\). If \(\alpha = 0\), DS-WIQA is similar with DCNN and otherwise, (\(\alpha \ge 0.7\)), DS-WIQA reads on a subset, as exposed in Fig. 4.
Table 2 presents SROCC’s and LCC’s learning of the ten iterations on Zenodo dataset validation where the distinct \(\alpha\) values apply to the whole distinct of distorted wood images. The saliency map raises achievement on the five types distortion of JP2K, JPEG, WG, BA, and FF in SROCC and LCC, respectively. Unfortunately, the salient map in SROCC is admirable perfomance than LCC. Our proposed DS-WIQA also achieves improvement of \(7.85\%\) for the fine wood image of the Zenodo dataset on SROCC compared with LCC.
Also, the best coefficient \(\alpha ^*\) of each distortion type is proved by exploring SROCC and LCC of the ten iterations of the Zenodo dataset in Table 2. For JP2K and JPEG, the highest SROCC and LCC, respectively, is achieved when \(\alpha = 0\) on Zenodo validation set. Table 3 reveals that the similar DS-WIQA’s to DCNN’s performance on the test set occurs in the two distortion types of no-salient map.
Table 3 describes the DS-WIQA’s performance on the Zenodo dataset of the other methods by using \(\alpha ^*\). Our DS-WIQA exceeds other methods of image quality appraisal on the five types distortion. On the all distortion types, our DCNN and DS-WIQA achieve 0.9763 and 0.9800, respectively, on SROCC and 0.9686 and 0.9705, respectively, at LCC, which outrun the other advanced no-referrence image quality appraisal methods [2,3,4,5,6,7, 9,10,11]. Our DS-WIQA also achieved the highest result at all.
In the DS-WIQA cross-dataset test, we train on the Zenodo dataset and test on the Lignoindo dataset for the five types distortion in our cross-dataset experiment. The output of our DS-WIQA is the length of [0, 99] in the DMOS. We follow the settings in [8], the Lignoindo dataset is extracted to two subsets. They are \(90\%\) of the data training and \(10\%\) of data testing. We rerun these data training and testing about 10 times to get a cross-dataset performance on the Lignoindo dataset. The best performance of DS-WIQA coefficient is \(\alpha =0.1\) as used on the fine wood image of blocking artifact in Table 2. Our DS-WIQA has a better state-of-the-art compared the other methods [2,3,4,5,6,7, 9,10,11, 24] as we can see in the Table 4.
Our study proves that the overlapping max pooling units achieve the best among other types of pooling units for SROCC and LCC appraisals, respectively. The overlapping max pooling outrun \(5.6\%\) and \(4.2\%\) outrun than non-overlapping max pooling on SROCC and LCC appraisals, respectively. From this case, we stop the overlapping max pooling units until \(3rd\) and \(4th\) Layers. We analyze that the overlapping max pooling tends to outrun in \(5th\) layer, as demonstrated in Table 5.
Overall, DS-WIQA surpasses the other advanced methods [2,3,4,5,6,7, 9,10,11, 24]. DS-WIQA surpasses \(0.38\%\) our DCNN and \(34.84\%\) other methods [2,3,4,5,6,7, 9,10,11, 24], respectively, on SROCC. Our DCNN also outruns \(34.33\%\) other methods [2,3,4,5,6,7, 9,10,11, 24] on SROCC. DS-WIQA also exceeds \(0.22\%\) our DCNN and \(30.15\%\) other methods [2,3,4,5,6,7, 9,10,11, 24], respectively, on LCC. At the same time, our proposed DCNN exceeds \(29.86\%\) other methods [2,3,4,5,6,7, 9,10,11, 24] on LCC.
In computational complexity, our proposed methods have better performance in operational function times for shift add operation compared to other methods [2,3,4,5,6,7, 9,10,11, 24] as described in Tables 6 and 7. We have an amount of time (\(\mu\) second), which is performed in the number of elementary operations. Our DCNN and DS-WIQA reduced the terms of shift add operation. DS-WIQA emerged to be first-rate to our DCNN and the other advanced methods [2,3,4,5,6,7, 9,10,11, 24].
5 Conclusion
We propose a new DCNN of NR-IQA which has a better result compared with the other NR-IQA methods [2,3,4,5,6,7, 9,10,11, 24]. In the distorted wood image, our DCNN outperforms the recent sophisticated MLP [7] in five types distortion which has \(18.84\%\) JP2K, \(21.29\%\) JPEG, and \(16.62\%\) WG, \(48.44\%\) BA, and \(26.06\%\) FF in SROCC and \(29.93\%\) JP2K, \(32.43\%\) JPEG, and \(29.10\%\) WG, \(39.66\%\) BA, and \(33.69\%\) FF in LCC. In the undistorted wood image, our poposed DCNN is superior to MLP [7] by \(25.44\%\) SROCC and \(32.88\%\) LCC, respectively.
Our DS-WIQA exceeds other state-of-the-art methods [2,3,4,5,6,7, 9,10,11, 24]. DS-WIQA obtains \(0.38\%\) and \(0.22\%\) higher than our DCNN, and \(34.84\%\) and \(30.15\%\) higher than other methods [2,3,4,5,6,7, 9,10,11, 24] on SROCC and LCC, respectively. Experimental results show that the computational complexity of our DCNN and DS-WIQA reduce shift add operation in exponential, logarithmic, and trigonometric functions. DS-WIQA raises to be more noteworthy than our DCNN and the other methods [2,3,4,5,6,7, 9,10,11, 24].
References
Oszust M (2019) No-reference quality assessment of noisy images with local features and visual saliency models. Elsevier Inf Sci 482:334–349. https://doi.org/10.1016/j.ins.2019.01.034
Wang T, Zhang L, Jia H (2019) An effective general-purpose NR-IQA model using natural scene statistics (NSS) of the luminance relative order. Elsevier Signal Process Image Commun 71:100–109. https://doi.org/10.1016/j.image.2018.11.006
Gupta P, Moorthy AK, Soundararajan R, Bovik AC (2018) Generalized Gaussian scale mixtures: a model for wavelet coefficients of natural images. Elsevier Signal Process Image Commun 66:87–94. https://doi.org/10.1016/j.image.2018.05.009
Li Y, Po L-M, Xu X, Feng L, Yuan F, Cheung C-H, Cheung K-W (2015) No-reference image quality assessment with shearlet transform and deep neural networks. Elsevier Neurocomput 154:94–109. https://doi.org/10.1016/j.neucom.2014.12.015
Zeng Z, Yang W, Sun W, Xue J-H, Liao Q (2018) No-reference image quality assessment for photographic images based on robust statistics. Elsevier Neurocomput 313:111–118. https://doi.org/10.1016/j.neucom.2018.06.042
Li Q, Lin W, Gu K, Zhang Y, Fang Y (2019) Blind image quality assessment based on joint log-contrast statistics. Elsevier Neurocomput 331:189–198. https://doi.org/10.1016/j.neucom.2018.11.015
Zhang W, Qu C, Ma L, Guan J, Huang R (2016) Learning structure of stereoscopic image for no-reference quality assessment with convolutional neural network. Elsevier Pattern Recognit 59:176–187. https://doi.org/10.1016/j.patcog.2016.01.034
Li F, Shao F, Jiang Q, Fu R, Jiang G, Yu M (2018) Local and global sparse representation for no-reference quality assessment of stereoscopic images. Elsevier Inf Sci 422:110–121. https://doi.org/10.1016/j.ins.2017.09.011
Yang J, Liu J, Jiang B, Lu W (2018) No reference quality evaluation for screen content images considering texture feature based on sparse representation. Elsevier Signal Process 153:336–347. https://doi.org/10.1016/j.sigpro.2018.07.006
Siahaan E, Hanjalic A, Redi JA (2018) Semantic-aware blind image quality assessment. Elsevier Signal Process Image Commun 60:237–252. https://doi.org/10.1016/j.image.2017.10.009
Ji W, Wu J, Shi G, Wan W, Xie X (2019) Blind image quality assessment with semantic information. Elsevier J Vis Commun Image Represent 58:195–204. https://doi.org/10.1016/j.jvcir.2018.11.038
Chen L, Song Z, Lu J, Zhou J (2019) Learning principal orientations and residual descriptor for action recognition. Elsevier Pattern Recognit 86:14–26. https://doi.org/10.1016/j.patcog.2018.08.016
Duan M, Li K, Yang C, Li K (2018) A hybrid deep learning CNN-ELM for age and gender classification. Elsevier Neurocomput 275:448–461. https://doi.org/10.1016/j.neucom.2017.08.062
Wang H, Zhang Q, Wu J, Pan S, Chene Y (2019) Time series feature learning with labeled and unlabeled data. Elsevier Pattern Recognit 89:55–66. https://doi.org/10.1016/j.patcog.2018.12.026
Fan Q, Shen X, Hu Y, Yu C (2019) Simple very deep convolutional network for robust hand pose regression from a single depth image. Elsevier Pattern Recognit Lett 119:205–213. https://doi.org/10.1016/j.patrec.2017.10.019
Hana Y, Zhang P, Zhuo T, Huang W, Zhang Y (2019) Going deeper with two-stream ConvNets for action recognition in video surveillance. Elsevier Pattern Recognit Lett 107:83–90. https://doi.org/10.1016/j.patrec.2017.08.015
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Proceedings of neural information processing systems (NIPS’12), pp 1097–1105. https://doi.org/10.1145/3065386
Riche N, Mancas M, Duvinage M, Mibulumukini M, Gosselin B, Dutoit T (2013) RARE2012: a multi-scale rarity-based saliency detection with its comparative statistical. Elsevier Signal Process Image Commun 28:642–658. https://doi.org/10.1016/j.image.2013.03.009
Jia S, Zhang Y (2018) Saliency-based deep convolutional neural network for no-reference image quality assessment. Springer Multimed Tools Appl 77:14859–14872. https://doi.org/10.1007/s11042-017-5070-6
Jia S, Zhang Y, Agrafiotis D, Bull, D (2017) Blind high dynamic range image quality assessment using deep learning. In: Proceedings of the 2017 IEEE international conference on image processing (ICIP), pp 765–769.https://doi.org/10.1109/ICIP.2017.8296384
Barmpoutis P, Stathaki T (2019) Zenodo wood species dataset (8544 macroscopic images). Imperial College London-Aristotle University of Thessaloniki (Version 1). https://doi.org/10.5281/zenodo.2545611
Damayanti R, Prakasa E, Krisdianto, Dewi LM, Wardoyo R, Sugiarto B, Pardede HF, Riyanto Y, Astutiputri VF Panjaitan GR (2019) LignoInd: image database of Indonesian commercial timber. The \(8^{th}\) international symposium for sustainable humanosphere. IOP Conf Ser Earth Environ Sci 374:1–10. https://doi.org/10.1088/1755-1315/374/1/012057
Zhang J, Zhao Y, Chen S (2018) Object-level saliency: fusing objectness estimation and saliency detection into a uniform framework. Elsevier J Vis Commun Image Represent 53:102–112. https://doi.org/10.1016/j.jvcir.2018.03.002
Hu Y, Gao Q, Zhang B, Zhang J (2018) On the use of joint sparse representation for image fusion quality evaluation and analysis. Elsevier J Vis Commun Image Represent 61:225–235. https://doi.org/10.1016/j.jvcir.2019.04.005
Risnandar Aritsugi M (2018) Real-time deep satellite image quality assessment. Springer J Real-Time Image Process 2:477–494. https://doi.org/10.1007/s11554-018-0798-4
Barten PGJ (1999) Pixel density and luminance quantization. Contrast sensitivity of the human eye and its effects on image quality. Press Monograph Series-SPIE Optical Engineering Press: Bellingham, Washington, USA. 194–198. https://doi.org/10.6100/IR523072
Hulusic V, Debattista K, Valenzise G, Dufaux F (2017) A model of perceived dynamic range for HDR images. Elsevier Signal Process Image Commun 51:26–39. https://doi.org/10.1016/j.image.2016.11.005
Cetinic E, Lipic T, Grgic S (2018) Fine-tuning convolutional neural networks for fine art classification. Elsevier Expert Syst Appl 114:107–118. https://doi.org/10.1016/j.eswa.2018.07.026
Dong C, Loy CC, He K, Tang X (2018) Learning a deep convolutional network for image super-resolution. Eur Conf Comput Vis Comput Vis (ECCV 2014) 8692:184–199. https://doi.org/10.1007/978-3-319-10593-2_13
Nair V, Hinton GE (2010) Rectified linear units improve restricted Boltzmann machines. In: Proceedings of The \(27^{th}\) international conference on international conference on machine learning 2010 (ICML’10), pp 807–814. http://dl.acm.org/citation.cfm?id=3104322.3104425
Maas AL, Hannun AY, Ng AY (2013) Rectifier nonlinearities improve neural network acoustic models. In: Proceedings of the \(30^{th}\) international conference on machine learning 2013 (ICML’13), vol 28, pp 1–6
He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp 1026–1034. https://doi.org/10.1109/ICCV.2015.123
Clevert D-A, Unterthiner T, Hochreiter S (2015) Fast and accurate deep network learning by exponential linear units (ELUs). In: Proceedings of international conference on learning representations 2016 (ICLR2016), pp 1–14. arXiv:1511.07289
Heusel M, Clevert D-A, Klambauer G, Mayr A, Schwarzbauer K, Unterthiner T, Hochreiter S (2015) ELU-networks: fast and accurate CNN learning on imagenet. In: Poster of the international conference on computer vision (ICCV), pp 35–68
Janowski L, Pinson M (2015) The accuracy of subjects in a quality experiment: a theoretical subject model. IEEE Trans Multimed 17:2210–2224. https://doi.org/10.1109/TMM.2015.2484963
International Telecommunication Union (2016) Recommendation ITU-T P.913: Methods for the subjective assessment of video quality, audio quality and audiovisual quality of Internet video and distribution quality television in any environment. Series P: Terminals and subjective and objective assessment methods audiovisual quality in multimedia services; Recommendation ITU-T P.913. International Telecommunication Union: Geneva, Switzerland. 1–34. http://handle.itu.int/11.1002/1000/12775
Luo X, Wang S, Yuan D (2016) Subjective score predictor: a new evaluation function of distorted image quality. Hindawi Math Probl Eng 2016:1–10. https://doi.org/10.1155/2016/1243410
Chow LZ, Rajagopal H (2017) Modified-BRISQUE as no reference image quality assessment for structural MR images. Elsevier Magn Reson Imaging 43:74–87. https://doi.org/10.1016/j.mri.2017.07.016
Acknowledgements
We thank all researchers and supporter of the Research Center for Informatics-Indonesian Institute of Sciences (LIPI), Ministry of Research and Technology/National Agency for Research and Innovation (BRIN), Republic of Indonesia, and the Forest Products Research and Development Center-Ministry of Forestry, Republic of Indonesia for the support and cooperation of this research collaboration.
Author information
Authors and Affiliations
Contributions
This paper was created by the team’s collaboration. Risnandar is the main contributor and the others are supporting contributors, as follows: (1)Conceptualization: Risnandar and Esa Prakasa; (2)Methodology: Risnandar; (3)Software: Risnandar and Esa Prakasa; (4)Validation: Iwan Muhammad Erwin and Elli Ahmad Gojali; and (5)Resources: Puji Lestari and Herlan.
Corresponding author
Ethics declarations
Conflict of interest
On behalf of all authors and contributors, the corresponding author states that there is no conflict of interest in this article.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Risnandar, Prakasa, E., Erwin, I.M. et al. Deep salient wood image-based quality assessment. SN Appl. Sci. 2, 1034 (2020). https://doi.org/10.1007/s42452-020-2671-x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s42452-020-2671-x