Light field super-resolution using complementary-view feature attention

Light field (LF) cameras record multiple perspectives by a sparse sampling of real scenes, and these perspectives provide complementary information. This information is beneficial to LF super-resolution (LFSR). Compared with traditional single-image super-resolution, LF can exploit parallax structure and perspective correlation among different LF views. Furthermore, the performance of existing methods are limited as they fail to deeply explore the complementary information across LF views. In this paper, we propose a novel network, called the light field complementary-view feature attention network (LF-CFANet), to improve LFSR by dynamically learning the complementary information in LF views. Specifically, we design a residual complementary-view spatial and channel attention module (RCSCAM) to effectively interact with complementary information between complementary views. Moreover, RCSCAM captures the relationships between different channels, and it is able to generate informative features for reconstructing LF images while ignoring redundant information. Then, a maximum-difference information supplementary branch (MDISB) is used to supplement information from the maximum-difference angular positions based on the geometric structure of LF images. This branch also can guide the process of reconstruction. Experimental results on both synthetic and real-world datasets demonstrate the superiority of our method. The proposed LF-CFANet has a more advanced reconstruction performance that displays faithful details with higher SR accuracy than state-of-the-art methods.


Introduction
Light field (LF) cameras, e.g., Lytro and RayTrix, provide 4D LF images, unlike conventional cameras, and thus LF imaging technology has been used widely in many applications, such as VR [1,2], tracking [3][4][5][6], 3D reconstruction [6,7], and saliency detection [8,9]. As shown in Fig. 1(a), these cameras place a microlens array between the main lens and the sensor to provide multiple views of a scene. LF images, captured by a handheld LF camera [1,2], record spatial information (accumulation from the same object point) and angular information (intensity values for all ray directions). However, due to the limitation of sensor resolution, the spatial resolution of LF images is much lower than that of commercial 2D cameras. Therefore, image super-resolution (SR) technology plays an important role in LF applications, as it effectively enhances the quality of LF images.
LF super-resolution (LFSR) is an ill-posed problem. This problem can be solved by exploring efficient use of sub-pixel information from different views to reconstruct SR images. Traditional methods generally solve the SR problem using multiple views based on prior disparity information, such as a Bayesian framework [10], a variational framework [11,12], or a Gaussian mixture framework [13]. However, these methods suffer from inaccurate prior disparity estimation and high computational cost. With the development of deep learning, learning-based methods [14][15][16][17] have been used to address the problem of the complex 4D structure of LF data and to improve results compared to traditional approaches. Although improvements have been continuously made [18,19], the inherent complementary information provided by sub-aperture images (SAI) is still not fully utilised, because parallax information is treated equally for each view, and feature fusion between complementary views is inadequate. These issues limit improvement of LFSR methods.
Taking advantage of the attention mechanism in SR networks [20][21][22], we propose a spatial and channel attention network, namely the light field complementary-view feature attention network (LF-CFANet), to improve the spatial resolution of LF images. As shown in Fig. 2, this network consists of two main modules, the residual complementary-view spatial and channel attention module (RCSCAM) and the maximum-difference information supplementary branch (MDISB). Specifically, the RCSCAM is designed to fuse the complementary information among pairs of LF images. With RCSCAM, reconstruction features can be combined with complementary sub-pixel information and local similarity information from different auxiliary views by computing an attention map. Meanwhile, providing this module with a channel attention mechanism allows it to capture global channel-level information by adaptively adjusting the response of the feature map for each channel. To guide LF reconstruction both effectively and efficiently, the MDISB is designed to obtain maximum-difference information from LF views. In MDISB, the features of a reference view and four auxiliary views are collected from a reservoir based on the maximumdifference angular positions. The maximum-difference feature is used to guide reconstruction of the reference view. Through these designs, the complementary information across LF views can be effectively utilized to reconstruct SR LF images to a certain extent.
Extensive experimental results using real-world and synthetic LF datasets demonstrate that the proposed method achieves quantitatively and qualitatively better results than state-of-the-art methods. Our contributions are summarized as follows: 1. We propose an RCSCAM to better exploit correlation cues for LF complementary-view pairs and generate effectively fused complementaryview features by introducing an attention mechanism. The RCSCAM consists of two types of attention: channel attention and spatial attention. Channel attention enhances the global perception of feature channels, while spatial attention strengthens the interaction of spatial information between complementary views. 2. We develop an MDISB to guide supple-mentation of the most informative difference for SR views by treating each perspective unequally. The

Fig. 2
Network architecture of the proposed LF-CFANet, comprising four parts: feature extraction, feature fusion, feature compression, and reconstruction. The input of our network is made up of sub-aperture images (SAIs). One view is randomly selected from the SAIs as a reference view, and the remaining images are taken as auxiliary view images. The output is an enhanced resolution version of the reference view. C denotes concatenation.
information is provided from a reservoir by concatenating two feature pairs consisting of four maximum-difference fused features based on the angular position of LF images.
3. Our LF-CFANet can exploit the complementary information and the local similarity information from different auxiliary views at the pixel level based on the geometric characteristics of LF images. It uses attention maps to fuse these information for enhancing the spatial resolution. Extensive experiments demonstrate the design effectiveness and improved results compared to state-of-the-art methods. The rest of this paper is organized in the following sections. Section 2 gives a brief review of related work.
Light fields and the architecture of our LF-CFANet are outlined in Section 3. We provide extensive analysis and experiments in Section 4, using synthetic and real-world datasets. Finally, Section 5 concludes the paper.

Related work
In this section, we review related work on both single image super-resolution (SISR) and LFSR.

Single image super-resolution
SISR is a reconstruction technology from fuzzy lowresolution (LR) images. This technology plays an important role in the field of surveillance, satellite imaging, microscope imaging, etc. Several studies have reviewed SISR in detail [23,24]. Here, we give a review of several recent advances. Nowadays, deep learning has gradually become a dominant approach for SISR, and has significantly improved the reconstruction quality over traditional methods. Dong et al. [25,26] proposed a milestone study with an SR deep convolutional neural network (SRCNN), which was a seminal method in the field of SR. This simple and shallow model outperformed earlier work. Kim et al. [27] proposed a very deep convolutional network (VDSR) combined with residual learning, which was more efficient and achieved higher quality than Dong et al.'s work [25,26]. Note that, VDSR obtained a larger receptive field by stacking filters, and the problem of slow convergence was solved by applying global residual learning. To better exploit intra-view information, more powerful models have been developed based on deep networks. Lim et al. [28] proposed an enhanced deep SR network (EDSR), which achieved extraordinarily better results than previous methods by revising the residual module and multi-scale model [29]. Zhang et al. [30,31] proposed a residual dense network (RDN), which fully utilized all hierarchical features in all convolutional layers and provided better feature extraction than EDSR. Applying an attention mechanism, Zhang et al. [32] proposed a residual channel attention network (RCAN), which worked by inserting a channel attention module to consider the interdependence of channels. Recently, Dai et al. [33] proposed a second-order attention network (SAN) by applying a trainable second-order attention module to capture spatial information. Both RCAN and SAN have achieved promising results in SISR reconstruction.
As shown in the above review, SISR methods efficiently and effectively reconstruct the spatial information for single images. However, these methods cannot directly handle correlations between multiple views, so cannot be applied to the field of LFSR.

LF super-resolution
For LFSR, a straightforward approach is to finetune the network parameters of SISR. However, LFSR requires using complementary information from multiple LF images of one scene to reconstruct a highresolution image. Existing LFSR methods can be mainly divided into optimization-based approaches and learning-based approaches.
Optimization-based approaches reconstruct SR images based on the estimated disparities between different views. Bishop and Favaro [10] first used a Bayesian framework for LFSR. Wanner and Goldluecke [11,12] proposed a variational method for SR by introducing disparity maps obtained from EPIs. Mitra and Veeraraghavan [13] proposed a patch-based approach modeled by a Gaussian mixture model to solve LF problems. The framework of this method could handle many different processing tasks. Zhang et al. [34] proposed the PlenoPatch method based on patches to generate more realistic results than previous methods. To better supplement complementary information and avoid costly disparity estimation, Rossi and Frossard [35] proposed an LFSR framework for homogeneous reconstruction of all views in the LF by using a graph-based regularizer. Later, Alain and Smolic [36] proposed a method to convert the inverse problem of LFSR into an optimization problem based on prior sparsity. Although these methods could well encode the complex 4D LF, optimization-based methods are not effective in combining the spatial information from different views. Moreover, most of these methods are based on handcrafted image priors, which limit the quality of reconstruction.
Learning-based approaches show superiority over optimization-based approaches in using complementary information from different views. Complementary information can improve the quality of LFSR. Yoon et al. [15,16] introduced CNNs to the field of LF (LFCNN), while Yuan et al. [14] proposed an SR method that fully exploited the structure of the LF with an SISR module and an EPI enhancement module. These modules captured the structural characteristics of LF well. By extending BRCN [37], Wang et al. [38] proposed a bidirectional recurrent convolutional neural network, LFNet, and stacked generalization techniques to synthesize the final sub-aperture images. In this structure, the recurrent neural network was improved to handle horizontal and vertical structures. In this network, spatial correlations between neighboring views could be modified to be more effective and flexible. Inspired by the residual network, Zhang et al. [18] proposed a multi-branch residual network (resLF) to handle image stacks with consistent sub-pixel offsets; each branch could extract high-frequency details from LF images. In order to preserve the parallax structure, Jin et al. [39] proposed a method with two-step LF spatial resolution by introducing a perspective feature fusion module and a structural consistency regularization loss (LF-ATO). More recently, Wang et al. [40] proposed an LF-InterNet to extract and incorporate spatial and angular information. This network could gradually combine the spatial and angular information. Mo et al. [41] proposed a dense dual-attention network (DDAN), with spatial attention within LF views and channel attention within different channels. This method achieved high-quality LF reconstruction.
In summary, these methods implicitly learn the internal correspondences of the LF structure, and they are gradually improving LFSR. However, due to the design of the network structure, complementary information is still not fully utilized. For example, LFNet uses a bidirectional recurrent network to fuse angular information among SAIs. This information only considers row and column directions, and it cannot be efficiently used to reconstruct LF images. Instead, we propose a complementary-view feature attention approach that uses the information from all auxiliary views to reconstruct the reference view.

Architecture of LF-CFANet
In this section, we introduce the 4D LF representation, and propose a many-to-one LFSR network. The architecture of our LF-CFANet is shown in Fig. 2. The feature fusion part is composed of two branches, MDISB and a reservoir branch.

Problem formulation
A 4D LF can be parameterized by two parallel planes. As shown in Fig. 1(b), the spatial plane, Π = (h, w), and the angular plane, Ω = (u, v), are used to describe the structure of a 4D LF. These planes can accurately represent the light images L(Π, Ω). Following existing LFSR methods [39], we only use Y channel images as input, which are obtained by converting input RGB images to YCbCr images, and retaining only the Y channel [39]. The input can be denoted L LR ∈ R U ×V ×H×W , ignoring the channel dimension. The goal of the LFSR task can be described as generating an SR LF from the LR input LF. The reconstructed LF images are L SR ∈ R U ×V ×αH×αW , where α is the upsampling rate.

Feature extraction
The quality of discriminative features with rich contextual information is very useful to SR reconstruction. This information can be obtained by using a multi-scale receptive field and feature learning. Therefore, the feature-extraction module of our LF-CFANet follows [21,48] and uses atrous spatial pyramid pooling (ASPP) module to extract the LF image features. Figure 2 shows the overall network architecture of the proposed LF-CFANet. The input L LR is composed of SAIs. The initial features (with 64 channels) of L LR are extracted by a 3 × 3 convolution shown in Fig. 2(a), and then we use the multicascaded residual ASPP (MRASPP) module in Fig. 2(b) for multi-scale feature extraction to support downstream processing. Specifically, the initial features of the LF views are first fed to the ASPP blocks, which share weights for each view. Each ASPP block consists of three different dilated convolutions with a leaky ReLU layer. These dilated convolutions, with dilation rates (D) 1, 2, and 4, are used to extract L LR features with different receptive fields. After a leaky ReLU layer, we concatenate the three output features and compress the number of channels through 1 × 1 convolution to make them more compact. These ASPP blocks not only obtain multi-receptive fields without changing the size of the feature maps, but also enrich the diversity of the convolutions. After three cascaded residual ASPP blocks, the feature of each view is extracted. These features can be expressed as where f 0 represents the MRASPPBlock and n is the number of SAIs. For the output of MRASPPBlock, F n each , the reference feature is randomly selected from the n output features, and the auxiliary features are the remaining features. These two types of features can be specifically expressed as where i, j (1 i, j U ×V, i = j, i+j = n) represent the angular positions. There is one i-indexed feature, and there are (U × V − 1) j-indexed features.
As shown in Fig. 2

RCSCAM in reservoir branch
The feature fusion part includes two branches. The first branch is a reservoir branch, and the second branch is MDISB. The reservoir branch is the key to fusing auxiliary-view information with reference-view information, using the RCSCAM (Residual Complementary-View Spatial and Channel Attention Module). Inspired by the stereo-attention mechanisms [22,49], we develop an RCSCAM to supplement the sub-pixel information of the reference view.
As Fig. 2(c) shows, the input pair of features F ref , F j aux are separately fed to two ResBlocks, f 1 , with 64 channels. These two ResBlocks share weights.
The output features of f 1 are F r , F j a ∈ R H×W ×64 . To explore the correlation between feature channels, we introduce SEBlocks following Ref. [20]. Pseudocode to capture the channel attention is provided in Algorithm 1. This block processes the input feature in three steps: squeeze, excite, and reweight. F r and F j a are respectively fed to globally adaptive pooling (F 1 sq , F 1,j sq ) to obtain feature channels with 1 × 64 aggregated information. To capture channel-wise Algorithm 1 Squeeze and excite blocks channel becomes a number, which has a global receptive field. 2: Excite and Reweight: Each feature channel generates a weight to represent its importance. The weight of the output of Excite is regarded as the importance of each feature channel, and is applied to each channel by multiplication.
dependencies, two fully-connected (FC) layers are used. The output weights of the excitation process represent the importance of the feature channel. They are applied to each channel by multiplication. These processes are denoted (F 1 ex , F 1,j ex ). Then, the outputs are separately fed to 1 × 1 convolutions to generate the feature maps (F 1 r , F 1,j a ). These outputs can be specifically expressed as where f SE 1 and f SE 2 represent the SEBlocks, and H α and H β represent the 1 × 1 convolutions.
To generate a reference-auxiliary attention map, F 1,j a is first transposed to (F 1,j a ) T , and then the geometry-aware matrix is multiplied by matrix F 1 r . The multiplying output of these two matrices is processed by softmax to produce the final attention maps, M j aux→ref ∈ R H×W ×W . Similarly, M j ref→aux is generated. This process can be expressed as Eq. (4): where ⊗ represents batch-wise matrix multiplication. To achieve feature information combination between the reference view and auxiliary view, W j ref→aux and W j aux→ref are generated by multiplying the input pair of features (F ref , F j aux ) and the attention maps (M j ref→aux , M j aux→ref ), respectively. Both W j ref→aux and W j aux→ref contain the referenceview and auxiliary-view information. They can be computed using As Fig. 2 shows, these two features (W j ref→aux , W j aux→ref ) are fed into two new SEBlocks to generate new features (W j ref→aux , W j aux→ref ), respectively. To retain the original features of the reference and auxiliary views, the input pair of features (F ref , F j aux ) is concatenated with (W j ref→aux , W j aux→ref ), and fed into another 1 × 1 convolution. This process can be expressed as where Cat is the concatenation operator, and H γ and H δ represent the 1 × 1 convolutions to fuse these two In the training process, the reference view feature F ref is generated by randomly selecting from the initial features. Due to the complex geometric structure of LF images, the fusion features F out,j ref obtained by RCSCAMs contain complementary information and local similarity information from different auxiliary views. The principle of RCSCAM is to obtain the feature similarities for all possible disparities between each pixel in the reference view and auxiliary view to generate an attention map. By introducing the attention mechanism, we can fully fuse the complementary information through feature-level information for reconstructing SR. The effectiveness of RCSCAM is demonstrated in Section 4.3.

MDISB
As the second branch of our feature fusion, MDISB (Maximum-Difference Information Supplementary Branch) is used to select four maximum-difference fusion features to guide reference view reconstruction. This branch chooses the four fusion features with maximum-difference information relative to the reference view from the reservoir. After RCSCAM, each pair of the reference view and an auxiliary view generates one fusion feature. The total number of fusion features is n 1 = U × V − 1. Due to the parallax structure of LF, the difference information in each auxiliary view varies, and supplements the referenceview information. The four angular-position initial represents the fusion features that supplement the complementary-view information to F ref by using RCSCAM. As Fig. 2(a) shows, we concatenate these four features and compress them using a 3 × 3 convolution. The depth of the final feature is 64.

Feature compression
Feature compression can compress the feature depth to adapt to the part of the reconstruction. We use ResBlocks to process each feature, which are where Stack represents feature stacking.

Reconstruction
Inspired by the architecture of Ref. [50] for SISR, we use a similar structure to reconstruct the SR images. Following the method of Ref. [19], the feature F out,ref all from the compression module is first reshaped and processed by a SASBlock. The SASBlock is repeated 3 times to integrate angular and spatial domain information. The output feature is fed into two ResBlocks with 64 channels. One ResBlock (with two residual blocks) provides channel-wise view fusion. The other ResBlock (with three residual blocks) provides channel fusion to generate the final reference-view feature F out,ref ref .
To save memory and computation, we utilize an up-sampling block Up(·) to increase the resolution of the reference-view image L ref SR . This block, inspired by Ref. [18], is composed of a convolution layer, a shuffle layer, and a convolution layer in order. Finally, L ref SR is generated by adding the residual map to the up-sampled image. The reconstruction image for one angular position can be expressed as where Up represents the process of reconstruction. To simplify our LFSR network, we follow the approach in Fig. 3. We randomly choose a view as the reference view and feed all views into our network. Through feature extraction, feature fusion, feature compression, and reconstruction, our network can fully learn the differential sub-pixel information from the auxiliary views. The information can be added to the reference view for reconstruction.

Experiments
In this section, we first introduce the datasets used and implementation details. Then, we compare our LF-CFANet with several state-of-the-art SISR and LFSR methods. Finally, we conduct ablation studies to evaluate the contribution of individual component modules in our network.

Datasets
Our LF images come from both synthetic datasets and real-world datasets. The real-world datasets were captured by various devices with different baseline lengths. Therefore, LF algorithms should be able to adapt to different datasets. As listed in Table 1, 6 public LF datasets (EPFL [42], HCInew [43], HCIold [44], INRIA [45], STFgantry [46], and STFlytro [47]) were used for training and testing in our experiments, which include a total of 394 LF scenes for training

Implementation details
Our network has two types of convolutional layers, which are 3 × 3 and 1 × 1. All the 3 × 3 convolutional layers were zero-padded to retain the spatial resolution, and we set the number of Resblocks to 2, 2, and 3 residual blocks in order. The feature depths of residual blocks were all 64.
In the training stage, we randomly cropped the input LF images to a spatial size of 64 × 64 and randomly processed by flipping the images horizontally or vertically and rotating them by 90 • . The upscaling factor r was 2 or 4, and we respectively trained the network with different factors. We train our network with the Adam optimizer (β 1 = 0.9, β 2 = 0.999). The initial learning rate was set to 10 −4 and decreased by a factor of 0.5 every 250 epochs.
Training of the full LF-CFANet was stopped after 600 epochs.

Quantitative comparison results
A quantitative evaluation of PSNR and SSIM at 5 × 5 angular resolution for the 6 test datasets is given in Table 2. Our method achieved higher PSNR and SSIM than the RCAN [32] SISR method. Specifically, our method had an average PSNR increase of 1.7 dB (×2) and 1.2 dB (×4) in the test. That is because complementary information can be used effectively in  HCIold). That is because our LF-CFANet is based on feature fusion driven by the attention mechanism, which is sensitive to the disparity. Due to different angular resolutions, the PSNR of each view in SAIs is not identical. Figures 5  and 6 compare the PSNR of individual SAIs for LFSSR, ATO, LF-InterNet, and our method. Compared with the same many-to-one approach (LF-ATO), our approach shows significant performance improvements, as shown by Fig. 6. Although LFSSR, LF-ATO, and LF-InterNet can use the angular information from all input views to super-resolve each view, the gap among maximum-difference views of our method is much smaller than for other methods. That is because our method introduces MDISB to reduce the information degradation of maximum-difference views. The reconstruction quality of LF-CFANet is slightly higher than those of other LFSR methods. The computational load of some state-of-the-art methods (EDSR, resLF, LF-ATO, LF-InterNet) is presented in Table 4. Note that, our method consumes less computational resources but achieves the best results compared with SISR and LFSR methods, especially for the reference view. Our LF-CFANet has higher computational efficiency than LF-InterNet, because the structure of LF-InterNet processes with all LR LF simultaneously. However, our method reconstructs the reference view by supplementing the complementary information from auxiliary views. Note that the PSNR for our reference view is much higher than the average PSNR for LF-InterNet: see Fig. 5.

Qualitative comparison
We provide a visual comparison of results of different methods in Fig. 4 for ×2 and ×4 LFSR. Our LF-CFANet can recover fine details and textures, such as the letters in STFgantry cards. However, the other methods lose most high-frequency details in the reconstructed results. Compared with our method, VDSR and SAN, state-of-the-art SISR methods, produce poor details, because they lack   the complementary information to supplement image reconstruction. Although resLF, LF-SAS, LF-ATO, and LF-InterNet methods generate better results than SISR methods, they do not effectively make use of complementary information in the LFSR process. Our method can effectively and efficiently reconstruct LF images by using a channel and spatial attention mechanism. Figures 7 and 8 further demonstrate the visual comparisons of LF parallax structure of LFSR methods: our method produces clearer and straighter lines than the other LFSR methods. Our method can preserve the structural characteristics of LFs.

Ablation study
We conducted several experiments to evaluate the results using different architectures.

Effectiveness of MRASPP
The MRASPP is used to extract discriminative features. We used variants of LF-CFANet (onlyMRASPP and rmMRASPP) to show the effectiveness of the MRASPP. The results are given in Table 3. As expected, removing MRASPP caused rmMRASPP to suffer a decrease (of 0.06 dB) in PSNR. That is because MRASPP extracts feature at different scales, which can make the feature representations more robust. Moreover, discriminative features with rich context information can be extracted by using the multiple receptive fields of atrous convolutions. Therefore, our model can obtain accurate features to reconstruct LF.

Effectiveness of MDISB
MDISB is used to guide reference view reconstruction.
To validate the effectiveness of the MDISB, we removed this block, and again show the results in Table 3. rmMDISB suffered a 0.04 dB PSNR decrease compared to LF-CFANet. That is because this block can enahnce the influence of angular-position features in the process of reconstructing the referenceview image. Recall that in Eq. (8), we select four angular-position features with maximum-difference information according to the structure of the LF to boost the impact of maximum-difference views. We also evaluated the performance of MDISB with different angular positions for auxiliary views; see Table 5 and Fig. 9. The reconstruction accuracy consistently improved as the degree of differentiation of the information increased. Table 5 shows that MDISB (maximum) had the best result: MDISB

Effectiveness of RCSCAM
The RCSCAM plays a key role in our LF-CFANet. This model can enhance the complementary information exploitation capability between the reference view and complementary view by introducing an attention mechanism. For comparison, we replaced our RCSCAM with simple feature concatenation. As shown in Table 5, this block had a significant influence on the result, and the PSNR suffered a 0.13 dB decrease. Without a spatial and channel attention mechanism, complementary information from crossparallax images cannot be effectively learned to supplement the reference view.

Conclusions
In this paper, we propose the complementary-view feature attention network (LF-CFANet) for LFSR. The main contribution of our method is the fusion of complementary-view information, by using RCSCAM and MDISB. For RCSCAM, we use spatial and channel attention to effectively extract complementaryview feature information to supplement the reference view. To guide reference view reconstruction, MDISB is proposed to supplement the most differentiated featurelevel information. Our experiments show that MDISB works well in the reconstruction process, allowing the reference view image to be effectively and efficiently reconstructed. Our method achieves state-of-the-art LFSR results in both quantitative and qualitative evaluations, and it is more robust for real-world scenes. It is worth noting that the quality of the supplementary information from MDISB is crucial and improves the reconstruction accuracy. Therefore, a further study of the maximum-difference views is needed, and we could possibly use fewer views to reconstruct the whole set of LF views. In future work, we aim to use an encoder and decoder framework to improve the quality of feature fusion with fewer LF views, providing a further step toward consumer applications. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.
The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Other papers from this open access journal are available free of charge from http://www.springer.com/journal/41095. To submit a manuscript, please go to https://www. editorialmanager.com/cvmj.