Introduction

The concept of steganography has its origins in the Greek language, specifically from two Greek words: “steganos,” meaning “covered” or “protected,” and “graphia,” meaning “writing” or “drawing.” These words combined to form “steganographia,” which can be interpreted as “covered writing” or “hidden writing.” Over time, the term evolved into “steganography” as we know it today. Steganography refers to the technique of concealing information within various forms of media or carriers. It is considered an art and science of hiding information within seemingly innocent carriers like images, audio files, or text, without arousing suspicion. Throughout history, steganography has been used to covertly transmit sensitive information or maintain secret communication channels. Ancient methods included hiding messages in wax tablets, tattooing them on messengers’ shaved heads, or using invisible ink. One of the earliest recorded instances of steganography dates back to the Greek historian Herodotus, who described a method where messages were written on a slave’s shaved head, allowing the hair to regrow before reaching the intended recipient [1]. Steganography techniques involve hiding information in different types of media. Image steganography involves embedding information within the pixels of digital images, using methods like least significant bit (LSB) insertion or spread spectrum techniques. Audio steganography hides information within audio files, utilizing characteristics of sound such as phase coding or audio masking. Text steganography conceals information within textual content, employing methods like invisible ink, modifying font styles, or utilizing hidden spaces or punctuation marks. As steganography techniques evolve, methods of steganalysis also develop to detect and analyze hidden messages. Steganalysis [2] involves statistical analysis, machine learning algorithms, and forensic techniques to identify the presence of steganographic content. Common steganalysis methods include statistical analysis of carrier files, machine learning-based approaches using algorithms like support vector machines (SVMs) [3] or artificial neural networks (ANNs) [4], and visual inspection by trained experts to identify anomalies [5]. Peak signal-to-noise ratio (PSNR) [6], structural similarity index (SSIM) [7], and mean squared error (MSE) [8] are metrics commonly used in image and video processing to assess the quality or fidelity of a reconstructed or compressed signal compared to the original signal. The work presented in [9] proposed a steganography technique using pixel swapping-quantum Hilbert image scrambling and discrete wavelet transformation (DWT), followed by stego image smoothing operation. Initial steps of proposed image steganography include best suitable cover image selection for secret image. This process compares the secret image with the different cover image presents in senders cover image database. Before embedding (encoding), the secret image is scrambled, and then performs the DWT transformation on this scrambled image and cover image. Next steps embed both resultant images and generate a new image named as stego image. The quality of stegoimage is increased using pixel swapping based operation. The decoding process of proposed steganography scheme is just reverse of the former encoding process followed by application of convolutional neural networks (CNN) to improve the extracted secret image quality. Two metrics were employed to show the effectiveness of the proposed technique, namely, PSNR, and normalized cross-correlation (NCC). Experimental results included three publicly available photos, and the results showed improvement (around 2dB) over the existing approaches before and after the smoothing application. In [10], a novel least significant bit substitution (LSB) steganography approach is presented. Details about other recently-published relevant approaches were also given. The image is flipped, transformed, and partitioned into the three-color channels. The red, green, and blue (this channel is shuffled using Magic Matrix, a MATLAB built-in function) channels are employed to embed the secret message. Multi level encryption algorithm (MLEA) is deployed to increase the level of robustness of the proposed approach. Twelve colored images, nine gray-scale, nine texture images, and nine aerial images were utilized to test the performance of the proposed technique. All images in the first set were tested with varying the dimensions (e.g., \(128 \times 128, 256 \times 256, 512\times 512,\) and \(1024 \times 1024\)). The size of the secret message was one of the following values: 2, 4, 6, 8, 10, 12, 14, and 16 KB. The metrics employed to assess the proposed technique were PSNR, MSE, structural similarity index (SSIM), and normalized cross-correlation (NCC). The results showed that the proposed approach was better than the rest of the existing techniques. More details about the results in that paper is given in the “Results” section . In one research paper [11], existing approaches, current trends, and obstacles in steganography research were examined. The paper analyzed publicly accessible databases and evaluation measures commonly used in these studies. It also conducted a comparative analysis of different methods and discussed their identified gaps, advantages, and disadvantages. Another paper [12] introduced an image steganography approach based on k least significant bits (LSB) coding. The proposed method concealed an image by utilizing a specific number of least significant bits. The paper compared this approach with other state-of-the-art methods. In another related work [13], a robust and secure video steganographic algorithm was proposed, utilizing discrete wavelet transform (DWT) and discrete cosine transform (DCT) domains, motion-based tracking, and error correcting codes. The paper demonstrated improved embedding capacity, imperceptibility, security, and robustness against various attacks. A novel technique for image steganography based on Huffman Encoding was presented in [14], showcasing high capacity, good invisibility, and satisfactory security.

In this paper, a new image steganography insertion approach is introduced. The approach divides the cover image into non-overlapping blocks and transforms them using 2D discrete wavelet transform (DWT). An adaptive algorithm [15,16,17,18] determines the weights of each coefficient in the Wavelet domain for each block. The block with coefficients having lower total weights compared to others is selected. The secret data is embedded in the least significant bit (LSB) of the binary representation of the selected block. The block is converted back to the decimal representation, and inverse DWT is applied to obtain the stego image. The performance of the proposed technique is evaluated using MSE, PSNR, and human visual inspection. A comparison is made with two other techniques: Spatial LSB and energy-based DCT insertion. The impact of the size of the cover image block is also examined. The experimental results demonstrate that the proposed approach outperforms the other techniques when tested with a samples from the BOSSBase [19], and a customized databases. The remaining sections of the paper are organized as follows: “Methods” presents details about steganography techniques, “Proposed technique” explains the proposed technique, “Results” presents the results, “Results discussion” contains the discussion, and “Conclusions” summarizes the conclusions.

Methods

This section provides an overview of different techniques used in steganography. These techniques can be categorized into two domains: spatial domain and discrete wavelet transform (DWT) domain.

Spatial domain LSB insertion

The spatial domain LSB insertion technique is a commonly used method in steganography. It involves hiding information within the least significant bit of a pixel or a sample in an image or audio signal. The least significant bit is chosen because altering it has minimal impact on the overall perception of the signal. In this technique, the binary representation of the secret message is embedded by replacing the least significant bit of selected pixels or audio samples with the corresponding bits from the message. This process is repeated until all bits of the secret message are embedded. By modifying only the LSB, the changes introduced to the carrier signal are generally imperceptible to humans. However, it is important to consider the capacity of the carrier signal to ensure that the secret message can be embedded without causing noticeable artifacts. While LSB insertion provides a simple method for steganographic data hiding, it may be vulnerable to detection by steganalysis techniques that analyze statistical properties or deviations from expected patterns in the carrier signal. To enhance security, additional techniques like encryption and more advanced steganographic methods can be employed.

Discrete wavelet transform (DWT)

The wavelet process, as described in [20, 21], produces four frequency bands: LL (low pass–low pass), LH (low pass–high pass), HL (high pass–low pass), and HH (high pass–high pass), which are combined together in a matrix. When this process is applied to 2D signals such as images, a single-level discrete wavelet transform (DWT) decomposition involves the use of a scaling function called \(\varphi (x,y)\) and three wavelets denoted as \(\psi (x,y)\). The computation of these wavelets can be outlined as follows:

$$\begin{aligned} \varphi (x,y)={} & {} \varphi (x)\varphi (y)\end{aligned}$$
(1)
$$\begin{aligned} \psi ^H(x,y)={} & {} \psi (x)\varphi (y)\end{aligned}$$
(2)
$$\begin{aligned} \psi ^V(x,y)={} & {} \varphi (x)\psi (y)\end{aligned}$$
(3)
$$\begin{aligned} \psi ^D(x,y)={} & {} \psi (x)\psi (y) \end{aligned}$$
(4)

In this context, the scaling function \(\varphi (x,y)\) represents the low-frequency component or the LL band, which captures overall variations in the image. The column variations, denoted by the LH band, are obtained by measuring changes along the columns. Similarly, the row variations, referred to as the HL band, are detected by the sensitivity of the wavelet function \(\psi ^V(x,y)\) to changes along the rows. Finally, diagonal variations, corresponding to the HH band, are simulated by the wavelet function \(\psi ^D(x,y)\) to replicate variations along the diagonal direction. Therefore, the 2D-DWT (discrete wavelet transform) of an image represented by g(x,y) with dimensions \(M \times M\) is:

$$\begin{aligned} W_\varphi (j_0,m,m)=\frac{1}{\sqrt{MM}} \sum \limits _{x=0}^{M-1} \sum \limits _{y=0}^{M-1} g(x,y) \varphi _{j_0,m,m}(x,y) \end{aligned}$$
(5)
$$\begin{aligned} W^i_\psi (j,m,m)={} & {} \frac{1}{\sqrt{MM}} \sum \limits _{x=0}^{M-1} \sum \limits _{y=0}^{M-1} g(x,y) \psi ^i_{j,m,m}(x,y)\\\nonumber i={} & {} \{H,V,D\} \end{aligned}$$
(6)

\(j_0\) represents an arbitrary initial scale, and the coefficients \(W_\varphi (j_0,m,m)\) serve as an approximation of the function g(xy) at scale \(j_0\). On the other hand, the coefficients \(W^i_\psi (j,m,m)\) contribute additional details in the horizontal, vertical, and diagonal directions for scales \(\textit{j} \ge \textit{j}_0\). Typically, \(\textit{j}_0\) is set to 0, and \(M=M=2^J\) is chosen, where \(\textit{j}=0, 1, 2, ....J-1\) and \(m=m=0, 1, 2, ..., 2^j-1.\)

Cosine domain insertion

The discrete cosine transform (DCT) is a mathematical method widely employed in signal processing and data compression. It is also utilized in specific types of steganography to conceal data within digital media like images or videos. The mathematical equations, both forward and inverse, used to compute these coefficients are[14]:

$$\begin{aligned} G(m,n)={} & {} \frac{2}{\sqrt{M \times N}}\sum \limits _{u=0}^{M-1} \sum \limits _{v=0}^{N-1} g(u,v) c_m \\\nonumber{} & {} \cos \left( \frac{m(2u+1)\pi }{2M}\right) c_n \cos \left( {\frac{n(2v+1)\pi }{2N}}\right) , \end{aligned}$$
(7)

where g(uv) is the signal in the time domain and G(mn) is the \(m^{th}\) row, \(n^{th}\) column DCT coefficient for \(u=0,1,\ldots M-1\) and \(v=0,1,\ldots N-1\).

$$\begin{aligned} g(u,v)={} & {} \frac{2}{\sqrt{M \times N}}\sum \limits _{u=0}^{M-1} \sum \limits _{v=0}^{N-1} G(m,n) c_m \\\nonumber{} & {} \cos \left( \frac{m(2u+1)\pi }{2M}\right) c_n \cos \left( {\frac{n(2v+1)\pi }{2N}}\right) \end{aligned}$$
(8)

where \(c_m\), and \(c_n\) are:

$$\begin{aligned} c_m = \left\{ \begin{array}{ll} \frac{1}{\sqrt{2}} &{}\text {for}~ m=0\\ 1 &{}\text {otherwise} \end{array}\right. \end{aligned}$$
(9)

In steganography, the DCT is employed on blocks or segments of the original image. Steganography techniques that utilize DCT often select specific frequency coefficients for hiding information. The selection of coefficients is typically based on their perceptual significance, favoring those that are less noticeable to the human eye. Usually, low-frequency components are commonly chosen for steganography purposes. The confidential data is typically represented as binary bits, which are then embedded by modifying the selected DCT coefficients. This modification involves adding or subtracting small values to the coefficients to encode the secret information. After embedding the data, the modified DCT coefficients are quantized and compressed. Quantization reduces the precision of the coefficients, making the changes caused by embedding less noticeable, while compression further reduces the size of the resulting stego image.

When the stego image reaches the recipient, the reverse process is applied to retrieve the hidden information. The DCT coefficients are inverted to the spatial domain through an inverse transformation, resulting in the reconstructed image. The hidden data is extracted by analyzing the modified coefficients. It’s important to note that the specific techniques and algorithms used in steganography can vary, and there are numerous variations and refinements to the process just described.

Adaptive algorithm

The adaptive algorithm is a technique used in steganography to optimize the selection of coefficients for embedding secret information. The algorithm involves several steps:

  1. 1.

    Calculation of the total energy of the cover image;

  2. 2.

    Application of 2D DCT [22] to obtain the first representation of DCT coefficients;

  3. 3.

    Selection of a predefined number of coefficients, with the rest transformed back to the spatial domain;

  4. 4.

    Transformation of the current version of the image to the wavelet domain using 2D DWT. Then, predefined number of coefficients are chosen and the rest are transformed back to the spatial domain;

  5. 5.

    The current total energy is calculated. If the calculated value is less than \(0.05\%\) of the value in step 1, go to step 2; otherwise, the algorithm halts;

  6. 6.

    The final outputs are the weights of each coefficient in the cosine and the wavelet domains.

The difference in minimizing the cost function, which measures the inconsistency between the initial energy and the energies retained in each domain, lies in the energy leftover, denoted as \(\Phi (\alpha , \beta )\). Specifically, \(\Phi (\alpha , \beta )\) is computed in the following manner:

$$\begin{aligned} \Phi (\alpha , \beta )=[C_1]^2-[T_{2,1}(C_2)]^2-[T_{3,1}(C_3)]^2 \end{aligned}$$
(10)

where \([\ ]^2\) is the square of each element separately. The procedure starts by employing a steepest descent algorithm [23] to reduce the remaining error. After the iteration is complete, a specific number of coefficients are saved in two separate domains: the 2D-DCT and the 2D-DWT domains. These preserved coefficients are then combined to form the resulting feature vector for each pose. The parameters used in the training phase are as stated below. The weight matrices \(\alpha\) and \(\beta\) are initially set with elements of 0.5 and 0.3, respectively. The updating equations for each iteration are described in [24]:

$$\begin{aligned} \alpha _{i,j}(n+1)={} & {} \alpha _{i,j}(n)-\mu _{\alpha _{i,j}} \nabla _{\alpha _{i,j}} \Phi \end{aligned}$$
(11)
$$\begin{aligned} \beta {i,j}(n+1)={} & {} \beta {i,j}(n)-\mu _{\beta {i,j}} \nabla _{\beta {i,j}} \Phi \end{aligned}$$
(12)

where i and j span the entire domain and depending on \(\alpha _{i,j}\) and \(\beta _{i,j}\) are elements in \([\alpha ]\) and \([\beta ]\), respectively; n is the iteration index; and \(\mu\) is the converging factor. The converging factors, \(\mu _{\alpha _{i,j}}\) and \(\mu _{\beta {i,j}}\), are calculated in the following fashion:

$$\begin{aligned} \mu _\alpha ={} & {} \frac{\Phi (n)}{\sum _{i=0}^{N-1}\sum _{j=0}^{N-1}\left[ \nabla _{\alpha _{i,j}}\Phi \right] ^2}\end{aligned}$$
(13)
$$\begin{aligned} \mu _\beta ={} & {} \frac{\Phi (n)}{\sum _{i=0}^{N-1}\sum _{j=0}^{N-1}\left[ \nabla _{\beta _{i,j}}\Phi \right] ^2} \end{aligned}$$
(14)

Performance metrics

Performance metrics are utilized to assess how well a system, process, algorithm, or model performs in terms of its effectiveness, efficiency, accuracy, or quality. The selection of performance metrics relies on the particular task or application at hand.

Mean squared error (MSE)

Mean squared error (MSE) is a metric commonly used in regression to gauge the average squared disparity between the pixel values of the initial signal and the reconstructed or compressed signal. It offers a numerical assessment of the overall distortion existing between these two signals. The mathematical expression for MSE is as follows:

$$\begin{aligned} MSE = \frac{1}{m \times n} \sum \limits _{m}^{} \sum \limits _{n}^{} (I(x, y) - K(x, y))^2 \end{aligned}$$
(15)

In the formula, the variable I(x, y) represents the pixel value of the original signal at a specific position (x, y), K(x, y) represents the pixel value of the reconstructed or compressed signal at the same position, and (m * n) represents the total number of pixels in the image. A reduced MSE value indicates a smaller average difference, suggesting a higher quality of reconstruction or compression. However, MSE alone may not offer a perception-based measure of quality since it does not account for the human visual system’s sensitivity to various image characteristics.

Peak signal-to-noise ratio (PSNR)

Peak signal-to-noise ratio (PSNR) is a logarithmic metric that establishes a connection between the maximum achievable power of a signal (such as the highest pixel value) and the power of the noise (i.e., the dissimilarity between the original and reconstructed/compressed signals). PSNR is typically measured in decibels (dB). The mathematical representation for PSNR is as follows:

$$\begin{aligned} PSNR = 10 * \log _{10}(MAX^{2} / MSE) \end{aligned}$$
(16)

The maximum pixel value, represented as MAX (such as 255 for an 8-bit grayscale image), is used in calculating the PSNR (peak signal-to-noise ratio). PSNR is a measure of quality that considers the range of pixel values and follows a logarithmic scale, making it more relevant in terms of human perception. A higher PSNR value signifies better quality since it represents a lower ratio of noise to the maximum strength of the signal.

Structural similarity index (SSIM)

The SSIM reflects the similarity between two images, whole or parts, by providing a quantitative assessment of how well the perceived structural information of an image is preserved after undergoing various processing. The SSIM takes into account 3 important components, namely, luminance, contrast, and structure. The SSIM index falls between \([-1, 1]\), with 1 indicating perfect similarity between the images. This index is calculated as follows:

$$\begin{aligned} SSIM(x, y) = [l(x, y) * c(x, y) * s(x, y)] \end{aligned}$$
(17)

where x and y are the two input images (or windows) being compared, l(x, y) represents the luminance comparison, c(x, y) represents the contrast comparison, and s(x, y) represents the structure comparison. Each of these coefficients is calculated as follows:

$$\begin{aligned} l(x, y) ={} & {} \frac{2 \mu _x \mu _y + C1}{\mu ^2_x + \mu ^2_y + C1} \nonumber \\ C1 ={} & {} (K1 * L)^2 \nonumber \\ c(x, y) ={} & {} \frac{2 \sigma _x \sigma _y + C2}{\sigma ^2_x + \sigma ^2_y + C2}\\ C2 ={} & {} (K2 * L)^2 \nonumber \\ s(x, y) ={} & {} \frac{\sigma _{xy} + C3)}{\sigma _x \sigma _y + C3} \nonumber \end{aligned}$$
(18)

where \(\mu\) is mean, \(\sigma\) is standard deviation, \(\sigma ^2\) is the variance, K1 = 0.01, L = 1 (the dynamic range of pixel values for gray-scale images, L = 1), K2 = 0.03, and C3 is a small constant.

General comments on the performance metrics

It should be emphasized that MSE (mean squared error), PSNR (peak signal-to-noise ratio), and structural similarity index (SSIM) have their own limitations. They do not perfectly encompass all aspects of image quality, such as the way humans perceive visual information, and may not consistently align with subjective evaluations. Consequently, it is advisable to utilize these metrics in conjunction with other methods for assessing quality and to take into account the unique requirements and characteristics of the particular application or task.

Proposed technique

In Fig. 1, the proposed technique is depicted. The process begins by dividing the original image into non-overlapping blocks. These blocks are then transformed into the wavelet domain using the 2D DWT (2D discrete wavelet transform). To determine the weights of each coefficient within the wavelet domain, an adaptive algorithm described in “Adaptive Algorithm” is applied to each block individually. The block(s) with coefficients exhibiting lower total weights compared to the other blocks is(are) selected. These selected coefficients are converted into a binary representation called “cover in binary.” The secret data, in its binary form, is then embedded in the least significant bit (LSB) of the cover in binary. Afterwards, the block is converted back to its decimal representation. To obtain the stegoimage, a 2D IDWT (2D inverse discrete Haar transform) is applied. At the receiver’s end, the recipient can extract the secret data by partitioning the image into blocks with the same dimensions as those used in the encoding process. Additionally, the index of the chosen block must be securely transmitted to the receiver. The effectiveness of the suggested method is assessed by employing evaluation metrics like mean squared error (MSE), peak signal-to-noise ratio (PSNR), and structured similarity index (SSIM) in addition to human visual examination. A comparative analysis is performed between the proposed system and recently reported approaches.

Fig. 1
figure 1

The proposed technique utilized in steganography system. The two modules of the system are shown

Results

The results are presented as two scenarios, namely, BOSS database, and customized database.

First scenario: BOSS database

In this part of the results, the proposed method is tested using a set of 10 gray-scale samples obtained from the BOSSBase database (Break Our Steganographic System Base) [19]. This database comprises 10,000 black and white images designed for experiments related to detecting hidden data in JPEG images. It includes various features extracted from clean images, as well as images with hidden random data using different techniques such as the JPEG universal wavelet relative distortion, the nsF5 method, and the uniform embedding revisited distortion (UERD) algorithm. The features include discrete cosine transform residuals (DCTR), Gabor filter residuals (GFR), and the phase-aware projection model (PHARM). To study the effect of the block size on the PSNR, the following non-overlapping block sizes were chosen: \(4 \times 4, 8 \times 8 , 16 \times 16, 32 \times 32, 64 \times 64,\) and \(128 \times 128\). Figure 2 shows samples from The BOSS database (original cover and stego images with different block sizes).

To study the effect of the secret message size on the PSNR of the stegoimage, the following message sizes were chosen: 4, 6, 8, 10, 12, 14,  and 16 KB (kilobytes \(= 1024\times 8\) bits). Figure shows PSNRs for all images with different message size.

Second scenario: customized database

In this part of the results, the proposed method is tested using a set of 5 RGB images that appeared in [10]. The unprocessed samples of these images are shown in Fig. 3. The proposed approach is compared with 7 other approaches explained in [10]. These are labeled in the results as follows: 1st Modified LSB RGB [25], 2nd Improved LSB for RGB [26], 3rd LSB Replacement through XOR [27], 4th Multi-Stego for Grey Scale [28], 5th Value Difference using adjacent pixel LSB [29], 6th Grey Level Modification and Multi Level Encryption [30], and \(7^th\) Novel Least Significant Bit Technique [10].

Each cover image (the red layer only) is partitioned into non-overlapping blocks of \(4 \times 4\) and a 16 KB message is embedded in it. No pre-processing step is utilized except the image resizing to obtain the required dimensions. Two image dimensions, \(512 \times 512\) and \(1024 \times 1024\), were chosen to have a fair comparison with other recently reported results and to fit the Haar transform requirement.

To study the effect of the block size on the PSNR, the following non-overlapping block sizes were chosen: \(4 \times 4, 8 \times 8 , 16 \times 16, 32 \times 32, 64 \times 64,\) and \(128 \times 128\).

Results discussion

Based on the results displayed, the suggested method outperformed the other techniques being compared. It consistently maintained lower levels of mean squared error (MSE) while achieving higher peak signal-to-noise ratios (PSNRs) and SSIMs that are close to 1 in the majority of cases. Moreover, upon visual examination by humans, it was observed that the proposed technique did not alter the visual characteristics of the original image.

First database

The results of this database, in terms of PSNRs, are shown in Fig. 4 for both Spatial and proposed techniques. Figure 2 contains samples, before and after processing step, from The BOSS database/proposed LSB techniques. Original cover image and stego images with different message sizes are shown in this figure. As shown in the results, the proposed technique outperformed, or at least was at the same performance level except for two cases. Figure 5 shows the MSEs for different message sizes. The parts of the charts that are barely seen refer to low error values. The SSIMs were equal to, or very close to, 1 for all cases.

Fig. 2
figure 2

Samples from the BOSS database/proposed LSB technique. a Original cover image, b stegoimage with block size of 4, c stegoimage with block size of 8, d stegoimage with block size of 16, e stegoimage with block size of 32, f stegoimage with block size of 64, and g stegoimage block size of 128

Fig. 3
figure 3

Unprocessed samples of the customized database

Fig. 4
figure 4

PSNRs for the BOSS database/proposed LSB technique with different message sizes with \(4 \times 4\) block size. a Message size = 4 KB, b message size = 6 KB, c message size = 8 KB, d message size = 10 KB, e message size = 12 KB, f message size = 14 KB, and g message size = 16 KB

Fig. 5
figure 5

MSEs for the BOSS database/proposed LSB technique with different message sizes with \(4 \times 4\) block size. a Message size = 4 KB, b message size = 6 KB, c message size = 8 KB, d message size = 10 KB, e message size = 12 KB, f message size = 14 KB, and g message size = 16 KB

Second database

Figure 3 shows processed samples of the customized database with \(4 \times 4\) blocks, and 16 KB secret message size. Figure 6 shows the results, in terms of PSNR, obtained from the proposed approach and the other 7 approaches in comparison when the cover images’ dimensions are \(1024\times 1024\). As shown in this figure, the proposed approach is outperformed the rest of the approaches by several dBs. In addition, the figure shows the results, in terms of PSNR, obtained from the proposed approach and the other 7 approaches in comparison when the cover images’ dimensions are \(512\times 512\). As shown in this figure, the proposed approach is also outperformed the rest of the approaches. The SSIMs were equal to, or very close to, 1 for all cases. Figure 6 shows the PSNRs of the proposed approach when block size is changed to values mentioned above for image sizes \(1024 \times 1024\) and \(512 \times 512\), receptively. As shown in these subfigures, the less the block size (i.e., less coefficients are selected and processed individually), the higher PSNR obtained. Nevertheless, small block size leads to increase the number of total blocks that have to be processed by the system and hence more processing time is required. On the other hand, Fig. 7 show the MSEs obtained from the proposed approach and the other 7 approaches in comparison when the cover images’ dimensions are \(1024\times 1024\), \(512 \times 512\), different block size/customized database/image dimensions are 1024\(\times\)1024, and different block size/customized database/image dimensions are 512\(\times\)512. The results clearly show the better performance of the proposed approach compared with the rest of the reported approaches.

Fig. 6
figure 6

a PSNRs for proposed approach and other recently published results/customized database/image dimensions are 1024\(\times\)1024, b image dimensions are 512\(\times\)512, c different block size/customized database/image dimensions are 1024\(\times\)1024, and d different block size/customized database/image dimensions are 512\(\times\)512

Fig. 7
figure 7

a MSEs for the proposed approach and other recently published results/customized database/image dimensions are 1024\(\times\)1024, b image dimensions are 512\(\times\)512, c different block size/customized database/image dimensions are 1024\(\times\)1024, and d different block size/customized database/image dimensions are 512\(\times\)512

Conclusions

A new approach is presented for concealing a hidden message within an image, utilizing the two-dimensional discrete wavelet transform (2D DWT). The method involves transforming the image into the Wavelet domain using 2D DWT, followed by the selection of a specific number of coefficients to embed the binary secret message. This selection process incorporates an analysis of the image in two distinct domains: 2D DCT and 2D discrete wavelet transform. The analysis is adaptively executed to minimize any potential alterations to the original image. An adaptive algorithm is employed to assign weights to each coefficient in both domains, with lower-weighted coefficients chosen for embedding the secret message. To assess the efficacy of the technique, Grey scale samples from the BOSSbase and a RGB costumed databases were utilized, and three metrics, mean squared error (MSE), peak signal-to-noise ratio (PSNR), and structural similarity index (SSIM) were employed. Additionally, a visual examination of the resulting image by human observers was taken into consideration. The results clearly indicated that the proposed technique outperformed other recently reported techniques MSE, PSNR, SSIM, and visual quality.