1 Introduction

In practice, the practitioners, experimenters, and energy experts are interested to investigate the relationship between variables at different places and years. This objective can be obtained from the study of the correlation analysis. The Z-test for two correlation coefficients under classical statistics is quite helpful to the significance between two variables recorded at different places and years. The main aim of this test is to see either the relationship between a pair variable recorded at one place/year is significantly different from a pair of variables recorded at the second place or another year. This type of study will guide them to estimate/forecast the relationship between these variables for the next years. Kumar and Kumar [1] studied the relationship between metrological and COVID-19 pandemic. Weaver and Wuensch [2], Lee [3], Chen [4], Giakoumis [5], Shan et al. [6] and Rezaee et al. [7] studied various applications of correlation analysis in a variety of fields.

Statistical analysis has been widely applied for analyzing the wind speed for forecasting purposes. The correlation analysis is also helpful in analyzing the wind speed and temperature relationship. For example, the energy experts or meteorologists may interest to see the relationship between the wind speed and temperature in the year 2019 and the wind speed and temperature in the year 2020. Bechrakis and Sparis [8] used the correlation to study the relationship between wind speed of various stations. Su et al. [9] studied the relation between wind speed and wind turbines. Shen et al. [10] presented the work on the correlation between energy variables. More applications of statistical method in forecasting and analysis of wind speed can be seen in Rahimiyan [11], Arias-Rosales and Osorio-Gómez [12], Katinas et al. [13], Min et al. [14], Barhmi et al. [15], and Ben et al. [16].

The Z-test for two correlation coefficients under classical statistics is workable under the assumption that the data follow the normal distribution and have all determined observations. In real life, the wind speed and temperature data are recorded in intervals. In such cases, when the data are intervals or there is uncertainty in parameters or observations, the statistical methods using fuzzy logic are applied. Damousis et al. [17] studied the relationship between energy-related variables using fuzzy logic. Some statistical tests developed using the fuzzy logic can be seen in Montenegro et al. [18], Petković [19], Grzegorzewski and Śpiewak [20], Sezer et al. [21] and Nie et al. [22].

The neutrosophic logic is said to be more efficient than the fuzzy logic as it gives additional information about the measure of indeterminacy, see Smarandache [23]. Smarandache and Khalid, [24] proved its efficiency over the fuzzy logic and interval-based analysis. Several applications of the neutrosophic logic can be read in Abdel-Basset et al. [25], Smarandache [26] and Nabeeh et al. [27]. Smarandache [28] introduced the neutrosophic statistics that gives efficient results when data are in the interval, imprecise and indeterminate. Chen et al. [29, 30] discussed the measure of indeterminacy evaluation for neutrosophic numbers. Aslam [31] presented the wind forecasting method under neutrosophic statistics. More details can be seen in Aslam [32] and Aslam [33].

A rich literature on wind speed analysis using correlation under classical statistics and fuzzy logic is available in the literature. The existing tests are unable to give information about the measure of indeterminacy. By exploring the literature and best of our knowledge, there is no work on Z-test for two correlation coefficients under neutrosophic statistics. In this paper, we will originally introduce a Z-test for two correlation coefficients under neutrosophic statistics. The statistic of the proposed test will introduce under indeterminacy. The application of the proposed test will be given on the energy data with the expectation that it will be efficient and flexible to apply under uncertainty. It is expected that the proposed test will be helpful for studying the relationship between wind speed and temperature at various places, stations and years.

2 Preliminaries

Suppose that \({X}_{i1N}={X}_{i1L}+{X}_{i1U}{I}_{1N}\left(i=\mathrm{1,2},3,\dots ,{n}_{N}\right); {I}_{1xN}\in \left[{I}_{1xL},{I}_{1U}\right]\) and \({Y}_{i1N}={Y}_{i1L}+{Y}_{i1U}{I}_{1N}\left(i=\mathrm{1,2},3,\dots ,{n}_{N}\right); {I}_{1yN}\in \left[{I}_{1yL},{I}_{1yU}\right]\) be a pair of neutrosophic random of the first sample of sample size \({n}_{N}\in \left[{n}_{L},{n}_{U}\right].\) Let \({X}_{i2N}={X}_{i2L}+{X}_{i2U}{I}_{2N}\left(i=\mathrm{1,2},3,\dots ,{n}_{N}\right); {I}_{2xN}\in \left[{I}_{2xL},{I}_{2U}\right]\) and \({X}_{i2N}={Y}_{i2L}+{Y}_{i2U}{I}_{2N}\left(i=\mathrm{1,2},3,\dots ,{n}_{N}\right); {I}_{2yN}\in \left[{I}_{2yL},{I}_{2yU}\right]\) be a pair of neutrosophic random of the second sample, where \({I}_{1xN}\in \left[{I}_{1xL},{I}_{1U}\right]\) and \({I}_{2yN}\in \left[{I}_{2yL},{I}_{2yU}\right]\) present the corresponding measure of indeterminacy. Note that \({X}_{i1L}, {Y}_{i1L}, {X}_{i2L}\) and \({Y}_{i2L}\) present the determined observations and \({X}_{i1U}{I}_{1N}, {Y}_{i1U}{I}_{1N},\) \({X}_{i2L}\) and \({Y}_{i2L}\) are indeterminate part of neutrosophic forms. Based on this information, the neutrosophic correlation for the first sample, say \({r}_{1N}\in \left[{r}_{1L},{r}_{1U}\right]\) can be defined as follows

$${r}_{1N}=\frac{{n}_{N}\sum {X}_{1N}{Y}_{1N}-\sum {X}_{1N}\sum {Y}_{1N}}{\sqrt{\left\{{n}_{N}\sum {X}_{1N}^{2}-{\left(\sum {X}_{1N}\right)}^{2}\right\}\left\{{n}_{N}\sum {Y}_{1N}^{2}-{\left(\sum {Y}_{1N}\right)}^{2}\right\}}};{n}_{N}\in \left[{n}_{L},{n}_{U}\right]$$
(1)

The neutrosophic correlation for the second sample, say \({r}_{2N}\in \left[{r}_{2L},{r}_{2U}\right]\) can be defined as follows

$${r}_{2N}=\frac{{n}_{N}\sum {X}_{2N}{Y}_{2N}-\sum {X}_{2N}\sum {Y}_{2N}}{\sqrt{\left\{{n}_{N}\sum {X}_{2N}^{2}-{\left(\sum {X}_{2N}\right)}^{2}\right\}\left\{{n}_{N}\sum {Y}_{2N}^{2}-{\left(\sum {Y}_{2N}\right)}^{2}\right\}}};{n}_{N}\in \left[{n}_{L},{n}_{U}\right]$$
(2)

The neutrosophic correlation, say \({r}_{N}\in \left[{r}_{1N},{r}_{2N}\right]\) is defined as follows

$${r}_{N}=\left\{\frac{{n}_{N}\sum {X}_{1N}{Y}_{1N}-\sum {X}_{1N}\sum {Y}_{1N}}{\sqrt{\left\{{n}_{N}\sum {X}_{1N}^{2}-{\left(\sum {X}_{1N}\right)}^{2}\right\}\left\{{n}_{N}\sum {Y}_{1N}^{2}-{\left(\sum {Y}_{1N}\right)}^{2}\right\}}},\frac{{n}_{N}\sum {X}_{2N}{Y}_{2N}-\sum {X}_{2N}\sum {Y}_{2N}}{\sqrt{\left\{{n}_{N}\sum {X}_{2N}^{2}-{\left(\sum {X}_{2N}\right)}^{2}\right\}\left\{{n}_{N}\sum {Y}_{2N}^{2}-{\left(\sum {Y}_{2N}\right)}^{2}\right\}}}\right\}$$
(3)

The neutrosophic form \({r}_{N}\in \left[{r}_{1N},{r}_{2N}\right]\) can be written as

$$ r_{N} = r_{1N} + r_{2N} I_{rN} ; I_{rN} \in \left[ {I_{rL} ,I_{rU} } \right],r_{1N} \in \left[ {r_{1L} ,r_{1U} } \right],r_{2N} \in \left[ {r_{2L} ,r_{2U} } \right] $$
(4)

In Eq. (4), the first part \({r}_{1N}\in \left[{r}_{1L},{r}_{1U}\right]\) denotes the correlation under classical statistics and \({r}_{2N}{I}_{rN}\) denotes the indeterminate part, where \({I}_{rN}\in \left[{I}_{rL},{I}_{rU}\right]\) presents the indeterminacy associated with \({r}_{N}\in \left[{r}_{1N},{r}_{2N}\right]\). The neutrosophic form reduces to correlation under classical statistics if \({I}_{rL}=0\).

3 Design of the Proposed Test

The existing Z-test for two correlation coefficients has been applied widely for testing the significant difference between correlation coefficients when the observations in each pair of two samples are precise, exact and determined. The use of the existing test may mislead when the samples have intervals, inexact and indeterminate observations. In this section, the design of the proposed Z-test for two correlation coefficients under neutrosophic statistics will be presented. It is assumed that the two samples are drawn from the neutrosophic normal distributions and the relationship between the neutrosophic independent variables and neutrosophic dependent variables is linear. Suppose that \({\rho }_{1N}\) and \({\rho }_{2N}\) are the corresponding neutrosophic population correlations, see Kanji [34] and Smarandache [28]. The Z-test of a correlation coefficient for the first sample under neutrosophic statistics is calculated as follows

$${Z}_{1N}=\frac{1}{2}{log}_{e}\left(\frac{1+{r}_{1N}}{1-{r}_{1N}}\right)=1.1513{log}_{10}\left(\frac{1+{r}_{1N}}{1-{r}_{1N}}\right); {r}_{1N}\in \left[{r}_{1L},{r}_{1U}\right]$$
(5)

In neutrosophic form, the values of \({Z}_{1N}\in \left[{Z}_{1L},{Z}_{1U}\right]\) can be written as

$${Z}_{1N}={Z}_{1L}+{Z}_{1U}{I}_{z1N}; {I}_{z1N}\in \left[{I}_{z1L},{I}_{z1U}\right]$$
(6)

The Z-test of a correlation coefficient for the second sample under neutrosophic statistics is calculated as follows

$${Z}_{2N}=\frac{1}{2}{log}_{e}\left(\frac{1+{r}_{2N}}{1-{r}_{2N}}\right)=1.1513{log}_{10}\left(\frac{1+{r}_{2N}}{1-{r}_{2N}}\right); {r}_{2N}\in \left[{r}_{2L},{r}_{2U}\right]$$
(7)

In neutrosophic form, the values of \({Z}_{2N}\in \left[{Z}_{2L},{Z}_{2U}\right]\) can be written as

$${Z}_{2N}={Z}_{2L}+{Z}_{2U}{I}_{z2N}; {I}_{z2N}\in \left[{I}_{z2L},{I}_{z2U}\right]$$
(8)

The neutrosophic mean of \({Z}_{1N}\in \left[{Z}_{1L},{Z}_{1U}\right]\) and \({Z}_{2N}\in \left[{Z}_{2L},{Z}_{2U}\right]\) are given by

$${\mu }_{{Z}_{1N}}=\frac{1}{2}{log}_{e}\left(\frac{1+{\rho }_{1N}}{1-{\rho }_{1N}}\right)=1.1513{log}_{10}\left(\frac{1+{\rho }_{1N}}{1-{\rho }_{1N}}\right)$$
(9)
$${\mu }_{{Z}_{2N}}=\frac{1}{2}{log}_{e}\left(\frac{1+{\rho }_{2N}}{1-{\rho }_{2N}}\right)=1.1513{log}_{10}\left(\frac{1+{\rho }_{2N}}{1-{\rho }_{2N}}\right)$$
(10)

The neutrosophic variance of \({Z}_{1N}\in \left[{Z}_{1L},{Z}_{1U}\right]\) and \({Z}_{2N}\in \left[{Z}_{2L},{Z}_{2U}\right]\) are given by

$${\sigma }_{{Z}_{1N}}=\frac{1}{\sqrt{\left({n}_{1N}\right)-3}}$$
(11)
$${\sigma }_{{Z}_{2N}}=\frac{1}{\sqrt{\left({n}_{2N}\right)-3}}$$
(12)

The test statistic, say \({Z}_{N}\in \left[{Z}_{L},{Z}_{U}\right]\) under neutrosophic statistics is given by

$${Z}_{N}=\frac{\left({Z}_{1N}-{Z}_{2N}\right)-\left({\mu }_{{Z}_{1N}}-{\mu }_{{Z}_{2N}}\right)}{{\sigma }_{N}};{Z}_{N}\in \left[{Z}_{L},{Z}_{U}\right],$$
(13)

where \({\sigma }_{N}\in \left[{\sigma }_{L},{\sigma }_{U}\right]\) is defined by

$${\sigma }_{N}=\sqrt{{\sigma }_{{Z}_{1N}}^{2}+{\sigma }_{{Z}_{2N}}^{2}}$$
(14)

In neutrosophic form, the statistic \({Z}_{N}\in \left[{Z}_{L},{Z}_{U}\right]\) can be written as

$${Z}_{N}={Z}_{L}+{Z}_{U}{I}_{ZN}; {I}_{ZN}\in \left[{I}_{ZL},{I}_{ZU}\right]$$
(15)

Note here that the statistic \({Z}_{N}\in \left[{Z}_{L},{Z}_{U}\right]\) reduces to statistic under classical statistics when \({I}_{ZL}=0\) and \({I}_{ZN}\in \left[{I}_{ZL},{I}_{ZU}\right]\) is indeterminacy interval associated with \({Z}_{N}\in \left[{Z}_{L},{Z}_{U}\right]\).

The proposed test can be implemented as follows

Step 1: State the null hypothesis that \({H}_{0N}:{\rho }_{1N}={\rho }_{2N}\) vs. the alternative hypothesis \({H}_{1N}:{\rho }_{1N}\ne {\rho }_{2N}\).

Step 2: State the level of significance \(\alpha \).

Step 3: Compute the values of the test statistic \({Z}_{N}\in \left[{Z}_{L},{Z}_{U}\right]\) using the neutrosophic sample information.

Step 4: Select the critical value from the Z-table corresponding to \(\alpha \) and decide about the rejection region according to \({H}_{1N}\).

Step 5: Do not reject \({H}_{0N}:{\rho }_{1N}={\rho }_{2N}\) is the calculated value of \({Z}_{N}\in \left[{Z}_{L},{Z}_{U}\right]\) falls within the acceptance region.

4 Application Using Energy Data

In this section, the application of the proposed test is given on the weather data. For the study, the two important weather variables, namely temperature and wind speed are selected. The purpose of the application of the proposed test is to show the significant relationship between the temperature and wind speed at various time periods. The minimum and maximum values of two variables for the month of January are recoded for Lahore, Pakistan, and reported in Table 1 for the years 2019 and 2020. The energy experts are interested to see either the relation between temperature and wind speed for the year 2019, and the year 2020 is significant or not. As the data are recorded in indeterminate intervals, therefore the use of the existing test under classical statistics is not appropriate or may mislead the energy experts. In this situation, the proposed test can be applied to see either the correlation between temperature and wind for the two years is significant or not. Let \({r}_{1N}\in \left[{r}_{1L},{r}_{1U}\right]\) be the correlation between wind speed, say \({X}_{1N}\) and temperature, say \({Y}_{1N}\) for the year 2019. Let \({r}_{2N}\in \left[{r}_{2L},{r}_{2U}\right]\) be the correlation between wind speed, say \({X}_{2N}\) and temperature, say \({Y}_{2N}\) for the year 2020. The neutrosophic correlation \({r}_{1N}\in \left[{r}_{1L},{r}_{1U}\right]\) and \({r}_{2N}\in \left[{r}_{2L},{r}_{2U}\right]\) are computed as follows

Table 1 Temperature and wind speed data

\({r}_{1N}\in \left[-0.0763,-0.3190\right]\);\({n}_{1N}\in \left[\mathrm{31,31}\right]\) and \({r}_{2N}\in \left[\mathrm{0.2804,0.1290}\right]\);\({n}_{2N}\in \left[\mathrm{31,31}\right]\).

The neutrosophic form \({r}_{N}\in \left[{r}_{1N},{r}_{2N}\right]\) can be written as

$${r}_{N}=\left[-0.0763+(-0.3190){I}_{1rN}\right]+\left[0.2804-0.1290\right]{I}_{2rN}; {I}_{rN}\in \left[\mathrm{0,0};\mathrm{0.76,1.17}\right]$$

The Z-test of a correlation coefficient for the first sample is computed as follows

$${Z}_{1N}=1.1513{log}_{10}\left(\frac{1+\left[-0.0763,-0.3190\right]}{1-\left[-0.0763,-0.3190\right]}\right)=[-0.0765,-0.3306]$$

In neutrosophic form, the values of \({Z}_{1N}\in \left[{Z}_{1L},{Z}_{1U}\right]\) given as

$${Z}_{1N}=-0.0765+\left(-0.3306\right){I}_{z1N}; {I}_{z1N}\in \left[\mathrm{0,0.7686}\right]$$

The Z-test of a correlation coefficient for the second sample under neutrosophic statistics is computed as follows

$${Z}_{2N}=1.1513{log}_{10}\left(\frac{1+\left[\mathrm{0.2804,0.1290}\right]}{1-\left[\mathrm{0.2804,0.1290}\right]}\right)=[\mathrm{0.2881,0.1298}]$$

In neutrosophic form, the values of \({Z}_{2N}\in \left[{Z}_{2L},{Z}_{2U}\right]\) can be written as

$${Z}_{2N}=0.2881-0.1298{I}_{z2N}; {I}_{z2N}\in \left[\mathrm{0,1.2196}\right]$$

The test statistic, say \({Z}_{N}\in \left[{Z}_{L},{Z}_{U}\right]\) under neutrosophic statistics is computed as

\({Z}_{N}=\frac{[-0.0765,-0.3306]}{{\sigma }_{N}}\); \({\sigma }_{N}\in \left[\mathrm{0.2672,0.2672}\right]\) and \(\left|{Z}_{N}\right|\in \left[\mathrm{1.3642,1.7226}\right]\).

In neutrosophic form, the statistic \({Z}_{N}\in \left[{Z}_{L},{Z}_{U}\right]\) is given by

$${Z}_{N}=1.3642+1.7226{I}_{ZN}; {I}_{ZN}\in \left[\mathrm{0,0.2081}\right]$$

The proposed test for the real data sets is stated as follows

Step 1: State the null hypothesis that \({H}_{0N}:{\rho }_{1N}={\rho }_{2N}\) vs. the alternative hypothesis \({H}_{1N}:{\rho }_{1N}\ne {\rho }_{2N}\).

Step 2: State the level of significance \(\alpha =0.05\).

Step 3: Compute the values of the test statistic \(\left|{Z}_{N}\right|\in \left[\mathrm{1.3642,1.7226}\right]\) using the neutrosophic sample information.

Step 4: The critical value from the Z-table is 1.96 corresponding to \(\alpha =0.05\) and \({H}_{1N}:{\rho }_{1N}\ne {\rho }_{2N}\).

Step 5: Do not reject \({H}_{0N}:{\rho }_{1N}={\rho }_{2N}\) as \(\left|{Z}_{N}\right|<1.96\)

From the proposed test, it can be concluded that the relationship between the temperature and wind speed of January 2019 and January 2020 is insignificant. Similarly, the relationship between variables can be studied for other months of the years 2019 and 2020.

5 Comparative Study

As discussed earlier, the proposed test is a generalization of the test under classical statistics. The proposed test reduces to the existing test when all observations in the data are exact, determined and certain. In this section, the comparison of the proposed test is given over the existing test in terms of the measure of indeterminacy, flexibility and information. For the comparison purpose, the neutrosophic form of the statistic \({Z}_{N}\in \left[{Z}_{L},{Z}_{U}\right]\) is considered only. The other neutrosophic quantities can be explained in the same manner. The neutrosophic form of \({Z}_{N}\in \left[{Z}_{L},{Z}_{U}\right]\) is \({Z}_{N}=1.3642+1.7226{I}_{ZN}; {I}_{ZN}\in \left[\mathrm{0,0.2081}\right]\). Note here that this neutrosophic form reduces to statistic under classical statistics when \({I}_{ZL}=0\). Therefore, the first part of the neutrosophic form presents the value of test statistic under classical statistics. Similarly, the second part \(1.7226{I}_{ZN}\) shows the indeterminate part of the neutrosophic form. In addition, the measure of indeterminacy associated with this test is 0.2081. According to the proposed test, the values of the statistic of statistic \({Z}_{N}\in \left[{Z}_{L},{Z}_{U}\right]\) are flexible and lie in the indeterminate interval that is \(\left|{Z}_{N}\right|\in \left[\mathrm{1.3642,1.7226}\right]\). According to the proposed test, under an uncertain environment, the value of \({Z}_{N}\in \left[{Z}_{L},{Z}_{U}\right]\) can be expected from 1.3642 to 1.7226. This range differentiates the proposed test from the existing test under classical statistics which gives the determined value which is not appropriate in uncertainty. Another aspect of the proposed test is that it gives more information about the testing process under indeterminacy. The proposed test gives additional information about the testing procedure which is the measure of indeterminacy. For the energy example, for testing \({H}_{0N}:{\rho }_{1N}={\rho }_{2N}\), the probability that \({H}_{0N}:{\rho }_{1N}={\rho }_{2N}\) will be accepted is 0.95, the change of rejecting it when it is true is 0.05 and change of uncertainty about \({H}_{0N}:{\rho }_{1N}={\rho }_{2N}\) is 0.2081. For fuzzy statistics, there are lower value of the interval (measure of truth) and the upper value of the interval (measure of falseness). It means that \({Z}_{N}\in \left[{Z}_{L},{Z}_{U}\right]\) can be from 1.3642 to 1.7226. The analysis based on fuzzy statistics does not give information about the parameter “measure of indeterminacy.” From this study, it is clear that the proposed test is flexible, informative and reasonable to apply for testing \({H}_{0N}:{\rho }_{1N}={\rho }_{2N}\) under uncertainty. From the study, it is concluded that the proposed test under neutrosophic statistics is better than the test under classical and fuzzy statistics in terms of information and flexibility.

6 Concluding Remarks

This paper introduced a Z-test for two correlation coefficients under neutrosophic statistics. The necessary steps to implement the proposed test were given. The statistic of the proposed test under indeterminacy was introduced the first time. The application of the proposed test was given using temperature and wind speed data. The proposed test was the extension of the existing test under classical statistics. The application of the proposed test on the energy data showed it is efficient in measure of indeterminacy, flexibility and information. The proposed test can be applied for ocean big data as future research. The efficiency of the proposed test using some distributions can be considered a fruitful area of future research.