Data interpolation methods with the UNet-based model for weather forecast

Wang, Jiayu

doi:10.1007/s41060-024-00611-z

Data interpolation methods with the UNet-based model for weather forecast

Regular Paper
Open access
Published: 09 August 2024

(2024)
Cite this article

Download PDF

You have full access to this open access article

International Journal of Data Science and Analytics Aims and scope Submit manuscript

Data interpolation methods with the UNet-based model for weather forecast

Download PDF

Jiayu Wang¹

56 Accesses
Explore all metrics

Abstract

Deep learning improves weather predictions, and most machine learning applications need data preparation, including interpolation. Since meteorological satellite collected data have several missing values, it is worth studying the interpolation in weather forecasting. This paper used a UNet-based model to evaluate 10 interpolation methods with different parameters on a short-term weather prediction task from the IEEE Big Data Competition 2021. Each strategy was evaluated using 3 groups of evaluation aspects, totaling 7 metrics. One of the specific issues explored in this research was reducing the influence of possible displacement in satellite images, which is often emphasized by exciting evaluation standards. After interpolation, some solutions showed that they could increase the model performance to some extent. Although there was no universal optimal method, interpolation using linear relationships performed rather well in most cases and produced the best results when all evaluation metrics were taken into account. However, the most effective method is time-consuming and requires a great number of calculations. In addition to interpolation with linear relation, computing the mean value of a limited region is beneficial and efficient. This study expects the conclusion to improve future weather prediction or meteorological data processing and to be expanded with other evaluation metrics to better assess a deep learning model’s effectiveness.

ECMWF short-term prediction accuracy improvement by deep learning

Article Open access 12 May 2022

Extreme Weather Prediction Using 2-Phase Deep Learning Pipeline

Improving the Short-Range Precipitation Forecast of Numerical Weather Prediction through a Deep Learning-Based Mask Approach

Article 26 July 2024

Find the latest articles, discoveries, and news in related topics.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Climate change causes increasingly unpredictable weather, threatening agriculture, transportation, and human safety. Meteorologists are constantly developing a method for accurate and timely weather predictions. Traditional numerical weather prediction (NWP) relies on solving a set of nonlinear equations for weather forecasts. However, NWP faces several challenges. It is highly sensitive to initial conditions, and even small differences can have a significant impact on the prediction results, leading to a decrease in accuracy as the prediction time increases. In addition, the computational cost of solving these equations rises dramatically as the size and complexity of the data increase, resulting in an increasing reliance on supercomputers for NWP [1].

Satellites collect atmospheric data for meteorology, but their high dimensionality makes analysis difficult. In recent years, machine/deep learning has been successful in almost every field, and is therefore being used to efficiently collect, extract, and analyze meaningful data over a wide, high-resolution area. Both traditional machine learning [2, 3] and deep learning-based models [4,5,6,7,8] can improve short- and long-term prediction accuracy over NWP.

Since 2010 s, many deep learning-based methods have been widely used for weather forecasting. Such methods can be applied to several specific areas, such as the prediction of climate change [9, 10], air quality [11, 12], and extreme weather conditions forecasting in the form of extreme temperature [13, 14], forest fires [15, 16], flooding [17], cloud-to-ground lightning [18, 19], and typhoon [20, 21], etc.

Big data tasks have missing values [22], which may cause loss of efficiency, complexity in data processing, and bias from inaccurate data [23]. Therefore, in addition to the structure of the model and the size of the datasets, data preprocessing is crucial to deep learning benefits since it helps the model to learn from the data and interpret it [24].

Due to technical limitations (e.g., sensor failures, etc.) or objective climatic conditions (e.g., cloud cover, atmospheric pollution, etc.) [25,26,27,28], satellite sensors are often unable to generate high spatial-temporal resolution images, resulting in missing values [29]. The absence of these values may lead to a reduction in data integrity and accuracy, makes it difficult to properly identify and analyze surface features, thus having a negative impact on many applications.

A proper interpolation strategy can improve the performance of a model when dealing with imperfect data with missing values, which appear as blank or white dots in the satellite image [23]. Deep learning is widely used for data interpolation since of its ability to efficiently capture complex spatial and temporal patterns. However, these models usually require a large amount of computational resources for training and inference, and the time-consuming and computational costs are significantly increased during the preprocessing process of dealing with large-scale satellite images.

Over the past decades, many traditional interpolating methods have been proposed. Masking datasets with missing values is the simplest method. Mean values or other dataset-specific values can also replace missing values. Maximum likelihood techniques sample probabilistic models for interpolation. These methods are increasing in popularity and do not require prior experience [30]. In addition, some statistical-based algorithms, Kriging, for example, established over fifty years ago, has been frequently used in climate data interpolation and has numerous variations [31,32,33,34].

This study evaluates missing value interpolation methods in data preprocessing for regression weather prediction in continuous datasets. Various interpolation strategies were investigated to improve multichannel weather prediction with a UNet-based deep learning algorithm in a specific region. Ten interpolation methods were analyzed. Model performance is evaluated using seven metrics: mean squared error (MSE), peak signal-to-noise ratio (PSNR), structural similarity index measure (SSIM), percentage rate (PR), cosine similarity, Minkowski distances, and a new proposed measurement ’Shift’ that attempt to minimize the impact of slight movements that may occur after the prediction. Several improvements were made to traditional evaluation metrics to reduce the impact of slight location inaccuracy. To reduce time and avoid large calculations, this study mainly focuses on basic interpolation methods.

2 The problem, dataset, and the prediction framework

2.1 The problem

Many studies have been presented to manage missing values in data preprocessing [35, 36]. However, data mining and interpolation methods are still a challenge. Mismanaged methods can lead to unnecessary complexity and huge results distortions, resulting in a misleading conclusion [37]. The removal of missing values often biases the learning process and discards important data [29]. The elimination of useful data makes it difficult to draw conclusions [30]. In addition, some solutions, such as using a fuzzy similarity matrix to describe fuzzy relations, focus on discrete values and discretize continuous values before interpolation, which might cause the conversion process to lose valid characteristics [38]. Furthermore, some computing algorithms concentrated on missing conditional qualities rather than values [39].

As the size of datasets grows, information extraction and utilization become more challenging. Deep learning has powerful feature extraction and modeling capabilities, which is ideal for data processing. However, deep learning-based methods, as well as some machine learning-based algorithms (e.g., random forest), are inefficient and computationally intensive and due to the need to build equations or predictive models for each missing point. This makes information extraction more time-consuming and reduces its efficiency [40]. Therefore, basic interpolation methods may be more suitable than complex methods when dealing with large-scale data, as they can speed up the computational process and improve overall efficiency while maintaining accuracy.

Due to the importance of missing value computing in deep learning-based prediction model, it is worth investigating which interpolation strategy improves the model performance the best. This study explored a specific weather prediction challenge and intend to generalize its findings for further studies.

Additionally, when employing traditional evaluation methods, image displacement, which frequently occurs during data collection, might make prediction distortions appear worse. Therefore, this study applied the new proposed evaluation strategies to make comparisons more accurately to reduce the negative impacts of location inaccuracy.

2.2 The datasets

This study applied the IEEE Big Data Competition 2021 weather4cast data set [41] to investigate weather prediction. The dataset, collected in the Nile area from February 2019 to February 2020, includes four channels of messages at each time stamp.

- Cloud Top Temperature (CTT): Obtained from cloud surfaces. The cloud temperature for cloudy regions and the surface temperature for unclouded regions.

- Convective Rainfall Intensity (CRI): Convective rainfall accumulation capacity for hours.

- Probability of occurrence of tropopause folding (ASII-TF): Linked to upper-level frontogenesis and jet stream dynamics, and extreme weather.

- Cloud mask (CMA): 1 for the clouded region and 0 for the unclouded region.

Images are taken every 15 min throughout the year. Each 4 km by 4 km 256 $\times $ 256 image represents one of these four channels. Longitude, latitude, altitude and “time lead” (time serial number) are also provided. Each day has 96 files (24 h x 4 times per hour) with 8 images: 4 channels and 4 additional features.

About 312 days and 29,866 valid data files were produced for each weather product (CTT/CRI/ASII-TF/CMA), each containing a 256 by 256 weather array. For several days and time, the data source did not provide full files. There are 27,597 sets of continuous 4-time data. 85% were for training and 15% for validation. A total of 63 sets were randomly and evenly selected for testing.

Each weather product has 1.96 billion valid values. Unprocessed CTT contains approximately 88.8 million missing values, approximately 2974.30/image, with a value range of 174 K to 343 K; CRI contains 219,136 missing values, approximately 7.34/image value range from 0 mm/h to 34 mm/h; ASII-TF contains 218,904 missing values, approximately 7.36/image value range from 0 to 100%; CMA contains 45,059 missing values, approximately 1.51/image, and has only 2 valid values: 0 for the unclouded region and 1 for the clouded region. After preprocessing, the values of each product would normalize to [0, 1] and compute missing values with computed values.

2.3 The prediction framework structure

Each input data were processed as 4-time, 4-channels in training. Each time, the [1, 4, 256, 256] array would be added with additional characteristics mentioned in Sect. II.B: The Datasets, resulting in a [1, 8, 256, 256] array. The final input data were a [4*8, 256, 256] array. It outputs a [1, 4, 256, 256] array of 4-channel one-time without features, such as next-time weather. Iterative one-step predictions were made for the [32, 256, 256] test datasets. The final output was a [n, 4, 256, 256] array of n next-times, 4 channels, and four chronological weather images. n is 6 in this study. The following sections provide further information. Figure 1 illustrates the structure of this work.

A detailed analysis of the data shows three types of missing values [42, 43]: missing at random (MCR), missing completely at random (MCAR) and missing not at random (MNR)(MNAR). In this study, the missing values are MCAR. They were random and independent and were not affected by observed or missing data. An appropriate missing value computation method can complete datasets, enhance data analysis, and improve the performance of weather forecasts.

2.4 Interpolation methods

Efficiency is the main focus of this study. For this big data task, the number of missing values was high. For CTT, each image had 2972.50 missing pixels. In a relatively high-performance computer, some traditional interpolation methods (e.g., singular value decomposition) take more than 24 h to finish an epoch and 2 to 3 months to complete training. To simplify time-consuming processes, most methods are low-complexity and involve minimal computations.

Table 1 shows 14 strategies, using 10 algorithms with different parameters. Samples of traditional interpolation methods were chosen. Standard and successful machine learning methods were also used. In addition, this study proposed a computing strategy for gradually filling in missing values from outside to inside to interpolate missing points with as much relevant data as possible. Each method emphasizes different data features with various data distributions.

The methods were classified as follows: masking missing values (the default method provided on the data source), filling constant values, filling with basic statistics (like the mean, median, etc.), filling with common machine learning interpolation methods (like KNN), and the newly proposed method mentioned above.

This section will describe the interpolation methods used in this study. Strategies are abbreviated in parentheses for clarity. Table 1 shows a brief list of shortened interpolation methods. The following section is explained in detail below.

Table 1 10 Interpolation methods and 14 strategies

Full size table

2.4.1 Default method: mask missing values

The usual way to hide the missing value was given by the data source (referred to as ‘Mask’). It is also the default method provided in the original study [41]. White noise spots show where missing values are on the topographic map of the source picture. The image file has two arrays that are the same size: a true information array that is used for training and testing, and a bool array that indicates if the point is missing. Based on the information in the second table, the missing points in the first table would be interpolated with a default number and ignored by the next stage procedure.

2.4.2 Constant value

The constant value (referred to as ‘Value X’) interpolates all missing values with a specific value. As the data distribution range after preprocessing was between 0 and 1, two X values were used: 0 as the minimum valid value and 1 as the maximum valid value. Those two values were treated as two baselines to approximate extreme data distributions. This study assumed such an interpolation may help to maintain the overall stability of the dataset, especially when missing data are considered absent or inapplicable.

2.4.3 Statistic value

Four values with basic and standard statistical methods were used for interpolation, which is the classical choice based on the overall data distribution [44,45,46]. Such simple methods usually provide efficient interpolation and are particularly useful when missing data are randomly distributed and the dataset is large. Mean value (referred to as ’Mean’), median value (referred to as ‘Median’), and mode value (referred to as ‘Mode’) of the valid values in the image, as well as an adjusted method: mean value of a limited region of N * N pixels (referred to as ‘Mean/N’).

The last method fills the missing point [x, y] with valid values between $[x-\frac{N}{2}, y- \frac{N}{2}]$ and $[x+\frac{N}{2}, y+\frac{N}{2}]$ (the given number N is even) or $[x- \frac{N - 1}{2}, y - \frac{N - 1}{2}]$ and $[x + \frac{N + 1}{2} / 2, y + \frac{N + 1}{2} / 2]$ (the given number N is odd). The distribution of details in the input data is not uniform. Some regions have a large smoothing range, while some contain more detailed features. Besides, the missing values are also unevenly distributed. Some areas have only one isolated missing point, while others have a large range of missing block.

Thus, 3 N (5, 10 and 15) were used. The 5 $*$ 5 small square was used to capture small localized variations in images and provide more detailed information. Setting N value as 5 cannot ensure obtain enough contextual information but is less computational expensive and can capture local details. The 10 $*$ 10 median square is used to find a balance between capturing localized features and reducing computational costs. The 15 $*$ 15 large square is suitable for a wide range of smoothing areas to process of data with large missing areas. Set N value to 15 can treat large missing region and provide smoother interpolation results, but requires more computational costs.

Though they are very efficient and fast enough for big data tasks, they could result in large deviations when the spatial and time gaps are too large [47, 48]. Thus, this study also implemented the interpolation with mean value in small regions. Three regions (5 pixel * 5 pixel, 10 pixel * 10 pixel, 15 pixel * 15 pixel) were tried to find the optimal solution.

2.4.4 Classic interpolation algorithms

Linear, K-nearest neighbors (KNN), and inverse distance weighted (IDW) algorithms computed missing values and interpolated missing points with neighboring valid values. They are typical and have been proven effective for years, and they can complete the interpolation process rapidly to match the timely requirement of big data.

The linear relationship (referred to as ‘Linear’) replaces isolated missing points by averaging the four closest valid values in the top, bottom, right and left directions. Considering a missing point connected to other missing points in a certain direction, two valid values in this direction obtained in the first stage should be weighted before calculation, depending on the straight-line distance between the two detected valid points and the missing point. This interpolation method is simple, high efficiency, and easy to implement. It guarantees continuity of the interpolation results within the region, which is consistent with natural continuity in geographic data such as temperature or the quantity of rainfall [49]. Thus, linear interpolation has been widely used in recent years, especially in weather data [50,51,52,53].
KNN (referred to as ‘KNN/K’) is a machine learning algorithm for interpolation. It allows interpolation based on the similarity of the data and is particularly useful when spatial interrelationships between data points are critical to the interpolation results. This is a popular interpolation method and so on in weather prediction interpolation, since it is intuitive without the requirement of parameter assumption and flexible enough to adapt to local distribution of geographic data [54,55,56,57,58]. The computing of KNN begins with a reasonable K. Calculate all distances using Eq. (1) and choose each k-number of pixels closest to the missing location in the feature space. It requires a relatively considerable number of calculations to measure the distance between two points, but it is acceptable [59].
$$\begin{aligned} d(x_i, x_j) = \sum _{l=1}^{n}\left( |x_{i}^{(l)}-x_{j}^{(l)}|^p\right) ^{\frac{1}{p}} \end{aligned}$$
(1)
For images with simpler textures, a smaller K value ensures that the clustering centers represent the main color or texture features in the image. And for images containing complex details, larger K values can capture richer and more detailed features. Since it is too time-consuming and computationally intensive to perform the interpolation test for all k-values separately, this study run silhouette coefficient on several sets of training data. The results showed that smaller values of K have higher silhouette coefficient (from 3 to 8). Thus, this study chose K value as 5 for interpolation. And in order to avoid prevent the risk of richer and more complex details being lost, K value as 15 was also chosen.
IDW (referred to as ‘IDW’) follows the Tobler’s first rule of geography [30]. It is based on an inverse weighted average of distances, with data points closer to the target point having higher weights, and can be computed using Equation (2). This calculation is in a way consistent with the reality of geospatial data, and it can use the relationship between geographic location effectively to predict the value of unknown points [40]. Those characteristics give the IDW flexibility to adapt to different data distribution and the local or global variations, which make the IDW a reliable source in the interpolation of temperature and other geographic data [60,61,62,63]. IDW-based adaptive approaches with different distance-decay relationships have been proposed [37, 64]. In certain instances, advanced methods can outperform constant parameters and universal kriging. However, due to the consideration of computational costs, only the traditional IDW method were utilized in this study.
$$\begin{aligned} f(x, y)=\frac{\sum _{i=1}^{n}(\frac{1}{d^k})*Z_i}{\sum _{i=1}^{n}\frac{1}{d^k}} \end{aligned}$$
(2)

2.4.5 A new proposed method: ‘step by step’

This paper proposed a cluster-efficient missing block computation method. This method (named ‘Step by Step’, referred to as ‘SbS’) calculates missing values by iteratively invading valid data. Each iteration computes only missing-value pixels surrounded by n valid values and interpolates using the average of its n neighbors. In the initial round, n is 8 and reduces by 1 in each round until it reaches 3, after which it remains unchanged until the interpolation is complete.

Isolated missing islands are calculated using the eight surrounding points. For scattered missing points that contain neighboring missing ones, this method estimates the outermost points surrounded by as many valid values as possible in each round and repeats until there is no missing value.

Missing points are sometimes formed as a block. The outermost points have enough information, but the innermost point has only missing neighbors. This approach interpolates from outside to inside to calculate each missing point using as much valid information as possible. The interpolation starts in the sixth cycle, when n reaches 3, and computes the farthest points, lowering the missing block’s size to X-2 * Y-2. Continue until the missing block disappears.

Table 2 3 Inaccuracy groups and evaluation metrics/strategies

Full size table

2.5 Evaluation metrics

Interpolation distorts images and most prediction models generate noises, both distorting expected outcomes. The following measurements compared the quality of the ground-truth image with the expected image to evaluate the model’s performance with different interpolation strategies.

This study classified probable inaccuracies into intensity, structure, and location based on personal understanding. Seven measurements and their related tactics were employed independently and combined to obtain a clearer and more comprehensive conclusion.

Intensity inaccuracy evaluates pixel values in the corresponding images to analyze a precise prediction. They consider physical means, but not perceived visual quality. Structure inaccuracy affects the appearance of the image. They can analyze shape distortion and vector distances in parallel images. Furthermore, this study also suggested location inaccuracy to alleviate the negative influence of prospective relocation on the prediction process, which is often overstated in prior work when analyzing the prediction result’s precision and performance of the associated work.

This section describes the evaluation metrics used for this research. For clarity, the metrics’ abbreviated names are provided in parentheses. Table 2 lists all the evaluation metrics. The following section provides more information.

2.5.1 Intensity inaccuracy

MSE, PSNR, and PR were used for intensity inaccuracy. These measurements concentrate on the different values of two parallel points in the predicted image and in the true image.

Mean Score Error (referred to as ‘MSE’) [65]: MSE is the most widely used image quality measurement. MSE takes the average absolute distance between the truth data and the predictions. It calculates the absolute error by Eq. (3).

$$\begin{aligned} MSE=\frac{1}{n}\sum _{i=1}^{n}(y_i-\hat{y_i})^2 \end{aligned}$$

(3)

Peak Signal-to-Noise Ratio (referred to as ‘PSNR’) [65]: PSNR is the ratio of the maximum possible signal power to the signal-affecting noise power, calculated by Eq. 4. It approximates the human sense of reconstruction quality and is widely used to measure image quality [39]. The larger the result, the better the performance of the model.

$$\begin{aligned} PSNR(dB)=10*log_(\frac{MAX^2}{MSE})=20*log_(\frac{MAX^2}{\root 2 \of {MSE}}) \end{aligned}$$

(4)

Percentage rate (referred to as ‘PR’): PR = 100 % * (actual value) / | true value - predicted value |. In this study, ’PR: A%’ means that A% of predicted values have a percentage error <20% of their true values. 5 A values were used to evaluate the deviation data after the prediction.

2.5.2 Structure inaccuracy

SSIM, cosine similarity, and Minkowski distances analyzed structure inaccuracy. They are sensitive to geometric deformations [39], focus on the structure or vector retrieved from the image, and calculate the similarity or distance between parallel images.

The Structure Similarity Index Measure (referred to as ‘SSIM’): SSIM assesses structural similarity between two images based on perception [39]. It contains three elements: brightness, contrast, and structure, and is calculated by Eq. 5. Performance is improved when the SSIM is closer to 1.

$$\begin{aligned} SSIM(x, y)=\frac{(2\mu _x\mu _y+c_1)(2\sigma _{xy}+c_2)}{(\mu _x^2+\mu _y^2+c_1)(\sigma _x^2+\sigma _y^2+c_2)} \end{aligned}$$

(5)

Cosine Similarity (referred to as ‘Cosine’): Cosine similarity evaluates sequence similarity. It is the vector cosine of the angle between two sequences, calculated by Eq. 6. The similarity value only considers directions and ignores their length.

$$\begin{aligned} SC(A, B)=cos(\theta )=\frac{\overrightarrow{A}\cdot \overrightarrow{B}}{||\overrightarrow{A}||||\overrightarrow{B}||} \end{aligned}$$

(6)

Minkowski distances (referred to as ‘Distances’): The Minkowski distance, a normed vector space metric that generalizes Manhattan and Euclidean distances, is determined using equation 7. When p=1, the Manhattan distance is the Minkowski distance, and when p=2, the Euclidean distance is.

$$\begin{aligned} dist(x, y)=\left( \sum _{i=1}^n{|x_i-y_i|}^p\right) ^{\frac{1}{p}} \end{aligned}$$

(7)

2.5.3 Location inaccuracy

The displacement in Fig. 2 is small. In such a scenario, the expected distortion cannot be shown by intensity or structure in such a case. This work expanded the predicted image to a set as shown in Fig. 3, including the original image and eight new images with little displacement from [-1, -1] to [1, 1], compared each image in the new set with the true image, adopted the best performance score, and recorded the direction of its displacement.

Location inaccuracy reduces a small displacement effect. In Fig. 2, the intensity inaccuracy is high, but the shape inaccuracy does not disrupt the pattern. These displacements can be diagnosed with location error, making the analysis easier.

With the idea of location inaccuracy, two strategies were used: results after shift (referred to as ‘Shift MSE/PSNR/SSIM’) and improvement after ‘Shift’ (referred to as ‘Improved MSE/PSNR/SSIM’), calculated using shifted MSE / PSBR / SSIM - MSE / PSNR / SSIM.

2.5.4 Extra process

MSE, PSNR, and SSIM have always been analyzed using extra strategies to enrich their meaning and evaluate more accurately. This study examined MSE, PSNR, and SSIM using two additional processes, as described below.

Weight (referred to as ‘Weight’): Not all four weather products are equal. The CTT product, which represents the local temperature, is the most important. Each channel should also prioritize the extremely global condition. These datasets commonly had CTT with 1e-3 (0.1 mm/h) or lower precipitation, which should be considered valid instead of discarded. In this work, scenarios with different data distributions were weighted to reflect their relative relevance, eliminate irrelevant interference elements, and provide accurate and useful weather predictions. Figure 4 shows the weights of the products under various conditions. The ’Weight’ operation is applied to MSE, PRNR, and SSIM, and the ’Weight MSE’ is recorded.

Mean / N (referred to as “Mean”): The Mean was used to fully assess each part of the image by removing distant influences and focusing on locally extreme situations. After weighting, ’Mean’ splits each image into N pixels * N pixels by N. This study used N, the pixel number of each image’s length and width, as 16. ’Mean Weight MSE / PSNR / SSIM’ values were recorded.

3 Result

This section will describe the implementation details, evaluation and comparison principle, and the experiment results.

3.1 Implementation

The models were trained on a UNet-18 model provided as a baseline in the big data competition [41], using the PyTorch lightning framework. Some modifications were made to increase model performance. If not specified, hyperparameters are listed below.

The combination of UNet with DenseNet outperforms the prior UNet design [66], therefore each convolution layer block was replaced with a micro-DenseNet block to train the model more deeply and accurately [67]. This change utilized DenseNet’s dense connectivity to reduce information loss and promote feature reuse, allowing the model to learn complex features more efficiently.

In addition, the baseline model utilized ReLU as the activation function, in where the positive output is held constant and the negative output is set to zero as $f(x) = max(0, x)$. In this study, Mish is utilized as an alternative to ReLU, as it can improve model performance through smooth nonlinear transformations and better gradient flow. The Mish is defined by Eq. (8) [68].

$$\begin{aligned} f(x) = x * a (softplus (x)) \end{aligned}$$

(8)

The training batch size was 64, and the worker number was 8 due to hardware limitation. Since weight decay improves generalizability, AdamW optimizer [69] with weight decay of 6e−2 was employed to improve calculation efficiency. The learning rate was dynamically decreased from 1e-3 to 2e-4 in stages to accelerate training and eliminate oscillation and converge to a local minimum [70]. The final learning rate was set to be cyclical, ranging from 1.5e$-$4 to 2.5e$-$4, to speed up training and improve precision [71]. For best performance, all hyperparameters and strategies were extensively optimized.

Ten interpolation methods and seven evaluation metrics were applied independently. The results are presented in the following section. Each channel’s results were also evaluated using MSE, PSNR, and SSIM to directly analyze each weather product’s prediction and better understand how the data distribution affects the various interpolation strategies, thus improving the model’s performance.

3.2 Comparison principle

This study combined all evaluation results with varying result values to provide a more precise view and compare overall performance more fairly. Prediction results for each evaluation metric were scored and ranked using Table 3’s formulas.

Table 3 The calculated formula of scores for each evaluation metric

Full size table

3.3 Overall performance

Table 4 shows the performance of the total result, including the evaluated results in the 7 metrics, sorted by performance score calculation from Table 3. The ability of interpolation strategies to increase model performance was assessed using various metrics. The total inaccuracy score is calculated by intensity incorrectness (referred to as ‘II’), structure inaccuracy (referred to as ’SI’), and location inaccuracy (referred to as ’SI’). They are calculated using Equs. (9) to (12). The principle of calculating the score for each metrics with different tactics is provided in Table 4.

Table 4 The calculation of the score of each metric groups

Full size table

Table 5 The overall performance, the top 3 strategies, and the default method are written in bold

Full size table

Table 5 provides more information recorded in this study.

$$\begin{aligned}&Total Accuracy Score = 0.5*II Score + 0.3*SI Score \nonumber \\&\qquad \qquad \qquad \qquad \quad \qquad \qquad +0.2*LI Score \end{aligned}$$

(9)

$$\begin{aligned}&II Score=0.4*MSE Score+0.4*{PSNR} Score \nonumber \\&\qquad \qquad \qquad +0.2*PR Score \end{aligned}$$

(10)

$$\begin{aligned}&SI Score=0.6*SSIM Score+0.4*Cosine Score \end{aligned}$$

(11)

$$\begin{aligned}&LI Score=0.8*Shifted Score +0.2*Improved Score \end{aligned}$$

(12)

10 of the 14 interpolation methods outperform ’Mask’. ’Linear’ works best. However, ’Linear’ takes 1.5h to complete 1 epoch in the training procedure, 5 times longer than ’Mean / 5’ and ’Mean / 10’, which are not far behind to ’Linear.’ When the time is considered, interpolation using such a constrained region mean value would be the best choice. Despite outperforming some common method, ’SbS’ did not perform as intended.

Besides, the strategy’s computational complexity has some impact on the efficiency of interpolation, but does not necessarily improve model performance. Complex algorithms like ’IDW’ improve performance. While ’KNN’, a complex procedure involving numerous calculations, doesn’t work as expected.

Depending on parameters, the same interpolation methods can perform differently. ’KNN /15’ outperformed ’KNN/5’. The ’Mean / 10’, ’Mean / 10’, ’Mean / 10’, and ’Mean’(can be seen as ’Mean / 256’) also decreased in performance. Given the same method, the number of neighbor points used to forecast the missing point’s value has a limit, and exceeding or lower than it reduces performance. And the limitations vary by method.

3.4 Performance of 4 separate channels

Four weather products have unique meanings and data distributions. To further analyzed and gain a more complete understanding, this study examined MSE, mean weight MSE, PSNR, and SSIM for each product. Table 6 shows the performance of all weather products (referred to as ’Weather Products’ or "WPs") and each product (referred to as ’CTT’ / ’CRI’ / ’ASII’ / ’CMA’). They are calculated by Eqs. (13) to (14).

$$\begin{aligned}&Weather Products Score\nonumber \\&\quad =0.3*CTT Score +0.2*CRI Score +0.3*ASII Score\nonumber \\&\qquad +0.2*CMA Score \end{aligned}$$

(13)

$$\begin{aligned}&Each Product Score\nonumber \\&\quad =0.2*MSE Score +0.2*Mean Wright MSE Score \nonumber \\&\qquad +0.3*SSIM Score + 0.3*PSNR Score \end{aligned}$$

(14)

Table 6 The overall performance, the top 3 strategies and the default method are written in bold

Full size table

Considering the four products, the weather product ranking list in Table 6 was close to the overall ranking list in Table 5, but product scores are often lower. As in 4.3, ’Linear’ is the best. And IDW performed well for both weather products.

Interpolation methods affect different conditions differently. Data distributions can affect an interpolation method’s performance. The best interpolation strategy cannot optimize all four weather products.

Surprisingly, ’Value 0’ did well on ’CRI’ and ’CMA’. Since most values in the ’CRI’ and ’CMA’ products were close to 0, they probably prefer the method that computes small values. The prediction result always tends to be higher than truth value, and ’0’ countered this trend and considerably improved product performance. For the same reason, methods that tends to predict greater value were less effective.

3.5 Scores of each metric

And for each evaluation metric, this paper compared the trained model’s prediction to the ground truth value. The outcome and related information is shown in Fig. 5 and analyzed below. To simplify the results, only the top 3 strategies and the default method are shown in figure. Table 6 recorded the best results in all tactics.

Table 7 Best scores (Rank 1 Strategy) with each metric

Full size table

No interpolation strategy works best for all measurements. However, most good-performing strategies scored well on all evaluation metrics and weather products, while the low-performing strategies always performed poorly.

Most methods worked well on ’SSIM’, and none of the methods performed well on ’PSNR’ (from 18.63 to 23.31) or ’PR’ (less than 5%, 10%), indicating that interpolation cannot significantly improve model’s precise accuracy. ’PSNR’ and ’SSIM’ are decreased after the mean process (mentioned in Sect. II.E.4: Extra Process). Surprisingly, ’Mask’ outperforms most strategies for ’Cosine’.

The new proposed method ‘SbS’ performed best in ‘MSE’, improving the 9.91% model’s performance above ’Mask’.

Unfortunately, the interpolation methods had little effect on the location inaccuracy. For the new proposed evaluation metric ’Shift’, ’Mean / 15’ and ’IDW’ performed slightly better. Though each model’s prediction results improved just little in ’MSE’, ’PSNR’, and ’SSIM’ after ’Shift’, they did not differ significantly from the unshifted method. Generally, the little change after ’Shift’ cannot reflect its capacity to increase model performance as expected.

4 Discussions and conclusions

The study examines data interpolation strategies for a deep learning-based weather prediction task. Some conclusions can be taken from comparing the weather prediction model’s performance with different interpolation strategies.

Most strategies outperformed the default ’Mask’. But no strategy was universal superior. ’Linear’ scored highest for most metrics in both cases. However, it takes almost 1.5 h to finish an epoch, considerably times longer than other viable strategies. ’Mean / 10’ and ’Mean / 5’ performed well overall and may be better for bigdata workloads. They are practical and efficient.

Moreover, some simple interpolation methods can outperform complicated ones. Furthermore, hyperparameters like K in KNN can affect method performance.

For the new interpolation methods proposed in this study (mentioned in Sect. II.D.5: A New Proposed Method ’Step by step’), it shows a considerable result when evaluated with both metrics and has the highest score in ’MSE’. On the other hand, the proposed evaluation metric ’Shift’ (mentioned in Sect. II.E.3: ’Location Inaccuracy’) has only a slight influence.

For future experimentation, this study plans to test more interpolation methods, some of which are time-consuming. Additionally, parameter methods can test with more choices, to see if ’KNN / K’ can improve performance with a proper K value. Thirdly, weather data segmentation and data transformation can affect weather predictions can be explored. Eventually, to make the outcome more widely relevant and valid for more datasets and models, more tasks can be done.

This study was expected to be supplemented with more computing strategies or evaluation measures to produce a more comprehensive and universal investigation. In addition, if luck holds, it is hoped that the conclusion will serve as a basic overview and continuation of the interpolation method’s discoveries, provide as much useful information as possible, and offer potential suggestions in various fields. It expects it to be valuable for future weather prediction or data preprocessing projects, especially for meteorological datasets, as well as other machine learning-based tasks applied to other models and datasets in investigations and applications.

References

Ren, X., Li, X., Ren, K., Song, J., Zichen, X., Deng, K., Wang, X.: Deep learning-based weather prediction: a survey. Big Data Res. 23, 100178 (2021)
Article Google Scholar
Rasp, S., Dueben, P.D, Scher, S., Weyn, J.A, Mouatadid, S., Thuerey, N.: Weatherbench: a benchmark data set for data-driven weather forecasting. J. Adv. Model. Earth Syst. 12(11), e2020MS002203 (2020)
Holmstrom, M., Liu, D., Vo, C.: Machine learning applied to weather forecasting. Meteorol. Appl. 10, 1–5 (2016)
Google Scholar
Lim, B., Zohren, S.: Time-series forecasting with deep learning: a survey. Phil. Trans. R. Soc. A 379(2194), 20200209 (2021)
Article MathSciNet Google Scholar
Hewage, P., Trovati, M., Pereira, E., Behera, A.: Deep learning-based effective fine-grained weather forecasting model. Pattern Anal. Appl. 24(1), 343–366 (2021)
Article Google Scholar
Scher, S., Messori, G.: Predicting weather forecast uncertainty with machine learning. Q. J. R. Meteorol. Soc. 144(717), 2830–2841 (2018)
Article Google Scholar
Fu, Q., Niu, D., Zang, Z., Huang, J., Diao, L.: Multi-stations’ weather prediction based on hybrid model using 1d cnn and bi-lstm. In: 2019 Chinese control conference (CCC), pp. 3771–3775. IEEE (2019)
Salman, A.G., Kanigoro, B., Heryadi, Y.: Weather forecasting using deep learning techniques. In: 2015 International Conference on Advanced Computer Science and Information Systems (ICACSIS), pp. 281–285. IEEE (2015)
El-Habil, B.Y., Abu-Naser, S.S.: Global climate prediction using deep learning. J. Theor. Appl. Inf. Technol. 100(24), 4824–4838 (2022)
Google Scholar
Kumar, P., Chandra, R., Bansal, C., Kalyanaraman, S., Ganu, T., Grant, M. : Micro-climate prediction-multi scale encoder-decoder based deep learning framework. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 3128–3138 (2021)
Li, X., Peng, L., Yuan, H., Shao, J., Chi, T.: Deep learning architecture for air quality predictions. Environ. Sci. Pollut. Res. 23, 22408–22417 (2016)
Article Google Scholar
Mao, W., Wang, W., Jiao, L., Zhao, S., Liu, A.: Modeling air quality prediction using a deep learning approach: Method optimization and evaluation. Sustain. Cities Soc. 65, 102567 (2021)
Article Google Scholar
Chattopadhyay, A., Nabizadeh, E., Hassanzadeh, P.: Analog forecasting of extreme-causing weather patterns using deep learning. J. Adv. Model. Earth Syst. 12(2), e2019MS001958 (2020)
Jacques-Dumas, V., Ragone, F., Borgnat, P., Abry, P., Bouchet, F.: Deep learning-based extreme heatwave forecast. Front. Clim. 4 (2022)
Saha, S., Bera, B., Shit, P.K., Bhattacharjee, S., Sengupta, N.: Prediction of forest fire susceptibility applying machine and deep learning algorithms for conservation priorities of forest resources. Remote Sens. Appl.: Soc. Environ. 29, 100917 (2023)
Naderpour, M., Rizeei, H.M., Ramezani, F.: Forest fire risk prediction: a spatial deep neural network-based framework. Remote Sens. 13(13), 2513 (2021)
Chen, C., Jiang, J., Liao, Z., Zhou, Y., Wang, H., Pei, Q.: A short-term flood prediction based on spatial deep learning network: a case study for Xi county, China. J. Hydrol. 607, 127535 (2022)
Article Google Scholar
Zhou, K., Zheng, Y., Dong, W., Wang, T.: A deep learning network for cloud-to-ground lightning nowcasting with multisource data. J. Atmos. Oceanic Technol. 37(5), 927–942 (2020)
Article Google Scholar
Leinonen, J., Hamann, U., Germann, U.: Seamless lightning nowcasting with recurrent-convolutional deep learning. Artif. Intell. Earth Syst. 1(4), e220043 (2022)
Google Scholar
Jiang, G.-Q., Jing, X., Wei, J.: A deep learning algorithm of neural network for the parameterization of typhoon-ocean feedback in typhoon forecast models. Geophys. Res. Lett. 45(8), 3706–3716 (2018)
Article Google Scholar
Jiang, S., Fan, H., Wang, C.: Improvement of typhoon intensity forecasting by using a novel spatio-temporal deep learning model. Remote Sens. 14(20), 5205 (2022)
Article Google Scholar
García, S., Ramírez-Gallego, S., Luengo, J., Benítez, J.M., Herrera, F.: Big data preprocessing: methods and prospects. Big Data Anal. 1(1), 1–22 (2016)
Luengo, J., García, S., Herrera, F.: On the choice of the best imputation methods for missing values considering three groups of classification methods. Knowl. Inf. Syst. 32, 77–108 (2012)
Article Google Scholar
Hewage, P., Behera, A., Trovati, M., Pereira, E., Ghahremani, M., Palmieri, F., Liu, Y.: Temporal convolutional neural (tcn) network for an effective weather forecasting using time-series data from the local weather station. Soft. Comput. 24, 16453–16482 (2020)
Article Google Scholar
Pondaven, A., Bakler, M., Guo, D., Hashim, H., Ignatov, M., Zhu, H.: Convolutional neural processes for inpainting satellite images. arXiv preprint arXiv:2205.12407 (2022)
Zhang, Q., Yuan, Q., Zeng, C., Li, X., Wei, Y.: Missing data reconstruction in remote sensing image with a unified spatial-temporal-spectral deep convolutional neural network. IEEE Trans. Geosci. Remote Sens. 56(8), 4274–4288 (2018)
Article Google Scholar
Wang, Y., Zhou, X., Ao, Z., Xiao, K., Yan, C., Xin, Q.: Gap-filling and missing information recovery for time series of modis data using deep learning-based methods. Remote Sens. 14(19), 4692 (2022)
Article Google Scholar
Zhao, Q., Le, Yu., Zhenrong, D., Peng, D., Hao, P., Zhang, Y., Gong, P.: An overview of the applications of earth observation satellite data: impacts and future trends. Remote Sens. 14(8), 1863 (2022)
Article Google Scholar
Liu, M., Yang, W., Zhu, X., Chen, J., Chen, X., Yang, L., Helmer, E.H.: An improved flexible spatiotemporal data fusion (ifsdaf) method for producing high spatiotemporal resolution normalized difference vegetation index time series. Remote Sens. Environ. 227, 74–89 (2019)
Kaiser, J.: Dealing with missing values in data. J. Syst. Integr. (1804–2724), 5(1), (2014)
Kanaroglou, P.S., Soulakellis, N.A., Sifakis, N.I.: Improvement of satellite derived pollution maps with the use of a geostatistical interpolation method. J. Geogr. Syst. 4, 193–208 (2002)
Bhattacharjee, S., Mitra, P., Ghosh, S.K.: Spatial interpolation to predict missing attributes in gis using semantic kriging. IEEE Trans. Geosci. Remote Sens. 52(8), 4771–4780 (2013)
He, Z., Lei, L., Zhang, Y., Sheng, M., Wu, C., Li, L., Zeng, Z.-C., Welp, L.R.: Spatio-temporal mapping of multi-satellite observed column atmospheric co2 using precision-weighted kriging method. Remote Sens. 12(3), 576 (2020)
Kostopoulou, E.: Applicability of ordinary kriging modeling techniques for filling satellite data gaps in support of coastal management. Model. Earth Syst. Environ. 7(2), 1145–1158 (2021)
Article Google Scholar
Horton, N.J., Kleinman, K.P.: Much ado about nothing: a comparison of missing data methods and software to fit incomplete data regression models. Am. Stat. 61(1), 79–90 (2007)
Lakshminarayan, K., Harp, S.A, Samad, T.: Imputation of missing data in industrial databases. Appl. Intell. 11(3), 259–275 (1999)
Wang, H., Wang, S.: Mining incomplete survey data through classification. Knowl. Inf. Syst. 24, 221–233 (2010)
Article Google Scholar
Chen, S.-M., Huang, C.-M.: Generating weighted fuzzy rules from relational database systems for estimating null values using genetic algorithms. IEEE Trans. Fuzzy Syst. 11(4), 495–506 (2003)
Article Google Scholar
Somasundaram, R.S., Nedunchezhian, R.: Evaluation of three simple imputation methods for enhancing preprocessing of data with missing values. Int. J. Comput. Appl. 21(10), 14–19 (2011)
Google Scholar
Pyle, D.: Data Preparation for Data Mining. Morgan Kaufmann (1999)
IEEE BigData 2021. Weather4cast: multi-sensor weather forecast competition at scale (2021)
Choong, H.L., Hyung-Jin, Y.: Medical big data: promise and challenges. Kidney. res. clin. pract. 36(1), 3 (2017)
Article Google Scholar
Kantardzic, M.: Data Mining: Concepts, Models, Methods, and Algorithms. Wiley, London (2011)
Noor, N.M., Al Bakri, A., Mohd, M., Yahaya, A.S., Ramli, N.A.: Comparison of linear interpolation method and mean method to replace the missing values in environmental data set. In: Materials Science Forum, vol. 803, pp. 278–281. Trans Tech Publ (2015)
Du, H., Shu, L.: Camouflage images based on mean value interpolation. In: Proceedings of the 2012 International Conference on Information Technology and Software Engineering: Software Engineering and Digital Media Technology, pp. 775–782. Springer (2013)
Bhatt, P., Shah, A., Patel, S., Patel, S.: Image enhancement using various interpolation methods. Int. J. Comput. Sci. Inf. Technol. Secur. 2(4) (2012)
Feng, L., Nowak, G., O’Neill, T.J., Welsh, A.H.: Cutoff: a 825 spatio-temporal imputation method. J. Hydrol. 519, 3591–3605 (2014)
Article Google Scholar
Bokde, N., Beck, M.W., Álvarez, F.M., Kulat, K.: A novel imputation methodology for time series based on pattern sequence forecasting. Pattern Recognit. Lett. 116, 88–96 (2018)
Article Google Scholar
Blu, T., Thévenaz, P., Unser, M.: Linear interpolation revitalized. IEEE Trans. Image Process. 13(5), 710–719 (2004)
Article MathSciNet Google Scholar
Soltani, A., Meinke, H., De Voil, P.: Assessing linear interpolation to generate daily radiation and temperature data for use in crop simulations. Eur. J. Agron. 21(2), 133–148 (2004)
Article Google Scholar
Ruzanski, E., Chandrasekar, V.: Weather radar data interpolation using a kernel-based Lagrangian nowcasting technique. IEEE Trans. Geosci. Remote Sens. 53(6), 3073–3083 (2014)
Article Google Scholar
Gore, R., Gawali, B., Pachpatte, D.: Weather parameter analysis using interpolation methods. Artif. Intell. Appl. 1, 260–272 (2023)
Google Scholar
Li, L., Revesz, P.: Interpolation methods for spatio-temporal geographic data. Comput. Environ. Urban Syst. 28(3), 201–227 (2004)
Article Google Scholar
Xing, Y., Song, Q., Cheng, G.: Benefit of interpolation in nearest neighbor algorithms. arXiv preprint arXiv:1909.11720 (2019)
Hamed, Y., Mustaffa, Z.B. and Idris, N.R.B., et al.: An application of k-nearest neighbor interpolation on calibrating corrosion measurements collected by two non-destructive techniques. In: 2015 IEEE 3rd International Conference on Smart Instrumentation, Measurement and Applications (ICSIMA), pp. 1–5. IEEE (2015)
Kiani, K., Saleem, K.: K-nearest temperature trends: a method for weather temperature data imputation. In: Proceedings of the 2017 International Conference on Information System and Data Mining, pp. 23–27 (2017)
Poloczek, J., Treiber, N.A., Kramer, O.: Knn regression as geo-imputation method for spatio-temporal wind data. In: International Joint Conference SOCO’14-CISIS’14-ICEUTE’14: Bilbao, Spain, June 25th–27th, 2014, Proceedings, pp. 185–193. Springer (2014)
Sharif, M., Burn, D.H, Wey, K.M.: Daily and hourly weather data generation using a k-nearest neighbour approach. In: Canadian Hydrotechnical Conference, pp. 1–10. Citeseer (2007)
Hisham, M.B., Yaakob, S.N., Raof, R.A., Nazren, A.B., Wafi, N.M.: An analysis of performance for commonly used interpolation method. Adv. Sci. Lett. 23(6), 5147–5150 (2017)
Setianto, A., Triandini, T.: Comparison of kriging and inverse distance weighted (idw) interpolation methods in lineament extraction and analysis. J. Appl. Geol. 5(1) (2013)
Yang, W., Zhao, Y., Wang, D., Huihui, W., Lin, A., He, L.: Using principal components analysis and idw interpolation to determine spatial and temporal changes of surface water quality of Xin’anjiang river in Huangshan, China. Int. J. Environ. Re. Public Health 17(8), 2942 (2020)
Bartier, P.M., Keller, C.P.: Multivariate interpolation to incorporate thematic surface data using inverse distance weighting (idw). Comput. Geosci. 22(7), 795–799 (1996)
Ikechukwu, M.N., Ebinne, E., Idorenyin, U., Raphael, N.I.: Accuracy assessment and comparative analysis of idw, spline and kriging in spatial interpolation of landform (topography): an experimental study. J. Geogr. Inf. Syst. 9(3), 354–371 (2017)
Gong, G., Mattevada, S., O’Bryant, S.E.: Comparison of the accuracy of kriging and idw interpolations in estimating groundwater arsenic concentrations in texas. Environ. Res. 130, 59–69 (2014)
Article Google Scholar
Sara, U., Akter, M., Uddin, M.S.: Image quality assessment through fsim, ssim, mse and psnr: a comparative study. J. Comput. Commun. 7(3), 8–18 (2019)
Article Google Scholar
Li, X., Chen, H., Qi, X., Dou, Q., Chi-Wing, F., Heng, P.-A.: H-denseunet: hybrid densely connected unet for liver and tumor segmentation from ct volumes. IEEE Trans. Med. Imaging 37(12), 2663–2674 (2018)
Article Google Scholar
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700–4708 (2017)
Mish, M.D.: A self regularized non-monotonic activation function. arXiv preprint arXiv:1908.08681 (2019)
Loshchilov, I., Hutter, F.: Fixing weight decay regularization in adam (2017)
You, K., Long, M., Wang, J., Jordan, M.I.: How does learning rate decay help modern neural networks? arXiv preprint arXiv:1908.01878 (2019)
Smith, L.N.: Cyclical learning rates for training neural networks. In: 2017 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 464–472. IEEE (2017)

Download references

Author information

Authors and Affiliations

Independent Researcher, Evanston, IL, USA
Jiayu Wang

Authors

Jiayu Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jiayu Wang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Wang, J. Data interpolation methods with the UNet-based model for weather forecast. Int J Data Sci Anal (2024). https://doi.org/10.1007/s41060-024-00611-z

Download citation

Received: 09 January 2024
Accepted: 12 July 2024
Published: 09 August 2024
DOI: https://doi.org/10.1007/s41060-024-00611-z

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Data interpolation methods with the UNet-based model for weather forecast

Abstract

Similar content being viewed by others

ECMWF short-term prediction accuracy improvement by deep learning

Extreme Weather Prediction Using 2-Phase Deep Learning Pipeline

Improving the Short-Range Precipitation Forecast of Numerical Weather Prediction through a Deep Learning-Based Mask Approach

1 Introduction