Introduction

Due to the rapid development of space technology, the accuracy of Earth Orientation Parameter (EOP) estimates dramatically improved in the 1990s (Schuh et al. 2002) and has remained at a high level ever since. The EOPs consist of precession–nutation, polar motion (PM), the difference between Universal Time (UT1) and Coordinated Universal Time (UTC), that is, UT1-UTC, and Length of Day (LOD) (Petit and Luzum 2010). Earth Rotation Parameters (ERPs) comprise the PM (including PMX and PMY), UT1-UTC, and LOD. The instantaneous movement of the Earth's rotation axis with respect to the terrestrial reference frame is described by the PM. As for PM, a major complication is that it is caused by partially unpredictable mass redistributions on the surface and in the interior of the Earth (Gross 2007; Dobslaw et al. 2010; Sun et al. 2019; Börger et al. 2023). Modern space navigation and deep space exploration are increasingly required for accurate real-time prediction of ERPs. Given the complicated data processing of modern geodesic techniques, such as Global Positioning System (GPS) technology, the acquisition of ERP results must be delayed by 15–20 h. Obtaining ERPs requires several days for the Very Long Baseline Interferometry (VLBI) and Satellite Laser Ranging (SLR) technologies. These factors make it challenging to acquire real-time ERPs, emphasizing the need for accurate predictions (Zhang et al. 2012). Several national and international services publish predicted values of EOPs, such as the International Earth Rotation and Reference Systems Service (IERS) Rapid Service/Prediction Center (RS/PC), operated by the US Naval Observatory (USNO) (Guo et al. 2013), and published in the IERS Bulletin A files, for a year into the future in the daily interval, or the EOP service of the Institute of Applied Astronomy of Russian Academy of Sciences (IAA RAS) (Suvorkin et al. 2015). The products provided by these agencies comprise estimates for PM, UT1-UTC, LOD, and other parameters, usually for a year into the future at daily sampling.

The polar motion includes a regular deterministic and an irregular stochastic component. The deterministic part consists of the long-term trend, Chandler wobbles (CW) (Chandler 1981; Zharkov and Molodensky 1996), Annual wobbles (AW), and Semi-Annual wobbles (SAW) (Wang et al. 2016; Gross 2000). Chandler Wobble is a resonant rotational mode of the Earth that decays freely due to the viscoelastic nature of the Earth. Studies have shown that CW will freely decay within 68 years to the minimum rotational energy state without excitation. It is generally believed that the oscillation period and amplitude of CW vary over time, with the period fluctuating between 1.13 and 1.20 years (Schuh et al. 2001). The annual oscillation of the pole curve includes both prograde and retrograde components. The intensity of the prograde part is 10 times that of the retrograde part, and there is a significant change in the period of the forward annual oscillation of the PM, which oscillates between 356 and 376 days (Joachim 2004). Considering the characteristics of secular drift, CW, and AW, scholars have conducted extensive studies and proposed various methods for predicting the ERPs. In General, these methods fall into linear and non-linear models. Kalman Filtering (Babcock and Wilkins 1989), Least Squares (LS) extrapolation, fuzzy interface systems (Akyilmaz and Kutterer 2004), autoregression models (AR) (Sun and Xu 2012), autocovariance models (Kosek 2002), and different combinations of these methods (Kosek and Popiński 2005; Kosek et al. 2004; Kosek et al. 2008) are all linear models. Methods such as threshold autoregression models, artificial neural networks (Liao et al. 2012; Egger 1992), and fuzzy reasoning are non-linear models.

More hybrid and machine learning methods have been introduced in recent years for predicting ERP variations. The rapid expansion in computing power and data volume has made applying deep learning in geodesy increasingly promising. In particular, the long short-term memory (LSTM) network (Hochreiter and Schmidhuber 1997), one of the most popular forms of recurrent neural networks (RNNs), is advantageous for geodetic time series prediction. The LSTM network can capture the non-linear structure between different time epochs in the time series due to the unique structure of its cells (Gers et al. 2000; Graves and Schmidhuber 2005). Some researchers have used the LSTM model in predicting the LOD (Gou et al. 2021), which might also be suitable for PM prediction problems. This study investigates the potential of utilizing LSTM combined with traditional methods for predicting PM. The method proposed is novel in that the non-linear part of PM is not predicted by the linear method AR model but through the deep learning method LSTM model.

This paper is structured as follows: In the second section, we describe the LSTM and LS + AR algorithms. Section three introduces the dataset and processing strategy, including the data used in each experiment, the amplitude variation and characteristics of AW, SAW, and CW in PM through the Fast Fourier Transform (FFT) spectrum analysis, and the detailed PM prediction process by LSTM, LS + AR, and LS + LSTM model. Next, we present different models to estimate PM variability, including LS + AR, LSTM, and LS + LSTM, all of which draw on IERS EOP 14C04 data from 2011 to the end of 2020. At the same time, Bulletin A from the IERS RS/PC is used to compare the prediction accuracy with the results derived in this paper. A summary of the findings is given in the last section.

Materials and methods

LSTM prediction model

Introduction of the general concept of LSTM

LSTM is now widely used and has proven to perform well on various problems such as handwriting recognition, speech recognition, and time series prediction (Schmidhuber 2015; Alex et al. 2018). However, a neural architecture would not be widely utilized in practice without a solid theoretical foundation. Greff et al. (2017) recently reviewed several LSTM variants and their performances relative to the so-called vanilla model (Greff et al. 2017). The variant LSTM is an improved model based on the original LSTM (Hochreiter and Schmidhuber 1997; Gers and Schmidhuber 2000). The main change of the variant LSTM model compared to the original LSTM model is the addition of cell state information to the inputs of the three control gates. Unlike feedforward neural networks, the RNNs have a cycle function, which can take the activation in the previous steps as the network’s input and play a decisive role in the current input. However, training recurrent or very deep neural networks is challenging because they frequently suffer from exploding and vanishing gradient problems (Hochreiter 1991; Hochreiter et al. 2001). To solve the problems mentioned above, the LSTM architecture was developed to address this deficiency and the learning long-term dependencies. Figure 1 depicts the LSTM network structure, which is detailed in Appendix A.

Fig. 1
figure 1

The architecture of LSTM with a forget gate

LSTM training results analysis

In the LSTM network training, the hidden layers are set to 2 and the number of LSTM cells per hidden layer is 50. Time steps are set to 365 and training iterations are set to 1000. The learning efficient dropout is 0.1. The Savitzky–Golay (SG) smoothing filter is used in the experiments of this paper. The initial learning rate is set to 0.1 and the learn rate drop factor is 0.2 (Greff et al. 2017; Ren et al. 2020). The gradient threshold is set to 1 (Din et al. 2019). Other parameter settings are listed in Appendix A, Table 4. Figure 2 shows the LSTM network training based on the PM time series. Figure 2a and b indicates that the correlation between the original and output sequences of PMX and PMY is 0.99982 and 0.99987, respectively. Figure 2c and d shows that the Root Mean Square Error (RMSE) of PMX and PMY is 1.7916 mas and 1.6128 mas, respectively. Figure 2e and f shows that the Mean and the Standard Deviation (STD) of PMX and PMY are − 0.1682 mas and 0.3365 mas, and 1.7840 mas and 1.5776 mas, respectively.

Fig. 2
figure 2

Prediction values of the PMX and PMY. a, b Are the LSTM network training, PMX and PMY raw data in the blue line and outputs in the red line, c, d are the errors, e, f are the error histogram of PM from 2011 to 2020

LS + AR prediction model

LS model

We use the following model to fit the trend and periodic terms of EOP, whose parameters can be estimated using the least squares method. The residuals are then analyzed by the AR and other models. The least squares model can be described as

$$F\left(t\right)=A+Bt+{C}_{1}\mathrm{cos}\left(\frac{2\pi t}{{P}_{SA}} \right)+{C}_{2}\mathrm{sin}\left(\frac{2\pi t}{{P}_{SA}} \right)+{D}_{1}\mathrm{cos}\left(\frac{2\pi t}{{P}_{A}} \right)+{D}_{2}\mathrm{sin}\left(\frac{2\pi t}{{P}_{A}} \right)+{E}_{1}\mathrm{cos}\left(\frac{2\pi t}{{P}_{C}} \right)+{E}_{2}\mathrm{sin}\left(\frac{2\pi t}{{P}_{C}} \right)+\omega ,$$
(1)

where A is the constant, B is the trend term parameter in the model, C1 and C2 are the SAW parameters, D1 and D2 are the AW parameters, and E1 and E2 are the CW parameters. The fitting model calculates PSA, PA, and PC in years, representing the SAW, AW, and CW, respectively.

We additionally conducted an FFT analysis of the EOP 14C04 series. From Fig. 3a and b, it can be seen that CW and AW dominate the PM spectrum, manifested by cusps of power between 413 to 439 days (CW) and 356 to 376 days (AW). These values are relatively consistent with estimates given elsewhere (Mccarthy and Luzum 1991; Schuh et al. 2001; Joachim 2004). In our model (Eq. 1), the AW period is 365.25 days, CW is 434 days and SAW is 182.62 days. \(\upomega\) is the random error, \(t\) is the UTC of the series, and the unit is converted into years when LS fitting. Similarly, the meaning of each corresponding parameter table in the PMX series is identical to that of the PMY series.

Fig. 3
figure 3

Amplitude spectrum of the two components of polar motion (PMX in red, PMY in blue) as deduced from FFT of the respective time series. Units are [mas]

AR model

AR(p) model is the description of the relationship between a random series \({z}_{t}\) (t = 1, 2, …, N) before time t and the current time. Its expression can be written as follows:

$${z}_{t}=\sum_{i=1}^{p}{\o }_{i}{z}_{t-i}+{\omega }_{t},$$
(2)

where \({\o }_{1, } {\o }_{2, }{\dots , \o }_{p}\) represent the autoregressive coefficients obtained by solving the Yule–Walker equations using the Levinson–Durbin recursion (Brockwell and Davis 1997), \({\omega }_{t}\) is the white noise with zero means, and \(p\) stands for the model order. The above equation denoted by \(\mathrm{AR}\left(p\right)\) is the AR model of the order \(p\), and how to determine the order \(p\) is crucial. Usually, there are three methods for the determination of \(p\), Akaike's final prediction error (FPE) criterion, the information criterion, and the delivery function criterion. In this paper, the FPE criterion is adopted to determine the order \(p\) and corresponds to the smallest FPE (Akaike 1971):

$${\mathrm{FPE}}_{P}={P}_{p}(N+p+1)/(N-p-1),$$
(3)
$${P}_{p}=1/(N-p)\sum_{t=p+1}^{N}{\left({z}_{t}-\sum_{j=1}^{p}{\varphi }_{j}{z}_{t-j}\right)}^{2}.$$
(4)

The mean absolute error (MAE) is utilized to evaluate the prediction accuracy. It can be expressed as follows:

$${MAE}_{j}=\frac{1}{n}\sum_{i=1}^{n}(\left|{P}_{i}-{X}_{i}\right|),$$
(5)

where \({P}_{i }\) represents the predicted value of \(i\)-th prediction, \({X}_{i}\) stands for the corresponding observation value, \(n\) is the total prediction number, and \({MAE}_{j}\) is the MAE at span \(j\).

Data description and processing strategy

Data description

In particular, we use the PM time series from IERS EOP 14C04 with daily sampling interval is available at https://hpiers.obspm.fr/eoppc/eop/eopc04/. In this study, we use the PM series from January 8, 2011 to December 31, 2021. The results will be compared to Bulletin A (558 files) available at https://www.iers.org/IERS/EN/DataProducts/EarthOrientationData/eop.html, for the same period of time as IERS EOP 14C04. The LSTM network training is based on the PM time series from January 1, 2011 to December 31, 2020.

PM prediction processing strategy

Figure 4 depicts a schematic representation of the methodology adopted for predicting PM with various models. The observed PM can be divided into deterministic and stochastic components. The known component is referred to as a priori model, consisting of the long-term trend, CW, AW, and SAW. In this study, the LS + AR model (first method) is applied to forecast PMX and PMY and compared to results based on the LS + LSTM (second method) and LSTM models (third method). Figure 4 describes the respective processing schemes. For the first two methods, training patterns are derived from the residuals after subtracting the a priori model. These patterns are used for training the LSTM. The predicted residuals are then added to the a priori model to obtain the final predicted values of the PMX and PMY. The third method for predicting PM uses the LSTM model directly, relying on the IERS EOP14C04 time series.

Fig. 4
figure 4

Flowchart of the LS + AR, LS + LSTM, and LSTM approaches for PM prediction

Results and discussion

PM prediction using the LS + AR model

We initially preprocess the PM series using the LS model and deduce residuals (that is, the stochastic components) by subtracting the LS analysis results from the original pole coordinates. Figure 5a shows the PM residual results (purple line) derived from the IERS EOP 14C04 from 2011 to 2020. The residuals of the PMX and PMY are within \(\pm 0.08\) arcsecond (as). Due to the nature of the LS fitting model, the fluctuations at the start and end of the time series are somewhat larger compared to the middle part. Figure 5b depicts the first-order difference of the residual sequence (brown line) for PMX and PMY. Most of the residual values and the first difference values are within \(\pm 0.1\) arcseconds (as) and \(\pm 2.0\) milliarcseconds (mas), respectively.

Fig. 5
figure 5

Residual series (purple line) of the LS fitting model and the first-order difference series (brown line) of the residuals from 2011 to 2020

The evaluation of the Autoregressive Integrated Moving Average (ARIMA) model type is essential. When \(p=0\), the \(\mathrm{ARIMA}(p,q)\) (autoregressive integrated moving average) model can be expressed as \(\mathrm{MA}(q)\), i.e., q order moving average model. When \(q=0\), the model can be described as \(\mathrm{AR}(p)\), i.e., a p-order autoregressive model (Box et al. 1976). For the PM time series, Table 1 illustrates the criteria for which model can be evaluated according to the autocorrelation and partial correlation functions of the time series. Figure 6 depicts the subsequent calculation of the autocorrelation function (ACF) and partial autocorrelation function (PACF) for the first-order differential of residuals time series with a delay of 1 to 40 days. The results indicate that the autocorrelation function of the first-order difference sequence of the residual error is tailing, and the partial correlation function is truncated, allowing the \(\mathrm{AR}(p)\) model to be used for prediction, i.e., \(q=0\) (Schaffer et al. 2021). In this research, the FPE is used to determine order p, and this method is described in Eqs. (3) and (4). The optimal order p for the AR model, as determined by the final prediction error criterion, is set to 50.

Table 1 Judgment criterion of the ARIMA model
Fig. 6
figure 6

The ACF and PACF of first-order differential of PM residuals time series for lags from 1 to 40 days. All results in units of [mas]

In this experiment, we extrapolate the deterministic part of 2021 (365 days) using the LS model, based on the least squares fitting series of the IERS EOP14C04 from January 1, 2011 to December 31, 2020 (10 years). Figure 7 shows the LS model time series (red line) and LS extrapolation time series (blue line) of PM. The experimental purpose is to use the LS + AR model for PM prediction. The final prediction results of the LS + AR model are the sum of the LS extrapolation using the determined part and the prediction results of the AR model based on the residual sequence.

Fig. 7
figure 7

The LS fitting results of the PMX and PMY (red line) between 2011 and 2020, and the LS extrapolation of PMX and PMY (blue line) in the future 365 days of 2021. All outcomes are given in units of [as]

PM prediction using the LSTM model

Figure 8 displays the PM prediction based on the IERS EOP14C04 time series (blue line) utilizing the LSTM model. The green line represents the prediction outcomes for 2021 (365 days). It can be seen that the prediction results of PMX and PMY based on the LSTM model for 365 days in 2021 are consistent with the overall trend of PMX and PMY time series from the IERS EOP 14C04. Most PMX prediction values fall between 0.15 as and 0.21 as and the PMY prediction values are between 0.28 as and 0.42 as.

Fig. 8
figure 8

The PM prediction results (green line) from the LSTM model in 2021, based on the PMX and PMY observation (blue line) from IERS EOP14C04 between 2011 and 2020. All results in units of [as]

PM prediction using the LS + LSTM model

To investigate the contribution of the LS + LSTM model in PM prediction, LSTM is applied to forecast the residual part in 2021, using the residuals' basic time series. Figure 9 presents the 2021 residual values (blue line) prediction from the LSTM model. The final PM prediction using the LS + LSTM model is the total results of the LS extrapolation from the LS model and the residuals determined by the LSTM model.

Fig. 9
figure 9

Residuals of PM (red line) from 2011 to 2020, and the residual prediction results (blue line) of the PMX and PMY based on the LSTM model in 2021. All results are given in units [as]

Figure 10a and b depicts the final prediction of the PM with different methods, including the LS + AR, LSTM, and LS + LSTM models. In addition, the forecast results of Bulletin A in 2021 are included (purple line). The IERS EOP14C04 time series is considered a benchmark for comparing the estimated outcomes of various techniques. In terms of PM prediction, the results predicted by the LS + LSTM model (green line) in Fig. 10c and d are the closest to the IERS EOP14C04 time series (red) over the mid- and long-term prediction. Although the improvement is marginal, the findings predicted by the LS + LSTM model are very close to or better than Bulletin A in the mid- and long-term prediction of PMX, and it also can be seen that the PMX prediction accuracy from the LSTM model is higher than that from the LS + AR model in the mid- and long term. For PMX, the RMSE of the results is 0.035 as, 0.031 as, 0.018 as, and 0.030 as for LS + AR, LSTM, LS + LSTM, and Bulletin A, respectively. For PMY, the RMS of the results is 0.038 as, 0.035 as, 0.015 as, and 0.035 as for LS + AR, LSTM, LS + LSTM, and Bulletin A, respectively.

Fig. 10
figure 10

The PMX and PMY predictions from different models, i.e., LS + AR, LSTM, LS + LSTM, and Bulletin A model, are shown in a and b panel, respectively, and the IERS EOP 14C04 as a reference. Panels c and d show the 2021 forecasts as separate figures. All results in units of [as]

Evaluating the PM prediction results

Based on the previous analysis (Fig. 10), the LS + LSTM model prediction accuracy is higher than other models. To assess the accuracy of this method in a more extended way, we compare the PM prediction in different periods using the LS + LSTM model, LS + AR model, and LSTM model as shown in Fig. 11. The prediction span is 365 days with a ten-year basic sequence, and the statistical period is from 2011 to 2020. The model proposed in this experiment is based on the IERS EOP 14C04 for PM prediction. In Fig. 11 the orange, brown, green, and purple lines represent the LS + AR, LSTM, LS + LSTM, and Bulletin A prediction of PM, respectively. We also compare the PM prediction results from the LS + LSTM model to the IERS EOP 14C04. In the mid- and long-term prediction of the PM, the prediction results based on the LS + LSTM model are closer to the observed IERS EOP 14C04 time series than those based on Bulletin A.

Fig. 11
figure 11

Prediction values of PMX and PMY with different models, including the LS + AR, LSTM, and LS + LSTM, respectively. All results in units of [as]. IERS EOP 14C04 from 2011 to 2020 is shown as reference for comparisons

AE of PM prediction with different models

Experiments have demonstrated that the LS + LSTM model is superior for predicting PM, especially in the mid- and long term. To further explore the advantages of the LS + LSTM model in the accuracy of PM prediction, four different cases were designed to predict PM for 11 years (from 2011 to 2021). In this experiment, the prediction span was 365 days with a weekly sliding window. The experiment is divided into four parts, considering the following methods:

Case 1: PMX and PMY prediction based on the LS + AR model;

Case 2: PMX and PMY prediction based on the LSTM model;

Case 3: PMX and PMY prediction based on the LS + LSTM model;

Case 4: PMX and PMY prediction from Bulletin A achieved from the IERS RS/PC.

The four cases listed above correspond to the LS + AR, the LSTM, the LS + LSTM, and the Bulletin A provided by the IERS, respectively. Authors generally rely on the 10-year IERS EOP14C04 time series in the PM prediction as the basic series (Xu et al. 2012; Xu and Zhou 2015; Kenyon et al. 2012). In the following experiments, using various methods, we also choose a ten-year base sequence to predict the PM for the next 365 days.

Figure 12 shows the PM prediction’s absolute errors (AE) using four cases. All experimental results take the IERS EOP14C04 time series as a reference. It can be seen that the accuracy of the four cases (LS + LSTM, LSTM, LS + LSTM, and Bulletin A) in 2011–2015 is inferior to that in 2016–2021. One potential reason could be the 2011 earthquake on the Pacific coast of Tōhoku (the 3.11 Japan earthquake). Based on the data from the Jet Propulsion Laboratory (JPL) of the National Aeronautics and Space Administration (NASA), the 3.11 Japan earthquake shifted the Earth's rotation axis by 25 cm and accelerated the Earth's rotation rate by 1.8 microseconds (Gross 2007). Earthquakes not only cause significant changes in the Earth's rotation on the day they occur, but they also impact the location changes of surface stations over the next 3–5 years, thus affecting ERP monitoring (Souriau 1986; Bizouard, 2005; Bogusz et al. 2015). IERS introduced post-seismic deformation (PSD) in 2017 when establishing the most recent international terrestrial coordinate framework (ITRF2014) to reduce the influence of earthquakes on ground stations and obtain more accurate ERPs data. It is worth noting that ITRF2014 is the most recent ITRF solution at the time of the study. The IERS EOP14C04 also corrected the PSD model of the large earthquake in Japan in March 2011 to precisely solve the PM change phenomenon at this stage; hence, the PM results at this phase deviated from the previous comprehensive trend. However, this deviation was not considered when the models described in this study were used to predict PM, likely resulting in prediction errors. Thus, our preliminary conclusion is that the larger deviations of the prediction results from the observed values (EOP 14C04) between 2011 and 2015 are attributable to the effects of large earthquakes. To improve the accuracy of PM prediction following a major earthquake, further PSD model processing of the prediction algorithm is required. The results predicted by the LS + LSTM are closer to the IERS EOP14C04 series than those by the LS + AR, LSTM, and Bulletin A in mid- and long-term prediction.

Fig. 12
figure 12

Absolute errors (AE) of the PMX and PMY prediction from 2011 to 2021, and all outcomes with the unit [as]. All results are based on 10-year time series of IERS EOP 14C04, using the LS + AR, LSTM, and LS + LSTM models, respectively. The MAE of Bulletin A relies on the PM prediction, coming from the IERS RS/PC

MAE of PM prediction with different models

Figure 13 shows the Mean absolute errors (MAE) of PMX and PMY prediction in four cases. Compared to the other models, the MAE of the proposed LS + LSTM model yields smaller errors in mid- and long-term prediction. Since the LS + LSTM model better considers the overall characteristics of the base series, it obtains a more accurate long-term trend and long period term than the LS model during extrapolation, thus improving the mid- and long-term PM prediction accuracy.

Fig. 13
figure 13

Mean absolute errors (MAE) of the PMX and PMY prediction using the four cases, namely, the LS + AR model (blue line), LSTM model (brown line), LS + LSTM model (red line), and Bulletin A is indicated by the green line. All results in units of [mas]

As Table 2 reveals, the estimation accuracy of the PM is determined using different models, i.e., the LS + AR, LSTM, and LS + LSTM models. The MAE of predicted PM at various periods (1, 5, 10, 15, 20, 30, 45, 90, 120, 180, 270, 320, 365 days) is listed in Table 2. Combined with the PM prediction accuracy statistics, the improvement of LS + LSTM over Bulletin A is clear after 120 days. The improvement gradually increases with the lengthening of the prediction span, reaching a maximum of 33.7% and 31.9% in PMX and PMY, respectively. Generally, the LS + LSTM model has more advantages than the LSTM model, the traditional linear prediction model (LS + AR model), and Bulletin A in mid- and long-term PM prediction.

Table 2 Mean absolute errors (MAE) of the polar motion (PMX, PMY) prediction [mas] in 2021, using the LS + AR, LSTM, LS + LSTM, and Bulletin A time series from IERS, respectively

However, Bulletin A exhibits a smaller MAE in the four cases for short-term prediction. Table 2 demonstrates that Bulletin A outperformed the other models regarding short-term prediction, especially the ultra-short term (the first ten days in the future). This advantage is primarily due to Bulletin A considering the effects of atmospheric angular momentum (AAM) and oceanic angular momentum (OAM). In addition, the statistical results demonstrate that the LSTM model is superior to the traditional model (LS + AR) in the long-term (270–365 days) prediction of PM.

Conclusions

Polar motion is a crucial parameter describing the instantaneous movement of the Earth’s rotation axis relative to the body-fixed reference frame. Among existing prediction models, linear models are often used to predict PM, such as the LS + AR model. Here, we have analyzed the PM series data from January 8, 2011 to September 11, 2021 by different models, including LS + AR, LSTM, LS + LSTM, and Bulletin A. The residual series used in this research is obtained by removing the long-term trend term and the calculated AW, SAW, and CW values. In this paper, based on the characteristics of PM and its inherent periodic and trend terms, the LSTM prediction model is proposed. To verify the advantages of LSTM and its combination with LS in PM prediction, the basic sequence length of 10 years is selected, which is optimal for the LS + AR model to predict PM. The experimental findings demonstrated that the LS + LSTM model is superior for mid- and long-term forecasting of PM. Compared to Bulletin A, published by IERS, the LS + LSTM model demonstrates improved PMX and PMY prediction accuracy by up to 33.7% and 31.9%, respectively, and the LSTM model outperforms the LS + AR model in the mid- and long term.

The study's findings rely heavily on the 10-year snippet of PM time series between 2011 and 2021. Future research will investigate the relationship between the length of the basic time series, the seismic factors, and the accuracy of LSTM and LS + LSTM models in predicting PM. The prediction model such as LS + LSTM, based on proper base sequence length and seismic factor correction, will be established to improve the short-term PM prediction. In addition, the benefits of combining the LSTM with LS and other traditional methods for PM short-term prediction need to be further explored.