Assessment of length-of-day and universal time predictions based on the results of the Second Earth Orientation Parameters Prediction Comparison Campaign

Predicting Earth Orientation Parameters (EOP) is crucial for precise positioning and navigation both on the Earth’s surface and in space. In recent years, many approaches have been developed to forecast EOP, incorporating observed EOP as well as information on the effective angular momentum (EAM) derived from numerical models of the atmosphere, oceans


Introduction
Time-variable Earth Orientation Parameters (EOP) serve as an essential link between the celestial and terrestrial reference frames, allowing for the transformation of coordinates between them (Petit and Luzum 2010).This connection has a wide range of applications in modern geodesy (precise positioning and navigation on the Earth's surface and in space, and determining orbits of satellites), astronomy (astronomical instruments orientation), and in the operation Extended author information available on the last page of the article of space missions.EOP comprise polar motion (PM), differences between universal time and coordinated universal time (UT1-UTC), or its time-derivative length-of-day (LOD) variations, and corrections to the conventional precessionnutation model, i.e., celestial pole offsets (CPO).EOP are routinely determined with high accuracy with the means of Global Navigation Satellite Systems (GNSS, Ferland et al. 2009;Steigenberger et al. 2006;Zajdel et al. 2020), Satellite Laser Ranging (SLR, Bloßfeld et al. 2018;Glaser et al. 2015;Sośnica et al. 2019), Very Long Baseline Interferometry (VLBI, Karbon et al. 2017;Robertson et al. 1985;Schuh and Böhm 2013;Sovers et al. 1998), and Doppler Orbitography and Radiopositioning Integrated by Satellite (DORIS, Moreaux et al. 2023).
The International Earth Rotation and Reference Systems Service (IERS) is a widely recognized institution that is responsible for the regular delivery of the EOP series (Bizouard and Gambis 2009).The monitoring and sharing of daily, monthly, and long-term EOP data, as well as leap second announcements, are the responsibilities of the Earth Orientation Center hosted by the Paris Observatory that acts under the auspices of IERS (Bizouard et al. 2019;Gambis 2004;Gambis and Luzum 2011).The former version of the final EOP series provided by IERS, i.e., IERS EOP 14 C04, was consistent with the conventional International Terrestrial Reference Frame 2014 (ITRF 2014) and the Second International Celestial Reference Frame (ICRF2) (Bizouard et al. 2019).The latest implementation of the C04 series, namely IERS EOP 20 C04, aligned with the most recent international terrestrial reference frame ITRF 2020 (Altamimi et al. 2023) and celestial reference frame ICRF3 (Charlot et al. 2020), was made available in February 2023 (IERS Message No. 471 distributed by the IERS Central Bureau: https://da tacenter.iers.org/data/2/message_471.txt).The formal errors associated with the final IERS EOP 20 C04 data delivered by the IERS currently indicate an uncertainty level of approximately 10 microseconds (μs) for UT1-UTC, while rapidly processed observations may have errors several times higher.
Many operational applications require the knowledge of EOP in real time.However, the complexity of computations and the processing of data from various observational techniques, each characterized by different levels of accuracy, stability, availability, and temporal resolution, increase the processing time required for the EOP determination.As a result, rapidly processed but less precise EOP datasets are provided once per day, while the most accurate solutions are delivered with delays of 30 days and longer.Consequently, most real-time applications exploit short-term EOP predictions, which are currently processed by institutions around the world.For example, the U.S. Naval Observatory (USNO) routinely issues forecasts that are subsequently officially disseminated by the IERS (Luzum et al. 2001).In addition, Deutsches GeoForschungsZentrum (GFZ; Dill et al. 2019), Eidgenössische Technische Hochschule (ETH, Kiani Shahvandi et al. 2022a;Soja et al. 2022), Jet Propulsion Laboratory (JPL; Gross et al. 1998), and the European Space Agency (ESA; Bruni et al. 2021) deliver EOP forecasts on a regular basis.The latter two institutes provide truly independently processed series of EOP based on data from various observational techniques (Bruni et al. 2021;Ratcliff and Gross 2019), whereas other groups such as ETH and GFZ partly rely on IERS input data (Kiani Shahvandi et al. 2022a;Dill et al. 2019).
For most real-time applications, such as precise positioning and navigation, it is sufficient to have an EOP forecast for the next few days.EOP predictions for a duration of at least 1 year into the future might be utilized for climate forecasting and long-term satellite orbit prediction (Lei et al. 2023).In terms of required EOP prediction accuracy, the current accuracy of predictions (below 1 ms for UT1-UTC ultra-short-term predictions disseminated by IERS) meets the needs for ephemerides or pointing astronomical instruments (Luzum 2010).For the purpose of tracking and navigating interplanetary spacecraft, forecast accuracy requirements of approximately 10 mas for PM and 0.65 ms for UT1-UTC have been formulated (Oliveau and Freedman 1997).Even higher accuracy of EOP predictions is essential for real-time satellite orbit determination and VLBI analysis.Given the broad range of possible future satellite applications related to, e.g., the monitoring of precipitable water vapor, tsunamis, and earthquakes, the demand for accurate estimates and predictions of EOP might even increase further.
In addition to the observed EOP values, contemporary Earth orientation predictions commonly incorporate analyses and forecasts of effective angular momentum (EAM).In theory, changes in the rotation of the solid Earth can be examined by employing the principle of conserving angular momentum within the Earth system, including the surrounding fluid layers (atmosphere, oceans, continental hydrosphere) (Gross 2007).According to this principle, the rotation of the solid Earth is altered by external torques, internal redistribution of mass, and the exchange of angular momentum between the solid Earth and the adjacent fluid layers.EAM functions describe the excitation of Earth orientation changes caused by the sum of atmospheric, oceanic, and hydrospheric mass redistributions (Barnes 1983;Brzeziński 1992).In particular, the axial component (χ 3 ) of EAM excites UT1-UTC.Previous works (e.g., Gross et al. 2004) have shown that atmospheric and oceanic effects can explain up to 90% of observed UT1-UTC variation.Such relationships between modelled EAM and the EOP have prompted the adoption of EAM forecasts for predicting Earth rotation especially for shorter time horizons (Dobslaw and Dill 2018;Freedman et al. 1994).
Between 2006 and 2008, a comprehensive comparison and evaluation of different EOP forecasts was carried out as part of the EOP Prediction Comparison Campaign (EOP PCC) organized by Vienna University of Technology and Centrum Badań Kosmicznych Polskiej Akademii Nauk (CBK PAN) with the support of the IERS (Kalarus et al. 2010).The primary objectives of this initiative were to identify the optimal methods for EOP forecasting and to eventually develop a combined series of EOP predictions.The EOP PCC yielded valuable insights into various prediction techniques under consistent rules and conditions.The benefits of combining submitted solutions were demonstrated, and it was shown that incorporating atmospheric angular momentum (AAM) forecast data as an input improves EOP prediction accuracy.However, the choice of the best prediction technique was shown to be dependent on the selected EOP and the targeted prediction horizon, indicating that no single technique outperformed all the others (Kalarus et al. 2010).
Since the completion of the EOP PCC in 2008, considerable progress has been made in the development of observational data, the advancement of new forecasting methods, and the understanding of the role of EAM in EOP variations.There has also been a substantial increase in the number of teams involved in EOP prediction, with different teams applying various inputs, forecasting algorithms, and prediction horizons.Consequently, there are clear differences in the accuracy of the resultant individual predictions.As a result of the progress in the field of EOP forecasting, the IERS has established a dedicated working group (Working Group on the 2nd EOP PCC, for details see https://www.iers.org/IERS/EN/Organization/WorkingGroups/PredictionComparison/predictionComparison.html) with the primary objective of conducting a new comprehensive reassessment of the current capabilities of EOP forecasting.The 2nd EOP PCC was handled in collaboration between CBK PAN and GFZ.The EOP PCC Office, established at CBK PAN, was responsible for routinely collecting and evaluating forecasts submitted by registered campaign participants ( Śliwińska et al. 2022).The operational phase of the 2nd EOP PCC lasted 70 weeks from September 1, 2021, until December 31, 2022, involving 18 active teams from 23 institutes around the world, who routinely provided predictions of all EOP based on 50 different methods distinguished with individual IDs.During the campaign, 7327 individual predictions of any EOP were collected.A summary of the most relevant events related to the 2nd EOP PCC and some technical aspects, such as file format requirements and data submission rules, as well as more detailed statistics, can be found in Śliwińska et al. (2022).
Although only the values of PM, UT1-UTC, and precession-nutation are sufficient to perform coordinate transformations between the celestial and terrestrial reference frames, the 2nd EOP PCC Office also collected and analyzed LOD predictions, even though LOD is not directly included in the transformation matrix.LOD can be defined as the first negative derivative of UT1-UTC with respect to time, which means LOD is equal to the rate of change of UT1-UTC over time, expressed as: In turn, UT1-UTC can be estimated from LOD by integrating LOD values and adding the value of UT1-UTC at zero epoch (Mikschi et al. 2019).Estimation of LOD is important because of the inherent disturbances arising from GNSS data in the case of UT1-UTC, as it is challenging to reliably distinguish between linear drifts in the satellite constellation and changes in the Earth's rotational angle.LOD remains unaffected by this issue, making it feasible to estimate accurately using GNSS alone, without the requirement of a consistent combination with VLBI.LOD can be therefore used to densify and predict UT1-UTC (Senior et al. 2010).
In the current study, we present a scientific summary of the 2nd EOP PCC results focusing on a thorough assessment of predictions of UT1-UTC and LOD submitted by the campaign participants.Our goal is to provide an objective analysis of a wide variety of UT1-UTC and LOD forecasts developed using diverse approaches, prediction methods, and input data.We intend to determine the current forecasting capabilities for these parameters and offer recommendations on the most effective methodologies for this purpose.
No such comprehensive analysis of multiple UT1-UTC and LOD forecasts has been conducted since the conclusion of the 1st EOP PCC.Section 2 of this paper provides a brief overview of prediction methodologies and input data used by contributors, along with statistics regarding the number of valid files received and the most popular prediction horizons.Section 3 describes the methodology for prediction evaluation.Section 4 presents a thorough evaluation of the predictions, divided into a general overview of predictions without distinguishing specific IDs (Sect.4.1) and a detailed assessment of all predictions (Sect.4.2).Section 5.1 presents an analysis of the dependence of prediction accuracy on the considered time period, whereas Sect.5.2 deals with the transformation between LOD and UT1-UTC.Section 6 includes a ranking of all methods based on previously applied criteria, and finally, Sect.7 summarizes all the results and provides concluding remarks.

Prediction methods and input data exploited by participants
During the 2nd EOP PCC, UT1-UTC was predicted by 15 participants using 25 different combinations of methods and inputs (indicated with individual IDs), while LOD predictions were performed by 12 teams with the use of 25 approaches (Table 1).15 of the IDs provided forecasts for both parameters, which makes a total of 35 IDs providing UT1-UTC and/or LOD predictions.In total, the EOP PCC Office received 1399 files for UT1-UTC and 1226 files for LOD predictions.A summary of the prediction methods and input data used by the participants to predict UT1-UTC and LOD is presented in Table 2 and a full description of the The table only contains information relating to active participants of the campaign (i.e., those who have submitted their forecasts at least once) A participant is an institute or a consortium of several institutes that work together on a given forecast and have a common ID or several common IDs.An institute is a single research center that either acts as a participant in developing its own forecasts or cooperates with other institutes a Based on Kalarus et al. (2010) b Ultra-short-term < 10 days, short-term < 30 days, medium-term < 500 days groups is provided in "Appendix 1".A wide variety of methods are exploited, but algorithms based on least squares (LS) and their modifications and approaches based on machine learning (ML), were most popular among all IDs.However, only two institutes used ML-based methods for UT1-UTC and LOD forecasting.ML has been declared in 14 different IDs, 13 of which were developed by one group (ETH) (see Table 2 and "Appendix 1" for more details).The LS methods are usually combined with autoregression (AR), autoregressive integrated moving average (ARIMA), convolution, local approximation (LA), or kriging methods.Of the methods that do not belong to the two most popular groups (ML and LS), the most noteworthy are Kalman filtering, adaptive polyharmonic models, normal time-frequency transform (NTFT), singular spectrum analysis (SSA), and Copula approaches (Table 2).
The input data exploited are more homogeneous than methods of prediction as almost all participants use the EOP 14 C04 series provided by the IERS, usually supplemented with daily datasets from the IERS Rapid Service/Prediction Center at USNO.Only two IDs use the EOP final series from other data sources.A total of 23 IDs declared the use of EAM data as an additional input and most of these exploit EAM series provided by GFZ (Dill et al. 2013(Dill et al. , 2019(Dill et al. , 2022)).Although GFZ routinely delivers data and 6-day forecasts for atmospheric, oceanic, hydrological, and sea-level angular momentum (AAM, OAM, HAM, and SLAM, respectively), not all participants used each of the four components.

Submissions of UT1-UTC and LOD predictions
Figure 1 presents statistics of submitted files for UT1-UTC and LOD predictions, specifically the number of files submitted by each ID throughout the entire campaign period (Fig. 1a), the total number of all prediction files received on individual submission days (Fig. 1b), and the most common prediction lengths (Fig. 1c). Figure 1a shows that only two IDs (IDs 100 and 126) provided valid predictions of UT1-UTC in all 70 weeks of the campaign, but another 10 out of the 25 total IDs involved in UT1-UTC prediction provided more than 60 UT1-UTC forecasts.The IDs for which we received the lowest number of predictions (IDs 146-149) were registered around halfway through the 2nd EOP PCC operation.Even though as many methods were used for LOD forecasting as for UT1-UTC prediction, there are over 170 fewer files submitted for LOD (Table 1, Fig. 1).This is partly because in the case of LOD there were more methods registered later in the campaign 156,157), and some participants stopped (e.g., ID 108) or started (ID 117) forecasting LOD a few weeks after initiating active participation.Over 60 LOD predictions were sent by 10 out of the 25 IDs, of which only 2 provided files in all weeks of the campaign duration.The least active participant (ID 157) For about the first 7 months of the campaign, apart from a few occasional drops, the number of valid UT1-UTC and LOD forecasts received each Wednesday by the EOP PCC Office was relatively stable (around 20 predictions of UT1-UTC and 16 predictions of LOD) (Fig. 1b).After a noticeable decrease at the end of March 2022, the quantity of submitted files increased (around 25 for UT1-UTC and 22 for LOD) because of the registration of several new methods by one of the participants.These numbers eventually diminished in September 2022 to reach 18 or 19 for both parameters.
The 2nd EOP PCC participants had full freedom in terms of the forecast horizon, with the only condition that the predictions could not be longer than 365 days into the future.A histogram of the prediction horizons used in the 2nd EOP PCC is shown in Fig. 1c.The most popular forecast length for both UT1-UTC and LOD was 90 days into the future and the second most popular was the prediction for 365 days ahead.In contrast, in the 1st EOP PCC, the most frequently submitted prediction lengths were ultra-short-term forecasts (predictions for up to 10 days into the future) (Kalarus et al. 2010).

Prediction evaluation methodology
Because of the large number of IDs predicting both parameters, we decided to present detailed results in groups of contributions.As stated in Sect.2.1, the exploited input data were rather homogeneous, so we formed the groups manually based on the prediction method used.The most popular approach among participants was exploiting LS in combination with AR and with possible modification of the method, so we decided to distinguish two groups based on this group of methods.The first group, "LS + AR", includes IDs that do not use EAM data, while the second, "LS + AR + EAM", includes IDs that additionally use EAM data.The third group ("ML") includes ML approaches, most of which use EAM data as well (only for one ID, i.e., ID 115, we do not have a clear information whether this type of data was used or not).The last group, "Other", contains more unique methods that cannot be included in the previous three groups.In this group, 4 out of 8 IDs declared the use of EAM data (IDs 102,104,117,and 123).Every group contains between 5 and 8 IDs.A summary of the method clustering with information on the assignment of IDs to the groups is shown in Table 3.
Following the previous campaign, we use the mean absolute error (MAE) as a basic parameter for EOP prediction evaluation (Kalarus et al. 2010;Kur et al. 2022;Śliwińska et al. 2022):  100,107,113,114,118,122 101,105,108,112,116,136 115,126,130,146,147,148,149 102,103,104,117,121,123 LOD 100,113,114,122,156 101,105,108,112,136 115,124,125,129,142,143,144,145 102,104,117,121,123,141,157 where n p is the number of valid prediction files submitted by a campaign participant under a single ID, x obs i is the EOP reference data for the ith day of reference series, x pred i, j is the predicted value for the ith day of the jth prediction, and I is the forecast horizon.In this study, we use I 10 days and I 30 days.
As reference data, we use the IERS 14 C04 series.However, some forecasts evaluated in this study are predicted from other input data (e.g., IDs 104, 116, see Table 2).Since prediction algorithms are typically optimized with respect to the underlying input data, this could potentially result in increased MAE for forecasts based on the data other than the official IERS solution.Some insights regarding the differences between individual reference EOP series are included in Śliwińska et al. (2022), where the IERS 14 C04 solution was compared with other combined EOP data (e.g., SPACE, Bulletin A), as well as with the single-technique series based on GNSS, SLR and VLBI measurements.It was found that for UT1-UTC, the RMS of differences between IERS 14 C04 and other solutions ranged from 0.019 to 0.212 ms, while for LOD, these values ranged from 0.010 and 0.098 ms.Hence, the choice of reference data for assessing EOP prediction may be important in individual cases.
The results for UT1-UTC from each of the group are also compared with predictions disseminated by the IERS, which are produced by USNO for 90 days into the future.The IERS forecasts for UT1-UTC rely on observations from VLBI, GNSS, and SLR, alongside AAM analysis and prediction data.The AAM data utilized for UT1-UTC predictions are sourced from a combination of the operational National Centers for Environmental Prediction (NCEP) and Navy Global Environment Model (NAVGEM) (IERS 2023).USNO employs AAM-based projections to determine UT1-UTC forecasts for a period extending up to 7.5 days ahead.For longer-term predictions, LOD excitations are combined smoothly with the longer-term UT1-UTC predictions.The method for predicting UT1-UTC beyond 7.5 days utilizes a straightforward approach known as differencing (McCarthy and Luzum 1991).Details on the computation are provided in IERS Annual Report (IERS 2023).
The IERS forecasts were taken from daily updated files finals.2000A.daily(https://www.iers.org/IERS/EN/DataProducts/EarthOrientationData/eop.html-accessed2023-05-01).The ID 200 was assigned for these predictions.Note that LOD forecasts are not provided by IERS.However, to relate the LOD predictions provided by campaign participants to the forecasts delivered by IERS, we converted the UT1-UTC predictions from ID 200 into LOD values.
During routine checking of prediction files, the EOP PCC Office detected some erroneous predictions.Sometimes outliers were caused by problems in input preparation resulting from availability issues with the most recent EOP or EAM data, defects in the software, or data errors in the files (e.g., predictions provided for the wrong days) (personal communication with participants).Incorrect predictions may affect the objective assessment of the accuracy of a given forecasting method, which is undesirable; however, the EOP PCC Office had to avoid interfering with the supplied files or any manual modification of the submissions.To solve the problem of erroneous predictions, we decided to incorporate a two-step screening process.
In the first step of data selection, called "σ criterion", we computed the standard deviation S j of the differences between reference and prediction (x obs i −x pred i, j ) for all submitted predictions separately, and then, the standard deviation of differences was computed for all submissions (S total ).Individual predictions with S j > S total were eliminated from further processing.This step removes highly inaccurate predictions whose values significantly differ from observational data and other predictions, which could have been caused by factors such as incorrect units.
In the second step, called "β criterion", we considered each ID individually to eliminate those single predictions of a given participant that noticeably differ from the other predictions by the same provider.To achieve this, following Kalarus et al. (2010), we determined a threshold independently for each ID using the β parameter, the value of which depends on the median of differences between an observation and a prediction (median absolute error, MDAE): where the α value was chosen subjectively to preserve a representative number of predictions.In this study, we used α 3.All predictions with β j < 0 were eliminated from further processing.
Table 4 provides the number of UT1-UTC predictions eliminated by the σ and β criteria for each ID for the 10and 30-day prediction horizon as well as the ratio of rejected files to all submitted files.The use of the σ condition resulted in the elimination of nine files for ID 102 (for both prediction horizons), whereas no gross errors were detected for the other IDs.Indeed, for ID 102, we noticed some problems mainly in October 2021 and again between September and November 2022.The participant reported that gross errors in the first period might have resulted from the use of UT1-TAI difference instead of UT1-UTC, while the GFZ AAM forecasts (upon which those predictions are based for the first 7 days) were not downloaded correctly in the second period (personal correspondence with Christian Bizouard).Some of the participants with the largest number of rejections reported problems with the availability of the most recent EOP values from the IERS EOP Rapid solution, using observational data from wrong days, or unintentional lack of use of daily IERS EOP Rapid data.In general, after applying σ and β criteria, 6.0% of UT1-UTC predictions for the 10-day horizon and 4.8% of UT1-UTC predictions for the 30-day prediction horizon were eliminated.
The number of LOD predictions eliminated by the σ and β criteria are presented in Table 5.Again, the use of the σ criterion eliminated several files for ID 102, which was probably related to the problem with the download of the AAM predictions.In addition, two files for ID 129 were eliminated and this participant reported a few problems with the availability of EAM forecasts delivered by GFZ.The largest number of rejections for β was detected for ID 105.This was caused mainly by improper accounting for longperiodic ocean tides at the beginning of the campaign.This problem was then traced back and solved by the participant (the details of this issue are described in Kur et al. 2022).In general, after the applying σ and β criteria, 4.7% of LOD predictions for 10-day horizon and 2.2% of LOD predictions for 30-day prediction horizon were eliminated.
As mentioned above, one of the reasons for the reduction in the accuracy of 10-day forecasts that were eliminated from further processing may be problems with the availability of EAM data and their 6-day forecasts provided by GFZ, which are routinely used by participants, especially for the LOD forecasting.Therefore, in Tables 4 and 5, we also included the number of rejections for days in which the GFZ EAM predictions were unavailable.According to GFZ (personal correspondence with Robert Dill), we learned that in the operational phase of the campaign, in 11 out of 70 weeks there were problems with the availability of EAM forecasts on the day of submitting the EOP forecasts to the EOP PCC Office (Wednesday) or on the day before.However, the elimination of outlier predictions coincides with the unavailability of EAM predictions only in some cases (for LOD 13 deletions out of a total of 52 occurred on days when GFZ EAM predictions were inaccessible) (Table 5), indicating that there is no single reason for all the identified problems.It is also worth mentioning that the lack of EAM forecasts not only on the day of submission or the day before may reduce the accuracy of the predictions.When participants use the full set of daily EAM forecasts in their algorithm, missing values on any other day are essential.The most critical effect of EAM data unavailability is for those predictions where the lack of EAM caused an error, but the error was not large enough so that the prediction was eliminated.Those submissions with errors due to a lack of EAM, but not eliminated, can cause the average performance to degrade significantly.

General overview of predictions
We will begin the assessment of UT1-UTC and LOD predictions with a general overview of forecasts (without distinguishing individual IDs) to study the overall accuracy achieved during the campaign.Boxplots of differences between reference and predicted values for the 1st, 5th, 10th, 15th, 20th, 25th, and 30th day into the future are shown in Fig. 2 (for UT1-UTC) and Fig. 3 (for LOD).Statistics for the differences between reference and predicted values for the 10-and 30-day horizons are given in Table 6.
In the case of UT1-UTC, the lowest inter-quartile range (IQR, range between the first quartile (Q1) and the third quartile (Q3) of the data) of differences and the fewest outlier values were detected for the LS + AR + EAM and ML groups (Fig. 2).However, for the LS + AR + EAM group only, the most extreme data points that are not indicated as outliers (outliers defined as data points that fall below Q1 − 1.5 • IQR or above Q3 + 1.5 • IQR), are within the range of ± 10 ms for the 30-day prediction horizon.A gradual increase in the range of differences can be observed as the prediction length increases.For all considered prediction days in the "Other" group and starting from the 20th prediction day for the LS + AR group, there is a clear asymmetry in the distribution of differences (there are more positive differences), indicating that the predictions underestimate the observed values in these cases.Figure 3 shows that in the case of LOD predictions, except for the ML group, the highest number of outlier values is received for the initial prediction days (1st and 5th).Unlike UT1-UTC forecasts, there is no clear increase in the range of differences with increasing day of forecast.
The IQR for all UT1-UTC predictions for both the 10-and 30-day prediction horizons are 0.575 and 2.031 ms, respectively, which is around twice that of the ID 200 (Table 6).It is worth noting that despite eliminating outlier predictions, the range between the minimum and maximum values is prominent (83.523 and 106.902 ms for the 10-and 30-day prediction horizon, respectively), which was mainly influenced by outlier values of differences for just a few IDs (IDs 102, 122, and 126).Even though the difference values for EOP PCC contributors' predictions were generally considerably greater than those of the forecasts from ID 200, mean and median values for predictions from the campaign participants are lower than corresponding values for ID 200, especially for the 30-day prediction horizon.For both ID 200 and 2nd EOP PCC predictions, the differences between observed and predicted values as well as their statistics (min, max, median, mean, root-mean-square (RMS), range, IQR) are greater for the 30-day than for the 10-day prediction horizon, which shows that the error of UT1-UTC forecast grows as the prediction horizon increases.For statistics of differences computed for LOD predictions, there is less difference between the 10-day and 30-day horizon than in the case of UT1-UTC (Table 6).The IQR for all LOD predictions for the 10-and 30-day prediction horizon are 0.138 and 0.245 ms, respectively, while the range between the minimum and maximum values are as high as 9.445 and 9.718 ms for 10-and 30-day predictions, respectively.The comparison of statistics of differences obtained for LOD between ID 200 and 2nd EOP PCC participants shows that in both cases, the IQR is at a similar level for both time horizons.In turn, the differences received in the campaign exhibit higher extreme values than in the case of ID 200, but the mean and median values are lower for the campaign participants than for ID 200.

Detailed assessment of predictions
We further assess the quality of UT1-UTC and LOD predictions by specifically studying MAE for the 10-and 30-day horizons.Analysis is performed for the groups described in Sect.3. Figure 4 presents MAE plots for UT1-UTC for the 10-day prediction horizon for each group compared with the mean MAE for the group and the mean MAE for all IDs.The plots also contain MAE values for Day 0, which is the submission day (i.e., Wednesday, the last day for which observational data are available).By showing this value, we can verify whether a given participant had errors at the stage of preparing observational data, which could later reduce the accuracy of the forecast itself.Except for IDs 112 (Fig. 4b), 148 (Fig. 4c), 117 and 121 (Fig. 4d), there were no major issues at the data preparation stage for most participants, since MAE for Day 0 is usually very low.The providers of   prediction with ID 117 performed a thorough analysis and validation of their methodology to uncover the source of the offset observed on Day 0. They discovered that the issue was caused by an error in input data for several submissions.Typically, ID 117 used a combination of IERS 14 C04 (with a 30-day delay) and Bulletin A (with a 2-day delay) as an input.However, it appeared that for a few submissions, inadvertently, only IERS 14 C04 was used (omitting Bulletin A).
Due to the latency of IERS 14 C04, predictions for UT1-UTC (but also for LOD and PM) pertained to the last month rather than the current day and the subsequent days.For ID 108, the MAE for Day 0 is close to the rest of the participants, but starting from Day 1 of prediction, the MAE increases rapidly, which might suggest some weaknesses of the prediction method.The comparison of MAE plots for individual groups (Fig. 4a-d) shows that the IDs from the LS + AR + EAM group had the lowest forecasting errors (the mean for this group is lower than for the other groups, and as many as 5 out of 6 IDs in this group achieve results better than the mean for all methods).Most IDs from the ML group show a similar MAE change with increasing prediction day, with values congruous with those obtained in the LS + AR + EAM group (Fig. 4c).However, IDs 115 and 148 differ notably from the other participants in the ML group.Results for ID 115 are characterized by a much more rapid increase in the MAE with the length of prediction, and ID 148 suffers from high error at Day 0. Group LS + AR presents consistent results for all IDs (except for the more rapid MAE growth observed in ID 122); however, all submissions are characterized by the largest errors compared with participants from other groups (the mean MAE for this group is higher than for other groups and all IDs from the LS + AR group present results with lower accuracy than the mean for all IDs).This suggests that not including EAM data in the prediction procedure can deteriorate forecast quality.As the Other group contains several unique forecasting methods, results for individual IDs differ from each other both in the MAE values on particular prediction days and in the MAE increase rate.
To quantitively assess the accuracy of UT1-UTC predictions for the 10-day prediction horizon, we show in Table 7 numerical values of MAE for two cases: MAE for the 10th day of prediction (MAE[10]) and mean MAE for the whole 10-day horizon, i.e., mean of MAE for the days from 0 to 10 (MAE[0-10]).While we initially focus only on errors for the last day of the considered prediction horizon, in the second case, we examine the performance of the method for the entire prediction horizon.The results confirm the findings from Fig. 4 that methods from the LS + AR + EAM group are most reliable for UT1-UTC prediction as MAE [10] and MAE[0-10] for this group are 0.57 and 0.29 ms, respectively.The lowest MAE [10] and MAE[0-10] values were received for ID 136 (0.27 and 0.12 ms, respectively).It should be noted that IDs 136 and 105 are the only methods that perform better than ID 200 for the 10-day prediction horizon and both were submitted by the same team (GFZ).The only difference between the IDs is that in the case of ID 136, the last observed UT1-UTC value is taken from the Bulletin A only after its last update around 19:00 UTC, while the prediction from ID 105 is processed earlier in the day, which means that the initial value from Bulletin A of the day before is used (the value for Day 0 provided by ID 105 is actually a prediction and not an observed quantity).The analyses have shown that waiting for the most recent observed value can slightly improve the accuracy of the UT1-UTC prediction at the 10-day prediction horizon, which is plausible in view of the additional geodetic data incorporated.However, relying on additional input that is not available until late afternoon generally increases the risk that the forecast will not be provided in time.
MAE plots for UT1-UTC for the 30-day prediction horizon for each group are given in Fig. 5, while statistics of MAE for UT1-UTC for the Day 30 of prediction and for the whole 30-day forecast horizon are given in Table 8. Figure 5 shows that IDs 102, 113, 115, 121, and 122 suffer from an exceptionally sharp MAE increase starting from the first days of the prediction, while for many other methods (IDs 105,116,136,146,147,149,200) the MAE value starts to increase after about Day 10 into the future.Again, forecasts from the LS + AR + EAM group are characterized by the smallest prediction error (MAE[30] and MAE[0-30] of 3.21 and 1.29 ms, respectively), while the LS + AR group generally performs worst (MAE[30] and MAE[0-30] of 4.86 and 2.33 ms, respectively).This underlines the importance of using EAM information in improving the accuracy of UT1-UTC forecasting.Indeed, the best IDs in groups ML and Other are also those that exploit EAM data.Nevertheless, although several participants almost approach the accuracy of forecasts to that obtained for ID 200 (IDs 103,104,105,116), only ID 105 achieves a MAE slightly lower than that obtained for forecasts from ID 200.Notably, starting from around Day 12, MAE for ID 136 becomes slightly higher than for ID 105, even though the former uses the most up-to-date observational data.It can also be observed from Fig. 5 that for the first days of prediction, the MAE values for most ML   Table 8 shows that for the group of most promising prediction methods for UT1-UTC (LS + AR + EAM), the mean MAE [30] and MAE[0-30] are 3.21 and 1.29 ms, respectively, while the corresponding values for ID 200 are 2.79 and 1.00 ms, which indicates there is still some room for improvement for groups dealing with UT1-UTC forecasting.For ID 105, which has been identified as the most promising method, these values are only slightly lower (2.77 and 0.97 ms for MAE [30] and MAE[0-30], respectively) than for the forecast disseminated by IERS.
The MAE plots for UT1-UTC predictions reveal a gradual increase in MAE along with the day of prediction, and the growth starts to accelerate within 10 days (Figs. 4,5).Only ID 122 shows a reverse trend (after an initial rapid surge, the rate of MAE change slows after Day 10).An important indicator of the accuracy of a given forecasting algorithm is not only the error for a given day of prediction but also the rate of this error increase.Therefore, to study the growth rate of the MAE, we determined the mean MAE increase per day by differentiating the MAE values for consecutive prediction days within non-overlapping time intervals (i.e., 0-5 days, 6-10 days, 11-15 days, 16-30 days, and the 30-day prediction horizon (0-30 days)).These values were computed for each group along with ID 200 and all IDs together and are presented in Fig. 6.Methods from the LS + AR + EAM group show the least rapid increase in MAE.Apart from the LS + AR group, the mean MAE increase is lowest for the 0-5-day interval and highest for the 16-30 day-interval.Although the MAE increase for all groups is generally greater than that for ID 200, if we take into account individual submissions, some of them (IDs 105 and 136) are characterized by a slightly lower MAE growth rate than for ID 200.
We now move on to a detailed assessment of the LOD predictions.The MAE plots for a 10-day horizon (Fig. 7) indicate certain issues with the MAE on Day 0 for several submissions (IDs 100, 108, 112, and 117).It is worth noting that while we observe a significant error on Day 0 for ID 101, the MAE values are much lower for the following prediction days.This discrepancy could suggest that the incorrect observed LOD value on Day 0 was included only in the file sent to the EOP PCC Office, rather than the utilization of erroneous observations in the forecasting process itself.The MAE results are most uniform for the ML group, with mean MAE[10] and MAE[0-10] of 0.102 and 0.064 ms, respectively (Table 9).Similar to UT1-UTC predictions, for LOD, the most promising methodology for a 10-day prediction horizon is that utilized by ID 136 (MAE[10] and MAE[0-10] are equal to 0.072 and 0.040 ms, respectively).The other submission from the same participant (ID 105) exhibits only slightly higher errors.The corresponding values received for ID 200 (LOD predictions transformed from UT1-UTC predictions) are 0.104 and 0.083 ms for MAE [10] and MAE[0-10], respectively.Notably, the mean MAE for the LS + AR and LS + AR + EAM groups is distorted by a few individual submissions, where high errors are observed starting from the submission day.This naturally impacts the overall performance of these groups.If we take into account the mean MAE for the whole 10-day prediction horizon (MAE[0-10]) for the entire group, ML methods turn out to provide lowest errors.However, it is worth mentioning that the use of EAM data may be important in this case, not only the method itself.Indeed, in the ML group almost all participants use EAM data (note that we have not obtained information about whether EAM was used for ID 115).The LS + AR group is the only one in which none of the methods use EAM data, which may be the reason for the highest mean MAE values in this group.The impact of EAM inclusion on prediction accuracy improvement should be dominant for 10day predictions, since their forecasts are only available for 6 days into the future.
The results presented in Fig. 8 and Table 10 indicate that the ML methods perform best in the case of LOD for a 30day horizon (mean MAE [30] and MAE[0-30] of 0.161 and 0.119 ms, respectively).However, if we take into account individual approaches, it turns out that the most promising results were recorded for ID 157 (Table 10).Nevertheless, this finding should be treated with caution, as this participant registered relatively late (November 2022) in the campaign, and therefore the MAE was determined on the basis of only seven submissions.To evaluate this method objectively, more predictions from this participant should be assessed.
Table 11 summarizes the performance of methods developed by the EOP PCC participants compared with the algorithm exploited for ID 200, giving the number of IDs with   In contrast, for LOD predictions, about half of the IDs forecast with an error lower than that for ID 200.

Evolution of prediction accuracy
This part of our analysis focuses on how the accuracy of individual predictions has evolved over time and aims to verify whether the improvements made by the participants during the campaign were effective.To do so, we analyze MAE for a 10-day horizon in eight consecutive two-month periods named P1-P8, whose start and end dates are shown in Table 12.
Figure 9 shows MAE for UT1-UTC for up to 10 days into the future for consecutive 2-month periods.It can be seen that for many participants prediction errors were not stable throughout the duration of the campaign.For example, ID 117 experienced some difficulties on Day 0 between the Table 12 Two-month periods (P1-P8) considered in the analysis of forecast evolution over time and their start and end dates Fig. 9 MAE for UT1-UTC for up to 10 days into the future for consecutive 2-month periods (a-h).The thick black line represents the mean MAE for the period and the thick magenta line represents the mean MAE for the whole campaign (the same for all subplots) 31, 2022.Another participant with ID 148 started submitting forecasts in April 2022 with relatively high MAE, but then managed to reduce the errors in the following months.
We are unable to identify a single ID for which errors consistently decreased from period to period, which would indicate continuous improvement of the method.Instead, we observe periodic increases and decreases in MAE for individual participants.In general, the methods that exhibited the lowest errors throughout the entire campaign period (IDs 104,105,116,136,146,147,and 149, see Fig. 4) demonstrate visibly greater stability in terms of MAE across individual periods than other submissions.Comparing the average MAE for a given period with the average MAE for the entire operational phase of the campaign allows us to conclude that the period between March and May 2022 had the highest number of forecasts with errors below the overall average.It is worth noting that at the end of the campaign (in the four last months), the majority of methods exhibited low errors, and the mean MAE for those periods were only disrupted by a few individual submissions.
When considering LOD, it becomes apparent that during the first 2 months, the MAE values for IDs 105 and 117 exhibit a distinct sinusoidal fluctuation that is not present in any of the other submissions (Fig. 10).This problem was thoroughly discussed in Kur et al. (2022), who provided a Fig. 10 MAE for LOD for up to 10 days into the future for consecutive two-month periods (a-h).The thick black line represents the mean MAE for the period and the thick magenta line represents the mean MAE for the whole campaign (the same for all subplots) preliminary evaluation of LOD predictions collected during the 2nd EOP PCC.It was concluded that the issue was caused by incorrect handling of impact from long-period ocean tides (Bizouard et al. 2022).In subsequent periods, the oscillation vanished, which is considered an effective intermediate result of the preliminary evaluations by the EOP PCCO Office (Kur et al. 2022).There was a notable enhancement of accuracy when comparing MAE results for ID 108 for Period P1 with MAE for Period P2; however, starting from December 22, 2021, this participant stopped providing LOD forecasts while continuing to predict UT1-UTC and PM. Figure 10 shows that the issue with incorrect value at Day 0 for ID 101 discussed in the previous section (Fig. 7) became apparent from May 2022 and was not solved until the end of the 2nd EOP PCC.Similar to the results for UT1-UTC, MAE for almost all predictions after July 2022 are below the mean for the whole campaign duration and only individual methods (IDs 100 and 102) have MAE visibly higher than the mean MAE computed for the whole 2nd EOP PCC period.
To study the time evolution of prediction accuracy in more detail, for each ID, we computed a percentage change (PCh) of MAE for a given 2-month period compared with the preceding period: where MAE i is the value for ith point of the prediction computed for the nth period (n 1, . . ., 7).If PCh > 0, the preceding period has lower MAE (prediction accuracy has improved), if PCh < 0, the preceding period has higher MAE (prediction accuracy has deteriorated).
Values of PCh for UT1-UTC and LOD prediction from each participant are displayed Fig. 11.The figure confirms that practically no method exhibits a continuous improvement in accuracy, but there are also no instances in which the quality consistently declines.Rather, we observe alternating periods of better and worse prediction performance.Forecasts from ID 200 also demonstrate such tendencies.This may not necessarily be linked to modifications of the method itself but perhaps to the temporal occurrence of certain phenomena affecting LOD and UT1-UTC, which may be more challenging to forecast.
The PCh statistics presented in Table 13 indicate a noticeable decrease in accuracy for several campaign participants and for ID 200 as well (minimum PCh values), which could have been caused by periodic errors in data preparation, lack of observational EOP data and EAM predictions or delays in access to the data, as discussed before.Both for UT1-UTC and LOD, the number of positive and negative values of PCh is similar, which, combined with a median value close to zero, allows us to conclude that the majority of predictions on a global scale were rather stable over time.

Transformation between LOD and UT1-UTC
In this section, we analyze MAE values received for the original UT1-UTC and LOD predictions, as well as the MAE of these parameters transformed from the respective LOD and UT1-UTC forecasts.Figure 12 presents MAE for up to 10 days into the future for original and transformed UT1-UTC and LOD predictions as well as the differences in MAE between the original and transformed predictions.The results are provided only for IDs who forecast both UT1-UTC and LOD.In general, for the methods that performed equally well (or poorly) for both UT1-UTC and LOD predictions (e.g., IDs 104,105,112,117,and 136), the differences in MAE before and after transformation are relatively small.However, in cases where there are large prediction errors for one parameter and small errors for the second parameter, the corresponding difference becomes more pronounced.For example, ID 108 predicts LOD with a relatively large error compared with other participants, while UT1-UTC forecasts provided by this participant are accurate.Consequently, it turns out that transformed UT1-UTC predictions from ID 108 exhibit higher MAE than the original UT1-UTC predictions, but the transformed LOD shows a lower MAE than the original LOD forecasts.In other words, for this method, it is better not to forecast LOD but to transform it from the UT1-UTC predictions.The reverse applies to the method from ID 122, which poorly predicts UT1-UTC but has a more accurate prediction of LOD.This analysis demonstrates the influence of differences in accuracy between UT1-UTC and LOD predictions on the results of the transformation between these parameters, rather than the impact of the transformation itself on the accuracy of the transformed predictions.The forecasting method also seems to have little influence, as the IDs with the smallest differences in MAE before and after transformation belong to different groups.

Ranking of prediction approaches
Although it is not possible to identify a single ID that would provide the highest prediction accuracy for all EOP and for different forecast horizons, we attempted to find the most universal and reliable combination of prediction methodology and input data.To do so, we have developed a ranking of all IDs based on the following criteria: (A) percentage of rejected submissions-to assesses the credibility of predictions for a given algorithm; (B) range of differences between prediction and reference-to evaluate predicting repeatability (accurate predictions with high stability over time should be characterized by a small range of differences); and MAE[10]-to check the quality of predictions at the beginning, middle, and end of the 10-day prediction horizon.To include all IDs in the ranking, we do not consider the prediction for 30 days into the future; (D) median of PCh-to assess the stability of the accuracy of the method.
Each ID has been assigned points equal to its position in the classification, assuming that the lower the number of points, the higher place in the classification is reached.The ranking for UT1-UTC is shown in Table 14, and the ranking for LOD is in Table 15.Overall, the prediction from ID 200 (IERS) is placed 2nd for UT1-UTC and 10th for LOD predictions transformed from UT1-UTC predictions, which confirms the reliability of the algorithm performed at USNO.For both UT1-UTC and LOD, the highest places are dominated by IDs exploiting EAM data mostly from groups LS + AR + EAM and ML.The leader for both parameters turned out to be ID 136 with the LS + AR + EAM method exploited by GFZ.

Summary and conclusions
The main objective of this research was to conduct a thorough evaluation of multiple predictions of UT1-UTC and LOD obtained during the 2nd EOP PCC, using the IERS 14 C04 solution as a reference.The primary goal of the campaign was to assess the current potential of EOP prediction, which encompassed exploring new methodologies such as ML methods that have been rapidly evolving in recent years, along with studying the contribution of input data (both EOP observations and EAM data and predictions) on forecast performance.The 2nd EOP PCC provided an objective evaluation platform for scientists from different countries and institutions to collaborate and compete in enhancing EOP predictions.Thanks to the effort and experience of the 23 participating institutions from around the world, an unprecedented set of EOP forecasts was gathered during an operational phase spanning 70 weeks.
Since the conclusion of the 1st EOP PCC, there has been an increased interest in EOP forecasts, which was evident in the considerably larger number of teams participating in the most recent EOP PCC.Lessons from the first campaign have been learned in the sense that, as recommended in the conclusions of the 1st EOP PCC, there has been an increased interest in utilizing EAM when predicting EOP in order to enhance forecasting algorithms.In the case of UT1-UTC and LOD, 23 out of 35 IDs exploited such data.While the focus in the first campaign was primarily on utilizing AAM, in the current campaign, OAM, HAM, and SLAM data and their predictions were also incorporated.However, at present, EAM predictions developed by GFZ are accessible for a maximum of 6 days ahead, and this notably influences the outcomes obtained by 2nd EOP PCC participants who utilized these data.Extending the length of EAM predictions could potentially help to reduce the prediction errors for longer prediction horizons.The only center, apart from GFZ, that currently provides EAM predictions is ETH Zurich (Kiani Shahvandi et al. 2022aShahvandi et al. , 2023;;Soja et al. 2022).Daily forecasts of all four EAM components from ETH are available up to 14 days ahead; however, these predictions are generated through the utilization of ML techniques applied to data sourced from GFZ, so they are not entirely independent.Nevertheless, their utilization has the potential to enhance the accuracy of EOP predictions for a maximum forecast horizon of about two weeks.Future research aiming to improve EOP predictions should encompass the development of new EAM data and longer-term EAM forecasts, by making use of even longer prediction runs that are now being performed by various numerical weather prediction centers (Scaife et al. 2022).
Regarding the best methods for predicting UT1-UTC and LOD, the Kalman filter (with AAM forecast from NCEP used), wavelet decomposition + autocovariance, and adaptive transformation from AAM to LOD residuals (LODR) were identified as the most effective approaches in the 1st EOP PCC (Kalarus et al. 2010).In the current campaign, LS + The result for prediction from IERS is shown in bold AR with EAM data and predictions and ML with EAM data and predictions were found to achieve the highest accuracy.It should be noted that although ML methods have been rapidly developing in recent years, and the ML groups considered in this article were the largest in terms of the number of IDs, the majority of these methodologies were developed by a single participant (ETH).Therefore, it would be beneficial for other teams to join in the exploitation of this promising new technology.All ML methods that achieved high accuracy utilized EAM as an additional data source and the usage of these data seems to be crucial in improving the accuracy of UT1-UTC and LOD predictions.The source of the EOP observations used appears to play a secondary role.However, proper implementation of input data is crucial, as errors for the submission day resulting from issues such as the limited availability of EOP and/or EAM data, or internal problems with the data retrieval algorithm, contribute to a bias specific to each method.
When it comes to the numbers, for the best prediction methodologies chosen for UT1-UTC, MAE[10] was 0.27 ms, while MAE[30] was 2.77 ms.For forecasts from ID 200 those quantities were only slightly larger, i.e., 0.37 ms and 2.79 ms for MAE [10] and MAE [30], respectively.In turn, during the 1st EOP PCC, the best UT1-UTC prediction methods ensured MAE[10] of around 0.60 ms and MAE [30] as high as 3.80 ms.In the case of LOD, the best achievements from the 2nd EOP PCC were 0.072 ms for MAE[10] and 0.097 ms for MAE [30], while in the 1st EOP PCC, optimal values were around 0.130 ms and 0.220 ms for MAE [10] and MAE[30], respectively.There has clearly been considerable progress made in EOP forecasting over the past years.Nevertheless, there remains room for improvement for teams predicting UT1-UTC as only 2 (in the case of MAE [10] and MAE [30]) out of the 25 IDs revealed slightly higher prediction accuracy than official forecasts disseminated by IERS (ID 200).The result for prediction from IERS is shown in bold a ID 157 submitted only seven predictions To summarize the achievements of the 2nd EOP PCC and provide some perspectives, it should be stated that currently, the most important factor in improving the accuracy of UT1-UTC and LOD forecasts is the use of precise and reliable EAM data and predictions.Therefore, the first step in this regard should be the development of EAM forecasts for longer horizons.It has also been demonstrated that modern ML-based algorithms have tremendous potential and should continue to be developed, although classical approaches based on LS + AR (with added EAM information) have also proven to be reliable.adjusted to have the same bias and rate, the stated uncertainties of the measurements are adjusted, and data points considered to be outliers are deleted (Freedman et al. 1994;Gross et al. 1998).

105
Authors with affiliations: Robert Dill, Henryk Dobslaw, Jan Saynisch-Wagner, Christopher Irrgang, Maik Thomas Department 1: Geodesy, Deutsches GeoForschungsZentrum GFZ, Potsdam, Germany Method description: EOP prediction is based on the GFZ EAM Predictor (Dill et al. 2019).The sum of EAM (4 years of model-based effective angular momentum functions including EAM 6-day forecasts) and the residual of geodetic angular momentum (GAM) (4 years of geodetic angular momentum derived from IERS 14 C04), extrapolated for the last ~30 days up to the end of the EAM 6-day forecasts by a 1st LS + AR step, is predicted into the future by a 2nd LS + AR step.GFZ uses this two-step GFZ EAM Predictor to provide daily updated EAM predictions from -90 days in the past to + 90 days into the future with 3-hourly sampling (http://esmdata.gfzpotsdam.de:8080/).The submitted EOP prediction #105 is generated as soon as the EAM prediction is available (~11 UTC).Using the latest available EOP coordinates from IERS rapid EOPs (finals.daily)from the day before as initial values for the Liouville equation a 90-day EOP prediction is derived from the EAM prediction.The predicted 3-hourly EOPs are re-sampled to daily time intervals and the first day is cut off to start the time series at the actual day of submission.

107
Authors with affiliations: Zhijin Zhou, Lue Chen, Weitao Lu, Songtao Han National Key Laboratory of Science and Technology on Aerospace Flight Dynamic, Beijing Aerospace Control Center, Beijing, China Method description: First determine the length of the training data and reads the raw data, preprocessing according to its type, then the least square fitting, calculating the residuals between the data and the LS model, the residual prediction data is obtained by autoregressive modeling, and the least square model is extrapolated.The two are added together and post-processed to obtain the prediction product (Chen et al. 2014).

108
Authors In this method, the EAM from the Earth System Modelling Group of GeoForschungsZentrum Potsdam (ESMGFZ) are selected as the input excitation series.Afterward, the interannual, seasonal and sub-seasonal terms of EOP are calculated from the EAM predictions by the Liouville convolution equation.Meanwhile, the rest of the EOP trend terms are extrapolated by the polynomial LS model.Finally, the total EOP predictions are combined with the excitation calculations and trend extensions (Xu et al. 2023).

116
Authors with affiliations: Erik Schoenemann (1), Sara Bruni (2), Michiel Otten ( 2 Concerning the prediction phase, the software implements a combination of least-square fitting and autoregressive modelling based on the whole history of ESA ERP estimates to characterize the deterministic part of the ERP variability and the high-frequency variability induced by non-tidal atmospheric and oceanic dynamics.Then, the short-range (6 days) EAM forecasts provided by GFZ are used to predict the irregular variations generated by the atmospheric, oceanic and hydrological dynamics.In order to stabilize the short-term predictions, EAM forecasts are also combined in the excitation domain with an additional signal extrapolating the difference between the geodetic excitation necessary to generate the observed history of ERP variations and the corresponding excitation extracted from geophysical models.
Additional details on the ESA ERP Service can be found in (Bruni et al. 2021); the ESA ERP Software is described in (Kehm et al. 2023).

117
Authors with affiliations: We used two different approaches for Earth Rotation Parameters (ERP) and celestial pole offsets (CPO) prediction.The proposed algorithm for predicting the ERP is called SSA + Copula-based analysis (Modiri et al. 2018(Modiri et al. , 2020;;Modiri 2021).The algorithm splits the observed PM time series into periodic terms and anomalies, which are modelled using SSA and Copula-based analysis, respectively.The SSA periodic terms estimation involves selecting a window parameter, forming a trajectory matrix, performing singular value decomposition, selecting a proper group of singular values and corresponding singular vectors, and calculating the trend.The Copula anomaly modelling involves forming the trajectory matrix of residual time series, computing the marginal distribution, transforming data to the rank space, computing the empirical and conditional Copula, and sampling random data from the conditional Copula cumulative distribution function (CDF).The final predicted PM data is the sum of the predicted periodic terms using SSA and the predicted anomaly using the Copula-based model.

124
Authors with affiliations: Mostafa Kiani Shahvandi, Matthias Schartner, Junyang Gou, Benedikt Soja Institute of Geodesy and Photogrammetry, ETH Zurich, Zurich, Switzerland Method description: This method focuses on the 10-day prediction of LOD using a hybrid modeling approach, which combines physical, statistical, and encoder-decoder LSTM (EDLSTM) network while considering geophysical excitations.We first remove the secular trend and known signals from the LOD time series by combining the Savitzky-Golay filter (Savitzky and Golay 1964), tidal corrections (Petit and Luzum 2010) and leastsquares adjustment (Brockwell and Davis 2002) to generate the LOD residuals (LODR).The GFZ AAM data (Dobslaw and Dill 2018) and corresponding 6-day forecasts are also preprocessed using least-squares adjustment to obtain AAM residuals (AAMR).
Then, the LODR, AAMR, and 6-day forecasts of AAMR are concatenated and considered as input features.The final dimension of the input tensors is 30 × 8 since we consider the previous 30 days as the input sequence.Then we feed this tensor into the EDLSTM network (Hochreiter andSchmidhuber 1997, Nayak andNg 2020) to predict LODR for the next ten days.In order to restore the full LOD, we should also predict the previously removed components.The tidal and seasonal signals can be easily predicted since they are estimated using deterministic models.The long-term trend will be predicted using PCHIP (Piecewise Cubic Hermite Interpolating Polynomial) extrapolation (Fritsch and Carlson 1980).
For more details, please refer to Gou et al. (2023).

125
Authors with affiliations: Mostafa Kiani Shahvandi, Matthias Schartner, Junyang Gou, Benedikt Soja Institute of Geodesy and Photogrammetry, ETH Zurich, Zurich, Switzerland Method description: This architecture is designed (Kiani Shahvandi and Soja 2022a) to extract meaningful information even from the scarce data.The elements of this architecture are as follows: (1) residual learning; (2) attention mechanism; (3) long shortterm memory (LSTM).Several consecutive blocks of the mentioned elements are stacked together in an auto-encoder manner.Each of these blocks are pretrained using the socalled greedy-layer wise pretraining in order to facilitate the main training phase.Similar to the studies (Kiani Shahvandi and Soja 2021) on Transformer architecture, number of the blocks depends on the accuracy obtained on the validation set during training, but for the prediction of LOD there are usually two blocks in the architecture.It is also important to note that the architecture of LSTM element is similar to the one used in Gou et al. (2023).The input and output sequence lengths to this architecture are 30 and 12, respectively.The loss function is mean absolute error.The algorithm is retrained at each prediction epoch to take advantage of the new EOP and EAM data being available.This architecture is trained on LOD residuals (i.e., after the removal of secular trends, tidal effects, and seasonal signals) having the AAM and its 6-day forecasts as additional features (seasonal signals are removed from AAM).The final prediction is the summation of output of the architecture and the prediction of secular trends, tides, and seasonal signals of the LOD.The architecture used is based on the first-order neural ordinary differential equations (Neural ODEs).In this architecture the hidden state in the hidden layer should follow a differential equation.To apply this concept to the EOPs, it is assumed that EOPs follow first-order differential equations the exact form of which should be determined by fitting neural networks to the observations.The general approach of Neural ODE differential learning (Kiani Shahvandi et al. 2022a) is modified (i.e., in a way that does not require using the rates of EOPs) and used as the primary architecture.A variation of this architecture is the so-called simple recursive method (Kiani Shahvandi et al. 2022b), in which an attempt is made to incorporate the uncertainties in the observational data in the training for a more reliable estimation of parameters of the neural networks (Kiani Shahvandi and Soja 2022b).As the result, the loss function here is the mean squared error.The architecture does not require any preprocessing of the input features.However, in case of LOD prediction it is used on the LOD residuals after the removal of secular trends, tides, and seasonal signals (Gou et al. 2023).The forecasting horizon of the architecture contains both 10 and 30 days.The input sequence length to the architectures is 10 days.The architectures are trained at each prediction epoch to take advantage of the most recent available EOP and EAM data.

136
Authors with affiliations: Robert Dill, Henryk Dobslaw, Jan Saynisch-Wagner, Christopher Irrgang, Maik Thomas Department 1: Geodesy, Deutsches GeoForschungsZentrum GFZ, Potsdam, Germany Method description: EOP prediction is based on the GFZ EAM Predictor (Dill et al. 2019).The sum of EAM (4 years of model-based effective angular momentum functions including EAM 6day forecasts) and the residual of GAM (4 years of geodetic angular momentum derived from IERS 14 C04), extrapolated for the last ~30 days up to the end of the EAM 6-day forecasts by a 1st LS + AR step, is predicted into the future by a 2nd LS + AR step.GFZ uses this two-step GFZ EAM Predictor to provide daily updated EAM predictions from − 90 days in the past to + 90 days into the future with 3hourly sampling (http://esmdata.gfz-potsdam.de:8080/).The submitted EOP prediction #136 is generated as soon as the rapid EOP solution for the actual day is available (~17:15 UTC).The latest non-predicted EOP coordinates from IERS rapid EOPs (finals.daily) are taken as initial values for the Liouville equation to derive a 90-day EOP prediction from the EAM prediction.The predicted 3-hourly EOPs are resampled to daily time intervals.

141
Authors with affiliations: Maciej Michalczak, Marcin Ligas AGH University of Krakow, Adama Mickiewicza 30, 30-059 Kraków, Poland Method description: Ordinary kriging belongs to a broad family of geostatistical methods of prediction, and is optimal in the sense of Best Linear Unbiased Prediction (BLUP) if the mean value of a random function is an unknown constant.Prediction by means of kriging requires a structure function that describes continuity and variability of a random process.A semivariogram, due to its advantage over a covariance function, is used to describe a structure hidden in the residual series, that is obtained after the removal of a linear trend, periodic components and periodicity associated with solid Earth tides.Among theoretical semivariogram models the best performing one turned out to be exponential model.The final forecast consists of extrapolated deterministic part combined with the predicted (kriged) stochastic part.The entire process is presented in Michalczak andLigas (2021, 2022).

142
Authors with affiliations: Mostafa Kiani Shahvandi, Matthias Schartner, Junyang Gou, Benedikt Soja Institute of Geodesy and Photogrammetry, ETH Zurich, Zurich, Switzerland Method description: This method focuses on the 10-day prediction of LOD using a hybrid modeling approach, which combines physical, statistical, and encoder-decoder LSTM (EDLSTM) network while considering geophysical excitations.We first remove the secular trend and known signals from the LOD time series by combining the Savitzky-Golay filter (Savitzky and Golay 1964), tidal corrections (Petit and Luzum 2010), and least-squares adjustment (Brockwell and Davis 2002) to generate the LOD residuals (LODR).The GFZ EAM (AAM + OAM + HAM) data (Dobslaw and Dill 2018) and corresponding 6-day forecasts are also preprocessed using least-squares adjustment to obtain EAM residuals (EAMR).
Then, the LODR, EAMR, and 6-day forecasts of EAMR are concatenated and considered as input features.The final dimension of the input tensors is 30 × 8 since we consider the previous 30 days as the input sequence.Then we feed this tensor into the EDLSTM network (Hochreiter andSchmidhuber 1997, Nayak andNg 2020) to predict LODR for the next ten days.In order to restore the full LOD, we should also predict the previously removed components.The tidal and seasonal signals can be easily predicted since they are estimated using deterministic models.The long-term trend will be predicted using PCHIP (Piecewise Cubic Hermite Interpolating Polynomial) extrapolation (Fritsch and Carlson 1980).
For more details, please refer to Gou et al. (2023).

143
Authors with affiliations: Mostafa Kiani Shahvandi, Matthias Schartner, Junyang Gou, Benedikt Soja Institute of Geodesy and Photogrammetry, ETH Zurich, Zurich, Switzerland Method description: The basis of this architecture is the multilayer perceptron.However, in order to use several hidden layers and overcoming the problems with deep neural networks, we have added the residual learning blocks to this architecture, similar to Kiani Shahvandi and Soja (2022a).Therefore, we name this architecture the revised multilayer perceptron.The activation functions for the hidden layers are tangent hyperbolic, except for the last layer which is linear.Number of hidden layers is three.The input sequence length in this algorithm is 10.However, the output sequence length is either 12 or 32 depending on the forecasting horizon, both of which are provided.The architecture is used for LOD prediction in the same manner as Gou et al. (2023).The inputs to this algorithm are preprocessed, i.e., in the case of LOD the trends, tides, and seasonal signals are removed, while for EAM functions the seasonal signals are subtracted from the observations and forecasts.The mentioned subtracted signals are subsequently added to the predictions of the architecture in order to give the final value of LOD prediction.

147, 148, 149
Authors with affiliations: Mostafa Kiani Shahvandi, Matthias Schartner, Junyang Gou, Benedikt Soja Institute of Geodesy and Photogrammetry, ETH Zurich, Zurich, Switzerland Method description: This architecture is based on the first-order neural ordinary differential equations (Neural ODEs).The general Neural ODE differential learning architecture (Kiani Shahvandi et al. 2022a) is modified in a way that does not incorporate the rates of EOPs into the architecture.A simplified form of this architecture is also used (referred to as simple recursive) (Kiani Shahvandi et al. 2022b) in which the uncertainties in the input data are used to weigh the loss function (Kiani Shahvandi and Soja 2022b) (the loss function here is the mean squared error) for a more reliable estimation of the parameters of the neural networks.However, investigating the residuals of training phase reveals that some signals in the observations are not well captured by the Neural ODEs.Therefore, an attempt is made to model these residuals by Long Short-Term Memory (LSTM) neural networks in the same manner suggested by Gou et al. (2023).First, Neural ODEs are trained and then the fitted values are subtracted from the observations in order to compute the residuals.Subsequently these residuals are modelled by LSTM.The input to the LSTM architecture is only the past values of residuals of training of the Neural ODEs architecture.The input sequence length here is 10 and the loss function is mean absolute error.For this purpose, the predictions of IERS are also incorporated into the algorithm in order to predict the residuals at each training epoch (retraining is required).

156
Authors with affiliations: Maciej Michalczak, Marcin Ligas AGH University of Krakow, Adama Mickiewicza 30, 30-059 Kraków, Poland Method description: The deterministic part, i.e., estimated linear trend, periodic components and tidal effects, is first removed from raw times series.ARIMA model is then used to predict the residual part.Autoregressive integrated moving average ARIMA(p, d, q) model is a combination of autoregressive model (p), moving average model (q) and differencing process (integrated part; d) that accounts for a potential non-stationarity of a residual process.The deterministic component is extrapolated for future time instances and then combined with ARIMA-based predicted stochastic part.The best set of parameters p and q is selected by means of corrected Akaike Information Criterion (AIC).Parameter d determines a degree of differencing to be applied in order to transform a non-stationary time series into a stationary one in the mean sense.Stationarity of the residual process is checked using Kwiatkowski-Phillips-Schmidt-Shin (KPSS) test.The entire prediction procedure is described in Michalczak et al. (2022).

157
Authors with affiliations: Maciej Michalczak, Marcin Ligas AGH University of Krakow, Adama Mickiewicza 30, 30-059 Kraków, Poland Method description: Dynamic mode decomposition (DMD) is a relatively new technique of data decomposition that emerged in the field of fluid dynamics due to work by Schmid (2010).It is described as an ideal marriage of data decomposition methods (e.g., principal component analysis) and Fourier transform (Kutz et al. 2016).It is a data-driven, equation-free technique with the only assumption that some dynamics is present in data.It has the ability to reconstruct and forecast data in a single numerical procedure.Despite DMD is of spatio-temporal origin with a slight modification it can be used to univariate time series by splitting an input time series of length T into L subseries (shifted by one time step ahead) of length K (Tirunagari et al. 2017).This generates a trajectory matrix known from, e.g., singular spectrum analysis (SSA).The main goal of the method is to capture a low rank structure of the analyzed dynamical system, i.e., to decompose it into the most dominant components (trends, harmonics) that may be used for a future state prediction later on.Due to numerical feasibility, during the 2nd EOP PCC the method was used without dimensionality reduction step (no low rank approximation involved).

Fig. 2
Fig. 2 Boxplot of differences between reference and predicted values for UT1-UTC for 1st, 5th, 10th, 15th, 20th, 25th, and 30th day into the future for groups a LS + AR, b LS + AR + EAM, c ML, and d Other

Fig. 4
Fig. 4 MAE for UT1-UTC for up to 10 days into the future for groups a LS + AR, b LS + AR + EAM, c ML, and d Other.The thick black line represents the mean MAE for the group, and the thick magenta line represents the mean MAE for all IDs (the same for all subplots)

Fig. 5
Fig. 5 MAE for UT1-UTC for up to 30 days into the future for groups a LS + AR, b LS + AR + EAM, c ML, and d Other.A thick black line represents the mean MAE for the group, whereas a thick magenta line represents the mean MAE for all IDs (the same for all subplots)

Fig. 6
Fig. 6 Mean MAE increase per day for UT1-UTC prediction for each group, all predictions together (Total), and ID 200

Fig. 8
Fig. 8 MAE for LOD for up to 30 days into the future for groups a LS + AR, b LS + AR + EAM, c ML, and d Other.The thick black line represents the mean MAE for the group and the thick magenta line represents the mean MAE for all IDs (the same for all subplots)

Fig. 11
Fig. 11 Percentage change (PCh) of MAE of a UT1-UTC and b LOD predictions in individual analysis periods (P2-P8) in relation to the previous periods (P1-P7)

Fig. 12
Fig. 12 MAE for up to 10 days into the future for a UT1-UTC, b LOD, c UT1-UTC transformed from LOD, d LOD transformed from UT1-UTC, e difference between MAE of UT1-UTC and MAE of UT1-UTC ESA/ESOC, Robert-Bosch-Straße 5, Darmstadt, Germany (2) PosiTim UG at ESA/ESOC, Robert-Bosch-Straße 5, Darmstadt, Germany (3) LSE Space GmbH at ESA/ESOC, Robert-Bosch-Straße 5, Darmstadt, Germany Method description: ESA's contribution to the second IERS EOP PCC was based on the output of the ESA ERP Service (http://naviga tion-office.esa.int/products/erp/)generated from September 2021 to December 2022.The Service provides daily updates of ERP estimates and the relevant predictions for 119 days in the future.The estimation phase is based on a rigorous combination at the normal equation level of different geodetic solutions.The combination takes into account the full correlation matrices, and realizes a seamless transition between ERP estimates based on final and rapid input products.For the IERS EOP PCC, the combination included ESA's GNSS, SLR and DORIS official products submitted to the relevant International Association of Geodesy (IAG) Services, as well as BKG and Deutsches Geodätisches Forschungsinstitut (DGFI) solutions for intensive and rapid-turnaround VLBI sessions, respectively.

Table 1
Summary of the number of all registered participants, methods, and prediction files involved in the 2nd EOP PCC compared with the 1st EOP PCC.

Table 4
For IDs not listed(IDs 113, 115), none of the files was removed.Note that the prediction horizon for IDs 126 and 130 is shorter than 30 days, therefore the 30-day prediction horizon values are not provided for those IDs a N/A not applicable

Table 5
For IDs not listed(IDs 101, 113, 114, 136, 157), none of the files was removed.Note that the prediction horizon for IDs 124, 125, 129 and 142 is shorter than 30 days, therefore the 30-day prediction horizon values are not provided for those IDs

Table 6
Statistics (min, max, mean, root-mean-square (RMS), median, range, and inter-quartile range (IQR)) for the differences between reference and predicted values for UT1-UTC and LOD for predictions from all participants (Total) and for prediction from ID 200 for prediction horizons of 10 and 30 days into the future These discrepancies ultimately impacted the MAE values.The forecast group identified sessions with incomplete input data that relied solely on C04 data with a 30-day lag as MJD submission dates, i.e.59,528, 59,535, 59,542, 59,556, 59,591, and  59,899 (personal communication with Sadegh Modiri).The σ and β criteria utilized by the EOP PCC Office indicated only the dates 59,528, 59,535, and 59,556 as outliers.The participant also indicated the inconsistent combination of two distinct input time series (IERS 14 C04 and Bulletin A) as potential reason for the offset observed on Day 0.
Table 7 includes the highest, lowest, and mean values of MAE[10] and MAE[0-10] for each group in comparison with quantities received for ID 200.

Table 7
Statistics (maximum, minimum, and mean)of MAE for UT1-UTC for the 10th day of the prediction and for the whole 10-day forecast horizon

Table 8
Statistics (maximum, minimum, and mean)of MAE for UT1-UTC for the 30th day of the prediction and for the whole 30-day forecast horizon

Table 9
Statistics (maximum, minimum, and mean)of MAE for LOD for the 10th day of the prediction and for the whole 10-day forecast horizon

Table 10
Statistics (maximum, minimum, and mean)of MAE for LOD for the 30th day of prediction and for the whole 30-day forecast horizon

Table 14
Ranking of IDs according to the adopted criteria and the number of points awarded to each ID in individual categories for UT1-UTC

Table 15
Ranking of IDs according to the adopted criteria and the number of points awarded to each ID in individual categories for LOD Availability of data and materialsAll EOP predictions analyzed in this study were submitted to the EOP PCC Office by registered participants in the frame of the 2nd EOP PCC.The data can be accessed from the GFZ Data Services under the following link: https://doi.org/10.5880/GFZ.1.3.2023.001.Predictions developed by IERS/USNO, as well as the IERS 14 C04 solution used in this study to validate EOP predictions, are available at https://www.iers.org/IERS/EN/DataProducts/EarthOrientationData/eop.html.
The prediction method is LS + MAR, in which LS means difference LS, and MAR means Multi-elements AR.The inputs including EOP data released by IERS and AAM data released by GFZ.The prediction parameters include both PM components and UT1-UTC; the longest prediction day is 365.