Development of the UKESM-TOPAZ Earth System Model (Version 1.0) and Preliminary Evaluation of its Biogeochemical Simulations

Earth system models (ESMs) comprise various Earth system components and simulate the interactions between these components. ESMs can be used to understand climate feedbacks between physical, chemical, and biological processes and predict future climate. We developed a new ESM, UKESM-TOPAZ, by coupling the UK ESM (UKESM1) and the Tracers of Phytoplankton with Allometric Zooplankton (TOPAZ) biogeochemical module. We then compared the preliminary simulated biogeochemical variables, which were conducted over a period of 70 years, using observational and existing UKESM1 model data. Similar to UKESM1, the newly developed UKESM-TOPAZ closely simulated the relationship between the El Niño-Southern Oscillation and chlorophyll concentration anomalies during the boreal winter. However, there were differences in the chlorophyll distributions in the eastern equatorial Pacific between the two models, which were due to dissolved iron, as this value was higher in UKESM-TOPAZ than in UKESM1. In a mean field analysis, the distributions of the major marine biogeochemical variables in UKESM-TOPAZ (i.e., nitrate, silicate, dissolved oxygen, dissolved inorganic carbon, and alkalinity) were not significantly different from those of UKESM1, likely because the models share the same initial conditions. Our results indicate that TOPAZ has a simulation performance that does not lag behind UKESM1’s basic biogeochemical model (Model of Ecosystem Dynamics, nutrient Utilisation, Sequestration, and Acidification; MEDUSA). The UKESM-TOPAZ model can simulate the variability of the observed Niño 3.4 and 4 indices more closely than UKESM1. Thus, the UKESM-TOPAZ model can be used to deepen our understanding of the Earth system and to estimate ESM uncertainty.


Introduction
In the Earth system, the ocean is a significant factor that controls the climate (Reid et al. 2009;Pörtner et al. 2014;Xie 2020). For example, the El Niño-Southern Oscillation (ENSO), a climate phenomenon resulting from interactions between the ocean and atmosphere, has a major impact on global climate through teleconnections (Bjerknes 1969;McPhaden et al. 2007;Kim and Seo 2016;Yeh et al. 2018;Kim et al. 2020). The ocean plays a role in decelerating climate change by absorbing atmospheric heat and by storing greenhouse gases such as CO 2 (Reid et al. 2009;Meehl et al. 2011;Pörtner et al. 2014;Xie 2020). Further, ocean biogeochemical processes are major components of the Earth's carbon cycle and are important factors that affect climate by interacting with ocean physics or atmospheric chemical processes (Charlson et al. 1987;Reid et al. 2009;Park et al. 2014aPark et al. , b, 2015Kim et al. 2018). Atmospheric CO 2 Korean Meteorological Society absorbed by the ocean can be converted to O 2 through photosynthesis by phytoplankton, the ocean's primary producer, or can be stored in the ocean through the biological pump that controls the amount of CO 2 in the atmosphere (Reid et al. 2009). In addition, marine phytoplankton can change ENSO or Arctic warming through marine environmental and geophysical feedbacks (Park et al. 2014a(Park et al. , 2014b(Park et al. , 2015(Park et al. , 2017Kang et al. 2017;Lim et al. 2018) and can release dimethyl sulfide (DMS) into the atmosphere that acts as cloud condensation nuclei and can affect climate through cloud albedo feedback (Charlson et al. 1987;Kim et al. 2018). Thus, ocean biogeochemical processes can influence various climate-related factors, including ocean physics and atmospheric chemistry, through direct and indirect feedback mechanisms, thereby consequently affecting the climate.
Ocean biogeochemistry and climate feedback studies are mostly performed using Earth system models (ESMs). Recently, an increased understanding of various global climate processes has benefited studies of the direct and indirect effects of chemical and biological components on the climate. Simultaneously, computing resources have continued to improve, allowing for environments that can run models with high calculation costs. Consequently, the complexity of recently developed models has also increased. Existing climate models implement interactions between each component of the atmosphere, ocean, ice, and land surface within their physical processes. The latest ESMs also implement chemical components and their movement and circulation, as well as climate model components Swart et al. 2019;Sepulchre et al. 2020). Cycling processes implemented in ESMs include the carbon, iron, and nitrate cycles, as well as biogeochemical cycles with changing forms and energy transfer among atmospheric aerosols, chemical processes, land biogeochemistry, vegetation, and ocean biogeochemistry. Thus, the latest ESMs can simulate complex physical and biogeochemical processes, as well as their interactions with actual natural phenomena, in great detail.
Previous studies have confirmed that marine environmental simulation results obtained from ESMs, including ocean biogeochemical processes, vary from those of other models (Kang et al. 2017;Park et al. 2017Park et al. , 2019Lim et al. 2018;Ham et al. 2020). Park et al. (2014b) determined that the direct and indirect effects of chlorophyll affect the amplitude and asymmetry of ENSO in the Geophysical Fluid Dynamics Laboratory (GFDL) model results. In addition, using the Community ESM of the National Center for Atmospheric Research (NCAR CESM), Kang et al. (2017) showed that a model with chlorophyll feedback simulated the ENSO amplitude better than a model without this feedback. Nevertheless, models that consider marine biogeochemical processes still have varying uncertainties (Anav et al. 2013;Bopp et al. 2013;Yool et al. 2013;Frölicher et al. 2016;Lovenduski et al. 2016;Kwiatkowski et al. 2020). Considering the historical and Representative Concentration Pathway 8.5 (RCP8.5) scenarios of 14 Coupled Model Intercomparison Project Phase 5 (CMIP5) models, Hammond et al. (2020) found that the simulated regional surface chlorophyll trends differed widely among the models. Moreover, Bopp et al. (2013) and Frölicher et al. (2016) found that the influence of model uncertainty on O 2 and net primary production (NPP) simulations in the twenty-first century projections of the CMIP5 models increased after the mid-twenty-first century, suggesting that the multi-model ensemble should be expanded.
In fact, as CMIP5 progressed to CMIP Phase 6 (CMIP6), various ESMs that consider marine biogeochemical cycles were developed, including the UK ESM (UKESM1) developed by the UK Met Office (UKMO) , the IPSL-CM5A-LR (Dufresne et al. 2013) and the IPSL-CM6A-LR ) developed by the Institute Pierre Simon Laplace in France, the CNRM-ESM1 (Séférian et al. 2016) and the CNRM-ESM2-1 (Séférian et al. 2019) developed by the National Center for Meteorological Research (CNRM)-Centre Européen de Recherche et de Formation Avancée en Calcul Scientifique (CERFACS), and the GFDL-ESM2 , GFDL-CM4 (Held et al. 2019), and the GFDL-ESM4.1  developed by GFDL. Marine biogeochemical models were developed for several ESMs as targets for the specific marine models used by each institution. For example, the marine models of CNRM-ESM2-1 and IPSL-CM6A-LR are the Nucleus for European Modeling of the Ocean (NEMO; Madec et al. 2017;Storkey et al. 2018) and use NEMO's basic ocean carbon cycle model, i.e., the Pelagic Interactions Scheme for Carbon and Ecosystem Studies volume 2 (PISCES-v2; Aumont et al. 2015) as an ocean biogeochemical model. Despite the diverse ESMs that have been developed, the combinations of marine models and biogeochemical models have been limited. Therefore, diversifying the present combinations of ocean models and ocean biogeochemistry models is required to better assess the links between climate and ocean biogeochemistry, as well as to make improved future predictions.
The two CMIP6 models, the Canadian ESM version 5 (CanESM5; Swart et al. 2019) and the CanESM5-Canadian Ocean Ecosystem model (CanESM5-CanOE; Swart et al. 2019), differ only in their implemented ocean biogeochemistry models, i.e., the Canadian Model of Ocean Carbon (CMOC) and CanOE, respectively. According to Gier et al. (2020), these two models have differences between their atmospheric CO 2 simulations, which suggest that differences in ocean biogeochemistry models may eventually yield differences in the interactions between systems and the simulated climate results. Cossarini et al. (2017) coupled the existing marine model from the Massachusetts Institute Korean Meteorological Society of Technology general circulation model (MITgcm) with the Biogeochemical Flux Model (BFM) using BFMCOU-PLER. Furthermore, GOTM-TOPAZ ) is a column ocean biogeochemical model that was developed by coupling the marine biogeochemical model Tracers of Phytoplankton with Allometric Zooplankton (TOPAZ) ) of the Modular Ocean Model (MOM) from the GFDL-ESM2 and the General Ocean Turbulence Model (GOTM), which is a one-dimensional ocean model. NEMO-TOPAZ (Jung et al. 2020) coupled the biogeochemical processes from TOPAZ with the three-dimensional NEMO marine model. Thus, new combinations of marine and marine biogeochemical models can be obtained by coupling previously developed models.
Accordingly, we developed a new ESM, named UKESM-TOPAZ, by replacing the Model of Ecosystem Dynamics, nutrient Utilisation, Sequestration and Acidification (MEDUSA; Yool et al. 2013), which is an original marine biogeochemical model in UKESM1, with TOPAZ, wherein we coupled NEMO from UKESM1 with TOPAZ from GFDL. In this study, we illustrate the development of UKESM-TOPAZ and introduce some preliminary model results by evaluating marine biogeochemical variables. The composition of the paper is as follows. In Sect. 2, UKESM1 and TOPAZ are introduced, and Sect. 3 contains the description and research method of the developed UKESM-TOPAZ. In Sect. 4, we evaluated representative ocean biogeochemical variables, such as chlorophyll, nitrate, silicate, dissolved oxygen (DO), dissolved inorganic carbon (DIC), and alkalinity, and analyzed the relationship between ENSO and chlorophyll concentrations in the tropical Pacific simulated using the UKESM-TOPAZ model. In Sect. 5, the evaluation results are summarized and discussed.

Earth System Model: UK Earth System Model Version 1
The overall structure of the UKESM1 model is shown in  Archibald et al. 2020) is used for the atmospheric chemical compositions, while JULES is used as the terrestrial biogeochemistry model . MEDUSA is used as the ocean biogeochemistry model. The scientific configuration of the ocean model was the Global Ocean 6.0 (GO6.0), and the NEMO3.6_stable version code was used. The horizontal resolution was approximately 1° of eORCA1 (332 × 362 grid points), with approximately 1/3° of equatorial resolution. The vertical resolution was L75, with a layer spacing of ~ 1 m at the surface that gradually increased to ~ 200 m in the bottom layer (5902 m).
Compared with nutrient-phytoplankton-zooplankton-detritus (NPZD) models, the ocean biogeochemical model MEDUSA-2.0 has an "intermediate complexity" that divides phytoplankton classes into diatoms and non-diatoms, and considers various elements that limit their growth (Yool et al. 2011(Yool et al. , 2013. In addition, the model includes O 2 and CO 2 gas exchanges in the calculations, and the ocean carbon cycle is simulated by considering the variable C:N ratios in the phytoplankton, zooplankton, and detritus. The method developed by Anderson et al. (2001) was introduced in MEDUSA, thereby adding the calculation of ocean surface DMS. Thus, the MEDUSA model receives the partial pressure of CO 2 (pCO 2 ) and dust from the UM through NEMO for its calculations, while sending the calculated chlorophyll, CO 2 , and DMS concentrations to the UM. The UKESM1 model produces 20 MEDUSA outputs. Further details on UKESM1 and its components can be found in Sellar et al. (2019) and the studies of each component (Anderson et al. 2001;Best et al. 2011;Clark et al. 2011;Yool et al. 2013;Craig et al. 2017;Madec et al. 2017;Mulcahy et al. 2018;Ridley et al. 2018;Storkey et al. 2018;Andrews et al. 2019;Walters et al. 2019;Archibald et al. 2020).

Ocean Biogeochemistry Model: Tracers of Phytoplankton with Allometric Zooplankton Version 2
The TOPAZ code version 2 was coupled to the UKESM1 model, replacing MEDUSA (the existing biogeochemical model). TOPAZ was originally coupled to MOM, which is a marine model developed by GFDL that predicts 30 tracers and diagnoses 11 tracers while categorizing phytoplankton into small, large, or diazotrophic groups based on their shapes and sizes and implementing the circulation of various tracers (nitrogen, phosphorus, silicon, carbon, and O 2 ). The continuity equation for state variable C is calculated by the model as follows: where ̃ is the velocity vector, K is the diffusivity, and S C is the source minus sink term.
TOPAZ has already been coupled to GOTM, which is a one-dimensional column model, and NEMO, which is a global ocean model . In this study, as in , TOPAZ was separated from MOM5 and coupled to UKESM1 after adding the necessary routines. First, a module related to chlorophyll feedback (Manizza et al. 2005) was added, in which the ocean temperature changes with phytoplankton photosynthesis. Then, because several TOPAZ calculations require the surface flux supplied from the atmosphere to the ocean, we provided atmospheric chemistry values as climatological values instead of supplying them from an atmospheric model. Specifically, climatological values were used for the surface fluxes (Dunne 2012) of DIC, O 2 , nitrate, ammonia, alkalinity, lithogenic aluminosilicate, and iron deposition (FeD), which are required for sediment calcite cycling and external bottom flux calculations. Additionally, alkalinity was calculated using the nitrate climatological value of deposition, and the air-sea O 2 and CO 2 gas transfer processes were implemented. In this calculation, the atmospheric O 2 mole fraction and pCO 2 were fixed values, and the sea level pressure (SLP) was set as the climatology field.

UKESM-TOPAZ Version 1.0
The UKESM-TOPAZ version 1.0 (UKESM-TOPAZ) model was developed by combining the UKESM1 model with the TOPAZ code version 2. This model is a new ESM that considers marine primary production and chemical processes, unlike UKESM1 (Table 1 and Fig. 1). The u-bf935 suite (Byun 2020), which is a UKESM1 historical member implemented by the National Institute of Meteorological Science-Korea Meteorological Administration, was used as the experimental setup and source code of the UKESM1 model for developing the new UKESM-TOPAZ model. The TOPAZ code was independently compiled, then linked when NEMO was compiled. NEMO defines the C Pre-Processor key during the compilation stage to determine which calculation process to use in the source code. The existing UKESM1 model defines and uses key_medusa and the related keys to use the MEDUSA code. To turn off MEDUSA, the keys were deleted from the marine model's compilation configuration (fcm-make.cfg). Instead, key_topaz was newly defined to turn on the TOPAZ code, and code to use the TOPAZ module was added to the NEMO source code. The TOPAZ module obtains information on each component model required for calculation via NEMO, and then calculates and updates the calculated information back to NEMO. TOPAZ was initialized during the NEMO initialization stage and tracer calculations were performed in NEMO's Tracers in the Ocean Paradigm (TOP) module. The main calculations in the TOPAZ module, i.e., the optical feedback and column physics process, were performed in the trcsms module, which calculates the sources and sinks of the biogeochemical model; the results of which were then updated back to NEMO. The biogeochemical tracer transport calculation was performed in the trctrp module of TOP after trcsms was executed, and the next time step was calculated. Finally, the TOPAZ output was exported as a NetCDF formatted file in NEMO.
As the UKESM-TOPAZ and UKESM1 model exhibited some differences in the exchange variables between the ocean and atmosphere through biogeochemical processes (Table 1), partial modification of the code was necessary.
In the existing UKESM1 model, pCO 2 and dust are supplied from the atmosphere to the marine biogeochemical process, while chlorophyll, CO 2 , and DMS in the ocean affect the atmosphere. In contrast, the UKESM-TOPAZ version 1.0 model currently uses the necessary atmospheric chemical fluxes and SLP as climatological data and does not implement the biogeochemical supply process from the ocean to the atmosphere or the DMS concentration calculation. Hence, the NEMO and UM codes related to variables exchanged through the OASIS coupler and the Rose configuration file were modified according to the UKESM-TOPAZ model, and the namelists (namelist_ topaz_ref, namelist_topaz_cfg) that set the flux data required by TOPAZ were added.
The number of marine biogeochemical variables simulated in the MEDUSA module increased from 20 (UKESM1) to 41 (UKESM-TOPAZ). The current version of UKESM-TOPAZ did not calculate the parameters related to the CMIP6 protocol, including ideal tracer concentrations, CFC-11, CFC-12, sulfur hexafluoride (SF6), and seawater age since surface contact. Therefore, the XIOS namelist (iodef.xml, field_def_bgc.xml) related to output production and the Rose configuration file related to the NEMO tracer namelist (namelist_top_cfg) were modified (refer to Supplementary information and Code availability for information on other modifications, input data, and namelists related to the combination of these modules).

Experimental Setup
In this study, we compared the major biogeochemical variables simulated using UKESM-TOPAZ and UKESM1 with observational data to evaluate the effect of replacing the marine biogeochemical model MEDUSA with TOPAZ. For the initial conditions used in the UKESM-TOPAZ model, except the marine biogeochemistry module, we used the January 1, 2565 data from the piControl experiment, which is the initial condition of the UKESM1 model. The TOPAZ initial conditions were composed of two data types. First, the initial conditions of eight variables that UKESM1 and UKESM-TOPAZ commonly simulate [nitrate, silicate, dissolved iron (DFe), DIC, detrital nitrate, alkalinity, DO, and chlorophyll] were obtained from UKESM1's piControl data in 2565. For chlorophyll, a field obtained by summing the diatom and non-diatom chlorophyll values was used. Second, 33 tracers, including phosphate and ammonia, used NEMO-TOPAZ's spin-up data in 3000. The biogeochemistry field of the spin-up run for the NEMO-TOPAZ model was initialized using data provided by MOM5, and CORE-II (Griffies et al. 2009) climatological data were used for forcing. As this model was performed at the resolution of ORCA2 and L31, data were interpolated to the eORCA1 and L75 resolution.
Surface fluxes, including those of nitrate, ammonia, lithogenic aluminosilicate, and FeD, required for TOPAZ calculations were interpolated to the ORCA1 resolution from the climatological data provided by the TOPAZ module. The 10 m wind, SLP, O 2 mole fraction, and pCO 2 were used to calculate the O 2 and CO 2 air-sea interactions in TOPAZ. The 10 m wind speed was calculated using the atmospheric model data, with the CORE-II climatological data as the SLP (Griffies et al. 2009). Moreover, the atmospheric O 2 mole fraction and pCO 2 used in the TOPAZ calculations were fixed as 0.21 and 286 ppm, respectively. Throughout the entire period of the CMIP6 historical simulation , an experiment on the initial 70 years (1850-1919) was performed. The first 20 years (1850-1869) were considered to be a stabilization period in the upper ocean; thus, the results for the last 50 years  were used for analyses. Parameters related to basic ocean production and carbon cycling (i.e., chlorophyll, nitrate, silicate, DO, DIC, and alkalinity) were analyzed, and their horizontal and vertical distributions and errors were thoroughly checked. The ocean areas defined for the vertical distribution analyses are shown in Fig. 2. Further, the simulated chlorophyll values in the equatorial Pacific were analyzed in detail. Variables including DFe, sea temperature, zonal current, and vertical current were also assessed, and the sensitivity of the chlorophyll concentrations to DFe and nitrate concentrations was tested using the GOTM-TOPAZ model ( Table 2).
The model was run using the Cray XC40 supercomputing environment at the Korea Meteorological Administration (National Center for Meteorological Supercomputer), with model execution times of ~ 630 and ~ 560 min for the UKESM-TOPAZ and UKESM1 models, respectively, when performing one model year using 720 cores (Table 1).
Multiple observational datasets were used to compare the simulation performances of the UKESM-TOPAZ and UKESM1 models. Observational data from different periods and the model simulation results were compared directly, as in previous studies that analyzed simulated marine biogeochemical variables (Yool et al. 2011Dunne et al. 2013;Held et al. 2019 (Garcia et al. 2018a(Garcia et al. , 2018b were used, with a horizontal data resolution of 1° × 1° and 102 standard vertical levels from the surface to 5500 m. For the DIC and alkalinity, the GLobal Ocean Data Analysis Project version 2 (GLODAPv2) (Lauvset et al. 2016) dataset was used, with a horizontal resolution of 1° × 1° and 33 vertical layers from 0 to 5500 m. The NOAA Extended Reconstructed Sea Surface Temperature version 4 (ERSSTv4) (Huang et al. 2015) dataset was used for sea surface temperatures (SSTs), with a horizontal resolution of 2° × 2°.

Chlorophyll
In the ocean, the primary producers are phytoplankton that greatly influence the entire food web and can affect the physical environment by changing the ocean color through photosynthesis, which is an important factor in the Earth system (Park et al. 2014a(Park et al. , b, 2015. Figure 3 shows the differences in chlorophyll concentrations (a proxy for phytoplankton biomass) simulated for 1870-1919 using the UKESM-TOPAZ and UKESM1 models, as well as the SeaWiFS satellite data from 1998 to 2008. Considering that the SeaWiFS data include chlorophyll concentrations  at the sea surface and in the upper part of the mixed layer, the average of 12 vertical levels from 0 to 20 m was used (Jochum et al. 2009;Park et al. 2014b;Jung et al. 2020). The models correctly simulated the chlorophyll concentrations, which are high at the equator and high latitudes and low in the subtropical gyres (Fig. 3a, c, d). The latitudinal averaged chlorophyll concentrations were overestimated by both models in the lower-middle latitudes of the Southern Hemisphere (0-60° S), and underestimated in the middle latitudes of the Northern Hemisphere (30-60° N) (Fig. 3b). Further, both models simulated chlorophyll concentrations that were higher than the observed data in the eastern equatorial Pacific and in the Antarctic Circumpolar Current, and significantly underestimated the high chlorophyll concentrations along the continental coasts at high latitudes (above 70° S and 80° N) (Fig. 3e, f). Similarly, on average, overestimates at high latitudes in the Southern Hemisphere and underestimates at high latitudes in the Northern Hemisphere were indirectly confirmed in an ocean primary production analysis of the CMIP5 ESMs (Anav et al. 2013), which may be caused by inaccuracies in the models and observational data at high latitudes (Gregg and Casey 2004;Moore et al. 1999Moore et al. , 2013, by model inaccuracy for the iron cycle in ironlimited ocean regions including the sub-Arctic Pacific and Southern oceans (Williams and Follows 2011;Andrew et al. 2019), or by model underestimation of nutrient supplies, which play an important role in phytoplankton blooms along the continental coasts (Vichi and Masina 2009). Note that the chlorophyll results compared with the SeaWiFS data at high latitudes (≥ 60°) may be affected by the lack of observations at high latitudes during the winter.
In the eastern equatorial Pacific, with strong air-sea interactions, chlorophyll feedback affects ENSO (Park et al. 2014a, b;Kang et al. 2017;Lim et al. 2018), which subsequently affects global climate by changing the atmospheric pressure anomalies, wind circulation, and SST (Bjerknes 1969;McPhaden et al. 2007;Kim and Seo 2016;Yeh et al. 2018;Kim et al. 2020). Therefore, differences in chlorophyll feedbacks in this ocean area between the models, owing to the different biogeochemical models used, may cause variability in the simulated global climate results. Hence, differences between the models in the equatorial Pacific chlorophyll simulation results and the interactions between chlorophyll and ENSO were analyzed in further detail. Figure 4 shows the differences in the chlorophyll concentrations between the two models and the most likely associated variables in the equatorial Pacific (5° S-5° N). Both models exhibited the highest chlorophyll concentrations at 30-50 m depth in the eastern equatorial Pacific (Fig. 4a, b). However, in the vertical structure leading to the western equatorial Pacific, where high chlorophyll concentrations occur in the eastern equatorial Pacific, the UKESM-TOPAZ model led to the deep western Pacific, whereas UKESM1 led to the surface. Therefore, a characteristically different structure was produced (Fig. 4c) compared to the UKESM1 simulations, wherein the UKESM-TOPAZ model overestimated the chlorophyll concentrations from the depths of the western equatorial Pacific to the eastern equatorial Pacific surface.
Differences in the vertical distributions of chlorophyll between the two models were caused by differences in the DFe concentrations and the physical environment. The UKESM-TOPAZ model, which simulated higher chlorophyll concentrations, exhibited higher DFe concentrations than those in the UKESM1 model (Fig. 4d-f), particularly in the western Pacific where the concentrations were almost doubled. This relationship between DFe and chlorophyll concentrations is consistent with DFe acting as a limiting factor for phytoplankton growth in the equatorial Pacific (Aumont and Bopp 2006;Schneider et al. 2008;Ito et al. 2019;Hamilton et al. 2020). The impact of DFe is important in the euphotic zone, which receives abundant light for phytoplankton to photosynthesize. Thus, although the differences in DFe between the two models were larger in the deeper layers than in the shallower part of the water column (Fig. 4f), the differences in the chlorophyll concentrations were large from the surface to 100 m depth (Fig. 4c). The differences in the other ocean biogeochemical variables between the two models were insufficient to explain the differences in the chlorophyll concentrations. At depths of ~ 50-250 m, the UKESM-TOPAZ ocean temperature was higher and lower than that of UKESM1 in the western and eastern equatorial Pacific, respectively (Fig. 4g). Thus, the east-west slope of water temperature in the UKESM-TOPAZ model was larger than that of the UKESM1 model. As the hydrostatic gradient from west to east and upwelling in the eastern Pacific were reinforced, the Cromwell Current, i.e., the undercurrent in this ocean area, was stronger in the UKESM-TOPAZ model than in the UKESM1 model (Fig. 4h, i). Therefore, overestimated ocean currents may have transported high DFe concentrations from the western to the eastern equatorial Pacific, resulting in high chlorophyll concentrations. Coinciding with the differences in the east-west hydrostatic pressure gradient, the trade winds in the UKESM-TOPAZ model were also stronger than those in the UKESM1 model, which indirectly indicates that differences in ocean biogeochemical processes cause differences in ocean physics and biogeochemical environments, potentially affecting atmospheric elements such as winds.
A sensitivity test using the GOTM-TOPAZ model was performed to determine if the differences in DFe in the equatorial Pacific contributed the most to the differences in the chlorophyll concentrations, as suggested by Fig. 4. The experiment used different initial conditions for DFe and nitrate concentrations (Table 2), which mainly affect chlorophyll productivity (Aumont and Bopp 2006;Schneider et al. 2008;Ito et al. 2019;Hamilton et al. 2020). The initial conditions of the other variables (excluding DFe and nitrate) for all of the experiments were set to the equatorial Pacific environment (5° S-5° N, 140° W) averaged through the 1870-1919 UKESM-TOPAZ simulations, while the initial conditions of DFe and nitrate were set as the outputs of the UKESM-TOPAZ and UKESM1 models, respectively. First, the control experiment (CTL) used DFe and nitrate simulated by UKESM-TOPAZ, after which experiments were performed wherein the DFe, nitrate, or DFe and nitrate values were modified to the outputs produced by UKESM1 (i.e., the MEDUSA model outputs DFe_M, NO3_M, and DFeNO3_M, respectively). The initial DFe and nitrate concentrations from the UKESM-TOPAZ output were approximately 69% more and 14% less than those obtained from the UKESM1 model, respectively. The model was integrated for 9 days, and the average during the last 3 days was obtained (Fig. 5). Using the UKESM-TOPAZ DFe concentrations, the 20-50 m chlorophyll concentration was 6% higher in the CTL and NO3_M than when the UKESM1 model DFe were used (DFe_M and DFeNO3_M). Differences in the nitrate concentrations had little effect on the chlorophyll concentrations. Moreover, a sensitivity experiment was conducted wherein the atmospheric forcing and marine physical environment were changed to the values from each model, but the effects on chlorophyll concentrations were not significant. Overall, the results indicate that chlorophyll concentrations in the upper equatorial Pacific in the model are greatly affected by DFe (Fig. 4), which is consistent with previous findings regarding the eastern equatorial Pacific being an iron-limited high-nutrient, low-chlorophyll (HNLC) ocean area (Aumont and Bopp 2006;Schneider et al. 2008;Williams and Follows 2011;Ito et al. 2019;Hamilton et al. 2020).
We also determined that the UKESM-TOPAZ model can adequately simulate the negative correlation between chlorophyll concentrations and ENSO during the boreal winter (December-February; Fig. 6). The inter-annual variability of the Niño 3.4 index and chlorophyll averaged from 5° S  , e); c, f differences in chlorophyll and Dfe concentrations between the two models. Differences between the physi-cal ocean variables in the two models (UKESM-TOPAZ minus UKESM1), i.e., g sea temperature (°C), h vertical current (m s −1 ), and i zonal current (m s −1 ). Temp temperature, vert vertical, curr current to 5° N in the equatorial Pacific indicate that chlorophyll concentrations decreased during El Niño periods when the Niño 3.4 index was located east of the 180° line, compared with other periods. Conversely, the chlorophyll concentrations increased during La Niña periods when the Niño 3.4 index was west of the 180° line. During El Niño periods, upwelling is weakened and nutrient supplies decrease, resulting in lower phytoplankton biomass. Meanwhile, phytoplankton growth increases during La Niña periods owing to enhanced upwelling and increased nutrient supplies. The inverse relationship between the Niño index and chlorophyll concentrations were therefore adequately simulated by both models.
We then analyzed the regression coefficient of the chlorophyll anomaly and the sign-reversed Niño (− Niño) 3.4 index during the boreal winter (Fig. 7). The regression coefficient was high from 5° S to 10° N and from the eastern Pacific to 150° E. Both models exhibited regression coefficients in the western (120° E-180°) and eastern equatorial Pacific that were larger than those observed. Moreover, in the Niño 3.4 region (5° S-5° N, 120-170° W), the regression coefficient of the UKESM-TOPAZ model was higher than that of the UKESM1 model.
The − Niño indices and average chlorophyll anomalies during the winter (December-February) are shown in Fig. 8 (dashed and solid lines, respectively). As shown  in Fig. 6, both models exhibited an inverse relationship between increasing chlorophyll concentrations and decreasing SSTs, and the correlation coefficients were ≥ 0.93 in all ocean areas. These values were higher than the correlation coefficients between the observed SeaWiFS and ERSSTv4 datasets, which were 0.86, 0.95, and 0.86 in the Niño 3, 3.4, and 4 regions during 1998-2009, respectively. Comparing the two models, the UKESM1 correlation coefficient was higher than that of UKESM-TOPAZ (averages of 0.97 and 0.95, respectively). The chlorophyll standard deviation in the model was > 2.0 times greater than that of the observed value (Table 3), while there was great variability in the UKESM-TOPAZ Niño 3 and Niño 3.4 area values and the UKESM1 Niño 4 area value. The SST standard deviations of UKESM1 in the Niño 3 area, as well as those of UKESM-TOPAZ in the Niño 3.4 and Niño 4 areas, were the most similar to the observed data.
Although the UKESM-TOPAZ and UKESM1 models use the same chlorophyll initial field, the marine environment and the chlorophyll concentrations differed from the beginning of the integration, which are may be related to the large differences in the amount of DFe in seawater, because the UKESM-TOPAZ model prescribing the FeD as climatological data, whereas the UKESM1 model used dust deposition from the atmospheric model. However, despite these differences, it is remarkable that the inverse relationship between chlorophyll and the Niño 3.4 index in the equatorial Pacific was reproduced by both models.

Nitrate and Silicate
Phytoplankton require various nutrients, including nitrate, phosphate, and silicate, for growth (Myers and Iverson 1981;Rhee and Gotham 1981;Smith 1984;Egge and Aksnes 1992;Nelson and Tréguer 1992;Hoffmann et al. 2008). Given that nutrients are essential for phytoplankton growth and can affect primary productivity (Pasquero et al. 2005), the UKESM-TOPAZ simulated nitrate and silicate concentrations from 1870 to 1919 were compared with those obtained from the UKESM1 model and with the WOA18 climatological data. According to the nitrate concentration distributions and errors (Fig. 9), surface nitrate concentrations in the eastern equatorial Pacific and high-latitude regions were higher than those in other ocean areas (Fig. 9a,  d), and the simulated nitrate distribution was similar to the observed distribution. The patterns of the surface and vertical errors from the two models were similar (Fig. 9b, c, f, g, j, k). However, concentration differences between models are gradually increased over time, but not their vertical structures. According to the surface error distribution and zonal average concentrations (Fig. 9b-d), the model overestimated the surface nitrate concentrations. At the sea surface, the values obtained from both models were similar, or those from UKESM-TOPAZ were lower than those from UKESM1 ( Fig. 9b-d, f-h, j-l).
Vertical distributions were analyzed by averaging over the Pacific and Atlantic regions (Fig. 2), wherein the Pacific (Fig. 9e), exhibited high nitrate concentrations in the North Pacific Deep Water (NPDW). Considering the entire Pacific, the model overestimated the average nitrate profile at a depth of 1300 m, and underestimated it in deeper regions (Fig. 9h). In the Pacific, the vertical structure of the nitrate error was overestimated at 0-1000 m depths in the Southern Hemisphere, and underestimated it at > 1000 m depths in the Northern Hemispheres (Fig. 9f, g). These results indicate that the errors are related to the Antarctic Intermediate Water (AAIW) and NPDW simulations, respectively. Moreover, observations in the Atlantic exhibited high and low nitrate concentrations along the AAIW and the North Atlantic Deep Water (NADW) in the Southern and Northern Hemispheres, respectively (Fig. 9i). In the mean profile of the area-averaged Atlantic, the model tended to underestimate the observations (Fig. 9l). The underestimation error in the 500-1500 m layer at 30° N from the Southern  Hemisphere seems to be related to the AAIW simulation, whereas the error observed in the 500-3000 m layer in the Northern Hemisphere seems to be related to the NADW simulation. Silicate concentrations in the equatorial Pacific, Antarctic, and Northern Hemisphere high latitudes were higher than those in the subtropical gyres (Fig. 10a), and the surface silicate error distributions for the two models were similar (Fig. 10b, c), with surface silicate concentrations being underestimated overall (excluding those in the Southern Ocean). Vertical errors were underestimated and overestimated in the northern Pacific and at ~ 1000 m depth in the southern Pacific, respectively (Fig. 10f, g). Moreover, in the Atlantic, the models underestimated silicate concentrations in the Northern Hemisphere and apparently overestimated them at depths of ~ 500-1000 m in the Southern Hemisphere (Fig. 10j, k). Similar to nitrate concentrations, the vertical error structure of silicate was related to water mass simulations, including those of AAIW, NPDW, and NADW.
In contrast to the significant differences in simulated chlorophyll concentrations between the two models discussed in Sect. 4.1, nutrients did not exhibit significant differences during the analytical period, and the structural errors were confirmed vertically. In previous studies, the simulated AAIW and NADW in HadGEM (the physical model of UKESM1) and UKESM were underestimated (Zhu et al. 2018;Kuhlbrodt et al. 2018;Heuzé 2020). This simulation error also caused errors in the concentrations and behaviors of the tracers simulated by the model, and is likely reflected in the initial conditions of the UKESM1 model. The effect of this initial condition was maintained for approximately 70 years, which comprised the model integration period; as a result, the vertical errors of the simulated tracers in the UKESM-TOPAZ and UKESM1 models exhibited similar structures.

Dissolved Oxygen
DO is an important factor in marine biogeochemistry, as it is related to air-sea gas exchanges, phytoplankton photosynthesis, respiration of marine life, and deep water mass formation at high latitudes (Roy-Barman and Jeandel 2016). The simulated DO results were compared with the WOA18 climatological data and the UKESM1 results (Fig. 11). The observational data indicate that surface DO concentrations were low and high in the warm pool and at high latitudes, Korean Meteorological Society respectively (Fig. 11a), which was simulated well by the model (Fig. 11d). The UKESM-TOPAZ model tended to underestimate the overall surface DO concentrations, while the UKESM1 model tended to overestimate them ( Fig. 11b-d). The UKESM-TOPAZ model yielded zonally averaged surface DO values that were closer to the observations than the UKESM1 model (Fig. 11d). According to the vertical error distributions, the DO errors observed in the Pacific and Atlantic (Fig. 11f, g, j, k) were the opposite of the nitrate errors (Fig. 9f, g, j, k). In the Pacific, the DO error distribution (Fig. 11f, g) indicates that the DO concentrations in the North Pacific Intermediate Water (NPIW) and the AAIW were lower than the observed values, while the DO concentrations in the NPDW (deeper compared to the NPIW and AAIW) were higher than the observed values. Considering the mean DO profile (Fig. 11h), the average concentration was simulated up to 500 m in the upper layer, where the effects of these water masses cancel each other out, and was similar to the observed value; however, the concentrations were overestimated in the deeper layer where the NPDW is mainly located. In the Atlantic, the AAIW was simulated weakly, and the DO concentrations in the oxygen minimum zone were relatively high (Fig. 11j-l).

Dissolved Inorganic Carbon and Alkalinity
As DIC and alkalinity are the major parameters of marine biogeochemical processes that are related to the carbon cycle (Egleston et al. 2010;Bates et al. 2014), we compared their results during 1870-1919 using the two models with the observed climatological data from . Surface DIC had higher concentrations in the Antarctic, Atlantic, and Arctic Oceans than in the other oceans (Fig. 12a), and the models exhibited positive errors near the equator and in the surface layer of the Antarctic Ocean, while negative errors were obtained in the subtropical gyres (Fig. 12b, c). The distribution of zonally averaged DIC values obtained from the models was largely in agreement with that of the observations (Fig. 12d). Similar vertical distributions were obtained in the two models, overestimating on average in the Pacific (Fig. 12h) and underestimating in the upper Atlantic at depths shallower than 2500 m (Fig. 12l). Moreover, the vertical error structures (Fig. 12f, g, j, k) were similar to those of nutrients (Figs. 9f,g,j,k and 10f,g,j,k). Unlike UKESM1, the UKESM-TOPAZ model does not implement CO 2 exchanges with the atmospheric model, but  Fig. 2 the error trends in the integration results did not exhibit significant differences between the models for approximately 70 years, as the initial conditions appear to have a large effect.
According to the alkalinity analysis (Fig. 13), the observations generally showed high alkalinity in subtropical surface waters (Fig. 13a), and both models overestimated the alkalinity along the equator in the Indian, Pacific, and Atlantic oceans compared with the observed values (Fig. 13b, c). The zonally averaged alkalinity was overestimated in the models from the Southern Hemisphere to approximately 30° N and underestimated from 40° to 70° N (Fig. 13d). As in the DIC errors, the errors of the two models were similar, tending to overestimate alkalinity except in several areas of the surface layer ( Fig. 13f-h, j-l). In the Pacific, the error magnitude increased (on average) with depth. In the Atlantic, the depth of the lowest layer was deeper than that of the observations (approximately ≥ 1500 and 700 m, respectively), which may be related to the weakly simulated NADW in the Atlantic, as suggested by the nutrient analysis (Sect. 4.2).

Summary and Discussion
In this study, we described the development of a new ESM, the UKESM-TOPAZ, by coupling the existing UKESM1 model with the marine biogeochemical model TOPAZ. Moreover, the preliminary results of this biogeochemical integration of the UKESM-TOPAZ model were compared with those of the UKESM1 model and observational data.
Essential calculation processes were added by separating the TOPAZ module from the MOM5 model. After compiling the TOPAZ code independently, it was linked to NEMO when the UKESM code was compiled. This linking method was similar to the method developed for the GOTM-TOPAZ ) and the NEMO-TOPAZ (Jung et al. 2020) models. Currently, exchanges of CO 2 , chlorophyll, DMS, or dust between the atmospheric and marine biogeochemical processes have not been implemented, and several atmospheric chemical fluxes (such as FeD) used in the TOPAZ calculations utilize climatological values. These limitations will require improvements in future versions of the model. The initial conditions of the model employed the piControl of the UKESM model in the year 2565, which are the initial conditions of the existing UKESM model, while the TOPAZ marine biogeochemical parameters, which are not provided by UKESM, were obtained from the NEMO-TOPAZ spin-up data over 3000 years. The CMIP6 historical experiments were performed and integrated from 1850 to 1919, and data from the last 50 years (1870-1919) of the integration period were used for analysis.
Chlorophyll, nitrate, silicate, DO, DIC, and alkalinity were evaluated. Chlorophyll is related to the primary productivity of the ocean, and the observed chlorophyll distribution was reproduced well by the UKESM-TOPAZ model. Compared with the UKESM1 results, the differences in chlorophyll distributions in the eastern equatorial Pacific were clear from the beginning of the integration, even though the two models used the same initial field. In the eastern equatorial Pacific, the chlorophyll concentrations obtained from the UKESM-TOPAZ model were higher than those obtained from UKESM1 at the upper boundary of the enhanced Cromwell Current, which is in an iron-limited area. A vertical distribution analysis and GOTM-TOPAZ sensitivity experiments indicate that the DFe concentrations simulated by UKESM-TOPAZ were higher than those simulated by UKESM1; thus, the chlorophyll concentrations were also higher. Meanwhile, the difference in the east-west hydrostatic pressure gradient of the UKESM-TOPAZ model was larger; thus, the Cromwell Current and upwelling were stronger and more widespread than in the UKESM1 model, which strengthened the DFe supply. The two models used the same initial DFe conditions, but differed in the way that FeD was prescribed from the atmosphere. The UKESM-TOPAZ model used the climatological field, while UKESM1 used the dust calculated from the atmospheric chemistry model. Therefore, there was a difference in the soluble iron flux, which resulted in relatively high concentrations of chlorophyll in the UKESM-TOPAZ model.
The results demonstrate that differences in marine biogeochemical processes can create changes in the feedbacks between the marine biogeochemical processes and climate; consequently, climate predictions can differ widely. The models overestimated the chlorophyll and SST variabilities, and the variabilities of the Niño 3.4 and Niño 4 indices in the UKESM-TOPAZ model were closer to the observed values than those of UKESM1. Regarding the correlation between the Niño indices and chlorophyll during the boreal winter, phytoplankton decreased during El Niño periods, but increased during La Niña periods (Park et al. 2014a). Both models simulated this inverse relationship between the Niño indices and chlorophyll in the eastern equatorial Pacific. Thus, the UKESM-TOPAZ model can be used for chlorophyll-climate feedbacks or for interrelationship studies involving ENSO.
The results of the mean field analyses of nitrate, silicate, DO, DIC, and alkalinity in the UKESM-TOPAZ model did not differ significantly from those of UKESM1. The surface distribution patterns of the biogeochemical variables were simulated well by both models, which also exhibited similar error trends for the observations in the vertical distributions. As the differences in the error distributions between the models were not large, the influence of the initial conditions was suggested to be large throughout the analytical period. We estimated that characteristics of the UKESM1 model occurred because the UKESM1 spin-up data were used as the initial conditions for common variables and because the UKESM-TOPAZ model did not spin up separately. However, concentration differences between the models, but not the structures, gradually increased over time, particularly those of nitrate and DO. In the Pacific, the errors were large at depths of 500-1000 m in the Southern Hemisphere and at depths of over 1000 m in the Northern Hemisphere. In the Atlantic, the error gradually expanded from the depth of the Southern Hemisphere to the north, and the opposite phase error was observed at a depth of 2500-3000 m. This vertical error structure was related to HadGEM3-GC3.1, which is a physical model in UKESM1 that underestimated the intensities of major water masses, such as the AAIW, NPDW, and NADW (Zhu et al. 2018;Kuhlbrodt et al. 2018;Heuzé 2020).
As the biogeochemical model was changed from MEDUSA to TOPAZ, chlorophyll and various biogeochemical processes also changed. The UKESM-TOPAZ model exhibited different simulation aspects regarding chlorophyll concentrations, while similar interactions were observed between chlorophyll concentrations and ENSO compared to those of the UKESM1 model. The chlorophyll concentrations related to NPP and the diversity of NPP between ESMs was established in previous studies. Bopp et al. (2013)  with GLODAPv2. h, l Vertical profiles of the average alkalinity concentrations using GLODAPv2 (black line), UKESM-TOPAZ (red line), and UKESM1 (blue line). The Pacific and Atlantic regions in e-l were defined as shown in Fig. 2 showed that the inter-model standard deviation of NPP between the future RCP 8.5 scenarios of the CMIP5 models was 7.5% in the 2090s, which is relatively large considering the change in the mean NPP was − 8.6% during the 2090s (compared to the 1990s). Moreover, Frölicher et al. (2016) showed that the uncertainty in NPP projections is governed by the model uncertainty (> 40%), rather than the internal variability or the scenario uncertainty. Although most of the CMIP6 models exhibited slightly improved NPP simulations during 1998-2014 compared with the CMIP5 models , large inter-model uncertainties were still present Kwiatkowski et al. 2020). As suggested by Bopp et al. (2013) and Frölicher et al. (2016), inter-model comparisons and improved multi-model projections are required to understand the inter-model uncertainty. The UKESM-TOPAZ and UKESM1 models use different ocean biogeochemical models, while other Earth system components are the same. Thus, the UKESM-TOPAZ model can increase the number of ESMs, and can contribute to the understanding of inter-model uncertainty caused by different biogeochemical models in historical and future warming scenarios by comparing its results with those of the UKESM1 model.
In the CMIP6 Ocean Model Intercomparison Project protocol, it is recommended to spin up the ocean biogeochemical field for a period of over 2000 years (Orr et al. 2017). The model developed in this study may also require a longer integration time than the analytical period to re-equilibrate the marine environment. Therefore, it is necessary to spin up the UKESM-TOPAZ model through the piControl experiment. In addition, it is possible that the initial conditions of the variables that are not simulated by MEDUSA, but are only simulated by TOPAZ, indirectly caused the differences in the simulated results between the two models. The model results may also differ depending on how the variables that interact with the atmospheric model (such as FeD) are considered. In the future, feedback processes such as chemical exchanges with the atmospheric model (i.e., dust and CO 2 ) will be implemented in the UKESM-TOPAZ model, and the CMIP6 variables related to subsurface ventilation or water mass ages (e.g., CFC-11, CFC-12, and SF6;Orr et al. 2017) need to be added. In addition, the ocean physical and atmospheric environments also differed between the two models, as mentioned in Sect. 4.1. As observed by Park et al. (2014b), the difference in the chlorophyll concentrations can change the SST, trade winds, thermocline depth, and upwelling in the equatorial Pacific through direct and indirect feedbacks. Therefore, these differences should be further analyzed in future studies. Eventually, this newly developed UKESM-TOPAZ ESM is expected to contribute considerably to furthering our current understanding of uncertainties in climate simulations across models, and can be used in a variety of studies regarding Earth system interactions, including studies of the correlations between ENSO and chlorophyll feedbacks.