1 Introduction

Data assimilation, which combines observations and numerical models to obtain the statistically best state estimate, has long been recognized as a useful tool for wide-ranging purposes in geophysics such as meteorology and oceanography. One of the major targets of data assimilation is “reanalysis,” which aims at reconstructing a long-term historical dataset for the atmosphere, ocean and so on. In meteorology, various atmospheric reanalysis datasets have been produced (e.g., Kalnay et al. 1996; Uppala et al. 2005; Onogi et al. 2007) and have been extensively utilized in a variety of research fields of not only meteorology but also climate study and oceanography.

Ocean reanalyses have also become available along with advancements of ocean observing systems such as remotely sensed and in situ observing platforms in recent decades. One of the most important attempts in ocean reanalysis study is the production of global ocean reanalysis that covers a period of multiple decades for climate research, and such datasets are available from operational centers and some research groups (e.g., Scott et al. 2007; Carton and Giese 2008; Köhl and Stammer 2008; Saha et al. 2010; Storto et al. 2011; Balmaseda et al. 2013; Toyoda et al. 2013; Fujii et al. 2015; Osafune et al. 2015). Due partly to the size of the problem and technical limitations, relatively low-resolution ocean models are usually employed in these global reanalyses. Furthermore, a multi-model ensemble approach has been proposed recently in order to construct a more accurate dataset (Toyoda et al. 2015b; Han et al. 2015). State-of-the-art global reanalysis datasets are summarized in Balmaseda et al. (2015).

In addition to the enhancement of ocean observing systems, recent advances in computing capability enable us to perform long-term basin-scale reanalyses at eddy-resolving resolutions [e.g., (Miyazawa et al. 2009; Moore et al. 2011; Oke et al. 2013)], which give a realistic representation of mesoscale variability as well as variations in large-scale circulation. The reanalysis period for the high-resolution datasets is however limited to the satellite altimeter era (since 1993) because realistic reproduction of actual mesoscale features such as meanders in a jet and mesoscale eddies strongly depends on satellite altimeter data (Oke and Schiller 2007). Therefore, to produce long-term high-resolution reanalysis data over a period longer than the altimeter era is a challenging issue, and at the present time there is no such dataset for the western North Pacific, which is the main target region in this study.

The western North Pacific is characterized by the existence of strong western boundary currents, the Kuroshio and Oyashio. Their variations accompany energetic variabilities in the associated mesoscale features such as eddies and meanders. Another prominent feature in this region is the existence of low-frequency variability on interannual to decadal time scales (e.g., Qiu 2003; Ohshima et al. 2006; Taguchi et al. 2007; Na et al. 2012; Usui et al. 2013; Kuroda et al. 2015), which has a strong connection with changes in atmospheric forcing. Accumulated evidence shows that the low-frequency variability is accompanied by modulation of the mesoscale variability. This region is also characterized by its marginal seas such as the East China Sea, the Japan Sea and the Okhotsk Sea, which connect with adjacent seas through narrow straits on the order of tens of kilometers in width. Therefore, eddy-resolving resolution with a typical horizontal grid spacing of about 1/10° is indispensable for realistic representation of ocean processes in the western North Pacific.

Strong sea surface temperature (SST) fronts associated with the western boundary currents exhibit an extremely large heat exchange between the ocean and atmosphere in the western North Pacific (Kida et al. 2015). This area, hence, is believed to be a key area for understanding mid-latitude air-sea interaction (e.g., Hirose et al. 2009; Taguchi et al. 2009; Kelly et al. 2010; D’Asaro et al. 2011; Nakamura et al. 2012; Sasaki et al. 2012). Moreover, the area is well known to contain some of the most productive fishing grounds supported by their high primary production, and variations in the western boundary currents at both the large- and meso-scales have a large impact on the area’s marine ecosystems and fisheries (e.g., Miller et al. 2004; Nishikawa et al. 2011). Therefore, a long-term high-resolution reanalysis dataset for the western North Pacific would be a great benefit for various research fields such as oceanography, meteorology, climate study and fisheries. In this study, we produce a long-term ocean reanalysis dataset for the western North Pacific using an advanced ocean data assimilation system.

The Meteorological Research Institute (MRI) of the Japan Meteorological Agency (JMA) developed a data assimilation system with a multivariate three-dimensional variational (3DVAR) analysis scheme, namely the MRI Multivariate Ocean Variational Estimation system, MOVE-3DVAR (Usui et al. 2006a). The western North Pacific version of MOVE-3DVAR with an eddy-resolving ocean model was implemented at JMA in 2008 and has been used for monitoring and forecasting the ocean state around Japan. The four-dimensional variational (4DVAR) analysis scheme version of the MOVE system (MOVE-4DVAR) was recently developed, and it was demonstrated that MOVE-4DVAR provides better representation of mesoscale features compared to MOVE-3DVAR (Usui et al. 2015).

Running a 4DVAR assimilation system for producing a long-term ocean reanalysis with the eddy-resolving configuration is still challenging even on the current high-performance computer systems because of its high computational cost. Owing to the recent update of the Earth Simulator in Japan Agency for Marine-Earth Science and Technology (JAMSTEC) made in 2015 Spring, we now have enough computing resources, which enables us to perform a high-resolution, long-term reanalysis experiment with the 4DVAR system. Using this updated Earth Simulator, we conducted the reanalysis experiment and produced the new regional ocean reanalysis dataset, FORA-WNP30: Four-dimensional variational Ocean Re-Analysis for the Western North Pacific over 30 years.

FORA-WNP30 is produced for use in various research fields not only in oceanography but also in meteorology, climate study and fisheries. The purpose of this article is, therefore, to provide detailed descriptions and the basic performance of the FORA-WNP30 dataset. Configurations of the data assimilation system and reanalysis settings are described in Sects. 2 and 3, respectively. In Sect. 4, the FORA data are compared with independent observations, and the basic performance is discussed. The interannual to decadal variability represented by FORA-WNP30 is shown in Sect. 5. Section 6 gives a summary and future works.

2 Data assimilation system

The data assimilation system used in FORA-WNP30 is MOVE-4DVAR (Usui et al. 2015), which was developed as an extension of the 3DVAR version of the MOVE system (MOVE-3DVAR) for the western North Pacific. MOVE-4DVAR consists of an eddy-resolving ocean general circulation model (OGCM) and the 4DVAR assimilation scheme. The following subsections give descriptions of the OGCM and the assimilation scheme used in FORA-WNP30.

2.1 Ocean model

The OGCM used in MOVE-4DVAR is the western North Pacific version of the MRI Community Ocean Model version 2.4 (MRI.COM-WNP; Tsujino et al. 2006). MRI.COM is a depth-coordinate model that solves the primitive equations under the hydrostatic and the Boussinesq approximations. The vertical coordinate near the surface follows the surface topography like \(\sigma\) coordinates. The model configuration is basically the same as that used in Tsujino et al. (2006) except for vertical grid spacing near the bottom and for the sea ice model incorporated. A biharmonic operator is applied to the horizontal mixing of tracers. For the horizontal momentum mixing of a biharmonic Smagorinsky viscosity (Griffies and Hallberg 2000) is used. In order to determine the vertical viscosity and diffusivity we adopt the turbulent closure scheme of Noh and Kim (1999). The model is driven by wind stress and heat fluxes. Latent heat and sensible heat fluxes are calculated by the bulk formula of Kondo (1975) using the model sea surface temperature (SST). More detailed descriptions of the framework of MRI.COM and the setting of MRI.COM-WNP are given by Tsujino et al. (2006, 2010), respectively.

The model domain of MRI.COM-WNP covers the western North Pacific, spanning from 117°E to 160°W zonally and from 15°N to 65°N meridionally. The horizontal resolution is variable: it is 1/10° from 117°E to 160°E and 1/6° from 160°E to 160°W zonally, and it is 1/10° from 15°N to 50°N and 1/6° from 50°N to 65°N meridionally. There are 54 vertical levels with thickness increasing from 1 m at the surface to 600 m at 6300 m depth. Ocean states at the open boundaries are obtained from a North Pacific model with a horizontal resolution of \(1/2^\circ \times 1/2^\circ\) using a one-way nesting method. Daily outputs from the North Pacific model are linearly interpolated in both time and space to update the open boundaries of MRI.COM-WNP at every time step.

A multicategory sea ice model with a single layer sea ice and snow cover is coupled to MRI.COM-WNP. In this model, five categories of sea ice are considered according to the thickness of the sea ice (0.0–0.6, 0.6–1.4, 1.4–2.4, 2.4–3.6 and 3.6–30.0 m). The sea ice model includes a thermodynamic and a dynamic component. The thermodynamical part is based on Mellor and Kantha (1989), and the dynamical part is based on the Los Alamos sea ice model (Hunke and Ducowicz 1997; Hunke and Ducowicz 2002). The sea ice model was recently introduced into MOVE-4DVAR although it was not included in the original version (Usui et al. 2015). It should however be noted that propagation of adjoint sensitivity through the sea ice model is ignored in the adjoint model.

The tangent linear and adjoint codes of MRI.COM-WNP were generated manually. We made some approximations to the codes. The vertical mixing coefficients in the turbulent closure scheme are calculated from the background state, and the dependency of the coefficients on the deviation from the background state is ignored. We also apply the same approximation to the bottom friction. In addition, some modifications are applied in order to avoid singularities. We have confirmed that the tangent linear and adjoint codes satisfy \(({\mathbf L} {\mathbf a})^T({\mathbf L} {\mathbf a})={\mathbf a}^T {\mathbf L}^T ({\mathbf L} {\mathbf a})\) for any vector \(\mathbf a\), where \({\mathbf L}\) denotes the tangent linear operator for MRI.COM-WNP. The left-hand side, the square of \(\mathbf {La}\), is obtained by integrating \(\mathbf a\) using the tangent linear model. On the other hand, the right-hand side, the inner product of \(\mathbf a\) and \({\mathbf L}^T\mathbf {La}\), is calculated by integrating \(\mathbf {La}\) using the adjoint model \({\mathbf L}^T\). The tangent linear and adjoint codes of MRI.COM-WNP have already been used for various analyses. Fujii et al. (2008) conducted a singular vector analysis to estimate the optimal perturbations for a Kuroshio large meander. Furthermore, Fujii et al. (2013) identified the pathway of the North Pacific Intermediate Water from its origin in the subarctic North Pacific to the subtropical region.

2.2 Assimilation scheme

The assimilation scheme used for FORA-WNP30 includes two parts. One is ocean data assimilation, and the other is sea ice assimilation. The ocean data assimilation scheme is based on MOVE-4DVAR (Usui et al. 2015), but some modifications have been made compared to the original version. The largest difference from the original scheme in Usui et al. (2015) is the use of MOVE-3DVAR analysis fields as the first guess in the 4DVAR calculation, which is effective to reduce the number of iterations in the 4DVAR calculation (Usui and Fujii 2015). A schematic diagram of the FORA assimilation cycle is shown in Fig. 1.

Fig. 1
figure 1

Schematic diagram of the FORA assimilation cycle. The 3DVAR calculation is first carried out to estimate temperature and salinity increments \(\Delta {\mathbf x}^f\) using the final state in the previous assimilation window as the background state \(\mathbf x^b\). Then, 4DVAR is performed to estimate the increment \(\Delta {\mathbf x}^a\) using \(\Delta {\mathbf x}^f\) as the first guess of the increment. In the 4DVAR calculation, the forward and adjoint models are integrated iteratively. The forward model with the initialization scheme of the incremental analysis update (IAU) is integrated to calculate the cost function J, and the adjoint model is integrated backward in time to calculate the gradient of the cost function \(\nabla J\). The IAU scheme works as the incremental digital filtering (IDF) during the adjoin model integration. Finally, the forward model is again integrated with the increment \(\Delta {\mathbf x}^a\) and sea ice assimilation. The final state of the forward model \({\mathbf x_{t_F}}\) is transferred to the background state in the next assimilation window

Both MOVE-3DVAR and MOVE-4DVAR estimate temperature and salinity fields above 1500 m using a multivariate variational method with vertically coupled temperature-salinity (T-S) empirical orthogonal dunction (EOF) modal decomposition for the background error covariance matrix (Fujii and Kamachi 2003). The model domain of MRI.COM-WNP is divided into 13 subregions, and monthly T-S EOF modes are calculated in each subregion. In order to prevent discontinuities in the analysis fields, the subregions overlap in the boundary areas.

MOVE-4DVAR attempts to optimize the ocean state over the assimilation window \([t_I, t_F]\) by controlling temperature and salinity increments to the initial condition at \(t_I\). The cost function is defined by

$$\begin{aligned} J ({\mathbf z} )&= {} {1 \over 2} \sum _l {\mathbf z}_l^T ({\mathbf B}_H)_l^{-1} {\mathbf z}_l + \sum _{t=t_I}^{t_F} \biggl [ {1 \over 2} \left( \mathbf {Hx}_t - \mathbf {y}_t^\mathrm{TS} \right) ^T {\mathbf R}_t^{-1} \left( \mathbf {Hx}_t - \mathbf {y}_t^\mathrm{TS} \right) \nonumber \\&\quad+ {} {1 \over {2\sigma _h^2}} \left( {\mathcal H} (\mathbf {x}_t) - {\mathbf y}_t^\mathrm{SLA} - {\mathbf H}_h \overline{\mathbf h} \right) ^T \left( {\mathcal H} (\mathbf {x}_t) - {\mathbf y}_t^\mathrm{SLA} - {\mathbf H}_h \overline{\mathbf h} \right) \biggr ] + J^c, \end{aligned}$$
(1)

where \(\mathbf z_l\) is the control variable composed of amplitudes for the vertical coupled T-S EOF modes, and \(\mathbf x\) is the state vector of temperature and salinity fields, which is a function of the control variable \(\mathbf z\). Matrices \((\mathbf B_H)_l\) and \(\mathbf R_t\) are the horizontal correlation matrix of background errors and the observation error covariance matrix. The horizontal correlation matrix is modeled by using a Gaussian-like function with area-dependent decorrelation scales, which are determined according to Kuragano and Kamachi (2000). Vectors \(\mathbf y^\mathrm{TS}\) and \(\mathbf y^\mathrm{SLA}\) are T-S profile observations and altimeter-derived sea level anomalies (SLAs), matrices \(\mathbf H\) and \(\mathbf H_h\) denote spatial interpolation from the model grid to observation locations for T-S and SLA data, respectively, \(\sigma _h\) is the observation error for the altimeter-derived SLA data, and \(\overline{\mathbf h}\) is the mean sea surface dynamic height (SDH). The last term on the right-hand side, \(J^c\), represents additional constraints, which are used for various purposes such as for avoiding density inversion (Fujii et al. 2005) and excessively cold temperature less than the freezing point (Usui et al. 2011). Note that in this study the constraint to prevent low temperature is set to work only in the region off the east of Japan. The subscripts l and t denote the l-th subregion and the time index. If we replace \({\mathbf x}_t\) with \({\mathbf x}_{t_I}\), (1) becomes the cost function for MOVE-3DVAR.

The operator \(\mathcal H\) calculates SDH at each observation point from the T-S field of the state vector \(\mathbf x\) as follows:

$$\begin{aligned} {\mathcal H} \left( \mathbf {x} \right) &= \mathbf H_h \mathbf h, \end{aligned}$$
(2)
$$\begin{aligned} h_i= & {} -\frac{1}{\rho _s} \int _0^{z_m} \rho '_i(T,S,p)dz, \end{aligned}$$
(3)

where \(\mathbf h\) is the vector consisting of SDH at each horizontal grid point, \(h_i\) is the ith component of \(\mathbf h\), i.e., SDH at the ith horizontal grid point, z and \(z_m\) denote the vertical coordinate and the reference depth for the SDH calculation, T and S are temperature and salinity, p is the pressure, \(\rho _s\) is the surface density, and \(\rho '\) is the density deviation from the reference state (\(T=0\,^\circ\)C and \(S=35\) psu). Note that the z-axis is positive upward, and \(z=0\) is the sea surface. This analysis scheme enables us to estimate variations in salinity accompanying temperature changes by using the T-S EOF modes, even if few salinity data are available.

The analysis increment for the temperature and salinity fields with respect to the background state at the initial time is calculated by

$$\begin{aligned} \Delta \mathbf x = {\mathbf {x(z)}} - {\mathbf x^b} = {\mathbf S} \sum _l \mathbf W_l {\mathbf U}_l {\mathbf \Lambda }_l {\mathbf z}_l , \end{aligned}$$
(4)

where \(\mathbf x^b\) represents the background state, \(\mathbf S\) is a diagonal matrix for standard deviation of background errors, \(\mathbf U_l\) is a matrix composed of dominant T-S EOF modes, \(\mathbf \Lambda _l\) is a diagonal matrix for the singular values of T-S EOF modes, and \(\mathbf W_l\) is a diagonal weight matrix for the lth subregion whose mth diagonal element, \(w_{lm}\), needs to satisfy \(\sum _l w_{lm}^2=1\) (Fukumori 2002).

In order to filter out high-frequency noises MOVE-4DVAR employs an initialization scheme of Incremental Analysis Updates (IAU; Bloom et al. 1996) in contrast to the conventional 4DVAR scheme, in which the analysis increment is added to the first-guess field at once at the initial time. The time evolution of the state vector during [\(t_I\), t] is thus expressed by

$$\begin{aligned} \mathbf {x}_t = \mathcal {M}_{t_I,t} \left( \mathbf {x}_{t_I} \right) + \Delta \mathbf x_t^\mathrm{cor}, \end{aligned}$$
(5)

where

$$\begin{aligned}&\Delta x_t^\mathrm{cor} \equiv g_t \left( {\mathbf {x(z)}} - \mathbf {x}_{t_I} \right) = g_t \left( \Delta {\mathbf x} + {\mathbf x^b} - \mathbf {x}_{t_I} \right) , \nonumber \\&g_t = \left\{ \begin{array}{ll} 1/N &{} (t_I< t \le t_I+N)\\ 0 &{} (t_I+N < t \le t_F). \end{array} \right. \end{aligned}$$
(6)

Operator \(\mathcal M_{t_I,t}\) denotes a nonlinear model, \(\mathbf x_{t_I}\) is the model state at the initial time step corresponding to the final state of the previous assimilation window, and \(\Delta \mathbf x_t^\mathrm{cor}\) represents the correction term of IAU, which works during the initialization period from \(t_I\) to \(t_I+N\). It should be noted that an Incremental Digital Filtering (IDF; Polavarapu et al. 2000) works on adjoint variables during the backward integration of the adjoint model, because IDF is the adjoint of IAU (Usui et al. 2015).

To reduce model bias the background state \(\mathbf {x}^b\) is determined by

$$\begin{aligned} \mathbf {x}^b = r_c\ \mathbf {x}_\mathrm{clim} + (1-r_c)\ \mathbf {x}_{t_I}, \end{aligned}$$
(7)

where \(\mathbf x_\mathrm{clim}\) is the climatology and \(r_c\) is the weight for the climatology. The weight is set to be 0.02, which is equivalent to a restoring time scale of about 16 months. Note that if \(r_c\) is zero \(\mathbf {x}^b\) becomes \(\mathbf x_{t_I}\).

The ocean data assimilation for FORA-WNP30 is carried out according to the following steps (see Fig. 1): (1) MOVE-3DVAR is conducted to estimate the increment \(\Delta {\mathbf x}^f\); (2) MOVE-4DVAR is performed by using \(\Delta {\mathbf x}^f\) as the first guess of the increment. In both MOVE-3DVAR and MOVE-4DVAR, the same observation data within the assimilation window are used. In order to minimize the cost function, the preconditioned optimizing utility for large-dimensional analyses (POpULar; Fujii 2005) is adopted in the MOVE system. POpULar is based on a nonlinear preconditioned quasi-Newton method and can minimize a nonlinear cost function without inversion of a nondiagonal background error covariance matrix.

After the step (2), the forward model is again integrated using the increment estimated by MOVE-4DVAR. During this model integration, ice concentration data are assimilated. The sea ice assimilation scheme is based on a simple least squares analysis and nudging. The least squares analysis follows a similar method of Lindsay and Zhang (2006). The analysis value of the ice concentration at the ith horizontal grid \(C_i^a\) is written by

$$\begin{aligned} C_i^a = C_i^f + K_i \left( C_i^o - C_i^f \right) , \end{aligned}$$
(8)

where \(C_i^f\) is the first guess obtained from the model results, \(C_i^o\) is the observation interpolated into the model grid, and \(K_i\) is the weighting factor, which determines the strength of the analysis increment. The main target of the sea ice assimilation in FORA-WNP30 is changes in sea ice distribution in the Okhotsk Sea north of Hokkaido Japan, where a large seasonal change in ice area is apparent. In order to assign a large value to the weighting factor around the sea ice edge, we assume that the weight depends on mean ice thickness for each grid (ice volume divided by grid area). To be specific, the weight is set to be 1.0 (0.0) when the mean thickness is less than 0.3 m (greater than 2.0 m) and decreases as the thickness becomes larger when the thickness is between the two values.

The model ice concentration is then nudged to the analysis result with a restoring time scale of 1 day. During the nudging procedure, ice thickness is also corrected so that the volume of sea ice is conserved except for cases of adding (production) or removing (melting) ice (Usui et al. 2010). When changing the ice volume because of the nudging correction, water fluxes corresponding to the changes in ice volume are transferred to the ocean model after conversion to salt fluxes.

3 Reanalysis settings

This section describes experimental settings for FORA-WNP30, which are summarized in Table 1. The FORA-WNP30 dataset covers the period 1 January 1982–31 Decmber 2012. The assimilation window and the IAU period in MOVE-4DVAR are set to 10 and 3 days, respectively. We assimilate in-situ temperature and salinity profiles above 1500 m, gridded SST, altimeter-derived sea surface height (SSH) and sea ice concentration. The temperature and salinity profiles are collected from the World Ocean Database 2013 (WOD13; Boyer et al. 2013) and the Global Temperature Salinity Profile Program (GTSPP; Hamilton 1994). The gridded SST data are Merged satellite and in situ Global Daily Sea Surface Temperature (MGDSST, Kurihara et al. 2006) and have a horizontal resolution of 1/4\(^\circ \times 1/4^\circ\). The altimeter-derived SSH observations are AVISO along-track multimission products for TOPEX/Poseidon, Jason-1/2, ERS-1/2, Envisat, GFO and Cryosat-2 (AVISO 2015). The sea ice concentration observations used for the ice assimilation are gridded data with a horizontal resolution of 1/4\(^\circ \times 1/4^\circ\) derived from the Defense Meteorological Satellite Program (DMSP) special sensor microwave imager (SSM/I).

Table 1 Overview of the FORA-WNP30 dataset

Figure 2 shows the time series of the number of assimilated observations in each assimilation window. The SSH data have been available since 1993, and the number is closely linked to the number of satellites (Fig. 2b). The temperature and salinity data (sum of satellite-based MGDSST and in-situ profile observations) are almost constant before 2003. Despite the slight increase of the number of observation points (blue line in Fig. 2c), the number of temperature and salinity data becomes about double in 2008 compared to the level of 2003, indicating the large contribution of Argo floats. The total number of observations shows the gap between 1992 and 1993 due to the introduction of a satellite altimeter and continuously increases after the 2000s (Fig. 2a).

Fig. 2
figure 2

The time series of the number of assimilated observations in each window. a Total, b SSH, c sum of SST and profile data (black line) and number of observation points (blue line)

It should be noted that in-situ temperature and salinity profiles are assimilated after vertical interpolation from observed depth levels to model vertical levels (37 levels in the top 1500 m). The number of vertical levels for each profile is taken into consideration in the number of observations shown by the solid black line in Fig. 2. For example, it is counted as 37(1) for a temperature profile from the surface to 1500 m (SST, i.e., available data are only at the top layer). In contrast, the number of observation points shown by the solid blue line in Fig. 2c does not count the number of vertical levels and hence corresponds to the number of observation stations. To check the horizontal distribution of observation points, in Fig. 3 we plot the density distribution of observation points in each \(5^\circ \times 5^\circ\) grid, which are calculated every 5 years. The figure indicates that there are relatively dense in-situ observations around Japan in the 1980s although for many of them the observation depths are limited to the surface layers. On the other hand, in-situ observation points in the open ocean, which are sparse in the 1980s and 1990s, have become much denser since the 2000s because of the deployment of Argo floats.

Fig. 3
figure 3

Geographical distribution of the average station number of in-situ observations per 10 days in each 5° grid. Means of every 5 years are presented in each panel

The monthly T-S EOF modes and their related statistics such as standard deviations of background errors and singular values of the T-S EOF modes are calculated from historical T-S profiles. We use 12 dominant modes for calculating the background term of the cost function in both MOVE-3DVAR and MOVE-4DVAR, which explain more than 90 % of the total variance. The mean SDH \(\overline{\mathbf h}\) in (1) is calculated by the following procedure. First, we produce monthly mean temperature and salinity fields during 1993–2012, corresponding to the reference period for the AVISO SLA observations. The monthly fields are obtained from a 3DVAR analysis using in-situ T-S profiles and monthly mean climatology from World Ocean Atlas 2013 (WOA13; Locarnini et al. 2013; Zweng et al. 2013) as a first guess. Second, monthly SDH fields are calculated from the monthly T-S fields using (3). Then, we obtain the mean SDH by averaging the monthly SDH fields over the period 1993–2012.

The MOVE-4DVAR assimilation scheme described in the previous section is based on the so-called strong constraint 4DVAR, in which it is assumed that model errors originate from the initial condition of its 10-day assimilation window, and the model physics and forcing field do not give rise to any errors. Our former studies have shown that MRI.COM-WNP, the forward model in MOVE-4DVAR, has a good performance in reproducing mesoscale features such as Kuroshio meanders and mesoscale eddies as well as large-scale circulation patterns (e.g., Tsujino et al. 2006; Usui et al. 2006). In order to reduce the model error, we also performed 3DVAR analysis over the North Pacific model from which the boundary values are provided to the western North Pacific model. The model resolution (\(\sim\)10 km), however, is not sufficient to represent the coastal phenomena, and it does not include tidal forcing. We therefore apply a special treatment to in-situ observations in coastal regions with bathymetry less than about 1000 m. To be specific, the observation time of the profile data within an assimilation window in the coastal region is set to the initial time of the assimilation window, indicating that those observations are assimilated through 3DVAR.

The JRA-55 atmospheric reanalysis product (Kobayashi et al. 2015) is used for the forcing ocean model. Surface atmospheric parameters originally defined on a reduced Gaussian grid (TL319) are interpolated to the regular longitude grid of about 0.56° resolution. Further, 6 hourly data are averaged to produce daily data. In order to avoid unphysical atmospheric forcing in the coastal zone due to the discrepancy in land mask definitions between JRA55 and FORA-WNP, JRA55 ocean grid data are saturated to the land grid by successive applications of the Laplacian operator. Since in general atmospheric reanalysis products have relatively large errors in precipitation, the freshwater flux is corrected by restoring sea surface salinity toward the monthly mean climatology with a time scale of 1 day to reduce a model bias.

The initial condition on 1 January 1982 was obtained from a preliminary experiment, in which the ocean model is spun up from an initial state of no motion with climatological temperature and salinity for about 20 years using climatological forcing, and subsequently data assimilation of in-situ profile data is conducted with MOVE-3DVAR using the forcing of JRA-55 during the period 1971–1981.

One of the biggest technical challenges in deriving a long-term reanalysis product using an advanced data assimilation system is its high computational cost. In the FORA system, the original data assimilation package, MOVE, is tuned to give the best performance on the specific high performance computational platform, NEC SX-ACE. About 12 % reduction of computational time relative to the original MOVE system is marked. Further, the FORA analysis system adapts the hybrid approach of 3DVAR and 4DVAR in optimizing its control parameters for computational efficiency as mentioned before. The upper limit in the number of iterations in the descent algorithm is set to 10 in 3DAR and 30 in 4DVAR. The upper limit of 3DVAR iterations is reduced to 9 when excessive reduction of the cost function due to a local minimum is observed. On average, the first 3DVAR analyses mark about 20 % reduction of the cost function, and the following 4DAR analyses mark about 40 % reduction, implying that 4DVAR analysis fields have higher accuracy than 3DVAR.

4 General features of the reanalysis data

4.1 Comparison with independent observations

4.1.1 Current field

To examine the performance of the FORA data, the current field of FORA is compared to the observations by the JMA research vessels. The current data are not assimilated in the system. We use the current observations during 2001–2012 obtained by the acoustic Doppler current profiler (ADCP) installed in two JMA research vessels, R/V Ryofu Maru and R/V Keifu Maru. To correct misalignment of the transducer and/or gyro settings and reduce other uncertainties, the ADCP data were processed by the CODAS software system, which was developed at the University of Hawaii. The ensemble averaging interval is 300 s. To exclude tidal current in the observations, we estimate barotropic tidal currents using the OSU tidal data inversion global model of “TXPO7.2” (Egbert and Erofeeva 2002) and subtract it from the ADCP data.

In addition to the FORA data, we evaluate another reanalysis product, which was recently produced at JMA using the MOVE-3DVAR system. As mentioned before, MOVE-3DVAR employs the same ocean model (MRI.COM-WNP) as that used in FORA, and hence the 3DVAR product has the same horizontal resolution. The atmospheric forcing used is JRA-55, which is the same as in FORA except for 3-hourly temporal resolution. Assimilated observations in MOVE-3DVAR are also basically the same as those used in FORA except for in-situ temperature and salinity profiles. In the 3DVAR product, in addition to in-situ T-S observations obtained from WOD13 and GTSPP, T-S observations uniquely collected at JMA are assimilated. Thus, differences in assimilated fields between FORA and MOVE-3DVAR mainly originate from the difference in the analysis schemes used.

All ADCP data at 100 m depth during the cruise (ship speed exceeds 10 knot) are compared to the same position of FORA and MOVE-3DVAR, which are linearly interpolated from four nearest data grid. Table  2 shows the current field statistics in the six areas shown in Fig. 4, i.e., the averaged current velocity and its standard deviation, correlation coefficient of current velocity, slope of the regression line, standard deviation of the reanalysis data from regression line, and standard deviation of the current direction difference between the reanalysis data and ADCP data. Both FORA and MOVE-3DVAR show high correlations with the ADCP current velocity in South of Honshu, the East China Sea and Kuroshio extension region and relatively low in the Japan Sea and Oyashio region. In each region, the correlation coefficient of the current velocity of FORA is higher than that of MOVE-3DVAR, and those differences are statistically significant at the 99 % confidence level. As for standard deviations of difference from the regression line and the difference of the current direction, those of FORA are smaller than those of the 3DVAR product in most regions. No statistic of FORA is significantly worse than that of MOVE-3DVAR.

Table 2 Current field statistics of reanalysis data (FORA and 3DVAR system) at 100 m depth in each area in Fig. 4 compared to the observed data in terms of the averaged current velocity and its standard deviation, correlation coefficients, slope of regression line, standard deviation of the reanalysis data from regression line and standard deviation of the current direction difference
Fig. 4
figure 4

Map of areas to calculate current field statistics shown in Table 2. A snapshot of the FORA daily current field (unit in cm s\(^{-1}\)) at 100 m depth is also shown. Area names are as follows: a East China Sea, b South of Honshu, c Kuroshio extension region, d mixed water region, e Oyashio region and f Japan Sea

In April 2011, a warm eddy exists off Sanriku at around 39°N and 143°E, and R/V Keifu Maru had cruised around this area to deploy and recover an ocean bottom seismograph. Figure 5 shows the comparison of reanalysis data with the ADCP data of R/V Keifu Maru during 11–20 April 2011. In Fig. 5, locations of assimilated observation data during April 11–20 are also plotted. The data density seems to resolve the mesoscale eddies in the area well. The FORA data reproduced the current field around the warm eddy well, especially in the southwest part. The 3DVAR product also reproduced the warm eddy field; however, the detailed shape of the eddy (for example, around 38.5°N and 143.3°E) is not well matched with the ADCP data. Figure 6 shows the scatter diagram of the current velocity and the histogram of difference of the current direction between the reanalysis data and ADCP data. The results show the current velocity and direction of the FORA data are reproduced much better than those of MOVE-3DVAR.

Fig. 5
figure 5

Comparison of reanalysis data with ADCP data for the a FORA and b 3DVAR system. Streamlines (light blue) are the current field at 100 m depth of reanalysis for the daily mean of 16 April 2011. Sticks (red) are the current vector of ADCP data for 11–20 April 2011 observed by R/V Keifu Maru. Black dot indicates the origin of a vector and is plotted at the ship position of each datum. Assimilated observation during 11–20 April 2011 is also shown as an orange circle (in-situ temperature and salinity), blue circle (in-situ temperature) and purple cross (SSH)

Fig. 6
figure 6

Comparison of the current velocity of reanalysis data and ADCP data during 11–20 April 2011 in the area shown in Fig. 5 for a the FORA and b 3DVAR system. A linear regression is plotted as a black line in each panel. Histograms of the direction differences between reanalysis and ADCP data in 5° bins for c the FORA and d 3DVAR system. The average difference is shown as a black line in each panel. Statistics are also shown in the each panel

4.1.2 Sea level

We next evaluate sea-level variability in FORA using tide gauge observations, which are also not used in data assimilation. Daily mean sea-level data during the period of 2001–2010 at 63 tide gauge stations (see Fig. 7 for their locations) are used for the evaluation. Tidal variations consisting of diurnal and semi-diurnal signals are eliminated by a tide-killer filter with a cutoff period of 48 h designed by Hanawa and Mitsudera (1985). In addition, inverted barometer (IB) correction represented by

$$\begin{aligned} (\mathrm {IB})=(P-P_\mathrm{ref})/\rho g \nonumber, \end{aligned}$$

is applied to the daily mean sea level, where \(\rho\) is the water density, g is the acceleration of gravity, P is sea level pressure (SLP), and \(P_\mathrm{ref}\) is the mean SLP over the entire ocean. We use daily-mean SLP observed at a meteorological weather station close to each tide gauge station. The mean SLP over the entire ocean is calculated using SLP derived from JRA-55.

Figure 7a shows the root-mean square difference (RMSD) between the FORA SSH field and tide gauge observations during the period of 2001–2010. The matchup between the tide gauge observations and FORA data is taken after linear interpolation of the FORA SSH field into the location of each tide gauge station. In order to minimize the ground deformation effect in the observed sea level, RMSD is calculated with sea-level anomalies from the annual mean for each year. The calculated RMSD is relatively large along the south coast of Japan, where natural sea-level variability is relatively high because of variations in the Kuroshio Current flowing along the coast (e.g., Kawabe 1980). In other regions, it is largely less than 4.5 cm except for some specific tide gauge stations. The largest RMSD occurs at Hachijojima Island (RMSD: 7.5 cm) located close to the Kuroshio Current axis, but it is sufficiently small compared to sea level variance there (35.9 cm), which is mainly caused by Kuroshio path variations.

Fig. 7
figure 7

a RMSD between the FORA SSH field and tide gauge records at 63 stations (unit in cm). b Improve the ration (%) of sea-level variability in FORA defined as the relative difference in SSH RMSD for FORA and MOVE-3DVAR (see text for detailed definition). Black dotted lines indicate the climatological annual mean SSH field with an interval of 10 cm

We then calculate the RMSD for MOVE-3DVAR during 2001–2010 based on the same manner as that for FORA and compare two RMSDs. To make it easy to evaluate how much the 4DVAR scheme used in FORA improves the sea level variability compared to the 3DVAR product, the improve ratio (IR) defined by

$$\begin{aligned} (\mathrm{IR}) = \frac{(\mathrm {RMSD})_\mathrm{3DVAR} - (\mathrm{RMSD})_\mathrm{FORA}}{(\mathrm{RMSD})_\mathrm{3DVAR}} \times 100 \nonumber, \end{aligned}$$

is plotted in Fig. 7b. The calculated IR is positive at most tide gauge stations, indicating that the FORA result improves sea-level variability in most areas compared to the 3DVAR results. At tide gauge stations located in the open ocean such as Hachijojima and Chichijima, the IR is relatively high, and some of them exceed 10%. There are however several stations showing negative IR although their values are relatively small. The IR at tide gauge stations surrounded by complex coastal topography (e.g., the inner bay) or near the tip of a peninsula tends to be small. This is probably because representation of the coastal topography of the 10 km model used in both FORA and 3DVAR is insufficient.

To further look into the accuracy of the FORA result, 1-year time series of sea level in 2010 (2009 only for Tsushima) at six tide gauge stations are shown in Fig. 8. For comparison, we also plot the sea level derived from the AVISO daily gridded multi-satellite product with a horizontal resolution of 0.25° (AVISO 2015). Note that the AVISO data are not shown for the tide gauge station of Abashiri because grid point values adjacent to the station are all missing values. RMSDs of FORA, MOVE-3DVAR and AVISO data for the corresponding year are depicted in the figure.

The calculated RMSDs for FORA are smaller than not only those for MOVE-3DVAR, but also those for AVISO at all tide gauge stations except for Tsushima where FORA has the largest RMSD among the three datasets. Although RMSDs in Fig. 8 are calculated for the specific year (2010 or 2009), we confirmed that the results are robust by calculating the RMSD for 2001–2010. Comparison of the time series indicates that the FORA result reproduces features of sea level variability observed by the tide gauges well. At Hachijojima Island, where the variance of the sea level is the largest among the six stations, short-term variability on the time scale of about a month is enhanced. This variability, mainly due to short-term fluctuations of the Kuroshio path propagating to the downstream, is well represented in FORA. In contrast, temporal variability in AVISO is too small, and consequently it fails to represent the short-term sea level variations (for example, September–October 2010 in Fig. 8b).

Fig. 8
figure 8

Time series of SLAs at six tide gauge stations: a Abashiri, b Hachijojima, c Amami, d Chichijima, e Tsushima and f Sado (see Fig. 7a for their locations). The SLAs here denote anomalies from annual mean sea level for each year. The time series except for Tsushima are from 1 January through 31 December 2010. That at Tsushima is for 2009 because available data in 2010 are limited. Four lines indicate time series for (gray) the tide gauge, (green) AVISO, (red) FORA and (blue) 3DVAR. Note that time series of AVISO data are not shown for Abashiri because grid point values adjacent to the tide gauge station are all missing values. RMSDs of SLAs for AVISO, FORA and 3DVAR calculated with tide gauge observations for 2010 (or 2009 at Tsushima) are shown by green, red and blue numbers (unit in cm)

At Amami and Chichijima, which are located in the open ocean, variations in sea level on the time scale of a few months are predominantly caused by westward propagating mesoscale eddies. Such variations of sea level in FORA compare very well with the tide gauge observations although MOVE-3DVAR and AVISO also capture basic features of the observed sea level variability, implying that mesoscale eddies are properly assimilated in FORA. Those features are consistent to Usui et al. (2015), in which it was shown that the 4DVAR scheme provides better representation of mesoscale variability compared to 3DVAR.

Sea levels at Abashiri, Tsushima and Sado, which are located in marginal seas such as the Japan Sea and the Okhotsk Sea, exhibit high-frequency variability on a time scale shorter than 1 month, which is known to be induced by atmospheric forcing (Fukumori et al. 1998). Sea levels for FORA and MOVE-3DVAR show similar high-frequency variability, but it is not represented in AVISO. At the tide gauge station of Tsushima, which is located at Tsushima Island, the IR is the lowest among the 63 tide gauge stations. Sea levels at Tsushima not only for FORA but also for MOVE-3DVAR and AVISO show systematic bias compared to the tide gauge observation (Fig. 8e). Tsushima Island, which is not resolved in the AVISO data, is situated at the Tsushima Strait, and the bottom topography around the island is complicated. Thus, the tide gauge record at Tsushima might be affected by local processes. In addition, the bathymetric data are uncertain around the island (Hirose 2005). These might lead to the systematic bias.

4.2 SST and SSH fields

In this section, we evaluate SST and SSH fields in FORA using satellite-derived observations. Figure 9a, b compares mean and standard deviation of FORA SST during the period of 1982–2012 with MGDSST. The annual mean SST in FORA compares well with that for MGDSST. Taking a difference of the mean SST between FORA and MGDSST, we however recognize a weak cold bias over a wide area. We have confirmed that a similar bias pattern appears in MOVE-3DVAR as well as in a free-simulation experiment using the same model and the same atmospheric forcing (not shown), implying that this cold bias originates from the atmospheric forcing or errors in model physics. It is also worth noting that a warm bias is recognizable along the Kuroshio in the East China Sea and south of Japan. In SST fields, the Kuroshio is recognized as a warm tongue, which especially emerges in winter (Kida et al. 2015). The FORA SST field shows a realistic representation of the warm tongue compared to satellite images (not shown). In contrast, it is modest in MGDSST because the horizontal resolution is coarse compared to FORA, and the dataset is produced by an objective analysis without a dynamic model. Therefore, this warm bias is caused by the difference in representation of SST fronts around the Kuroshio between FORA and MGDSST.

Fig. 9
figure 9

(Contours) annual mean and (shades) standard deviation of SST for a MGDSST and b FORA. c Mean difference in SST between FORA and MGDSST. These values are calculated using the SST fields during the period from 1982 to 2012. Units in °C

The observed feature of the variance field is also well captured in FORA. The SST fields for both FORA and MGDSST show large variability in higher latitudes, shelf regions and frontal regions (Fig. 9). The variability in higher latitudes and shelf regions is due to large seasonal changes (e.g., Levitus 1987). In frontal regions such as the Kuroshio-Oyashio confluence region east of Japan, on the other hand, ocean current variability is considered to have a large impact on variations in SST. The current variability is caused by energetic mesoscale eddy activity associated with meanders of the Kuroshio Extension Current and pinch-off eddies. SST variability and its relation to current variability around the frontal regions in the western North Pacific are well summarized in Kida et al. (2015).

In order to evaluate mesoscale eddy variability with a time scale longer than 1 month, the SSH field in FORA is then compared to the AVISO SSH field. We take correlation between FORA and AVISO using monthly mean SSH fields in 1993–2012. For comparison, correlation between MOVE-3DVAR and AVISO is also calculated. Two correlation maps are compared in Fig. 10. For both FORA and MOVE-3DVAR, correlation exceeds 0.7 in most areas in the western North Pacific except for some seasonal sea ice areas. Focusing on an area with correlation greater than 0.9, it is widely distributed in the subtropical gyre for FORA, while it is limitted in the western part of the subtropical gyre in MOVE-3DVAR. Compared between the two maps, the correlation for FORA is much improved in the Kuroshio Extension and mixed water region off the east of Japan, where mesoscale eddy activity is the most enegetic in the western North Pacific. Therefore, the higher correlation implies better representation of mesoscale variability in FORA.

Fig. 10
figure 10

Correlation maps of SSH fields between AVISO and the reanalysis product: a FORA and b 3DVAR. Monthly mean SSH fields from January 1993 to December 2012 are used for calculation of the correlation

In the subarctic region of the North Pacific where barotropic responses to atmospheric forcing are enhanced (Kuragano et al. 2014), the correlation is not so high compared to that in the lower latitudes. This is probably because the observation operator \(\mathcal H\) for altimeter-derived SLA observations in (2) does not take into consideration the barotropic component of sea level changes. Therefore, a refinement of the observation operator for SLA observations could lead to further improvement of the SSH field.

4.3 Mixed layer depth and water mass properties

The ocean surface mixed layer directly communicates with the atmosphere. It provides the boundary condition (e.g., sea surface temperature) to the atmosphere, and, on the other hand, water masses are generated in the mixed layer under the atmospheric forcing and subducted to the interior ocean. Therefore, spatial and temporal variability of the surface mixed layer depth (MLD) and water masses is one of the important metrics for assessing the reproduction of the surface layer variability (e.g., Toyoda et al. 2015a). In addition, it has been indicated that smaller scale (eddy) processes are essential for the reproduction of MLD and surface-intermediate water masses (e.g., Ishizaki and Ishikawa 2004; Tsujino and Yasuda 2004; Nishikawa et al. 2010). Thus, the present product (FORA-WNP30) is expected to effectively reproduce the MLD-water mass variability by adopting a high-resolution OGCM.

Figure 11 presents the distributions of the winter- and summertime MLDs. MLD is defined in this study as the depth where potential density exceeds the 10-m depth value by 0.125 kg m\(^{-3}\) (e.g., Levitus 1982). The mean MLD distribution from FORA (Fig. 11a, c) captures the gross features of the observational MLD climatologies (Fig. 11b, d), although the wintertime MLDs are slightly deeper in the former case. At least in terms of the climatology, the seasonal cycle of the MLD distribution in the western North Pacific is, therefore, well reproduced in FORA. The horizontal patterns of the wintertime MLD differences between the composites in the positive and negative PDO (Pacific decadal oscillation; Mantua et al. 1997) phases are shown in Fig. 11e, f. Our result indicates large positive MLD anomalies in the central subtropics. Figure 11f shows that this pattern is supported by the observational estimate by using the recent Argo observations (MILA-GPV; Hosoda et al. 2010). Therefore, FORA represents the overall MLD variability on the interannual-decadal scale as well as the seasonal scale.

Fig. 11
figure 11

ad MLD distributions in March (a, b) and September (c, d) from FORA averaged over the 1982–2012 period (a, c) and from MILA-GPV climatology data. (e, f) Differences between the March MLD composites in the positive (2003, 2005 and 2010) and negative (2006, 2007, 2009) PDO phases from FORA (e) and from MILA-GPV monthly data (f)

Figure 12 compares cross sections of the 31-year (1982–2012) mean climatological temperature and salinity along 160.5°E and 179.5°W of the FORA-WNP30 with those of the latest observational climatology (WOA13). The temperature section along 160.5°E (Fig. 12a–c) contains the central-eastern part of Subtropical mode water (STMW; Masuzawa 1969) (about 16–19 °C), which originates from the Kuroshio water. The FORA result reproduces the subsurface temperature structure including STMW, although a small warm bias in the upper thermocline region (upper 200–300 m) is found (Fig. 12c). The salinity section along 179.5°W (Fig. 12d–f) includes the salinity minimum structure of North Pacific intermediate water (NPIW; Talley 1993) (green shades in Fig. 12d, e), which is generated in the Okhotsk Sea and Kuroshio-Oyashio interfrontal zone (e.g., Yasuda et al. 1996). Note that the realistic reproduction of the NPIW distribution is known to be a difficult issue in model simulation (e.g., Ishizaki and Ishikawa 2004). The FORA result reproduces this structure well.

Fig. 12
figure 12

Climatological mean temperature [°C] along 160.5°E for a FORA and b WOA13. c Difference (shades) between a and b. Contours in c are the same as in a. df Same as in ac but for salinity [psu] along 179.5°W. The FORA values are calculated using the fields during the period from 1982 to 2012

Figure 13 presents the time series of annual mean area and temperature of STMW in the vertical cross section along the 137°E line from the FORA results (red lines) and observations (black lines). Here, the observations are based on the long-term ship observation dataset (temperature and salinity, from 1971 to 2011) along 137°E by the Japan Meteorological Agency (JMA). STMW is defined by the following criteria: potential vorticity (PV) value is less than \(2.1 \times 10^{-10}\) m\(^{-1}\) s\(^{-1}\) with potential density range from 24.8 to 25.8 (see the figure caption for details). The observational results show decadal and interannual variability of the STMW area and temperature along 137°E. The FORA results reproduce these variabilities fairly well, although there are some differences (e.g., in 2003). The differences may partly come from temporal and spatial inhomogeneity of the observational data. Previous studies indicated that the decadal variability of STMW is closely related to that of the Kuroshio Extension (KE) state (e.g., Oka et al. 2015). Therefore, the good reproduction of the KE state in the present high-resolution system as described later (Sect. 5.2) would contribute to the above-mentioned reproduction of the STMW features.

Fig. 13
figure 13

Time series of annual STMW a area [km\(^2\)] and b area-mean temperature [°C] along 137°E. Here, STMW is defined as the water with potential vorticity (PV) less than \(2.1 \times 10^{-10}\) m\(^{-1}\) s\(^{-1}\) and potential density range from 24.8 to 25.8 kg m\(^{-3}\) in the latitudes 23°–33°N and the depths 91–570 m. PV here is simply calculated from the vertical density gradient and the Coriolis parameter. Red lines are from the FORA results, and black lines are from the JMA’s 137°E line ship observation dataset (temperature and salinity)

Overall, FORA-WNP30 reproduces the climatological structures and decadal–interannual variability of MLD and major water masses in the North Pacific well.

5 Interannual to decadal variability and anomalous events in the seas around Japan

5.1 Kuroshio south of Japan

It is well known that the Kuroshio south of Japan takes different paths: the straight path and the large meander (LM) path (Kawabe 1995). The straight path is sometimes called the non-large meander (NLM) path. This bimodal feature is unique to the Kuroshio and is not observed in other western boundary currents such as the Gulf Stream and Agulhas Current.

In order to evaluate the Kuroshio path variations south of Japan, we plot in Fig. 14 the time series of the Kuroshio southernmost latitude defined between 136°–141°E. The Kuroshio latitude of JMA data in Fig. 14 is subjectively determined on the basis of various information such as satellite SST images, tide gauge data, objectively analyzed subsurface temperature distribution using hydrographic observations and recent MOVE-WNP assimilation model results. The southernmost latitude for FORA is identified by tracking the maximum velocity at 100 m. During the reanalysis period, four LM events occurred, and their periods are depicted by gray shades.

Fig. 14
figure 14

Time series of the southernmost latitude of the Kuroshio path south of Japan defined between 136°E and 140°E. Black line is based on JMA data obtained from the web site at http://www.data.jma.go.jp/gmd/kaiyou/data/shindan/b_2/kuroshio_stream/kuro_slat.txt, and red line represents the time series calculated from FORA monthly fields. Gray shading indicates large meander periods

The Kuroshio path variation represented by FORA compares quantitatively well with the JMA data. It is worth emphasizing that FORA captures the observed features well, even in the 1980s, when there is no altimeter data. One reason for the precise representation of the Kuroshio path variation in the 1980s would be the relatively dense in-situ observations around the Japanese coasts as described in Sect. 3.

Formation of past LMs are also well represented in FORA. As an example, the time evolution of the Kuroshio path during formation of the LM in 1986–1988 is shown in Fig. 15. A small-scale meander first appears off the southeast of Kyushu (Fig. 15a). Then, the meander moves eastward and finally develops into a LM path. This feature is consistent with that indicated by previous studies addressing the LM formation (e.g., Usui et al. 2008). In addition, a sizable anticyclonic eddy appears in the deep layer around 3000 m during the formation of this LM (not shown). Previous modeling studies have indicated that the LM is caused by baroclinic instability, and a deep-anticyclonic eddy is formed below the Kuroshio because of the baroclinic instability (Endoh and Hibiya 2001; Tsujino et al. 2006). Here it should be noted that the 4DVAR system used in FORA controls the temperature and salinity fields above 1500 m as described in Sect. 2, indicating that the deep-anticyclonic eddy represented in FORA is generated through the model dynamics.

Fig. 15
figure 15

Time evolution of the FORA SSH field during the formation stage of the Kuroshio large meander in 1986–1998 (unit in cm). Red arrows indicate a meander of interest that first appears southeast of Kyushu, moves eastward and finally develops into the large meander. The location of the ASUKA line, which is used for evaluation of the Kuroshio transport in Fig. 16, is shown in a. Note that the ASUKA line extends southward to 26°N

We next look at the Kuroshio transport across the ASUKA (Affiliated Surveys of the Kuroshio off Cape Ashizuri) line off Shikoku, which coincides with a subsatellite track of TOPEX/Poseidon and Jason-1/2 (see Fig. 15a for geographical location). Figure 16 compares the time series of the Kuroshio transport above 1000 m across the ASUKA line between FORA and an altimeter-based transport estimate, which is deduced from sea-level difference across the Kuroshio observed by TOPEX/Poseidon and Jason-1 (Imawaki et al. 2001).

We calculated an eastward and throughflow transport of the Kuroshio in a similar way to the method of Imawaki et al. (2001). As can be found in Fig. 15, a stationary anticyclonic circulation exists to the south of Shikoku. The eastward transport includes the contribution of the stationary anticyclonic circulation. In contrast, the throughflow transport is calculated by subtracting a westward transport associated with the anticyclonic circulation from the eastward transport. It can thus be regarded as the net Kuroshio transport.

Fig. 16
figure 16

Time series of the Kuroshio a eastward and b throughflow transport above 1000 m across the ASUKA line. See text for definition of the two transports. Black lines indicate transports estimated from satellite altimeter data (Imawaki et al. 2001), and red lines are FORA results. Mean transport for each time series averaged over October 1992–March 2010 is also shown

Both time series of the eastward and the throughflow transport for FORA are in good agreement with the observation. The eastward transport, however, has a somewhat large bias. The mean eastward transports averaged from October 1992 to March 2010 are 61.0 and 67.2 Sv for observation and FORA, respectively. This bias reduces for the throughflow transport, implying that the bias originates in a strong anticyclonic circulation off Shikoku. The assimilation scheme for altimeter-derived sea-level anomalies could be one possible cause for this bias. As described in Sect. 2, the observation operator for altimeter data in MOVE-4DVAR takes model-data misfits in terms of surface dynamic height, which is calculated with a reference depth of 1500 m. The reference depth might not be appropriate for the stationary anticyclonic circulation, which has a deep vertical structure (e.g., Kobashi and Hanawa 2004). At the same time, the observation operator does not take into consideration ocean mass-related sea-level changes. Thus, improving the assimilation scheme for altimeter-derived sea-level anomalies might lead to a reduction of the bias for the anticyclonic circulation off Shikoku.

5.2 Kuroshio extension

The Kuroshio extension is an eastward-flowing free jet after the Kuroshio separates from the Japanese coast. The KE region exhibits the highest mesoscale eddy variability in the North Pacific. In addition, low-frequency changes on a decadal time scale are also one of prominent features of the KE jet variability. The KE decadal variability is considered to be remotely forced by the large scale wind stress field in the North Pacific (Qiu 2003; Taguchi et al. 2007).

In order to look into the KE variability represented by FORA, we plot in Fig. 17 the time series of three indices for the KE jet. For comparison, indices calculated from the AVISO altimetry data are also shown. All indices for both FORA and AVISO are calculated on the basis of the monthly mean fields. The latitudinal position of the KE jet in Fig. 17a is determined by zonally (141°–153°E) averaging latitudinal positions defined by the maximum surface velocity. The path length of the KE jet, which characterizes the KE dynamic state, is calculated by tracking the maximum surface velocity (141°–153°E). The eddy kinetic energy (EKE) is calculated using the surface velocity anomaly and is averaged over 141°–153°E and 32°–38°N. The surface velocity anomaly used for the EKE calculation is defined as a deviation from the climatological mean state from 1993 to 2012 for both FORA and AVISO.

Fig. 17
figure 17

Time series of three indices for the KE state. a KE latitude defined as a zonal average of the latitudinal position of the KE jet between 141°E and 153°E, b KE path length integrated from 142\(^\circ\)E to 153°E and c area-averaged EKE within 141°–153\(^\circ\)E and 32°–38°N. Black and red lines denote indices calculated from AVISO data and FORA, respectively. Time series of SSH anomaly averaged over the KE southern recirculation region (140°–155\(^\circ\)E, 30°–35°N) with a 5-month running mean is also shown by the gray dotted line in (a)

The KE variability on interannual to decadal time scales in FORA is in good agreement with that in the AVISO data. The meridional shift of the KE jet in Fig. 17a on decadal time scales is synchronized with the decadal changes in the strength of the KE southern recirculation gyre characterized by the area-averaged SSH anomaly (see the gray dotted line in Fig. 17a), suggesting that the northward (southward) shift of the KE jet is accompanied by a strong (weak) recirculation gyre. The time series of the path length of the KE jet shown in Fig. 17b also exhibits distinct low-frequency changes, which largely correlate with the meridional shift of the KE jet. When the KE jet migrates southward, the path length tends to be large, meaning a spatially convoluted KE path (unstable state). In contrast, the KE path length tends to be small, that is, KE is in a stable state when the KE jet migrates northward. These features of the KE variability in FORA are consistent with findings in a number of previous studies (e.g., Qiu and Chen 2005; Nakano and Ishikawa 2010). Furthermore, the time series of the KE latitude and path length in the 1980s, when there are no altimeter data, also seems to be consistent with a time series by Seo et al. (2014), who inferred the KE dynamic state using the wintertime SST gradient and constructed the time series of the KE latitude and the path length from 1982 to 2011 (see Fig. 4 in Seo et al. 2014). We, however, need further evaluation of the KE variability especially in the 1980s, which remains for future studies.

We finally mention the continuity of the FORA data in the KE region. Figure 17c shows the time series of EKE in the KE region, which reflects mesoscale eddy activity. A distinct gap is recognizable in 1993, when the satellite altimeter data begin to be assimilated. The EKE level before 1993 is obviously low compared to that during the altimeter era. A similar EKE gap can be seen in the Subtropical Countercurrent region (not shown), where mesoscale eddy activity is high as well as the KE region. We therefore need to pay special attention to the energy gap when looking into the energy field, and reduction of the gap should be the subject for future studies.

5.3 Oyashio

The Oyashio Current, the western boundary current of the wind-driven subarctic gyre in the North Pacific Ocean, flows southwestward along the Kuril Islands and the southeast coast of Hokkaido (see Fig. 25b for geographical location) and then splits into two branches. One branch continues southward along the east coast of Honshu and is known as the coastal branch or the first branch. The other, called the offshore branch or the second branch, turns direction and flows eastward. The Oyashio water is characterized by cold and fresh water, which is formed as a result of merging waters from the East Kamchatska Current (EKC) and Okhotsk Sea (e.g., Kono and Kawasaki 1997). It meets with warm and saline Kuroshio water off the east of Honshu to form complicated frontal structures, and thus this region is often called the mixed water region (MWR).

The first branch of the Oyashio has significant seasonal variation, which is well characterized by the horizontal extent of the Oyashio water in MWR. The horizontal extent increases from winter to early spring and reaches the maximum in March–April. It then gradually decreases, and the minimum extent occurs in November (Yasuda 2003). Moreover, the Oyashio Current exhibits prominent interannual variability, which is considered to have a relation to changes in large-scale atmospheric forcing in the North Pacific (Sekine 1988; Sekine 1999) although its mechanism is still controversial.

Figure 18 presents year-to-year changes of the Oyashio area in MWR averaged over March–May, which is defined by the cold water area at 100 m with temperature less than 5 °C (Kawai 1972) within a rectangular region (141°–148°E, 30°–43°N). The time series of the Oyashio area represented by FORA indicates that the maximum and minimum during 1982–2012 appeared in 1984 and 2000, respectively. The maximum extent in 1984 represented by FORA is consistent with the fact that an anomalous southward intrusion of the Oyashio occurred in 1984 (Sekine 1988).

Fig. 18
figure 18

Time series of the Oyashio area in spring (March–May) during the period from 1982 to 2012 (unit in 10\(^4\) km\(^2\)). The Oyashio area is defined by the area of cold water at 100 m with temperature less than 5 °C and is calculated within 141°–148°E and 30\(^\circ\)–43°N

In order to take a look at the difference between the two anomalous years, Fig. 19 compares the temperature and horizontal velocity distributions at 100 m in April 1984 and April 2000. In April 1984, the Oyashio water characterized by temperature less than 5 °C is widely distributed in MWR. It intrudes southward along the east coast of Honshu, and the southern tip of the Oyashio water extends to around 36°N. These features shown by FORA are in good agreement with observed features reported by former studies (Okuda 1986; Sekine 1988).

In contrast, the Oyashio area is hardly found in MWR in April 2000. It is noticeable that a warm-anticyclonic eddy around 146°E and 41.5°N exists, which seems to block the southward intrusion of the Oyashio water. The red solid line in Fig. 19 denotes a trajectory of the eddy during the period of 3 November 1998 through 23 September 2000, which is detected using daily SSH fields in FORA. This long-lived mesoscale eddy is first detected at the KE region in November 1998 and then migrates northward along the Japan Trench located there; it reaches southeast of Hokkaido in the winter to spring of 2000. Similar northward-moving mesoscale eddies are often observed in this area (Itoh and Yasuda 2010), and their relation to variability in the Oyashio Current has also been argued (Isoguchi and Kawamura 2006; Kuroda et al. 2015).

Fig. 19
figure 19

Comparison of monthly mean temperature (unit in °C) and velocity (unit in cm s\(^{-1}\)) fields at 100 m in a April 1984 and b 2000. The solid white line indicates a temperature isoline of 5 °C, which is often used for an indicator of the Oyashio area. The red solid line represents a trajectory of the anticyclonic warm-core eddy located around 146°E and 41.5°N in April 2000. The eddy’s trajectory during the period from November 1998 to September 2000 is plotted. The solid purple line along 41.5°N denotes a repeat hydrographic line observed by R/V Kofu-maru

Figure 20 shows vertical sections of temperature and salinity along 41.5°N (see Fig. 19) for hydrographic observations and FORA in April 1984 and April 2000. The hydrographic data in Fig. 20 were obtained by the research vessel Kofu Maru, which had been operated by Hakodate Marine Observatory in JMA. There are some discrepancies between the observation and FORA. For example, the subsurface temperature inversion in 1984, known as the mesothermal structure (Ueno and Yasuda 2000), is somewhat weak, and exists temperature near the surface in 2000 is warmer compared to the observation. The FORA results, nevertheless, capture the observed differences in the temperature and salinity fields between 1984 and 2000 well.

In April 1984, the surface layers are occupied by cold and fresh Oyashio water. On the other hand, warm and saline Kuroshio water is widely distributed in April 2000. A distinct feature of the temperature and salinity fields in 2000 is the existence of the long-lived warm-core eddy located around 146°E. The vertical structure of the eddy extends to about 600 m. The above results imply that the southward intrusion of the Oyashio water characterized by the Oyashio area in the MWR region should be considered together with mesoscale eddies in the region as well as the strength of the Oyashio Current. In addition, it is worth noting that the KE jet was in an unstable state in 2000 (see Fig. 17b); hence, there were many mesoscale eddies in the KE region. This suggests a possibility that the KE state is also a possible factor to control the variations in the Oyashio intrusion through the mesoscale eddy activity. The FORA data, a long-term high-resolution dataset, could be useful to evaluate this hypothesis, which remains for future studies.

Fig. 20
figure 20

a, c Temperature (unit in °C) and b, d salinity (unit in psu) sections from the surface down to 1000 m along 41.5°N (see also Fig. 19) in (a, b) April 1984 and (c, d) April 2000. Observation and FORA results are compared

5.4 Japan Sea

The Japan Sea is a semi-enclosed marginal sea of the North Pacific Ocean. It is connected to adjacent seas through four narrow and shallow straits. The straits of Tsushima (TSM) and Tsugaru (TGR) connect the Japan Sea to the East China Sea and the North Pacific, respectively. There are two straits between the Japan Sea and Okhotsk Sea: the Soya Strait (SOY) and Tatar Strait (TTR). Because of the semi-enclosed character, volume transports through the straits are thought to be important for mean state and variability in the Japan Sea. This is why a number of observational studies have attempted to estimate an accurate volume transport through each strait (e.g., Fukudome et al. 2010; Ito et al. 2003; Fukamachi et al. 2010). Recent observational efforts are well summarized in Na et al. (2009).

Annual mean transport over 31 years (1982–2012) and its standard deviation at each strait are shown in Table  3. A positive transport value represents northward flows in TSM and TTR and eastward flows in TGR and SOY. A water budget in the Japan Sea represented by FORA is that an inflow transport at TSM (2.32 Sv) almost balances with the outflow ones at TGR and SOY (1.63 and 0.67 Sv). A contribution of the transport at TTR (0.016 Sv) is very small. The ratios of the outflow transports to the inflow through TSM are thus estimated at 70 and 29 % for TGR and SOY, respectively, which are consistent with former estimates (Cho et al. 2009; Na et al. 2009).

Table 3 Comparison of volume transport at each strait in the Japan Sea between FORA and observations

Observational transport estimates at the straits are also given in Table 3 to be compared with the FORA results. Fukudome et al. (2010) showed 10-year time series of volume transport at TSM obtained from the acoustic Doppler current profiler (ADCP) data from February 1997 to February 2007, and the 10-year average of the transport is estimated at 2.65 Sv. They also showed volume transports through the east and west channels of TSM as 1.20 and 1.45 Sv. Ito et al. (2003) estimated the mean volume transport through TGR as 1.5 Sv using ADCP data from November 1999 to March 2000. Although this observation is conducted only in a short term, this value is almost the same as other observational results (Shikama 1994; Nishida et al. 2003). Fukamachi et al. (2010) showed the annual volume transport through SOY was 0.62–0.67 Sv from September 2006 to July to 2008 using bottom-mounted ADCP data; however, the annual mean from 2004 to 2005 was 0.94–1.04 Sv (Fukamachi et al. 2008). They suggested that the interannual variation in the transport through SOY is large. We also calculate the volume transport in FORA at each strait for the same period as the reference observation, which is summarized in Table 3. Note that the calculated transports for FORA get closer to the observational estimates at almost straits compared to the annual mean transports for 1982–2012.

The mean transports represented by FORA are largely consistent with the observations, while there are some minor discrepancies compared to the observational estimates. The volume transport in FORA is slightly lower (higher) at TSM (TGR). The lower transport at TSM seems to be attributed to underestimating at the western channel of TSM (1.09 Sv). It should however be noted that there is an imbalance of the observation-based transport estimates among the straits (Han et al. 2015); thus, it would also be one cause for the mismatch of the transports between FORA and the observations.

Figure 21a shows seasonal variation of volume transport calculated as a 31-year average at each strait. Annual cycles of volume transports at TSM and SOY are almost similar with their maximum in summer and their minimum in winter. In contrast, volume transport at TGR has the maximum in November and minimum in February. These features in FORA are consistent with seasonal variations indicated by previous studies (e.g., Fukudome et al. 2010). Figure 21b shows year-to-year variations in annual mean volume transports through the four straits. In addition to the interannual variations, the time series of the annual mean transports exhibit positive linear trends, which are estimated at \(1.62\times 10^{-2}\), \(9.28\times 10^{-3}\) and \(6.12\times 10^{-3}\) Sv/year for TSM, SOY and TGR, respectively. These trends are significant on 99 % confidence level.

Fig. 21
figure 21

Volume transport through the Tsushima Strait (red), Tsugaru Strait (green), Soya Strait (blue) and Tatar Strait (black). a Monthly mean averaged over 31 years with standard deviations for each month (error bar) and b annual mean with linear trend (black line)

The upper ocean heat content in the Japan Sea also shows a significant increasing trend as shown in Fig. 22. The spatial pattern of the trend compares well with that in Na et al. (2012). Further, in Fig. 23 we compare time-depth sections of temperature between FORA and an observation at a JMA long-term monitoring station PM-5 (134.72°E, 37.72°N) located around the Yamato Basin, where the upper ocean heat content shows a significant increasing trend (see Fig. 22). The isotherms are deepened with time for both the data. This feature is consistent with the increasing trend in the upper ocean heat content.

Fig. 22
figure 22

Linear trend over 31 year of the upper 300 m heat content in the Japan Sea. Black thick line is the contour of 300 m depth. Triangle mark is we compare observing point (PM-5)

Na et al. (2012) also suggested that low-frequency variability in the upper ocean heat content on a decadal time scale is apparent in the Japan Sea. To examine the long-term variation of the heat content in FORA, we calculated the detrended heat content anomaly averaged over the Japan Sea (130°E–140°E, 36°N–43°N) and a 13-month running-mean filter was applied (Fig. 24). The time series of the heat content anomaly in the Japan Sea exhibits significant variations on interannual to decadal time scales. The decadal variations are in phase with those in Na et al. (2012). Although the previous studies are limited to temperature analysis, the FORA data including both the temperature and velocity fields enable us to investigate the variations of the Japan Sea in more detail, such as the relationship between the ocean heat content and volume transport through TSM. Therefore, this dataset has the potential to clarify the variations in the Japan Sea on the decadal time scales using not only the temperature but also the velocity field.

Fig. 23
figure 23

Time series in vertical profiles of potential temperature in winter at the station PM-5 (left) observation and (right) re-analysis

Fig. 24
figure 24

Time series of the monthly detrended heat content anomaly over 300 m in the Japan Sea (gray line) and applied 13-month running-mean filter (black line)

5.5 Sea ice in the Okhotsk Sea

One of the characteristics of FORA-WNP30 is that the sea ice model and ice assimilation scheme are incorporated. In this subsection, we therefore present sea ice variability in the Okhotsk Sea and its relation to oceanic variations represented by FORA. The Okhotsk Sea, located to the northeast of Hokkaido, is known to be one of the southernmost seasonal sea ice zones in the Northern Hemisphere. A large amount of sea ice is produced at coastal polynyas in the northwest shelf region (Kimura and Wakatsuchi 2004), which are open water areas caused by strong offshore wind (Martin et al. 1998). In the northwestern coastal polynya, dense shelf water is formed because of brine rejection associated with the active ice production (Shcherbina et al. 2003), and it is thought to be the ventilation source of NPIW (Talley 1991).

In autumn sea ice begins to cover the northwestern shelf region of the Okhotsk Sea and then spreads southward along the east coast of Sakhalin. The southernmost tip of sea ice reaches the north coast of Hokkaido in mid to late winter, and the maximum ice extent occurs in March. Subsequently, the sea ice extent starts decreasing, and most of the sea ice disappears by June. The sea ice extent in the Okhotsk Sea shows large interannual variability (Ohshima et al. 2006). In Fig. 25a, we compare interannual variations of the sea ice extent in March in an observation and FORA. The observation data used for the comparison are calculated from quasi-five-daily ice charts produced by JMA, which are created through a manual analysis of in situ, satellite and aircraft observations. The ice extent for FORA is defined here by the cumulative area of grid cells in the Okhotsk Sea that have more than 15 % sea ice concentration.

Fig. 25
figure 25

a Time series of sea ice extent in the Okhotsk Sea in March for (black) observation and (red) FORA. b Year-to-year variations of the sea ice edge in March defined by the 15 % isoline of monthly mean sea ice concentration in FORA. The climatological ice edge in March and those in the maximum and minimum ice extent years (2001 and 2009) are depicted by green, red and blue solid lines, respectively

Interannual variability of the sea ice extent in FORA compares well with the observed variability. The maximum sea ice extent during the period from 1982 to 2012 occurred in 2001, and the minimum was in 2009 in both datasets. It should be noted that the lowest sea ice extent since 1970 was updated recently in the 2014/2015 winter. Figure 25b displays the year-to-year change of the sea ice edge in March, which is defined by the 15 % ice concentration isoline. The figure indicates that the year-to-year variation depends on how much sea ice expands to the southeastern part of the Okhotsk Sea.

A recent study by Nakanowatari et al. (2010) pointed out that the maximum sea ice extent in the Okhotsk Sea is highly correlated with SST around the EKC region in the preceding autumn. To confirm this relation using the FORA data, in Fig. 26 we plot the lag correlation maps between the sea ice extent of the Okhotsk Sea in March and SST in the preceding October to December. As shown by Nakanowatari et al. (2010), the correlation maps indicate that SST off the east coast of Kamchatka Peninsula in autumn has a large negative correlation with the sea ice extent in the subsequent winter. It is also worth emphasizing that the area with high correlation coefficients in FORA, which is confined along EKC compared to the maps in Nakanowatari et al. (2010), moves southwestward along EKC and a part of the correlation signal enters the southeastern part of the Okhotsk Sea through straits along the Kuril Islands, where there is large interannual variability in the sea ice extent. These features in FORA support a hypothesis proposed by Nakanowatari et al. (2010) that the interannual variability of the sea ice extent in the Okhotsk Sea is influenced by advection of temperature anomalies by EKC.

Fig. 26
figure 26

Lag correlation maps between sea ice extent of the Okhotsk Sea in March and monthly mean SST in the preceding a October, b November and c December. Climatological monthly mean fields of FORA surface velocity (unit in cm s\(^{-1}\)) are also shown. Note that the climatological surface velocities with magnitude less than 3 cm s\(^{-1}\) are masked

6 Summary and future work

In this study, we have produced a four-dimensional variational ocean re-analysis for the Western North Pacific over 30 years (FORA-WNP30), which is the first-ever dataset covering the western North Pacific over 3 decades at eddy-resolving resolution. MOVE-4DVAR, which consists of an ocean ice model for the western North Pacific and a 4DVAR assimilation scheme, is employed to assimilate in-situ profiles, gridded SST and altimeter-derived SSH observations. In addition, ice concentration observations are assimilated through a least squares analysis and nudging. A long-term assimilation experiment is conducted using the JRA-55 atmospheric forcing to produce the reanalysis dataset during 1982 to 2012.

We first compare the FORA data with the MOVE-3DVAR reanalysis dataset and independent observations such as ADCP current measurements and tide-gauge sea-level records. The comparison indicates that the FORA data reproduce the observed variability in ocean current and sea level. The comparison between FORA and MOVE-3DVAR also shows that FORA has higher accuracy than the 3DVAR product. We then evaluate the mean field represented by FORA focusing on the SST, SSH and MLD fields. Overall, the FORA dataset reproduces basic features in the western North Pacific well, although there are some biases.

The interannual to decadal variability and anomalous events in the seas around Japan are then analyzed. One of the outstanding features in FORA is that anomalous events such as the Kuroshio large meander and anomalous intrusion of the Oyashio in the 1980s, when there are no altimeter data, are successfully reproduced. In the Japan Sea, FORA reproduces variations in the upper ocean heat content as well as inflow/outflow transports on interannual to decadal time scales well, implying that the FORA dataset is useful for further analysis of the climate variability in the sea. Variations of sea ice extent in the Okhotsk Sea and their relation to SST changes in the preceding autumn are also properly represented by FORA compared to former observational studies. To summarize the above, we conclude that FORA is a valuable dataset for a variety of oceanographic research topics and potentially for related fields such as climate study, meteorology and fisheries.

There are, however, some issues found in FORA that need to be resolved in future works. As mentioned in Sect. 5.2, the EKE field in FORA has a distinct gap in 1993, when the satellite altimeter data begin to be assimilated. The mesoscale eddy activity in FORA before 1993 is at a similar level as that in a free simulation, but is relatively low compared to that in the altimeter era, resulting in the EKE gap. It is assumed that one cause for the EKE gap is a model bias derived from atmospheric forcing used, and thus using an atmospheric forcing adjusted for ocean models would be a useful approach to reduce the gap. Use of such adjusted forcing is becoming common in the ocean modeling community (Brodeau et al. 2010; Griffies et al. 2009). We also pointed out a cold bias in the FORA SST and a potential issue in the design of the observation operator for the SLA data assimilation. Another issue found in FORA but not explicitly mentioned in the previous sections is that excessively cold water with temperature less than the freezing point is sometimes analyzed near the surface layer in the southern part of the Okhotsk Sea. This issue appears in summer and arises from the analysis increment crossing over a strong temperature front between cold Okhotsk Sea water and warm water brought by the Soya Warm Current. As described in Sect. 2, we added a constraint term to the cost function in MOVE-4DVAR to prevent such an unrealistically cold temperature, but it was set to work only in MWR off the east of Honshu, where the same issue can occur (Usui et al. 2011). It is therefore possible to resolve this issue by setting the constraint to work in the Okhotsk Sea as well as in MWR. These improvements should be made when updating the dataset in the future.

The FORA-WNP30 dataset is available for research use. Instruction on the acquisition of the dataset can be found at http://synthesis.jamstec.go.jp/FORA/.