1 Introduction

Assessing the carbon dioxide (CO2) fluxes at the earth’s surface, between the atmosphere–land and atmosphere–oceans, is essential for studying the global carbon cycle. Within the atmosphere, the mole fraction of CO2 has distinct seasonal variations caused by natural CO2 fluxes that are superimposed with the human-induced increasing trend. These natural fluxes are induced by dynamic photosynthesis and respiratory activity in the terrestrial biosphere and by the sea–air CO2 exchange driven by ocean thermodynamic activity, oceanic surface mixing, and biological activity. In addition to seasonal variations, these natural fluxes exhibit notable interannual variations in response to major climate events, such as El Niño (e.g., Keeling et al. 1995). Changes in physical or biological land and ocean conditions can further influence the CO2 flux trends on the decadal scales. However, the mechanisms underlying the natural CO2 flux variations are still unclear, resulting in considerable uncertainties in global warming predictions made by earth system models equipped with climate–carbon cycle feedbacks (Friedlingstein et al. 2006).

An inverse analysis of atmospheric CO2, where an atmospheric transport model is used to connect the surface fluxes to the observed mole fraction of atmospheric CO2, is a prominent method that provides quantitative observation-based estimates of spatiotemporal surface flux variations. In fact, a number of ground-based stations and other observational platforms have been used to measure atmospheric CO2 for decades; through inverse analyses, they have revealed considerable interannual variations in region-by-region CO2 fluxes (Bousquet et al. 2000; Patra et al. 2005a,b; Baker et al. 2006; Chevallier et al. 2010; Peylin et al. 2013). Over the last decade, the four-dimensional variational method (4D-Var) and ensemble Kalman smoother (or filter), which are well-known state-of-the-art data assimilation methods utilized in meteorological and oceanic research, have become common inverse analysis methods and have made use of a wide variety of observations (Peters et al. 2007; Chevallier et al. 2010; van der Laan-Luijkx et al. 2017; Niwa et al. 2017b; Liu et al. 2021). However, Friedlingstein et al. (2020) demonstrated that inverse analyses have significantly large ranges of interannual flux variation estimates, even for zonal means of certain latitudinal ranges, further verifying the limited reliability of such inverse analyses.

Recently, societal momentum to reduce CO2 emissions has gained traction globally in response to the 1.5 °C/2.0 °C target of the Paris agreement. Consequently, CO2 flux data from inverse analyses are attracting global attention because of their potential to provide useful information for confirming national inventories. Recent studies by Chevallier (2021) and Deng et al. (2022) compared inversion fluxes with national inventories based on agriculture, forestry, and other land-use changes (LUCs). However, they used data from global-scale inversions with horizontal resolutions around hundreds of kilometers. Therefore, further development is required to increase the horizontal resolution so that fluxes can be compared more consistently with national inventories. This is a cumbersome task, because a several-decades-long analysis is required to investigate long-term trends, which is computationally demanding.

Niwa et al. (2017a, b) developed a sophisticated and computationally efficient inversion system based on the Nonhydrostatic Icosahedral Atmospheric Model (NICAM: Tomita and Satoh 2004; Satoh et al. 2008, 2014). The CO2 inverse analysis using this model is termed the NICAM-based Inverse Simulation for Monitoring CO2 (NISMON-CO2) (Niwa et al. 2021), and the CO2 flux estimates made using this system were included in a multi-inversion analysis of the Global Carbon Project (GCP) (Friedlingstein et al. 2020, 2022). The general features of NISMON-CO2 were described by Niwa et al. (2017b), who performed a pseudo-observation experiment. However, the experiment was performed for only a year under ideal conditions; therefore, the practical performance of NISMON-CO2 for long-term analysis is yet to be demonstrated.

In this study, we describe the latest inversion settings of NISMON-CO2, specifically those intended to improve the inversion performance. Furthermore, we performed a long-term inverse analysis using pseudo-observations that emulate actual observation networks. We focused on the use of NISMON-CO2 in investigating interannual variations in regional CO2 fluxes and on the effectiveness of the newly introduced techniques. Apart from this study, an inverse analysis was performed using NISMON-CO2 with the same inversion settings, but instead using actual observations for 1990–2020, whose data are publicly available as NISMON-CO2 ver. 2021.1 (Niwa 2020). This study also aims to present a detailed description of NISMON-CO2 ver. 2021.1.

2 Methods

2.1 NICAM-TM and NISMON-CO2

Initially, we describe the fundamental features of the NISMON-CO2 inversion system. One distinct feature of NICAM, on which NISMON-CO2 is based, is that it adopts an icosahedral grid system and calculates the continuity equation using the finite volume method (Tomita and Satoh 2004). Thanks to the finite volume method, the NICAM-based transport model (NICAM-TM) can completely conserve tracer mass without any numerical mass fixer, which motivates the usage of NICAM for simulating atmospheric transport of long-lived species such as CO2 (Niwa et al. 2011a, b). Simulations with NICAM-TM are performed iteratively in an inverse analysis of NISMON-CO2. Further details about NICAM-TM are described by Niwa et al. (2017a).

The inverse analysis system NISMON-CO2 uses 4D-Var to optimize surface CO2 fluxes consistently with atmospheric observations (Niwa et al. 2017b). In 4D-Var, the cost function J, whose minimum state is sought, is defined as:

$$J\left( {\delta {\mathbf{x}}} \right) = \frac{1}{2}\delta {\mathbf{x}}^{{\text{T}}} {\mathbf{B}}^{ - 1} \delta {\mathbf{x}} + \frac{1}{2}\left( {M\left( {\delta {\mathbf{x}}} \right) - {\mathbf{d}}} \right)^{{\text{T}}} {\mathbf{R}}^{ - 1} \left( {M\left( {\delta {\mathbf{x}}} \right) - {\mathbf{d}}} \right),$$
(1)

where δx represents the model parameter to be optimized, which comprises deviations from its prior estimate xpri of surface fluxes and the initial global mole fraction offset. Meanwhile, M(.) is an operator representing the model calculation, and d is a vector defined by the observational vector y as \({\mathbf{d}} = {\mathbf{y}} - M\left( {{\mathbf{x}}^{{{\text{pri}}}} } \right)\). The matrices B and R are the error covariance matrices of prior estimates and the model–observation mismatches, respectively. The iterative calculation of 4D-Var uses the gradient of the cost function:

$$\nabla_{{\delta {\mathbf{x}}}} J = {\mathbf{B}}^{ - 1} \delta {\mathbf{x}} + {\mathbf{M}}^{{\text{T}}} {\mathbf{R}}^{ - 1} \left( {M\left( {\delta {\mathbf{x}}} \right) - {\mathbf{d}}} \right)$$
(2)

to find the optimal model parameters (i.e., posterior estimates) that minimize the cost function. MT in Eq. (2) represents the transpose of M, where M is the tangent linear operator of M(.). In fact, the relation \({\mathbf{M}}\Delta {\mathbf{x}} \approx M\left( {\Delta {\mathbf{x}}} \right)\) is satisfied because the model calculates atmospheric transport linearly and does not include any chemical reactions. The calculations of M(.) and MT are performed by the forward and adjoint simulations using NICAM-TM, which only simulates CO2 transport in the so-called offline mode with prescribed meteorological data. The meteorological data are prepared in the “online mode,” in which meteorological fields are simulated by nudging horizontal winds toward those of the reanalysis data. In this study, we used the same model settings and prescribed meteorological data as those of Niwa et al. (2017a). The forward and adjoint transport simulations were performed with the horizontal grid level “glevel-5” of NICAM, whose mean grid interval was 223 km. The model contains 40 layers in vertical up to a height of approximately 45 km above the sea level.

The model operator M(.) includes spatiotemporal interpolation from the model grids and temporal steps to observational locations and times and the conversion of flux data from latitude–longitude grids to NICAM’s icosahedral grid, as well as atmospheric transport. Therefore, the matrix M and its transpose can be expressed as:

$${\mathbf{M}} = {\mathbf{HM}}^{\prime}{\mathbf{G}},$$
(3)
$${\mathbf{M}}^{{\text{T}}} = {\mathbf{G}}^{{\text{T}}} {\mathbf{M}}\user2{^{\prime}}^{{\text{T}}} {\mathbf{H}}^{{\text{T}}} ,$$
(4)

where H and G represent the spatiotemporal interpolation and the flux data conversion, respectively. Matrix M′ is the integral calculation of atmospheric transport. This study newly introduces G, which allows the flux data contained in δx to be arrayed in latitude–longitude grids, so the optimization can be performed in that space, but not in the model grid space (i.e., the icosahedral grids). The detailed conversion scheme is described in the next section. In NISMON-CO2, the optimization is performed using a quasi-Newton method-based scheme termed the Preconditioned Optimizing Utility for Large-dimensional analyses (POpULar: Fujii and Kamachi 2003; Fujii 2005).

2.2 Grid conversion scheme

The icosahedral NICAM grid shapes hexagons or pentagons, which requires an elaborate conversion from latitude–longitude grids, because flux input datasets are originally prepared in latitude–longitude grids. Previous studies (Niwa et al. 2011a, 2012, 2017a, b, 2021) adopted simple area averaging, in which data from within a certain distance in the surrounding latitude–longitude grids were averaged. Subsequently, flux values were slightly modified using scaling to ensure mass conservation. In this study, we have updated to a more sophisticated method of grid conversion, which no longer requires mass-conserving scaling, thus retaining its linearity.

The new method initially divides the latitude–longitude grid data into a finer-scale latitude–longitude grid, as depicted in Fig. 1 (the bold dark-gray mesh becomes the thin light-gray mesh), i.e., rasterizing. Next, each finer-scale grid is allocated to a hexagonal grid whose borders are trigonometrically defined in latitude–longitude coordinates. In Fig. 1, the colored grid areas are allocated to the centered icosahedral grid, which derives contributions from the surrounding latitude–longitude grids. Consequently, the flux value converted into the ith icosahedral grid f′i can be expressed as:

$$f_{i}^{{\prime }} = \mathop \sum \limits_{j = 1}^{{N_{i} }} a_{i,j} f\left( {G_{j} } \right)/A_{i} ,$$
(5)

where Gj denotes the original latitude–longitude grid that overlaps the ith icosahedral grid and Ni is the number of those grids (Ni = 9 in the case of Fig. 1). The partial area of each original latitude–longitude grid overlapped by the ith icosahedral grid is derived from the finer latitude–longitude grids, which is represented as ai,j (individual colored areas in Fig. 1). The area of the ith icosahedral grid is denoted as Ai. Because every finer latitude–longitude grid is allocated to one of the surrounding icosahedral grids, the total amount of global data is conserved. Because of the linearity of Eq. (5) and the absence of any mass-conserving scaling, this grid conversion scheme is entirely linear and thus represented by the matrix G in Eq. (3). This property enables us to easily develop the adjoint code represented by GT in Eq. (4). By introducing G and GT into the model operator and its adjoint, respectively, the control variables can be arrayed in latitude–longitude grids.

Fig. 1
figure 1

Illustration of flux conversion from latitude–longitude grids to icosahedral grids. The black line indicates the boundaries of the icosahedral grids, while the dark-gray and light-gray lines represent original and finer latitude–longitude grids, respectively. In the conversion, fluxes in colored areas are allocated to the centered icosahedral grid

Optimizing fluxes in the latitude–longitude grid has two technical advantages. First, conversion of flux data from icosahedral grids to latitude–longitude grids after inversion for comparison with other datasets is not required. Such conversion often causes deterioration of mass conservation and requires additional modification. Second, the flux data can be provided at a higher resolution than the icosahedral grid data. In this study, we optimized the flux data at a 1° × 1° resolution, which is finer than the resolution at the icosahedral grid (approximately 223 km). This allows the optimized fluxes to retain the high-resolution information originally contained in the prior flux data. During the grid conversion in this study, we divided 1° × 1° latitude–longitude grids by 10 × 10 to obtain finer grids.

2.3 Flux model

The surface CO2 fluxes input to NICAM-TM can be described as:

$$\begin{aligned} f_{{{\text{CO}}_{2} }} \left( {x,t} \right) & = f_{{{\text{fos}}}} \left( {x,t} \right) - \beta_{{{\text{GPP}}}} \left( {f_{{{\text{GPP}}}} \left( {x,t} \right) + \Delta f_{{{\text{GPP}}}} \left( {x,t} \right)} \right) + \beta_{{{\text{RE}}}} \left(f_{{{\text{RE}}}} \left( {x,t} \right) + \Delta f_{{{\text{RE}}}} \left( {x,t} \right)\right) \\ & \quad + \left( {1 + \Delta \alpha_{{{\text{LUC}}}} \left( {x,t} \right)} \right)f_{{{\text{LUC}}}} \left( {x,t} \right) + \left( {1 + \Delta \alpha_{{{\text{fire}}}} \left( {x,t} \right)} \right)f_{{{\text{fire}}}} \left( {x,t} \right) \\ & \quad + f_{{{\text{ocn}}}} \left( {x,t} \right) + \Delta f_{{{\text{ocn}}}} \left( {x,t} \right), \\ \end{aligned}$$
(6)

where x and t represent the flux location and time, respectively. Fluxes from fossil fuel use and cement production, gross primary production (GPP) and respiration (RE) of the terrestrial biosphere, LUC, biomass burning, and oceans are denoted as ffos, fGPP, fRE, fLUC, ffire, and focn, respectively; they are prescribed using flux datasets and here have a monthly temporal resolution.

In Eq. (6), we separate the terrestrial biosphere fluxes into GPP and RE components following the method of Lokupitiya et al. (2008) because GPP and RE vary differently. The coefficients, \(\beta_{{{\text{GPP}}}}\) and \(\beta_{{{\text{RE}}}}\), are scaling factors that produce diurnal variations. \(\beta_{{{\text{GPP}}}}\) and \(\beta_{{{\text{RE}}}}\) distribute fluxes at a three-hourly resolution from the monthly fGPP and fRE, respectively, whose values are derived from the downward shortwave radiation at the earth’s surface and 2-m height air temperatures data from the Japanese 55-year Reanalysis (JRA-55: Kobayashi et al. 2015; Harada et al. 2016). This is similar to the method of Olsen and Randerson (2004). The nature of the terrestrial biosphere causes \(\beta_{{{\text{GPP}}}}\) and \(\beta_{{{\text{RE}}}}\) to show different diurnal variations (e.g., \(\beta_{{{\text{GPP}}}}\) is zero at night because photosynthesis is inactive, which is not the case for \(\beta_{{{\text{RE}}}}\)).

In this study, fossil fuel emissions were not optimized, i.e., they were fixed as ffos, whereas the flux deviations of \({\Delta }f_{{{\text{GPP}}}}\), \({\Delta }f_{{{\text{RE}}}}\), and \({\Delta }f_{{{\text{ocn}}}}\) were optimized. Furthermore, the LUC and biomass burning fluxes vary with scaling factors, whose deviations from 1, \({\Delta }\alpha_{{{\text{LUC}}}}\) and \({\Delta }\alpha_{{{\text{fire}}}}\) were optimized. Therefore, the LUC and biomass burning fluxes were not allowed to change where their basic fluxes were set to zero in the optimization.

A monthly temporal resolution was selected for the flux parameters \({\Delta }f\) and \({\Delta }\alpha\), except for the ocean flux \({\Delta }f_{{{\text{ocn}}}}\), whose temporal resolution is set annually. This is because ocean flux variations are smaller than terrestrial flux variations, and some erroneous values could arise in the optimized ocean fluxes due to “leakage” of terrestrial signals to the oceans. An annual temporal resolution may help mitigate such flux leakage; however, this does not allow the inversion to optimize seasonal variations in ocean fluxes.

In NISMON-CO2 ver. 2021.1, where real observations were used, we used fossil fuel emission data from the GCP-Gridded Fossil Emission Dataset (GCP-GridFED: Jones et al. 2021) ver. 2021.2 for ffos, and the GPP, RE, and LUC data were obtained from the Vegetation Integrative SImulator for Trace gases (VISIT: Ito and Inatomi 2012; Ito 2019, 2021) for fGPP, fRE, and fLUC, respectively. The biomass burning emission data were obtained from the Global Fire Emissions Database (GFED) ver. 4.1s (van der Werf et al. 2017) for ffire, and the air–sea CO2 exchange data were taken from the Japan Meteorological Agency (JMA) (Iida et al. 2015, 2021) for focn. Except for focn, these data were originally provided at a higher resolution than 1° × 1°; however, for ease of analysis, we uniformly aggregated the data at 1° × 1°. Some of these flux datasets were replaced in the pseudo-observation experiment in this study, which is described in the next section.

Note that although the terrestrial flux is separated into GPP, RE, LUC, and biomass burning, they are not intended to be fully optimized independently. As demonstrated by Niwa et al. (2021), inversion can identify drastic flux signals, such as a large-scale biomass burning event; however, estimating fluxes independently for each component is difficult because the inversion only uses observations of atmospheric CO2 mole fractions. The separation of those fluxes is intended to represent the different diurnal variations in GPP and RE and to consider future possible constraints such as those from carbonyl sulfide (Campbell et al. 2015).

2.4 Prior error covariance

A 4D-Var calculation requires not only a prior estimate but also its prior error covariance matrix, as denoted by B in Eqs. (1) and (2). In the inverse experiment by Niwa et al. (2017b), the prior error covariances (i.e., the off-diagonal elements of B) were simply designed with a Gaussian function commonly used for the globe, in which a correlation scale length was arbitrarily determined. However, that assumption is too simple because the error covariances should dynamically vary in space and time according to the flux mechanisms.

In this study, we define the error covariance matrix using an ensemble set of prior fluxes as follows. First, we prepare a monthly long-term flux dataset and consider each year of data as a member of the ensemble. Therefore, the number of years equals the number of ensemble members, and each member contains 12 temporal components. For a prior flux component n, we approximate the covariance matrix using m members as

$${\tilde{\mathbf{B}}}_{{n,{\text{ cyc}}}} = \frac{1}{m - 1}\mathop \sum \limits_{k = 1}^{m} \delta {\mathbf{x}}_{n}^{k} \left( {\delta {\mathbf{x}}_{n}^{k} } \right)^{{\text{T}}} ,$$
(7)

where the subscript “cyc” represents cyclostationary, whose matrix is applied repeatedly for each year. The superscript k is an ensemble index, and \({\updelta }{\mathbf{x}}_{n}^{k}\) describes the deviation of the flux of the kth member from the ensemble mean (i.e., the long-term mean). However, due to the limited number of ensemble members (years), erroneous correlations occur in remote areas, i.e., so-called sampling errors. Therefore, to localize covariances, we apply the Schur product of \({\tilde{\mathbf{B}}}_{{n,{\text{ cyc}}}}\) with a correlation matrix C as

$${\mathbf{B}}_{{n,{\text{cyc}}}} = \gamma_{n} {\tilde{\mathbf{B}}}_{{n,{\text{cyc}}}} \circ {\mathbf{C}}.$$
(8)

Here, we use the Gaussian function for each element of C as

$$C_{ij} = {\text{exp}}\left( { - \frac{{l_{ij}^{2} }}{{L^{2} }}} \right),$$
(9)

where the indices i and j represent different flux locations, whose distance is denoted by lij. L is a constant and set to 2000 km, which is long enough to only damp erroneous correlations in remote areas. In practice, we let Cij = 0 where lij > 4000 km to minimize the computational burden. In Eq. (8), the scaling factor \(\gamma_{n}\) is introduced as a tuning parameter to adjust the degree to which fluxes are subjected to change by observational constraints. Furthermore, we replaced the resulting negative covariances with zeros, as we consider negative correlations to be primarily generated by sampling errors. The technique described in Eq. (8) has been used as “the covariance localization” in an ensemble Kalman filter assimilation (Lorenc 2003); however, we have applied it here for the 4D-Var inverse analysis.

For the prior covariances of the terrestrial biosphere GPP and RE, we used a long-term VISIT simulation for 1901–2020, generating 120 members for the covariance calculation (Eq. 7). After applying the correlation matrix, the resulting covariance was inflated by \(\gamma_{n} = 4\) so that the globally integrated variance was approximately (2 Pg C yr−1)2. The defined uncertainty of 2 Pg C yr−1 is derived from the difference between the global total net flux of the prior data and that estimated from the growth rate of the atmospheric observations; this is under the assumption that terrestrial flux is the dominant driver of global flux variations. To determine the error covariance of the prior ocean flux, we used the JMA ocean flux data for 1990–2019. Because we optimized the annual means of ocean fluxes, each annual mean was considered one ensemble member (with 30 members in total). The covariance for the ocean fluxes was derived similarly to that of the terrestrial biosphere using Eqs. (8) and (9) with \(\gamma_{n} = 2\). The resulting global variance was approximately (0.17 Pg C yr−1)2; this number is rather arbitrary compared with that for the terrestrial flux, but it was derived after several trial experiments. When \(\gamma_{n}\) was increased from 2, we obtained unrealistic variations in posterior ocean fluxes. Although 0.17 Pg C yr−1 is too small as a global ocean flux uncertainty, such small prior errors are needed to stabilize ocean flux estimates.

For the other flux components such as LUC and biomass burning, we do not consider error correlations and define each prior flux error covariance matrix Bn,cyc as a diagonal matrix, whose diagonal elements (i.e., variance) are all set to 4.0; this yields errors of 200%. We combined all Bn,cyc and applied them to every year, whose values are arrayed in the same grid and resolution as those of the prior flux, i.e., 1° × 1°. Finally, we obtained the full matrix of B by including the variance of the global initial mole fraction offset of (0.5 ppm)2.

Examples of the prior flux error covariance are given in Fig. 2, which shows error correlations \(\left( { = B_{ij} /\left( {B_{ii}^{1/2} B_{jj}^{1/2} } \right)} \right)\) of a given prior GPP for April and July. The original error correlations were significant in both months, even at a large distance. That feature was more pronounced in July, whereas larger correlations were present at a short distance in April (gray crosses in Fig. 2a and b). After performing the calculation in Eq. (8), those correlations became localized, whereas the above-mentioned feature was maintained (red crosses in Fig. 2a and b). The error correlations generated using this method have anisotropic spatial structures, as shown in Fig. 2c and d. Furthermore, they differ by location as shown in Fig. 2e and f. In fact, these anisotropic correlation features arise from year-to-year variations in the long-term VISIT data, which are driven by changes in meteorological conditions (shortwave radiation, temperature, moisture, etc.). We expect them to reflect actual biogeochemical mechanisms better than the previously used simple isotropic correlations.

Fig. 2
figure 2

Error correlations of a prior gross primary production (GPP) flux at near 60.5° N, 100.5° E along with distance for April (a) and July (b). Gray and red crosses represent correlations before and after localization (Eq. 8), respectively. Spatial structures of the localized error correlations of a and b are shown in c and d, respectively. Furthermore, e and f are the same as c and d but for a different prior GPP flux at near 30.5° N, 110.5° E

2.5 Observations

The observational data input to NISMON-CO2 was derived from the dataset named Observation Package (ObsPack)-GLOBALVIEWplus and the related near-real-time version ObsPack-NRT, provided through the National Oceanic and Atmospheric Administration (Masarie et al. 2014). Additionally, we used other independently provided data. In NISMON-CO2 ver. 2021.1, versions 6.1_2021_03-01 (Schuldt et al. 2021a) and 6.1.1_2021-05-17 (Schuldt et al. 2021b) of ObsPack-GLOBALVIEWplus and ObsPack-NRT were used, respectively; all data are listed in Niwa (2020). To demonstrate the potential performance of NISMON-CO2 ver. 2021.1, we performed pseudo-observation inversions for this study by emulating observational data at actual sites and times. The locations of the observations are shown in Fig. 3. For this dataset, we used all available observations provided from selected institutes (Niwa 2020), of which some sites have a constant data record over the entire period, and some do not. Therefore, several temporal observational gaps were present; we can investigate to determine the effect of these data gaps on flux estimations using the pseudo-observation experiment. As shown in Fig. 3, we only used surface observations (flask samplings and in situ measurements at ground-based stations and tower sites, and shipboard observations). This is because NISMON-CO2 ver. 2021.1 was intended to aid in the inversion comparison of GCP (Friedlingstein, et al. 2022), which is designed to use only surface observations so that aircraft data could be used as independent data for evaluating the inversions.

Fig. 3
figure 3

Locations of observations used in the pseudo-observation inversion experiment. The observations are the same as those of NISMON-CO2 v2021.1 (Niwa 2020) but for those available during 2007–2018. Flask air sampling and in situ observations at ground-based stations are depicted by filled magenta and unfilled purple triangles, respectively. Light blue circles denote shipboard observational locations. Tower observation sites are denoted by cyan crosses

2.6 Observation–model mismatch errors

We defined the observation–model mismatch error covariance matrix R as:

$$R_{ii} = \frac{1}{{\left( {\beta r_{i} } \right)^{2} N_{i} }},$$
(10)

where ri denotes the standard deviation of the mole fractions around the ith observation and Ni represents the number of observations within a certain spatiotemporal range of the ith observation. In this study, we apply a spatiotemporal range of one week, a 1000 km horizontal diameter circle, and a 1 km vertical depth. Furthermore, we introduce a scaling factor \(\beta\) so that \(\chi^{2} \left( {: = \frac{{2J_{{{\text{min}}}} }}{m}} \right)\), where m is the number of observations and Jmin is the minimum of the cost function, should be < 1 (Tarantola 2005). We calculated ri from the simulated daily mole fraction variations of CO2 at the site of observation i ranging from one week prior and one week after the observational timing; the simulation was performed using NICAM-TM with prior fluxes. As denoted by Eq. (10), R is a diagonal matrix, which assumes that all observations are independent from each other. However, this is a simplistic assumption, specifically in cases where observations were obtained with high density. Therefore, we inflate the variances for such areas by introducing Ni, which reduces the weights of such dense observations in space and time. This variance inflation prevents continuous in situ observations from imposing exceedingly strong constraints compared with discrete flask sampling observations. For NISMON-CO2 ver. 2021.1, we adopted a strategy of using as many available observations as possible. Consequently, this resulted in duplicated observations because several institutes share the same observational platforms; Ni places a single-site constraint on such collocated observations.

2.7 Pseudo-observation inversion experiment

To investigate the general performance of NISMON-CO2 and effectiveness of the proposed inversion settings, we performed an inverse analysis under ideal conditions; that is, we know the “true” state of fluxes. We prepared two datasets, one of which was regarded as “true” and the other one was used as a prior estimate. First, we performed a forward transport simulation of atmospheric CO2 using the true flux dataset and extracted the simulated CO2 mole fractions at the above-mentioned observational points and times. Next, we performed inverse experiments incorporating those pseudo-observations. Niwa et al. (2017b) conducted a similar experiment; however, the experiment in this study was designed more practically, using a combination of continuous in situ and discrete flask measurements (Fig. 3) and covering a longer analysis period. Furthermore, by virtue of the implemented grid conversion scheme, we optimized fluxes in a latitude–longitude grid of 1° × 1° rather than in the icosahedral grid.

The forward simulation used to generate the pseudo-observations (true run) employed the same NICAM-TM settings used for the inversion, except for the horizontal resolution. The horizontal resolution is “glevel-6,” whose mean grid interval is 112 km. This is one level higher than that used for the inverse analysis (glevel-5, 223 km). Therefore, the true run can simulate much smaller mole fraction variations. This can provide insights into the influence of errors in model representativeness, although the mean grid interval of 112 km is not sufficiently small to fully consider this aspect.

The true fluxes used in this study were obtained from the posterior fluxes of NISMON-CO2 ver. 2021.1 (Niwa 2020), which were generated from real observations; therefore, the generated pseudo-observations of atmospheric CO2 mole fractions have realistic seasonal and interannual variations, and the true fluxes are assumed to reflect this. Thus, except for the fossil fuel emissions, which were taken from the same GCP-GridFED dataset, we made moderate changes to the prior fluxes from those in NISMON-CO2 ver. 2021.1 (described in Sect. 2.3). For the terrestrial fluxes (fGPP, fRE, fLUC, and ffire), we employed the same VISIT and GFEDv4.1s datasets as in NISMON-CO2 ver. 2021.1, but they were climatologically averaged for 2006 to 2018. Meanwhile, the ocean fluxes (focn) were based on different ocean flux data from Takahashi et al. (2009), which provide climatological fluxes for the reference year 2000. Therefore, excluding the fossil fuel emissions, the prior fluxes do not contain interannual variations.

The inversion calculation was performed from January 2007 to March 2018, with the first 12 months and last three months disregarded as the spin-up and spin-down times of the analysis, respectively. Therefore, the inversion fluxes were evaluated over a ten-year period. Note that this analysis period is nearly one-third as short as that of NISMON ver. 2021.1 (1990–2020) for the ease of performing the multiple inverse experiments described in the next section.

2.8 Sensitivity tests

In addition to the control experiment performed with the settings mentioned above (referred to as CTL), we performed four inversions with different settings (Table 1). The first (referred to as NO_ERR) was conducted using the pseudo-observations produced by the model with the same horizontal resolution as the inversion (i.e., glevel-5), ensuring that the model’s representation error is no longer present. For the other three tests, we set each updated method back to those used in previous studies (Niwa et al. 2017b, 2021). The second inversion (referred to as ICO) was performed by optimizing fluxes on the icosahedral grids. The third experiment (referred to as NO_WTS) did not consider the observational weighting applied to the observation–model mismatch error, i.e., Ni = 1 in Eq. (10). In this study, for a fair comparison with the CTL case, we changed β in Eq. (10) so that the cost function (Eq. 1) is nearly equivalent to that of CTL at the beginning of the iterative calculation. The fourth inversion (referred to as ISO_ERR) employed the isotropic prior error covariance that was simply defined using a Gaussian function with correlation scale lengths of 500 km and 1000 km for land and ocean fluxes, respectively. In this experiment, also for a fair comparison, each \(\gamma_{n}\) in Eq. (8) was tuned so that the global integrated value of the resulting global prior error covariance was nearly equivalent to that of CTL.

Table 1 List of the sensitivity tests

3 Results

3.1 Cost function and number of iterations

In the inverse analysis by Niwa et al. (2017b), the optimization for estimating 12 monthly fluxes required approximately 50 iterations. The experiment assumed weekly observation sampling at 65 sites. The optimization was expected to require over 50 iterations when using an expanded observation network with more than a hundred sites consisting of continuous measurements and flask air sampling (Fig. 3). In fact, it required a few hundred iterations. Moreover, Fig. 4 depicts the changes in cost function from the beginning to the 300th iteration for the CTL experiment and the other four sensitivity cases. As shown in Fig. 4, all cost functions decrease smoothly but do not fully converge at the 300th iteration. However, we did confirm that seasonal and interannual variations in the optimized fluxes, which will be discussed later, mostly converged at the 300th iteration. Although they did not converge perfectly, 300 iterations seemed to be the limit in practical use due to the computational demand. Hereafter, we present results from fluxes optimized by 300 iterations for all experiments.

Fig. 4
figure 4

Cost function changes with iterations for the CTL case (blue) and the other four sensitivity tests of NO_ERR (light blue), ICO (light green), NO_WTS (orange), and ISO_ERR (pink)

3.2 Comparison of CO2 mole fractions with the pseudo-observations

The comparison of CO2 mole fractions with the pseudo-observations at Minamitorishima (24.3° N, 154.0° E) and Syowa Station (69.0° S, 39.6° E) is depicted in Fig. 5, which are well-established background stations in the northern and southern hemispheres, respectively. At both sites, the model with the posterior flux of CTL reproduced atmospheric CO2 mole fractions without bias, indicating that the inversion reasonably incorporated the pseudo-observations. Furthermore, the root-mean-square (RMS) differences denoted in the lower panels of Fig. 5 indicate that, except for NO_ERR, the compatibility with the pseudo-observations is similar across all sensitivity tests. The significantly smaller RMS differences in NO_ERR are attributed to the absence of the transport model error. The inset of Fig. 5a shows the CO2 mole fraction time series zoomed for January 2017, including the lower-resolution pseudo-observation used for NO_ERR. The CTL model accurately reproduced elevated events. However, their magnitudes are underestimated even after the inversion and are rather comparable to those of the low-resolution pseudo-observation. This indicates that the inversion did not fix the surface fluxes and consequent atmospheric mole fractions beyond the model’s representation error (i.e., not overfitting to the observations).

Fig. 5
figure 5

Upper panels indicate the time series of CO2 mole fractions at Minamitorishima (24.3° N, 154.0° E) (a) and Syowa Station (69.0° S, 39.6° E), Antarctica (b). Black and red lines represent the pseudo-observation and the model with the CTL posterior flux, respectively. The inset of panel (a) depicts the CO2 mole fraction time series zoomed for January 2017, including the lower-resolution pseudo-observation (gray line). (The x-axis covers the whole period of January 2017, and the y-axis ranges from 406 to 420 ppm.) Lower panels indicate the differences between the CTL model and the pseudo-observation (model–observation) at Minamitorishima (c) and Syowa Station (d), where the number indicates the root-mean-square difference for each sensitivity experiment. Note that the NO_ERR case is compared with the pseudo-observations generated from the lower-resolution model

3.3 CO2 flux maps and comparison with icosahedral grid optimization

An example of the optimized fluxes is depicted in Fig. 6, which shows the distributions of CO2 fluxes averaged for July 2011. (Fossil fuel emissions are excluded.) Here, we compare the optimized flux of CTL with that of ICO, as well as the true (i.e., NISMON-CO2 ver. 2021.1) and prior fluxes. Both the CTL and ICO cases successfully retrieved the true flux pattern from the prior fluxes, as evidenced by the continental sinks in Eurasia and the east side of North America, sources in central Africa and South Asia, and the southwest side of North America. However, the south of the Amazon shows notable discrepancies from the true fluxes. This is probably because of the lack of inland observations and convective vertical mixing, leading to poorly captured flux signals by surface stations.

Fig. 6
figure 6

Distributions of CO2 fluxes averaged for July 2011 from the true (a) and prior (b) flux datasets and the posterior fluxes of the CTL (c) and ICO (d) cases. White color areas depict fluxes close to zero. Fossil fuel emissions, which were not optimized, were excluded

Although the ocean fluxes were optimized annually, the flux pattern was retrieved to some extent. The tongue shape of the sources in the eastern equatorial Pacific, which persisted throughout the entire year, was suitably retrieved, and the strength of the sources increased toward the true fluxes.

Notably, CTL, where fluxes were optimized in the latitude–longitude grid, showed a similar global distribution to the ICO case while resolving smaller flux distributions. For instance, over the Himalayas, CTL shows a flux pattern with finer-scale variations (not necessarily closer to the true), reflecting the complexity of the topography. These results indicate that the newly introduced latitude–longitude grid optimization in this study functions reasonably well.

3.4 Seasonal and interannual variations in regional CO2 fluxes

Here, we present seasonal and interannual variations in CO2 fluxes via various iterations to demonstrate how those temporal variations converge as the iterative calculation progresses. The seasonal variations in nonfossil fuel CO2 fluxes in the 11 TransCom terrestrial regions (Gurney et al. 2002; Baker et al. 2006) averaged for 2008–2017 are illustrated in Fig. 7. In most areas, the optimized seasonal variation is highly consistent with that of the true flux. Furthermore, the optimization converges well within 50 iterations for northern high-latitude areas, which contain dense observations (Europe, Boreal Asia, and Boreal North America). The flux variations converged within 100 iterations, even in other regions.

Fig. 7
figure 7

Monthly variations of nonfossil fuel CO2 fluxes averaged for 2008–2017 for the 11 TransCom terrestrial regions (Gurney et al. 2002; Baker et al. 2006). Light-gray and dark-gray solid lines represent the prior and true fluxes, respectively. The other colors represent posterior fluxes from different iterations (10–300) of CTL. The blue dotted line denotes the 300th iteration result in the case where model errors do not exist (NO_ERR)

However, for interannual variations, convergence required more iterations. For example, in Fig. 8, interannual variations in regional CO2 fluxes had not converged after 100 iterations, even in northern high-latitude areas. Moreover, interannual variations converged much more slowly than seasonal variations toward the 300the iteration. Also, significant differences still exist between 200 and 300 iterations (e.g., Boreal N. America), indicating that 300 iterations may not be fully sufficient for convergence. This is consistent with the decreasing cost function even at the 300th iteration (Fig. 4).

Fig. 8
figure 8

Same as Fig. 7, but for yearly variations from 2008 to 2017

Nevertheless, after 300 iterations, the posterior flux revealed interannual variations consistent with those of the true flux, specifically in northern high-latitude areas. In other areas, notable peaks and troughs of the interannual variations were more or less well retrieved, albeit with some noticeable errors (e.g., consistently smaller than the true flux in South America).

The slower convergence speed of the interannual variations than that of the averaged seasonal variations is attributed to their small degree of variation. Fluxes in northern high-latitude areas have a seasonal amplitude of approximately 10 Pg C yr−1, whereas their annual averages vary by approximately 1 Pg C yr−1. These results imply that optimizing such small flux variations requires a large number of iterations. This also applies to ocean fluxes. The yearly variations in ocean fluxes for the 11 TransCom ocean regions are depicted in Fig. 9. Although the posterior ocean fluxes changed gradually, they approached the true fluxes from the prior data. In fact, the ocean flux variations were much smaller than those of terrestrial flux variations; the range of interannual variations was at most 0.4 Pg yr−1. The inversion mostly reproduced those interannual variations and 10-year trends (generally decreasing trends), although some areas had persistent biases (approximately 0.1–0.2 Pg Cyr−1). However, the posterior fluxes do not fully represent the recent uptake increases in the North Pacific and North Atlantic (Fig. 9b and d), despite those regions having relatively dense observation networks of marine stations and ships (Fig. 3). This is probably because these areas are located on the downwind sides of the continents. Terrestrial flux signals from the continents are transported over the oceans with higher magnitudes than those of ocean fluxes, which might make it difficult to clearly distinguish ocean fluxes in the inversion.

Fig. 9
figure 9

Same as Fig. 8, but for the 11 TransCom ocean regions

Figures 7, 8 and 9 depict seasonal and interannual variations that were derived under the assumption of no model error (NO_ERR, blue dotted line). The model error can be grasped from the inset of Fig. 5a and the RMS differences in Fig. 5c and d. As shown in Fig. 7, the posterior flux of NO_ERR showed approximately the same patterns in terrestrial seasonal variations as CTL. Although the NO_ERR case was better than the CTL case for the interannual variations of the terrestrial fluxes, it did not perfectly follow the true flux (Fig. 8). Conversely, NO_ERR showed little improvement in interannual variations over the oceans (Fig. 9). Figures 10 and 11 confirm those NO_ERR features, in which the RMS errors of the yearly posterior flux variations for NO_ERR are better over terrestrial areas and similar to those of CTL over the oceans (see blue and cyan bars).

Fig. 10
figure 10

Root-mean-square errors of the yearly fluxes against the true fluxes for the TransCom terrestrial regions. Results of the CTL and the other four sensitivity tests are illustrated here

Fig. 11
figure 11

Same as Fig. 10, but for the TransCom ocean regions

3.5 Global land–ocean partitioning

Table 2 summarizes the global, land, ocean, and total carbon budgets, and their interannual variations are depicted in Fig. 12. For 2008 to 2017, the global mean true sinks over the land and the ocean were 2.62 and 2.04 Pg C yr−1, respectively (estimated using NISMON-CO2 ver. 2021.1 with real observations), and the prior estimates were 4.61 and 1.41 Pg C yr−1, respectively. Meanwhile, the inversion estimates by CTL were 3.39 and 1.29 Pg C yr−1, respectively; thus, the estimated global land–ocean partitioning had a bias of approximately 0.8 Pg C yr−1. However, the global total sink was nearly identical to that of the true flux, indicating that the mass conservation was thus ensured. In fact, the bias was coherently present over the entire analysis period (Fig. 12). Nevertheless, Fig. 12 depicts that the inversion successfully reproduced the interannual variation patterns from the constant prior estimates, even for the ocean, which moderately changes. Figure 12 also indicates that 100–150 iterations are required for convergence of the land–ocean partitioning, which is less than those required for estimates of regional interannual variations (Figs. 8 and 9). Compared with the CTL case, the NO_ERR case did not necessarily provide better estimates of land–ocean partitioning and displayed the same coherent bias and similar interannual variations (Table 2; Fig. 12). Furthermore, Fig. 12 shows that the differences between the posterior and true fluxes of the land and the ocean are gradually increasing over time. From the former five-year period (2008–2012) to the latter five-year period (2013–2017), the mean differences between the posterior and true fluxes of the global land and ocean changed from 0.66 to 0.89 Pg C yr−1 and from 0.69 to 0.82 Pg C yr−1, respectively. This is attributable to the fact that prior ocean fluxes do not have interannual variations but are accompanied by small prior uncertainties, which makes it slightly difficult to keep up with the increasing trend of the ocean sink and, consequently, the decreasing trend of the land sink.

Table 2 Global-scale net sinks and errors for 2008–2017. The sink values do not include fossil fuel emissions. These errors are derived using root-mean-square errors against the true fluxes over the land, the ocean, and the global total carbon fluxes
Fig. 12
figure 12

Yearly variations of the global land (a), ocean (b), and total (c) CO2 fluxes. (Fossil fuel emissions are excluded.) Light-gray and dark-gray solid lines represent the prior and true fluxes, respectively. The other colors indicate posterior fluxes from different iterations (10–300) of CTL. The blue dotted line denotes the 300th iteration result where model errors are nonexistent (NO_ERR)

3.6 Sensitivity to inversion settings

As shown in Table 2, NO_ERR (2.74 × 10−1 gC m−2 day−1) had the smallest RMS error for the overall fluxes during 2008–2017, which were calculated against the true fluxes, while the second smallest value was obtained in CTL (3.49 × 10−1 gC m−2 day−1). The smaller RMS error of NO_ERR compared with that of CTL is due to fluxes over land. This could be attributed to more inhomogeneous and larger land fluxes than the ocean, and mole fractions simulated from terrestrial fluxes are more sensitive to the model transport resolution. The difference between CTL and NO_ERR is the model reproducibility of atmospheric mole fractions, which is derived from the horizontal resolution of the model that produced the pseudo-observations. Therefore, the result indicates that a change in the model horizontal resolution could highly impact the terrestrial flux estimates than the ocean flux estimates.

The other sensitivity tests, ICO, NO_WTS, and ISO_ERR, produced similar values for global land and ocean uptakes, i.e., similar biases; however, their RMS errors for the overall fluxes were larger than those of CTL (Table 2). In the ICO case, the larger RMS error came from the terrestrial fluxes, a feature that can also be found in interannual flux variations (Fig. 10). This result implies that the latitude–longitude optimization introduced in this study not only provided higher resolution fluxes but also improved the inversion estimates. The NO_WTS case showed larger RMS errors over the land and ocean, which is true for the overall fluxes (Table 2) and regional interannual variations (Figs. 10 and 11). This impact is highest in the sensitivity tests, indicating that the spatiotemporal weighting of the observation–model mismatch errors plays a crucial role. The ISO_ERR case also had a larger RMS error than the CTL case for the overall terrestrial fluxes (Table 2). However, this was not necessarily the case for the regional interannual variations; ISO_ERR had smaller errors in some regions than CTL (Fig. 10).

4 Discussion

The temporal variations in CO2 fluxes were retrieved well in this study, demonstrating the reliability of the flux estimates by NISMON-CO2. It is also promising that a weakly decreasing trend was retrieved in some ocean regions, because detecting such small flux changes is critical for monitoring carbon sink capacities of natural reservoirs and gaining insights into the future global carbon cycle considering the global warming conditions.

However, areas in the southern hemisphere and tropics still have substantial errors in flux estimates, possibly due to a sparse observational network there. The inversion reproducing atmospheric mole fractions at the two sites without any bias is depicted in Fig. 5; however, those sites are in regions with persistent flux biases (i.e., North Pacific and Southern Ocean, Fig. 9). This suggests that fluxes in other areas compensate to make the modeled mole fractions consistent with the observations. Such flux compensations could occur where observations are sparse, which may lead to further errors in flux estimates. This is a typical inversion problem for long-lived species, such as CO2.

The persistent bias and long-term trend error in the global carbon budgets indicates difficulty in separating the land and ocean budgets with the current observation network. This land–ocean partitioning error could be mitigated by using the best ocean flux dataset available for the prior estimate; however, assessing which dataset is the best is difficult. Therefore, for the prior ocean flux in the real-observation inversion of NISMON-CO2 ver. 2021.1, we used the JMA’s yearly varying ocean flux data (Iida et al. 2021), which is likely representing the actual ocean sink trend better than the climatological data (Takahashi et al. 2009) used in the pseudo-observation experiment. Performing multiple inversions with different ocean flux datasets could also be useful to account for ocean flux uncertainties. Furthermore, using additional O2/N2 data, as Rödenbeck et al. (2008) did, might be another approach to improve the land–ocean partitioning. However, they are left for future studies.

It should be noted that we did not select observational sites based on their temporal coverage. Therefore, well-reproduced interannual variations, as shown in Figs. 8 and 9, indicate that such temporal gaps in the observational data do not significantly affect the estimation of interannual variations. Nonetheless, some interannual variations remain poorly estimated and might be affected by the observational gaps. Such observational gap effects could be mitigated by selecting only observations with sufficient temporal coverage, as some other inversion studies do (e.g., Chevallier et al. 2010). However, reducing observations would increase overall flux errors. Furthermore, due to changes in meteorological conditions, misestimation of interannual variations could occur even with temporally sufficient observations. When wind direction at a site changes from year to year, its observational constraints may also change. Therefore, we adopted a strategy to use as many observations as possible without considering any site-specific selection in terms of temporal data coverage. This research demonstrates that such a strategy can still reproduce reasonable interannual variations. It should also be noted that this inversion only used surface observations. However, various aircraft data are available, which could provide effective constraints in flux estimates (Niwa et al. 2012, 2021). Furthermore, satellite observations (Crisp et al. 2008; Yokota et al. 2009) may provide different insights from surface and aircraft observations, although assessing biases and errors of satellite observations can be difficult. Incorporating such observations is one way to improve NISMON-CO2.

In contrast to a previous study by Niwa et al. (2017b), we used a higher resolution model to prepare pseudo-observations (~ 112 km). This provided insight into the representation error of the model, although the actual degree of the model’s representation error is probably much greater. The inversion performance in retrieving seasonal and interannual variations in surface fluxes depicted in this study is possibly overestimated due to the small model errors. Nevertheless, the difference between CTL and NO_ERR provides valuable insight into the effect of model resolution, from which we can predict a degree of improvement in a future inversion with a higher resolution model.

One major update to the inversion settings of Niwa et al. (2017b) is the implementation of a grid conversion in the inverse calculation, which enables a high-resolution latitude–longitude grid optimization. The computational time required for the grid conversion is mere minutes, and the convergence speed is almost the same as that of ICO (Fig. 4); therefore, this implementation does not induce any substantial increase in computational costs. Because it uses the same transport model resolution as before, the amount of information that can be extracted from the observations does not necessarily increase. Nevertheless, it is valuable because the original information in the prior flux datasets does not deteriorate. Furthermore, such highly resolved fluxes could be more useful for regional- or national-scale budget analyses, demand for which has recently increased for use in monitoring and reducing national carbon emissions. This scheme can be widely utilized because it is also applicable to inversion systems that use a conventional latitude–longitude grid model. The grid conversion scheme can be applied to convert from higher- to lower-resolution latitude–longitude grids.

Other updates to the inversion settings include the observational weighting and anisotropic prior error covariance. Although both of these updates improved the inversion results, the impact of observational weighting was considerably greater. Weighting is performed using Eq. (10) with three parameters (horizontal lengths, vertical lengths, and temporal range for counting nearby observations), which is much simpler than changing the model resolution or the prior flux error covariance design. In this study, these parameters were determined after a few preliminary sensitivity tests but were not thoroughly investigated. Therefore, equation modification or parameter sweeping could efficiently improve the inversion. The impact of the anisotropic prior error covariance is comparable to that of observational weighting on the RMS error of the overall fluxes (Table 2), whereas the RMS errors of the regional and temporal variations are marginal (Figs. 10 and 11). In fact, the improved result from CTL compared to ISO_ERR is partially attributed to the fact that the true flux was derived from NISMON-CO2 ver. 2021.1, which used the same inversion settings, including the prior error covariance, as those of CTL. This might have given CTL some advantages. Nonetheless, the difference between CTL and ISO_ERR indicates potential improvements made by the anisotropic error covariance and inversion estimates could be improved by shaping the error covariance according to real flux variations. Because a long-term flux dataset is used to derive not only the prior estimate but also its error covariance, preparing such a flux dataset has become more vital. Furthermore, the anisotropic error covariance is also valuable for providing high-resolution fluxes with the grid conversion scheme. If isotropic error covariance is used, the patterns of flux increments from prior fluxes would be blurred even when the optimization is performed in the latitude–longitude grid as the isotropic error covariance works on spatial smoothing. However, as depicted in Fig. 2, the anisotropic error covariance retains the fine scale patterns derived from flux dynamics.

5 Conclusions

In this study, we described the updated inversion settings for NISMON-CO2. Furthermore, to demonstrate the practical application of long-term inverse analysis, we conducted 10-year inversion experiments using pseudo-observations that emulate an inhomogeneous observation network consisting of continuous measurements and flask air samplings with actual data gaps. Despite a persistent global bias in the land–ocean partitioning, the seasonal and interannual variations were sufficiently retrieved in the experiment, demonstrating the reliability of NISMON-CO2 and its real-observation inversion product (NISMON-CO2 ver. 2021.1). Moreover, sensitivity experiments revealed that the flux estimation improvements were attributed to grid conversion, observational weighting, and anisotropic prior error covariance, all of which were newly introduced to NISMON-CO2. Hence, these schemes will be useful as the default settings of NISMON-CO2 and can also be applied to inversion analyses of other atmospheric constituents (e.g., methane).

Since the adoption of the Paris Agreement in 2015, everyone globally is working toward reducing greenhouse gas emissions with the common goal to limit global warming well below 2 °C, preferably below 1.5 °C, compared with pre-industrial levels. To verify the achievements in CO2 emission reductions, regular evaluations of surface CO2 fluxes are imperative. To this end, we will operationally conduct inverse analysis using NISMON-CO2 and provide science-based estimates of CO2 fluxes, as well as developing the system further.