Errors and uncertainties in a gridded carbon dioxide emissions inventory

Emission inventories (EIs) are the fundamental tool to monitor compliance with greenhouse gas (GHG) emissions and emission reduction commitments. Inventory accounting guidelines provide the best practices to help EI compilers across different countries and regions make comparable, national emission estimates regardless of differences in data availability. However, there are a variety of sources of error and uncertainty that originate beyond what the inventory guidelines can define. Spatially explicit EIs, which are a key product for atmospheric modeling applications, are often developed for research purposes and there are no specific guidelines to achieve spatial emission estimates. The errors and uncertainties associated with the spatial estimates are unique to the approaches employed and are often difficult to assess. This study compares the global, high-resolution (1 km), fossil fuel, carbon dioxide (CO2), gridded EI Open-source Data Inventory for Anthropogenic CO2 (ODIAC) with the multi-resolution, spatially explicit bottom-up EI geoinformation technologies, spatio-temporal approaches, and full carbon account for improving the accuracy of GHG inventories (GESAPU) over the domain of Poland. By taking full advantage of the data granularity that bottom-up EI offers, this study characterized the potential biases in spatial disaggregation by emission sector (point and non-point emissions) across different scales (national, subnational/regional, and urban policy-relevant scales) and identified the root causes. While two EIs are in agreement in total and sectoral emissions (2.2% for the total emissions), the emission spatial patterns showed large differences (10~100% relative differences at 1 km) especially at the urban-rural transitioning areas (90–100%). We however found that the agreement of emissions over urban areas is surprisingly good compared with the estimates previously reported for US cities. This paper also discusses the use of spatially explicit EIs for climate mitigation applications beyond the common use in atmospheric modeling. We conclude with a discussion of current and future challenges of EIs in support of successful implementation of GHG emission monitoring and mitigation activity under the Paris Climate Agreement from the United Nations Framework Convention on Climate Change (UNFCCC) 21st Conference of the Parties (COP21). We highlight the importance of capacity building for EI development and coordinated research efforts of EI, atmospheric observations, and modeling to overcome the challenges.


Introduction
Emission inventories (EIs) are the fundamental tool to quantify the amount of man-made emissions, such as those of greenhouse gases (GHGs) and other air pollutants, and to keep track of their changes over time. For GHGs, nationally reported EIs are generally compiled following the guidelines prepared by the Intergovernmental Panel on Climate Change (IPCC) (e.g., IPCC 2006). Emissions are reported by countries in order to monitor international compliance of GHG reductions (e.g., under the Kyoto Protocol or Paris Agreement). National EIs are primarily based on statistical data (e.g., on fuel production, consumption, and trade data), and emission estimates are often made at the national scale by economic sector or by fuel type. The IPCC Guidelines provide "best practice" to compile EIs in a consistent manner, regardless of the data availability in different countries. The uncertainties associated with national estimates for fossil fuel carbon dioxide (CO 2 ) emissions (FFCO 2 ) are often relatively small, especially for developed countries (e.g., ± 4% for the USA). However, the uncertainty reported with EIs often serves as an indicator for the level of confidence, rather than for the accuracy (Jonas et al. 2010). As previously discussed in Liberman et al. (2007), White et al. (2011), and Ometto et al. (2015), studying the variety of sources of errors and uncertainties is crucial in order to make EIs more robust and accurate for providing science-based guidance to global climate mitigation.
Adding an atmospheric, observational (top-down) constraint on statistically based emission estimates (bottom-up) should help improve the accuracy of emission estimates and provide a verification support to the current global GHG monitoring framework (e.g., Nisbet and Weiss 2010;Pacala et al. 2010;Ciais et al. 2015;Pinty et al. 2017). Because the effective spatial and temporal resolution of emissions estimates depends highly on the availability of observational data and the model reproducibility, how top-down approaches can play a role in the bottom-up vs. top-down exercise cannot be easily generalized (see Ciais et al. 2010). However, the increased volume of recent atmospheric CO 2 data collected from intensive urban observation networks (e.g., Lauvaux et al. 2016 for Indianapolis;Staufer et al. 2016 for Paris;Verhulst et al. 2017 for Los Angeles; Martin et al. 2018 for Baltimore-Washington area) and the recently available carbon observing satellites, such as the Japanese Greenhouse gases Observing SATellite (GOSAT, Yokota et al. 2009) and NASA's Orbiting Carbon Observatory-2 (OCO-2, Crisp et al. 2017), have placed us in a better position to implement bottom-up vs. top-down analyses at policyrelevant scales. For example, Lauvaux et al. (2016) developed a state-of-the-art, high-resolution atmospheric inversion system that demonstrated the feasibility of a top-down approach at a city scale, and confirmed the bottom-up emission estimates. Vogel et al. (2013) also demonstrated the use of radiocarbon measurements to detect potential biases in a bottom-up EI.
In such bottom-up vs. top-down exercises, bottom-up emission estimates generally need to be given in a spatially explicit form (e.g., gridded EIs). In fact, both Lauvaux et al. (2016) and Vogel et al. (2013) employed locally constructed, fine-grained, spatially explicit EIs for their atmospheric CO 2 model simulations (1.3-km resolution for Lauvaux et al. (2016) and 5-km for Vogel et al. (2013)). The Hestia inventory, which was used by Lauvaux et al. (2016), is based on a multi-resolution emission modeling approach and emission estimates are achieved at the resolution of emission sources of interest (e.g., point, line, and area sources). The multiresolution, bottom-up approach makes Hestia unique compared with spatially explicit EIs that are based on spatial disaggregation of national or regional emission estimates (e.g., Andres et al. 1996;Janssens-Maenhout et al. 2012Oda and Maksyutov 2011). While the multi-resolution modeling approach is considered to be the best approach to achieve emission estimates at policy-relevant scales, their development is extremely laborintensive and such EIs are only available for limited places and times. A few other spatially explicit EIs that employ a multi-resolution modeling approach (e.g., Gurney et al. 2012;Bun et al. 2018;Mori et al. 2015) also share these difficulties, and none of them cover the full globe to support global climate mitigation. It is important to note that large-scale, top-down GHG emission verification support systems, such as the one proposed by Pinty et al. (2017), assume the use of a disaggregation-based EI such as the Emission Database for Global Atmospheric Research (EDGAR, Janssens-Maenhout et al. 2012, not of the detailed bottom-up estimates based on multi-resolution modeling. A challenge for top-down monitoring systems is to achieve accurate, disaggregated, subnational emission estimates from national-level emission estimates. The spatial disaggregation is often an independent process from the regular, bottom-up, national EI compilation defined by the IPCC (2006). However, the uncertainty evaluation of spatially disaggregated emission estimates, especially for diffused (area) emission fields obtained with proxy approaches, is challenging, primarily due to the lack of physical measurements (e.g., Andres et al. 2016;Oda et al. 2018). To achieve accurate estimates, errors and uncertainties due to the emission disaggregation process need to be quantified and the error/uncertainty characterization needs to be incorporated into the top-down estimation (e.g., Rayner et al. 2010;Lauvaux et al. 2016;Oda et al. 2017). In principle, spatial patterns in disaggregated emission estimates, and their changes in time, are driven by changes in the total emissions and in the spatial patterns in proxy data. Thus, the changes in disaggregated emission estimates might not be accurately reflecting actual changes in emission. Given the requirements for useful emission estimates suggested by Ciais et al. (2015) (e.g., 1-km spatial resolution and hourly temporal resolution) and the labor expected for these detailed bottom-up EIs, the use of disaggregationbased EIs for climate mitigation analyses still remains valid. To successfully use disaggregated emissions to monitor emissions changes at subnational levels in a verification support system, we need to characterize the biases in disaggregated emission fields at different spatial levels of disaggregation, such as countries, provinces/states, and cities.
The Open-source Data Inventory for Anthropogenic CO 2 (ODIAC, Oda and Maksyutov 2011;Oda et al. 2018) is so far the only global, spatially explicit EI data product that meets the requirements of Ciais et al. (2015). ODIAC is based on disaggregation of national FFCO 2 estimates made by the Carbon Dioxide Information Analysis Center (CDIAC) at the Oak Ridge National Laboratory (ORNL) and projections ). Since its establishment in 2009, ODIAC has been intensively used for global and regional atmospheric inversions (e.g., Takagi et al. 2011;Maksyutov et al. 2013;Saeki et al. 2013; Thompson et al. 2016;Feng et al. 2016a;Shirai et al. 2017). ODIAC has been also used for regional-to urban-scale studies because of the high-spatial resolution (e.g., Ganshin et al. 2012;Oda et al. 2013;Brioude et al. 2013;Wong et al. 2016;Lauvaux et al. 2016;Oda et al. 2017;Ye et al. 2017;Wu et al. 2018;Martin et al. 2018;Hedelius et al. 2018). The fair agreement with local estimates and atmospheric CO 2 model reproducibility support the utility of ODIAC subnational emissions at regional to urban scales; however, it is yet unclear how well ODIAC subnational emissions are reflecting the true emission dynamics at policy-relevant spatial scales.
This study evaluates the ODIAC high-resolution emission fields by comparing them with a locally developed, fine-grained EI, the geoinformation technologies, spatiotemporal approaches, and full carbon account for improving the accuracy of GHG inventories (GESAPU, Bun et al. 2018;Charkovska et al. 2019). GESAPU is based on a multi-resolution approach and the domain of Poland. By taking full advantage of GESAPU emission fields, we characterize the biases and uncertainties in ODIAC over the course of spatial resolution from the national level (zero disaggregation), subnational, and city to the native 1-km grid scale of ODIAC. Following the "Data and methods" section, we compare ODIAC with GESAPU by emission sectors (point and non-point sources of emissions as defined in ODIAC) at different levels of disaggregation (national, province, city, and native 1-km grid) ("Results" section). In the "Discussions" section, we discuss the current limitations, and challenges in emission data studies (e.g., development and evaluation) and how we could potentially overcome them. We also respond to general questions about the merger of bottom-up and top-down approaches. We conclude this paper with some recommendations to establish a good, meaningful EIbased framework for international agreements on emissions limits.

Emissions data
This subsection describes the two spatially explicit CO 2 emission data used in this study: (1) the Open-source Data Inventory for Anthropogenic CO 2 (ODIAC, Oda et al. 2010Oda et al. , 2018Maksyutov 2011, 2015) and (2) the geoinformation technologies, spatio-temporal approaches, and full carbon account for improving the accuracy of GHG inventories (GESAPU, Bun et al. 2018;Charkovska et al. 2018Charkovska et al. , 2019Danylo et al. 2019). Table 1 compares the specifications of ODIAC and GESAPU. Figure 1 shows the two estimates of CO 2 emissions from fossil fuel use in Poland during 2010, presented at a common 1-km domain.

ODIAC global 1-km emission data product
ODIAC is a global, high-resolution (1 × 1 km) monthly, a gridded emission data product that is based on the spatial disaggregation of country total emissions estimates (e.g., Oda and Maksyutov 2011;Oda et al. 2018). The ODIAC first introduced the combined use of point source information for large point sources and satellite-observed nightlight data for global emission spatial disaggregation in order to achieve emission spatial distributions. The current ODIAC data product is based on country-level emission estimates made by CDIAC/ORNL, which consists of CO 2 emission estimates from fuel use (coal, oil, and gas), cement production, and gas flaring (e.g., Marland and Rotty 1984). CO 2 emissions  (2006) Emission calculation CDIAC approach (Marland and Rotty, 1984) IPCC guideline (IPCC, 2006) Emission spatial distributions Downscaled with point source information and nightlight Achieved by a multi-resolution (point, line, and area) approach The actual resolution is 30 arcsec. Thus, the grid size is smaller than 1 km 2 as it goes to higher latitude b The latest version of the ODIAC data product (ODIAC2018) covers 2000-2017 from cement production and gas flaring are not due to fossil fuel use; however, those emissions are often included as a part of FFCO 2 by definition (e.g., Andres et al. 2012). The emissions are distributed differently depending on the type of emissions (e.g., point source and non-point source). ODIAC employs the global power plant database CARMA (CARbon Monitoring and Action, www.carma.org; Wheeler and Ummel 2008;Ummel 2012) to estimate the power plant portion of a country's total emissions and maps its emissions as point sources. The rest of the emissions (total emissions minus point source emissions) are distributed as an aggregated area source sector using the Defense Meteorological Satellite Program (DMSP) calibrated radiance nightlight product (Ziskin et al. 2010). This study used the version 2016 of the ODIAC data (ODIAC2016, 2000(ODIAC2016, -2015. Oda et al. (2018) Charkovska et al. 2018Charkovska et al. , 2019Danylo et al. 2019) and is based on earlier studies (Bun et al. 2007Boychuk and Bun 2014). We here define GESAPU as a bottomup EI solely for convenience, also by the EI calculation approach. However, we also distinguish GESAPU from other existing spatially explicit EIs, such as EDGAR (Janssens-Maenhout et al. 2012, which are also often classified as bottom-up EIs in comparison with top-down atmospheric inversion studies. Many gridded EIs, including ODIAC as mentioned earlier, are based on emission spatial aggregation (e.g., Andres et al. 1996Andres et al. , 2011Andres et al. , 2014Andres et al. , 2016Kurokawa et al. 2013;Asefi-Najafabady et al. 2014;Janssens-Maenhout et al. 2012; however, GESAPU employed a multiresolution, high-definition (HD) emission modeling approach and the emissions are, in principle, calculated at the individual source level. GESAPU's HD approach is similar to the approach done for US cities by Gurney et al. (2012) and allows us to achieve detailed emission accounting and emission modeling simultaneously at a policy-relevant scale. Where information disaggregation is needed (e.g., diffused sources such as settlements and line sources such as road segments), GESAPU employs high granularity data at the municipality or district level, rather than country-level data. The GESAPU approach, which is an HD approach extended to a country, should help provide constraints on subnational emissions information and reduce potential biases due to the use of largescale data (e.g., national data) for emission disaggregation. Given the GESAPU emission modeling approach, the authors believe GESAPU is on the complete another side of the spectrum of spatially explicit EI from ODIAC. Thus, the authors expect GESAPU should allow us to thoroughly evaluate a disaggregation-based EI like ODIAC. GESAPU is based on the best available official statistical data and geospatial information data, which are collected at the best (smallest) possible administrative levels, such as municipalities and districts . Emission calculations are done according to the IPCC methodology for CO 2 , CH 4 , and N 2 O. Unlike ODIAC that holds its own modified, fuel-based emission categories, emission estimates are obtained for the IPCC-defined sectors and categories. Since GESAPU employs a multi-resolution modeling approach, the resulting emission estimates are accompanied by point locations, lines, and/or the polygon spatial information, rather than a grid point coordinate. GESAPU then is able to seamlessly prepare emission fields at a spatial resolution of interest (up to 100-m resolution) via emission spatial aggregation. The vector emission source maps for all human activity-induced emissions categories covered by the IPCC guidelines were developed, utilizing official company disclosure information available, the administrative boundary maps, the Corine Land Cover map, and other available data. GESAPU also employs the region-specific parameters (e.g., the differentiated characteristics of the fossil fuel used in the energy sector, the climatic conditions and the energy sources available in the residential sector, the species and age composition of forests, and many others) for the emission calculation. Thus, the total GESAPU emissions at aggregated levels, such as province and national levels, should be achieved more precisely than other estimates that are often calculated using national-specific parameters. Other than GHG, GESAPU also indicates non-methane volatile organic compounds (NMVOCs) and air pollutants such as SO 2 . GESAPU data have been provided as a part of Bun et al. (2018) (see Supplementary Material in Bun et al. 2018).

Emission comparison principles in this study
Comparing gridded EIs has become a common evaluation approach, as seen in previous studies (e.g., Hutchins et al. 2016;Hogue et al. 2016;Gately and Hutyra 2017;Oda et al. 2018), with an increasing interest in gridded EI uncertainties. Due to technical difficulties, which will be discussed later in this manuscript, comparisons of gridded EIs often only provide a limited opportunity to partially evaluate the uncertainty of interest and do not offer an objective measure for their accuracy (e.g., Oda et al. 2018). This is a fundamental limitation due to the fact that (1) gridded EIs are often achieved via two independent processes (i.e., emission calculations and emission spatial disaggregation and/or mapping) and (2) emission estimates at grid level are often not evaluated objectively due to the lack of physical measurements (e.g., Andres et al. 2016, 2018. Thus, it is very important to clearly define the objective and implementation of the emission comparison and describe limitations. This study attempts to evaluate the ODIAC global 1 × 1 km gridded emissions over the domain of Poland by comparing it with the GESAPU emissions aggregated to the same 1-km resolution domain. As done in previous studies, such as Hutchins et al. (2016) and Gately and Hutyra (2017), we use a bottom-up EI, which is GESAPU in this study, as a truth. It is primarily because of the fact that GESAPU, as described in the previous section and elsewhere (e.g., Bun et al. 2018;Charkovska et al. 2019), is a detailed spatially explicit EI that is locally developed using the best available data, while ODIAC is a global disaggregation EI. But the true significance of this comparison is from the fact that GESAPU's HD emission fields are achieved by its multi-resolution approach in which the little use is made of emission disaggregation. This type of comparison could be potentially achieved for a few US cities using Gurney et al. (2012). But GESAPU provides a unique opportunity to evaluate ODIAC emissions at a high resolution across the entire country.
The authors acknowledge that there are potential emission modeling errors and uncertainties associated with GESAPU. From the bottom-up vs. top-down perspective discussed by Jonas et al. (2011), the emissions estimates are not constrained by atmospheric observations. This study assumes that those errors and uncertainties are minor when compared with the ODIAC-GESAPU differences, defined as ODIAC minus GESAPU, and that the differences can be attributed to ODIAC. This is because the ODIAC-GESAPU difference is expected to be largely driven by the emission representation errors in ODIAC due to the use of global power plant data and nightlight data for emission disaggregation, especially at a high-spatial resolution where large-scale downscaling approaches often fail (e.g., Gately and Hutyra 2017). This could be also supported by a comparison of 1-km resolution emission fields presented in Fig. 1. While major spatial patterns of emitting areas (mainly major cities and their suburb areas) are shared by two emission fields, GESAPU offers more spatial details in the emissions field due to the data granularity. This study thus uses ODIAC-GESAPU differences as a proxy measure for errors and uncertainties associated with ODIAC emissions.

Emission comparison setup
Another common limitation we often face in emission comparison is the differences in emission calculations such as calculation methods and emission definition. Ideally, the differences in gridded EIs should be explained by individual components of emission calculation (e.g., emission calculation and emission disaggregation), but it is often not done so . For example,  attempted to mitigate this issue by scaling the gridded emissions to the same total and combined them with the global emission uncertainty. In fact, the use of GESAPU makes this even harder as its spatially explicit emissions are not based on emission disaggregation. Thus, we do not separate two error sources as also done in previous studies (e.g., Hutchins et al. 2016;Gately and Hutyra 2017).
We here focus to do the best effort to mitigate the differences due to different emission sectors covered in ODIAC and GESAPU. As described earlier, GESAPU indicates emissions by the IPCC sector (calculated at source level), while ODIAC has its own unique emission categorization (point source and non-point source over land) built upon the CDIAC fuel-based emissions categories. Depending on the comparisons we implement in this study, we do the best effort to mitigate the emission definition differences in order to implement the emission comparisons in a meaningful way. For example, emissions from refineries and cokes are indicated as point source emissions in GESAPU, but those are not explicitly indicated in ODIAC and are assumed to be a part of the non-point source emissions. We thus do an ad hoc adjustment for each emission comparison to best support the results. Table 2 summarizes the domain-wide emission totals from the two EIs and their breakdown, when emissions are compared as they are. We see this as a comparison at zero disaggregation level. We found that the domain-wide totals from ODIAC and GESAPU are fairly close (87,502 ktC/year for ODIAC and 85,612 ktC/year for GESAPU). The difference between the two totals was only 2.2%, which is well within the 2 sigma uncertainty range of the typical country-level emissions for developed countries (e.g., 5% as estimated by Andres et al., 2012). When compared with the original CDIAC estimate for Poland (86,246 ktC/year as estimated by Boden et al. (2016)), note that CDIAC national emissions are scaled in the ODIAC emission data development, in order to account for the difference between the global total emission and the sum of national total emissions mainly due to the inconsistency in the import/export portion of the statistical data (see more in Oda et al. 2018). When solely compared emission estimates taken from CDIAC and GESAPU, the difference is only 0.7% (CDIAC = 84,130 ktC/year). When the cement and gas flare emissions (2.5% of the total) are subtracted from ODIAC (85,355 ktC/year), the difference is even smaller (− 0.3%). Andres et al. (2012) showed that the agreements among different national-level estimates are often reasonable and this initial comparison is consistent with the study. The differences in point source and non-point source emissions categories are also small (− 0.1% for point source emissions and 4.5% non-point source emissions) (the definition of point and non-point emissions in this study will be discussed in detail later). The small differences support that the differences between two emission spatial fields are largely explained by the differences in emission modeling (thus, errors and uncertainties in ODIAC most likely). The results from this comparison can be combined with the national total uncertainty (CDIAC total uncertainty by Andres et al. (2012) in the case of ODIAC) to get the total uncertainty, in the method proposed .

Results
In this section, we compare ODIAC with GESAPU by emission types (point and non-point emissions in the "Point source emissions comparison" and "Non-point source comparison" section), from the national scale to a policy-relevant city scale ("City-level comparison" section). It is challenging to put all of the evaluations done in this study together and come up with a single, universal evaluation metric. We thus attempted to summarize our results by focusing on the levels of the overall errors and uncertainties in ODIAC as a function of spatial resolution. We compared the emissions spatial patterns from the two EIs at different spatial resolution from the native 1-km resolution to aggregated spatial scales that are roughly consistent with the spatial resolutions commonly used in transport model studies ("The disaggregation errors across different spatial resolutions-putting all together" section).

Point source emissions comparison
3.1.1 Background, issues, and focus in this comparison The combined use of the point source information and nightlight data was a key the global high-resolution emission field in ODIAC (see Rayner et al. 2010;Oda and Maksyutov 2011). A major known issue has been the inaccuracies in facility-level emission estimates and geolocations in the power plant database, which are directly aliasing to the resulting emission field. The errors, especially the geolocation errors, could be mitigated by spatially aggregating the emission field. However, a high-spatial resolution EI, such as ODIAC, requires a very high accuracy in the geolocation. The power plant information for the USA is often considered to be one of the best, but Woodard et al. (2014) showed a mean 0.84-km geolocation error from randomly selected 500 plants and thus demonstrated that geolocation error in power plant databases is an issue even for the USA. Such geolocation errors can be significantly reduced by a simple data review, but such reviews would be labor-intensive. For example, Oda and Maksyutov (2011) reviewed data on approximately 400 power plants, but it was just a little more than 2% of the 17,000 CARMA plants. More fundamentally, the information in power plant databases available are often incomplete (missing power plants), sparse in time (limited base year), and often outdated. We here evaluate the point source part of ODIAC emissions by taking advantage of the facility-level power plant emission estimates with verified geolocations that GESAPU offers.

Point source definition differences
First, we review the definitions of the two-point source data sources (CARMA/ODIAC and GESAPU). The point source definition in ODIAC was inherited from CARMA (i.e., electric power plants, as defined in Wheeler and Ummel (2008) and Ummel (2012)). CARMA was originally developed as a monitoring tool for GHG emissions from power plants (Wheeler and Ummel 2008). CARMA is primarily based on the individual plant information from the World Electric Power Plants (WEPP) database (https://www.platts.com/products/world-electricpower-plants-database now at https://www.spglobal.com/platts/en/products-services/electricpower/world-electric-power-plants-database), which is a commercial subscription global database provided by the company S&P Global Platts (https://www.platts.com/). According to the website, "It (WEPP) contains design data for plants of all sizes and technologies operated by regulated utilities, private power companies, and industrial autoproducers" and a product description (https://www.platts.com/im.platts.content/downloads/udi/wepp/descmeth. pdf); WEPP covers a wide variety of electricity generators worldwide (> 1 kW), not limited to major electric power plants regulated by the authorities. WEPP includes facility-level information and the geographical locations, but not CO 2 emission estimates. CO 2 emission estimates in CARMA were obtained in two ways: (1) taken from publicly available national power plant data if a facility entry can be matched up and (2) estimated using their own emission estimation scheme defined by Wheeler and Ummel (2008). Wheeler and Ummel (2008) reported that 2922 entries in an earlier version of CARMA (which has been used for the ODIAC emission data development) were matched up with publicly available CO 2 emissions data globally and 260 for Europe with the European Pollutant Emissions Register (EPER) database. In the newer version of CARMA (CARMA v3.0), 6200 entries were matched up with the publicly available data globally and 63% of the total emissions in Europe were covered (Ummel 2012). However, the match up is extremely labor-intensive and Wheeler and Ummel (2008) acknowledged that the data match up was incomplete and might be inaccurate. Geographical coordinates (latitude and longitude) were derived from the postal address indicated in the WEPP using a fuzzy string matching approach (e.g., Wheeler and Ummel 2008;Ummel 2012). In CARMA v3.0, the geographical coordinates are also taken from the matched up publicly available data. Oda and Maksyutov (2011) used the CARMA power plant entries with CO 2 emission estimates, assuming them as fossil fuel-fired power plants. In ODIAC, we loosely estimate the point source portion of national emissions using 2007 CARMA emissions. For other years, Oda and Maksyutov (2011) scaled the total point source emissions using national total emissions. The point source emissions thus might not be identical to ones originally indicated in CARMA. The potential errors of the power plant modeling approach have been evaluated in Oda and Maksyutov (2011). Emissions from cement production plants and gas flares should also be defined as point sources, but currently, emissions from cement production are distributed as a part of non-point source emissions and gas flare emissions are distributed using the spatial distribution of a gas flare nightlight product (see Oda et al. 2018).
In contrast, point source emissions in GESAPU are calculated at the facility-level and their geolocations are reviewed and verified ). The GESAPU point source category includes non-power plant point sources such as facilities for petroleum refining and manufacturing solid fuels (coke plants) (in fact, these emissions are mapped using industrial area polygons, see Charkovska et al. 2019). Because of the emission estimation method, emissions from refineries and coke plants are not explicitly represented in CDIAC (hence in ODIAC). Thus, we considered the corresponding emissions in ODIAC are distributed as a part of the non-point source emissions and in the GESAPU emissions (six refineries, and solid fuels production by eight coke plants) are excluded from the GESAPU point sources and added to the non-point source emissions. In GESAPU, power plants with a capacity less than 20 MW are not included in the point source sector (specifically, electricity and heat generation), but are included in the sector "Manufacturing Industry and Construction" as a part of non-point source emissions (see Charkovska et al. 2019). This classification in GESAPU was originated from the power plant categories defined in the Polish data collection framework (big as a rule; separate statistical reporting) and industrial (small as a rule; electricity generation by industrial plants for their needs; statistical reporting within industrial plants reporting). Table 3 summarized the differences in CARMA/ODIAC and GESAPU point source information after the adjustments mentioned earlier. Figure 2 shows the spatial distributions and intensities of the point source emissions from the two datasets. Regardless of the adjustments to mitigate the point source definitions differences, the differences between the two data sources are still significantly large and seem to be difficult to characterize the difference in a meaningful way. Although the totals from the two data sources are very close (0.1% difference), the numbers of point sources are significantly different. Looking at the spatial distributions, some of the major power plants seem to be well co-located (e.g., ones in Lublin (LU) and Mazovian (MZ) provinces; a list of two-letter codes is shown in Appendix Table 6). However, many small CARMA plants seems to be distributed without being co-located with GESAPU plants and do not show any clear systematic patterns. This is more clearly shown in the enlarged view of the southern part of the Silesian province (SL in Fig. 2). The geolocation errors seem to be significantly larger than the estimates by Woodard et al. (2014) for the USA (0.8 km).
Our original intention in this comparison was to estimate an average geolocation error, like done by Woodard et al. (2014), but it turned out it is not straightforward and it is questionable if we could derive a meaningful conclusion given the significantly large difference between two data sources. One could theoretically do is to review point sources information, match them up and do a geolocation error assessment like done by Woodard et al. (2014), although it will be extremely labor-intensive. Before doing so, we decided to focus on a subset of point sources with a hope of getting a general sense of what the difference looks like. We will try to answer the geolocation question in a different way in the "The disaggregation errors across different spatial resolutions-putting all together" section.   Table 6 in Appendix A for a list of two-letter Polish voivodeship (province) codes and full names   Some other CARMA plants are closely located with GESAPU plants, but those are not indicating electric power plants in GESAPU. For example, the CARMA plants in the middle of Kraków (indicated as Sendzimir/Braysz) looks like representing the power plant near the city center the Krakow Leg, which is the largest GESAPU plant in MA. But in fact, the Sendzimir/Barycz (CARMA_ID 40540/3454; total 108 ktC/year) is a combination of multiple sources such as a steel plant and a landfill which are both located somewhere else. The steel plant is just located a few kilometers east of the city center. We think that the reason might be that the central office of all these plants is located in the central part of Kraków and its postal address was used for setting geographical coordinates of these plants. This is obviously an error from the point source information point of view. Placing emissions to city centers however might have mitigated errors in the atmospheric CO 2 simulations to some degree by not creating an imaginary point-wise emission gradient, rather than assigning an intense point source emission to a completely nonemitting region. In fact, we found that Krakow Leg in CARMA (CARMA_ID 23019; 464 ktC/year for GESAPU and 765 ktC/year for CARMA, was placed to a village named Kraków located in the administrative gmina (municipality) Warta, Łódź Voivodeship (LD), in central Poland, which is approximately 215 km away from the city of Kraków in MA. This seems to be explained by an error in the fuzzy string match done by Wheeler and Ummel (2008), probably due to the accurately (or inaccurately) including Slavic characters. While Krakow Leg in CARMA has provided an error in LD, the Sendzimir/ Barycz have helped a bit to make up the missing large emissions, although only by a quarter of it. Another large power plant Siersza (CARMA_ID 41552; 365 ktC/year for GESAPU and 1009 ktC/year for in CARMA) was not found in the ODIAC emissions, although indicated in the original CARMA. This was because its geographical coordinates were not available and the emission was distributed as a part of non-point source emissions. Even with a correct geolocation, ODIAC would have overestimated the Siersza emission by more than 200%. The electricity utility company nearby Trzebinia (CARMA_ID 46415; 11 ktC/year) might have helped to reduce the emission representation error, but only by 3% of the true emission. For the rest of the CARMA plants, we confirmed Alwernia (CARMA_ID 1249; 13 ktC/year) represents a chemical plant, Wieliczka (CARMA_ID 49640; 13 ktC/year) is a salt mine place for tourists, based on the information on the web. Klucze (CARMA_ID 22496; 11 ktC/year) is most likely to indicate a hygiene manufacture company, but the location indicated in CARMA did not match with the company's actual facility location.

Background, issues, and focus in this comparison
The data on nightlights observed from satellites have been identified as an excellent indicator of the intensity of human activities (e.g., Elvidge et al. 1999). The use of nightlight data allows us to incorporate the dynamic changes in satellite-observed human emissions in a timely and globally coherent way (e.g., Oda and Maksyutov 2011;Oda et al. 2018). Separating point source emissions (which are not always co-located with human settlements) from total emissions further improved the performance of the nightlight data as an emission proxy, even at a higher spatial resolution Oda and Maksyutov 2011). The performance of the nightlight data as a proxy for CO 2 emissions however has not been fully evaluated, especially at a subnational level. This is because of the difficulties in evaluating disaggregated emissions as discussed earlier (see the "Emissions dataset comparison" section) and elsewhere Oda et al. 2018). ODIAC emission distributions are compared with other disaggregated or semi-disaggregated emissions (e.g., Hutchins et al. 2016;Hogue et al. 2016;Gately and Hutyra 2017), but those evaluations do not allow us to evaluate the performance of nightlight data as the comparisons did not take the differences in the disaggregation approaches into account.
In this comparison, we evaluate the performance of the satellite-observed nightlights as a proxy for diffuse source emissions with a special focus on characterizing the biases in the resulting emission field. The non-point emissions defined here are the residual of the total minus point source emissions. As already, the non-point total was close enough (only 4.5% difference, see Table 2) that we did not subtract the emissions from cement production and gas flaring from the ODIAC field. The subnational differences in non-point source emissions between ODIAC and GESAPU are expected to be larger than the difference in the total of nonpoint emissions. Non-point source emissions spatial distributions in ODIAC are purely estimated from the nightlight data. Thus, this comparison reveals how well nightlight data can explain the emission spatial distributions over the domain of Poland. Our special interest is to see the performance of the nightlight data along with the urban-rural transition, as it is only possible to do the detailed distribution of emissions with a multi-resolution EI such as GESAPU that covers not only cities but the entire country domain.

The province-level accuracy of ODIAC disaggregation
In principle, the proxy-based disaggregation approach should work reasonably well at a large scale. Andres et al. (1996), for example, disaggregated national emissions using population distribution to a 1 × 1 degree resolution global domain (i.e., CDIAC gridded EI). The gridded EI has been used for forward and inverse model calculations of CO 2 at large scales (e.g., Gurney et al. 2002). The population is a good estimator of the intensity and spatial extent of human activities (hence, CO 2 emissions) at an aggregated large spatial scale (e.g., state/ province levels). The correlation between population and CO 2 emissions however is expected to become weak at a higher spatial scale where spatial and temporal patterns of individual emissions sources (e.g., power plants and traffic) are more apparent (e.g., Oda et al. 2018). This should remain true, regardless of the choice of proxy data such as a nightlight, gross domestic production (GDP), and any other spatially distributed variables that have a fair correlation with human activities. As the name suggests, those variables are used as a proxy and could poorly represent regional differences in the degree of correlations with human activities, which could be a source of disaggregation bias. In fact, as shown by Raupach et al. (2007), regional emission drivers are very different over different parts of the world and this is expected to be applicable to subnational emissions. For example, nightlight data show different levels of correlation with population over different countries with different economic development status (e.g., Raupach et al. 2010;Oda et al. 2010). Those are the sources of uncertainty we have not been able to study in detail and thus the focus in this comparison.
Here we compared the non-point emissions totals of ODIAC and GESAPU calculated at province (called voivodeship in Poland) level. This is an important check to confirm if nightlight data are a fair proxy and/or estimator of province-level CO 2 emissions, before disaggregating emissions to much higher spatial scales (where disaggregation error can be significant). Total emissions at provincial levels are available for some other parts of the world (e.g., USA, Japan, and China). Thus, a comparison of provincial-level emission estimates could be an option to further evaluate the performance of the proxy approach used. A fair performance of the proxy (nightlight data) at the provincial level should support that subnational emissions changes driven by provincial-level mitigation activities could be detected with the proxy-based disaggregated emissions. Figure 4 compares the percentage provincial share of the Polish national emissions (hence nightlight data shown in blue in Fig. 4) with the comparable values from GESAPU. All of the values calculated are shown in Table S3. We also plotted the population (shown in red in Fig. 4) as a reference to characterize the performance of the nightlight data over different provinces. What Fig. 4 essentially shows is the accuracy of the proxy-based provincial emission estimates. Figure 4 shows the fair performance of nightlights to estimate province emissions (R 2 = 0.86). However, the percentage differences at the province level (estimation errors) are ranging from − 33 to 58.4%. The estimation errors of provinces with a large emission share such as Masovian province (MZ, 18.3% in GESAPU) and Silesian province (SL, 14.0% in GESAPU), which are the top two provinces with the highest per capita non-point emissions (see Table S3), are relatively larger in absolute value than for others with smaller emission share. ODIAC underestimated the MZ emission by 15% (1196 ktC, 2.8% of the total GESAPU) and the SL emission by 32% (1995 ktC, − 4.7% of the total GESAPU). Also, the percentage differences for provinces such as Podlaskie (PD, 58%) and Holy Cross (SK, 53%) are more prominent, although their emission shares are small (2.4% and 2.9%, respectively) (hence, small estimation errors in absolute value). Those differences are minor given the 4.4% total difference (1901 ktC), especially in large-scale transport modeling applications. The good spatial subnational emission partitioning (supported with the excellent correlation with GESAPU subnational emissions) (at approximately 139-km resolution based on One-to-one Fig. 4 A comparison of the % share of the provincial ODIAC emissions (hence, nightlight (NTL)), population, and GESAPU emissions the average area size of provinces) and relatively low emission errors compared with the total emissions (− 4.7-2.5%, in the presence of the 4.4% total difference) support the good quality of the disaggregated emissions at this disaggregation level. However, an improvement will be needed in the use of the estimated subnational emissions to keep track of subnational emissions changes, as the estimation errors at the province level are almost the same magnitude as the emission reduction proposed in local climate actions or larger.
We also found that population data outperformed nightlight data in estimating provincial emissions (R 2 = 0.95). Although we acknowledge that this comparison is only done for a single year (year 2010) and a single country, thus, the conclusions here might not remain the same for other years and other countries, the correlation might imply that demographic data such as population data might be able to provide regional constraint on subnational emissions for Poland and potentially improve the accuracy of nightlight-based emission disaggregation. One thing we can do is to calibrate the provincial nightlight to population (if an EI like GESAPU is not available). In fact, provincial-level information has been proven to be useful to improve the subnational disaggregated emissions. Nassar et al. (2013) applied a per capita correction to ODIAC and CDIAC population-based gridded emissions over Canada. The study found the correction to ODIAC was smaller than that to CDIAC gridded emissions.
The estimation error (estimation accuracy) of the two proxies seems to be comparable. The size of the corrections we could make by using population data in addition to nightlight data might be subtle, but reducing the emission estimation errors and improving the ability to accurately partitioning provincial emissions should help making a robust logical link between subnational emissions and total emissions. Provincial-level demographic data and/or statistical data could potentially provide a measure for the emission accuracy in disaggregated emissions and/or provide a constraint on emission disaggregation.

ODIAC-GESAPU subnational differences at a grid-scale level
Although population showed better performance in estimating province emissions, the advantages of nightlight data overpopulation data remain valid. The high-resolution images of nightlights, which are collected far more frequently than demographic data in a globally coherent manner, allow us to disaggregate emissions to a 1-km resolution. Here we compared ODIAC and GESAPU emissions on a common 1-km grid and evaluated the high-resolution emission disaggregation. Figure 5 shows the absolute and relative differences between ODIAC and GESAPU. The actual non-point maps are only presented in Appendix A (see Fig. 9) as the differences from the total maps (Fig. 1 in the main text) are not obvious in the same color scale. To mitigate the differences due to pre-disaggregation emission estimates and focus on the emissions spatial patterns differences, although the difference is only 4.5%, we scaled the ODIAC total emissions to GESAPU. The differences seen in the plots are thus largely explained by the differences in disaggregation approaches, mainly the lack of the underlying data granularity in ODIAC.
The comparison reveals several interesting spatial features in ODIAC-GESAPU differences. In general, ODIAC underestimates the emissions at the urban centers (see urban cores in blue in the absolute difference plot) and overestimates emissions outside of them. The underestimations at cities are outstanding in the absolute difference plot, but those are often of an order of 10-30% in relative difference. This is very good in agreement, especially compared with the recent study in the northeastern USA area (Gately and Huryra (2017) that was showing 50-250% relative differences over urban areas). The underestimations in the suburban areas, especially immediately outside of the cities, are relatively larger (90-100% in relative difference). The relative differences are decreasing as we go to remote areas (~10% relative difference). This general difference feature (underestimation in cities and overestimation outside of them) could be explained by the lack of a traffic sector in ODIAC as transportation is often a major sector in urban areas. The lack of a transportation sector thus incorrectly shifts those emissions to suburban areas . The insolation of transportation emissions will be the next key to addressing the urban-rural emission biases. The underestimation over remote areas is largely explained by ODIAC having zero emission areas due to the nightlight data proxy. From the spatial patterns of the underestimated emissions over the remote areas, we speculate that the underestimation is mostly related to transport sector (roads), centralized heat production (cities), and manufacturing industry (industrial zones in/near cities). Approximately 25,000 pixels (4.2% of the total pixels) indicate the 100% relative difference (dark brown) and 66% of them in GESAPU indicate 2~10 tC/ year. emissions with ODIAC indicating zero emissions (hence, 100% relative difference).

Nightlight proxy bias at urban-rural transition areas
While we take a closer look at the difference plot, we become curious about the large relative differences (overestimations in ODIAC) over the urban-suburban transitioning areas. To our eyes, the high relative difference seems to be located in the west-north of cities (see Fig. 11).
Here we hypothesized that the large difference might be due to the error in the geolocation of the nightlight data. We roughly estimated that we could significantly mitigate the difference by shifting the nightlight data by approximately 1.6 km to the south-east direction with 27.3 degrees (see Appendix B). This slightly less than 2-km geolocation error, which is larger than the ODIAC native grid (30 arcsec~1 km), could be a non-negligible source of errors especially when ODIAC emissions are used for urban studies. We further speculate some light aureole around cities strengthened by the vegetation in the areas might force ODIAC to incorrectly distribute emissions. Figure 6 shows the relative emission difference around the city of Białystok (300,000 habitats). We superposed several data layers, such as the city administrative boundaries (black), forest maps (green), and agricultural land maps (red, from Corine land cover map), Fig. 5 ODIAC-GESAPU absolute (left) and relative (right) differences. The differences are defined as ODIAC minus GESAPU onto the relative difference map. The forest and agricultural maps only indicate major patches in order not to make the plot busy. The city lights give a certain halo at a short distance from the urbanized area. The DMSP sensor (or retrieval algorithm) incorrectly identified them as electrical lights although those are from forested or agricultural areas. With the weak nonelectrical lights, ODIAC thus allocates weak emissions over the areas (order of 100 tC/year), while GESAPU indicates zero emissions (hence, yielding 90-100% relative differences). As goes far from the urban area, this bias becomes weaker as the sky reflection gets weaker and eventually lower than the instrument detection limit. This could be confirmed or rejected by a further investigation with new nightlight data collected from the Visible Infrared Imaging Radiometer Suites (VIIRS) on board Suomi National Polar-orbiting Partnership (Suomi-NPP) (e.g., Román and Stokes 2015;Román et al. 2018). The Suomi-NPP/VIIRS has an improved light sensitivity over previous nightlight instrument (e.g., Elvidge et al. 2013) and has been collecting improved nightlight images since 2012 (e.g., Román et al. 2018). As mentioned earlier, ODIAC was originally designed for atmospheric CO 2 inverse flux calculations to reduce the potential model Fig. 6 The relative difference around the city of Białystok. Several layers such as the boundaries of cities (black), the boundaries of forests (green), and the boundaries of agricultural lands (red, from Corine land cover map) are superposed. The forest and agricultural maps only indicate major ones just not to make the plot too busy biases due to coarse resolution gridded EIs. Given the simple nightlight-based downscaling in ODIAC, also as shown earlier in this section, urban emissions derived from ODIAC are subject to errors associated with the emission disaggregation. However, a few US domainbased studies have shown the utility of ODIAC downscaled urban emissions (e.g., Brioude et al. 2013;Lauvaux et al. 2016;Hedelius et al. 2018). Those studies have partially supported that ODIAC downscaled urban emissions are reasonably allocating emissions to urban areas. For example, Lauvaux et al. (2016) reported that the difference from locally developed GIS-based emissions by Gurney et al. (2012) was just 20% regardless of the significant differences in emission modeling approaches. The recent study by Gurney et al. (2019) further compared ODIAC and Hestia products for four US cities (Los Angeles, Salt Lake City, Indianapolis, and Baltimore) and found that the city-wide emission differences range from − 1.5 (Los Angeles) to 20.8% (Salt Lake City). Gately and Hutyra (2017) also found that ODIAC, among several downscaled emissions such as EDGAR and FFDAS, showed the best agreement with their 1-km ACES bottom-up emission data product. An encouraging message from this study is that the urban relative differences (shown in the previous subsection) are much smaller than the previous study such as Gately and Hutyra (2017). This might imply that the nightlight regional dependency is working reasonably well for urban areas in Poland. With the lack of highly detailed EI such as GESAPU and Hestia, getting reasonably accurate urban emissions via global disaggregation has a significance for global GHG emissions monitoring.

Disaggregating national emissions to the city level
To make such downscaled emissions more useful for urban studies, we need to assure the accuracy of the spatially distributed emission estimates for urban high-resolution transport modeling. Previous studies have shown fair model reproducibility using ODIAC emissions. The ability of evaluating ODIAC emissions might be limited by the model ability (Martin et al., 2018). Due to the issues with the nightlight data (e.g., blooming effect), it is challenging to accurately map electrical light patterns without biases (hence, errors in resulting emission fields). Such errors might be too subtle to detect. Also, for policy applications, showing reasonable emission distributions is not good enough. We need to assure the emissions changes in the field are reflecting changes in the local emission driver (emission reduction), which might be difficult to achieve by current national emission downscaling. However, it might be possible with EI collected by local climate actions, such as the C40 cities climate leadership group (https://www.c40.org/) and the global covenant of mayors for climate and energy (https://www. globalcovenantofmayors.org/). A regular EI reporting is often a requirement under these climate mitigation activities. For example, the global covenant of mayors has four processes such as commitment, inventory, target, and plan. EI reporting is the first process after cities declare their commitment. The covenant of mayors defined at least scope 1 emission following an inventory guide defined by the Global Protocol for Community-Scale Greenhouse Gas Emission Inventories (https://ghgprotocol.org/greenhouse-gas-protocol-accounting-reportingstandard-cities). In the use of locally compiled city EIs, creating reasonably well spatial distribution has more significance (e.g., Oda et al. 2017).

Disaggregated urban emissions for Warsaw
Here we took a look at the emission fields over the city of Warsaw, the capital city of Poland (population 1.7M in the year 2010). Warsaw is one of the world megacities that have been active in global climate mitigation activities, such as the C40 cities climate leadership group (since 2005, as one of the founding member cities) and the global covenant of mayors (since 2009-Present), and the district energy in cities initiative (http://www.iclei. org/activities/agendas/low-carbon-city/districtenergy.html). Warsaw has been moving towards low-carbon (https://www.c40.org/case_studies/c40-good-practice-guides-warsawsustainable-energy-action-plan-for-warsaw-in-the-perspective-of-2020) and air quality (https://www.c40.org/case_studies/cities100-warsaw-district-heating-upgrades-cut-airpollution). As a part of the climate action activities, Warsaw completed the first phase of the requirement (inventory development) (https://www.globalcovenantofmayors. org/cities/warsaw/). Including those locally compiled emission inventories should reduce the disaggregation bias significantly, compared with the current national emission disaggregation, although the uncertainty assessment of the city-level emission estimates will be another major challenge. Figure 7 compares three Warsaw emission fields from ODIAC (1 km) and GESAPU in two different aggregated spatial resolutions (1 km and 100 m), and a 15-arcsec (approximately 500 m) nightlight image collected from the VIIRS on the Suomi-NPP spacecraft, developed by the National Oceanic and Atmospheric Administration (NOAA)'s National Center for Environmental Information. As we expect, also shown in the results earlier (Fig. 6), urban emission spatial distributions are poorly represented in ODIAC. The 2010 total emissions within the city boundary (this should roughly correspond to the scope 1 territorial emissions) are 3638 ktC in GESAPU and 2554 ktC in ODIAC (30% difference). The authors conclude the difference is reasonably small compared with the cases in the US cities, such as Lauvaux et al. (2016) and Gurney et al. (2019). The agreement could be better if we take a slightly bigger domain for the emission calculation to mitigate the poor nightlight-driven city patterns. Here we also see the ODIAC urban emission core is shifted roughly towards west, compared with the city boundary as well as the VIIRS nightlight data. Comparing the ODIAC and the VIIRS nightlight data, our hypothesis for the high relative difference value around cities could be plausible (ODIAC distribute emissions over non-electrical light areas in the VIIRS nightlight data. Note the two nightlights are not taken in the same year although), is not only one example plausible (see Fig. 10 for absolute and relative difference plots for Warsaw). Also, it is very clear that the inclusion of traffic emissions would benefit to achieve a better emission estimate as this scale, with the underestimation in ODIAC emissions outside the city, as also discussed by Oda et al. (2017) and Gurney et al. (2019). Also shown in the previous subsection, the relative differences within the city boundary range from 0 to 60% and are relatively smaller than the case of Gately and Hutyra (2017) in the northeastern part of the USA.
Comparing the Warsaw city boundary to the VIIRS nightlight spatial pattern, the use of the VIIRS nightlight data for emission downscaling would significantly improve the spatial representation of urban emissions that has been proven to be useful for urban inversion . With the improved light sensitivity over previous nightlight instrument (e.g., Elvidge et al. 2013) as well as an improved nightlight data retrieval (Román et al. 2018), disaggregation errors due to the current emission representation errors can be greatly reduced. However, it is very clear that nightlight patterns within Warsaw do not always represent sectoral CO 2 emissions reasonably well, and better within-city emission modeling is required for achieving improved emission gradient within the city. Oda et al. (2017) have shown the validity of the combined use of satellite and geospatial modeling for better approximation of emission spatial patterns in urban inversion applications. This approach should improve the urban-remote area emission allocation we have looked earlier this section. Or instead of starting from the national estimate, we could start with city-level emission estimates available from activities by local climate actions.

The disaggregation errors across different spatial resolutions-putting all together
This study is unique compared with previous EI evaluation studies, in the way how we evaluated ODIAC emissions by components (point and non-point), over different spatial scales (national, province, city, and in between). The series of emission comparisons in this study not just estimated the disaggregation errors but also identified the possible causes of the biases and provided implications to reduce the disaggregation errors in ODIAC and possibly other nightlight-based disaggregation frameworks. The final challenge at the end of the series of the emission comparisons is to combine all the knowledge about the disaggregation errors we obtained from different comparison shown earlier, and come up with one single metric that guides the users of the ODIAC emission data product. The authors believe that such a metric would be a benefit for the data users who prescribe atmospheric transport models with ODIAC emissions and the potential users who wish to use ODIAC beyond the original intended use.
The errors in disaggregated emissions are expected to change as a function of the spatial scales (e.g., Hogue et al. 2017). In atmospheric modeling, the error is also a function of the sensitivity of the observation to the error. In most of global atmospheric CO 2 flux inversions (e.g., Bousquet et al. 1999;Gurney et al. 2002;Baker et al. 2006;Feng et al. 2009;Chevallier et al. 2010;Peylin et al. 2013;Houweling et al. 2015;Feng et al. 2016a), for example, FFCO 2 is given as a known quantity and never be optimized. Gurney et al. (2005) have pointed out that FFCO 2 needs to be accurately given in the inversions in order to obtain robust estimates of natural uptakes; otherwise, the errors in FFCO 2 are aliasing to the final flux inverse estimates. Historically, most of the atmospheric CO 2 data have been collected at remote background sites in order to infer at natural carbon uptakes (hence, FFCO 2 atmospheric signals are minor) (e.g., Tans et al. 1990: Ballantyne et al. 2012. Also, the spatial resolutions of working atmospheric transport models have been often coarser than 1 degree (e.g., Gurney et al. 2002), with few exceptions. The use of perfect FFCO 2 thus has been a fair assumption in the conventional atmospheric inversions and the accuracy of the emissions disaggregation has not been a major concern. However, atmospheric CO 2 data from dense city-focused observation networks (e.g., Lauvaux et al. 2016;Martin et al. 2018) and satellites (e.g., Kort et al. 2012;Janardanan et al. 2016;Schwandner et al. 2017;Hakkarainen et al. 2016;Nassar et al. 2017) are recording local emissions signatures from human activities up to few kilometer spatial scales and models are capable of replicating those concentration (e.g., Oda et al. 2012;Feng et al. 2016b;Lauvaux et al. 2016;Oda et al. 2017;Ye et al. 2017;Martin et al. 2018;Wu et al. 2018;Hedelius et al. 2018). As the spatial resolution of transport modeling increases to fully utilize those observations, the system should be more sensitive to the errors in FFCO 2 (e.g., Oda and Maksyutov 2011;. However, many global and regional inversion studies do not even incorporate any kind of uncertainty information regarding FFCO 2 imposed. This is partly because the error characterization is challenging and the uncertainty estimates from the multiinventory comparison might not be fair to work with inversions. At the urban scale, it is often challenging to define errors due to the lack of EIs (e.g., Lauvaux et al. 2016;Oda et al. 2017). This study also shares difficulties in evaluating uncertainties in spatially explicit EIs. The difficulties do not fully allow us to assess the accuracy of the disaggregation, but this study attempts to demonstrate the change in disaggregation errors as a function of spatial scales in order to demonstrate what extent the spatial aggregation would help reducing the errors in the downscaled emissions, as a case study. Figure 8 shows the levels of ODIAC-GESAPU differences in the emission magnitude and the agreement in the spatial patterns (spatial correlation), as a function of the spatial resolution (1, 5, 10, 25, 50, 100, 200, and 400 km). Some of the spatial resolutions roughly match with some popular spatial resolutions in transport model simulations (e.g., 1 km, urban simulations; 10 km, regional simulations; and 200 km, global inversions; see Table 4). The differences are defined as the sum of the absolute differences, which we think conservative as a measure. The initial difference (at 1-km resolution) is approximately 155% (200% for point and 112% for non-point). To highlight the level of mitigation by going from the native 1-km resolution to lower spatial resolutions, we normalized the difference by the initial difference. Thus, the difference begins with 100%.
As we expect, the emission differences are reduced as aggregated at coarser spatial resolutions. The difference in total emission was significantly mitigated during the first 100 km (close to 1 degree) of aggregation, by slightly more than 70%. The decrease in the difference (mitigation) after passing 100 km was subtle, compared with what we see in the first 100 km of aggregation, but showed a monotonous decrease. At 400-km resolution (slightly less than 4 degrees), the difference was mitigated by approximately 85%. Point and non-point emissions differences showed similar changes as a function of spatial resolution, with the point source difference being a few percent higher (proceeded by non-point emission difference). For example, non-point source emissions achieved 80% difference reduction at 200 km, while point source emissions achieved it at 350-km resolution and coarser. Also, it is important to note that the total emissions differences are largely driven by point source differences, rather than non-point emission difference. The overall behavior of the correlation over the spatial scales is similar, but some features are worth describing. The three correlation curves achieved 0.9 at different spatial scales (100 km for total emission, 200 km for point source emissions, and a little after 50 km for non-point emissions), with the point source curve proceeding to the non-point source emission before 25 km. Currently, ODIAC emission data products are provided in two spatial resolutions (1 km and 1 degree). Based on this analysis, the disaggregation error seems to be well mitigated (76.8% at 100 km). If we move from the 1 degree to a 0.1 degree, which is a typical global gridded EI resolution (e.g., EDGAR), we need to accept a more disaggregation error (50.3% at 10 km). Figure 8 b shows the change of difference and correlation at less than 25 km. We found that approximately 50% of the difference was mitigated during the first 10 km (approximately 0.1 degree). Also, it is now clearly seen that the correlation improvement was driven by point source aggregation than non-point emissions, which suggests the total agreement is dominated by point emissions. This implies that mitigating the errors in point source emissions should effectively improve the overall accuracy of the emission field of ODIAC especially at high-spatial resolutions. As shown in the previous section, the current ODIAC point source emissions have significant errors due to the use of CARMA. Point source emissions holding more disaggregation mitigation potential than non-point source emissions is a promising message for ODIAC, as the improvement of point sources are very labor-intensive, but achievable, and this study has identified the sources of errors and provided implications of how to improve them.  Table 4 A summary of the ODIAC-GESAPU emission differences and spatial pattern agreements at significant spatial resolutions for atmospheric modeling. Values are for total emissions (point + non-point) and values for point and non-point are in the brackets. The error reduction is defined by 100% minus normalized emission difference. The resolutions and narrative are just for reference and do not absolutely define high or low resolution Spatial resolution Error reduction by spatial aggregation (point/non-point) Spatial correlation (point/non-point) Narrative, application examples 1 km -0.04 (0.04/0.10) 1 km: ODIAC native resolution, high-resolution urban simulations (e.g., Lauvaux et al. 2016); 5 km: Fine-grained EI (e.g., Vogel et al. 2013;Thiruchittampalam 2012); 10 km: Typical global and regional gridded EI spatial resolution (0.1 degree; e.g., the Emission Database for Global Atmospheric Research (EDGAR) and the Regional Emission Inventory in Asia (REAS)); 25 km: MRV a fossil fuel data assimilation system (e.g., Rayner et al. 2010;Pinty et al. 2017 While we found that the differences can be significantly mitigated by spatial aggregation (coarsen the spatial resolution), this analysis also shows that the absolute difference does not go down lower than 12% of the initial errors (18% without normalization). This is not surprising as we think the error corresponds to the biases due to the use of nightlight data as an emission proxy (see the "Non-point source comparison" section). The absolute error at the provincial level is about 21.6% (note only non-point emissions), while the average provincial area is approximately 140 × 140 km 2 . From this analysis, the error at 140 km is slightly less than 30% (without normalization). It is natural to think this comparison would yield higher errors as the provincial-level comparison is less sensitive to the subnational emission spatial distribution errors. This is another form of the emissions representation error of nightlight proxy, in addition to the emission representation errors at higher spatial scales.  defined the gridded emission errors as: where E est is the errors associated with emission calculation and E disagg is the error associated with disaggregation. The focus of  was to quantify the uncertainty in global inversions due to the fossil fuel assumption. But when the formula is applied to a single downscaled gridded emission, the evaluation of the second term will be challenging.  assumed that E disagg equals zero at the global level (zero disaggregation), which is essentially what is assumed in the global inversions. But in reality, it is not zero as shown in the analysis. But this disaggregation errors should be relatively easy to fix, compared with the emission representation errors at native 1-km resolution. For example, we should be able to significantly mitigate the second emission representation errors (which we see in this subsection) by improving the subnational distribution using population.

Discussion
The analyses reported here have explored errors and uncertainties in the CO 2 emissions estimates of the ODIAC model. The use of the error and uncertainty estimates derived in this study is thus limited, but the estimates have practical implications to EI developers and to those who use the emissions data in atmospheric transport models. The authors believe that results could be generalized to some extent as we could use ODIAC to typify spatially explicit estimates of emissions created by taking national emissions estimates and distributing those emissions across a country using some proxy (such as satellite-observed nightlights and population) that plausibly correlates with CO 2 emissions. In this section, we discuss the importance and challenges of EIs in future GHG monitoring and management, the prospects of improving emission GHG estimates using atmospheric data, the utility of the spatially explicit EIs (for both scientific and political purposes), and the future of EIs and their scientific and political applications. There were five key topics addressed in the 3 previous workshops in this series on uncertainty in emissions inventories (see Liberman et al. 2007;White et al. 2011;Ometto et al. 2015): 1. Achieving reliable GHG inventories at national and sector scales and reporting uncertainties reliably at these and finer scales, 2. Bottom-up versus top-down GHG emission analysis, 3. Reconciling short-term emissions commitments and long-term atmospheric concentration targets, detecting and analyzing GHG emission changes vis-à-vis uncertainty, and addressing issues of compliance with commitments, 4. Issues of the scale of GHG inventories, and 5. Trading emissions and emission offsets.
In this discussion, we touch on topics 2, 3, and 4 with special focus on topic 2. In the "Perspectives towards the utility and the future of robust, spatially explicit EIs" section, we focus on the utility of spatially explicit inventories and the new challenges in the era of the Paris Agreement. In the "Recommendations for the future global emissions-inventory framework" section, we discuss how to proceed to achieve more robust, more accurate, spatially explicit EIs that can act as a part of a future GHG monitoring framework under the Paris Agreement and beyond.

New challenges for EI in the Paris era
As seen in studies, such as those of Guan et al. (2012) and Liu et al. (2015), following common, established guidelines, such as the IPCC guidelines, does not assure the accuracy of national emissions estimates. Guan et al. (2012) demonstrated a significant difference between two Chinese estimates of total-country emissions based on (1) national-level fuel statistics, which is an analogue of what is reported by the country, and (2) a collection of province-level statistics. Liu et al. (2015) found that the emission factor for coal recommended by the IPCC guidelines is larger than what they measured locally, and they suggested a downward correction to the country total emissions estimate for China. Charkovska et al. (2018) also found a similar systematic bias in the case of Poland. These are examples of biases that might not be always detected in the verification process defined in current inventory frameworks. Such biases, however, do not necessarily prevent us from keeping track of emission changes under frameworks such as the Kyoto Protocol. The emission reductions are measured as a relative change from the base year (typically, 1990). If we could reasonably assume that the errors in the EIs remain at the same level (or biased in the same way) every year, the relative change from the base year and the interannual changes (i.e., estimates of emission reduction effort) should be somewhat robust.
The authors argue, however, that the situation will change under the Paris Agreement and the two Chinese emission studies provide a good example of biases future monitoring systems need to detect. A future monitoring system will not be good enough if it just keeps track of percentage changes in emissions from a single-signatory country using reported EIs, but it needs to provide accurate estimates of how much GHG each country has emitted to the atmosphere. Accurate emission estimates are needed to clearly define each country's responsibility under a global framework. Under the Paris Agreement, commitments for emission reductions should be made by each country in order to meet the 2°C global target-or even below. Thus, the sum of country emission estimates needs to be consistent with the actual increase or decrease of GHG in the atmosphere and allow us to accurately project future climate. EIs will be used not only for a short-term commitment but also for long-term commitments (Jonas et al. 2014). Atmospheric inversions, for which ODIAC was originally conceived, also need accurate FFCO 2 estimates in order to obtain robust estimates of past and current natural sinks and to evaluate the future sink capacity in response to changes in environmental conditions. Under the Paris Agreement, EI reporting is mandatory for all signatory countries, including for developing countries that were exempt under the Kyoto Protocol. Many developing countries are still in need of economic development and thus are projected as major GHG emitters in the future. With data collection being a big part in EI compilation, EIs from developing countries with weak infrastructure for data collection and processing are more likely to contribute to uncertainties in global total estimates of FFCO 2 . Thus, one of the keys for the successful implementation of the global EI framework under the Paris Agreement is to help these countries with less experience to develop robust EIs. Estimating GHG emissions is, in fact, a challenge not only for developing countries but also for developed countries, when it comes to emission estimates at subnational levels. Emission reduction efforts at subnational levels, such as cities and states, and in the private sector (particularly at large point sources), are not accurately quantified by EIs compiled at the national level. EI compilation at the subnational level introduces a new challenge in data collection to accurately quantify emissions, for example in cities, where the boundaries in data collection are less clear. Studying the sources of uncertainty will become more important over time.

Towards the use of atmospheric measurements
The challenges discussed in the previous subsection suggest the use of atmospheric measurements for examining EIs (for example a bottom-up vs. top-down comparison). In theory, surface emissions will lead to enhanced concentrations in the atmosphere, and these emissions can be coupled to an atmospheric model and then the calculated concentrations compared with atmospheric measurements of concentration. Biases in EIs should appear as a deviation of calculated concentrations from the atmospheric measurements. Such a top-down approach for emissions verification has been performed for some non-CO 2 GHGs, i.e., nitrous oxide (N 2 O) and methane (CH 4 ) (Leip et al. 2018). But, FFCO 2 is a more challenging target for a top-down approach because of the smaller uncertainty associated with them in relation to other gases, such as N 2 O and CH 4 , and because of the presence of natural sinks and sources of CO 2 with large uncertainty. The use of atmospheric measurements from different platforms, such as groundbased stations, aircraft, and satellites, has been considered (see Pacala et al. 2010;Ciais et al. 2015;Pinty et al. 2017), and several studies have demonstrated the feasibility (e.g., Vogel et al. 2013;Lauvaux et al. 2016) of useful atmospheric measurements. Table 5 summarizes the current technological status of bottom-up vs. top-down analyses in general and some of their challenges and difficulties. Each type of observation has strengths and weaknesses and these change depending on the spatial and temporal scales of monitoring (e.g., national inventory vs. large emitters, fossil fuel emissions vs. biomass emissions). Feng et al. (2016b), for example, employed a high-resolution meteorological simulation with a detailed spatially explicit EI to evaluate the ability of the GHG observation network for Los Angeles to monitor the city's emissions. Globally coordinated carbon observations have been discussed by groups of international space mission partners, such as the Group of Earth Observations (GEO), in support of policy-making through global atmospheric GHG monitoring (e.g., Ciais et al. 2010). With future technological developments/improvements, such extensive observation capabilities should place us in a better position to successfully implement a system for providing a verification support for the implementation of global emissions monitoring (e.g., Ciais et al. 2015). Poor data quality for some countries, need for expanded data collection, need for estimating the spatial patterns (e.g., disaggregation). No established TD system to constrain national emissions, prototype studies have been conducted to demonstrate the concept (e.g., Ciais et al. 2015;Basu et al. 2016;Pinty et al. 2017;Wang et al. 2018) Need for observational data to constrain emissions in the presence of fundamental difficulties (e.g., biogenic emissions, boundary conditions), cost for observational systems Province/state Available for some countries, but not required for reporting. We recommend this spatial level as a good meeting point for national EIs and spatial EIs for accurate emissions and verification.
No guidelines specified, data collection will be challenging No established operational systems, but prototype studies have been conducted (e.g., Basu et al. 2016;Fischer et al. 2017).
Same as above.
City BU emissions estimates are available for some cities on a research basis; local climate actions require some cities to report EIs.
Data collection, accuracy, consistency with larger area emissions estimates, quality assurance across cities, often not special distributions Proven to be technologically feasible with a well-designed in situ observation network (e.g., Lauvaux et al. 2016). Satellites detect signals from cities (e.g., Kort et al. 2012;Schwandner et al. 2017). The estimation methods are being studied (e.g., Ye et al. 2017 Among technological development/improvements towards the establishment of an atmospheric measurements-based emission monitoring system is the continued effort of improving spatially explicit EIs. If one wants to incorporate the reported total emissions from a country into an atmospheric model in order to calculate regional and/or local concentration enhancements in the atmosphere and to compare these enhancements to observations, subnational, spatial emissions distributions need to be available; and the estimation errors associated with these subnational emissions estimates need to be reasonably small. Emissions from large-point sources (LPS), such as power plants (which need to be accurately located), and from cities should be a good target for improving the bottom-up vs. top-down analyses. Emissions from LPS are large enough to be detected even from current satellites (e.g., Pacala et al. 2010;Bovensmann et al. 2010;Nassar et al. 2017) and the detection can be enhanced with the aid of auxiliary data and/or observations (such as 14 C, NO 2 , and CO) for separating fossil fuel and biogenic sources. Recently, Nassar et al. (2017) demonstrated the possibility of detecting emission signals from a space-based instrument such as the OCO-2 satellite. Exploring such an approach is currently limited largely by the availability of data collected in suitable meteorological and geophysical conditions (e.g., Ye et al. 2017). More spatial and temporal data over cities and large point sources would be helpful.
A multi-scale emission modeling approach such as GESAPU will probably be the ideal candidate to serve as a part of the monitoring system for accurately introducing emission information to the global system. Challenges are the workload required to create an EI like GESAPU and that EIs reported to the United Nations Framework Convention on Climate Change (UNFCCC) do not cover the globe, which will be required to prescribe global simulations (e.g., Denier van der . The emissions dataset EDGAR is currently closest to what will be required. EDGAR follows the IPCC guideline to calculate national emissions and it does emission spatial disaggregation in a systematic way. Although the errors and biases in the downscaled spatial emissions in EDGAR have been pointed out (e.g., Gately and Hutyra, 2017), the global systematic disaggregation has considerable value as it allows us to make the spatial emissions traceable. We need to build a framework to support the systematic development of spatially explicit EIs around the current EI framework.

New challenges in the global EI framework
Regardless of what a future EI framework will be, we need to expand our capability of collecting data to improve the accuracy of subnational emissions estimates. As demonstrated in this study, extensive data collection (e.g., GESAPU) will allow us to characterize the biases in downscaled EIs, identify the root causes of biases, and plan a remedy for them. Reducing the uncertainties associated with the terms, such as emission factors and activity data, is important and is directly connected to the current EI system. There is a need to expand the collection of more regionally and/or sectorally varying emission factors.
But, how far do we need to expand the data collection? For example, Ciais et al. (2015) suggested a need for global emissions estimates at 1 km 2 and hourly scales-an extremely challenging goal. Given the finite amount of financial and time resources, we need to come up with a reasonable plan to make the EI system as accurate as possible and help global monitoring. This study has been greatly benefited from the data granularity that GESAPU offers and highlights an ideal form of spatially explicit EIs. The multi-resolution EI of Gurney et al. (2012) is thought to be the best effort for quantifying emissions from US cities. But such detailed EIs are extremely labor-intensive and make it difficult to cover a large area or to conduct replicate inventories over time. More input data is required and more data means more labor for quality assurance (QA) and quality checks (QC). Establishing a data collection system or guideline that allow us to make a systematic collection for EI development would greatly reduce the labor, possibly deliver EIs quicker, and reduce the level of uncertainty associated with the emission estimates. Note that Gurney et al. (2012) and Gately and Hutyra (2017) have extended their emissions over time by scaling the base year or assuming the same total emissions (see https://daac.ornl.gov/CMS/guides/CMS_Carbon_ Emissions_NE_US.html).
In this situation, there is a significant benefit in the use of downscaled, disaggregated emissions estimates. Although this study revealed large room for improvement in disaggregated emissions estimates, it is conceptually possible to both improve disaggregation and to move towards emissions estimates that are at 1 km 2 and hourly scale using the ODIAC framework. As demonstrated in this study, if data are available, it is possible to downscale emissions estimates and to calibrate the ODIAC model to a regional model. The use of satellite data, such as nightlight data, has an advantage in its global and timely systematic data collection (observation) and coherence. With a wider variety of useful satellite data (e.g., nightlight, land cover, imperviousness, etc.) and geospatial statistical data (e.g., road networks, gridded demographic data, etc.), we should be able to achieve emission fields that meet the requirements of a future global system. EIs like GESAPU and disaggregation emissions estimates are complementary: in the future system and we need to come up with a complementary combination of them. GESAPU has a unique, strong significance of resolving local human activities while being a large-scale model (country level). However, GESAPU is only available for Poland and Ukraine and we have to stitch together small models and large models to achieve national and then global coverage.
We propose further that emission calculations at an intermediate administrative level such as state or province have a chance in achieving useful emission estimates from the viewpoint of data availability. At this spatial scale, the errors from emission disaggregation should be less severe. Provincial emissions estimates could be used as a regional constraint to reduce disaggregation errors due to the lack of regional differences in carbon emission drivers. This would also provide an additional verification that could be implemented in the current EI framework. As seen in the provincial-level comparison (shown in Fig. 4), the difference between GESAPU and ODIAC were only 4%, regardless of the differences in the calculation methods.

Recommendations for the future global emissions-inventory framework
Based on the outcome of this study, we offer three recommendations to those who build and those who use inventories of greenhouse gas emissions. These are aimed at using emission inventories for monitoring of emissions agreements and for seeking aggregate global goals for limiting global climate change.

A capacity building for extended data collection for EI development
Due to the nature of EIs, they are strongly dependent on the quantity and quality of the data collected. One of the biggest sources of difficulties is that EIs are largely based on repurposed data, data that were collected for some other purposes with attendant criteria and manners. A coordinated data collection system to support more robust EI compilation is recommended. Information on large point sources is particularly important for disaggregation purposes and for bottom-up vs. top-down exercises. Existing efforts collecting point source emissions, such as US EPA's Emissions and Generation Resource Integrated Database (eGRID, https://www. epa.gov/energy/emissions-generation-resource-integrated-database-egrid), the Global Energy Observatory (http://globalenergyobservatory.org/), and the Global Power Plant Database recently published by the World Research Institute (http://www.wri.org/publication/globalpower-plant-database), are close to what we would suggest, but they have missing key variables (e.g., reported emissions, timely updates, accurate geolocations). Some of these efforts are attempting to achieve global coverage, but this is not quite achieved yet. Since emissions from the power sector often account for a significant portion of national emissions, this focus on large point sources will contribute to reducing the errors associated with the national and sectoral totals. At the same time, this should help the emissions spatial disaggregation. The inventory calculation for CO 2 needs to be extremely accurate to claim an independent, objective monitoring capability (see Quick, 2014). Also, it means we need to expand the parameters collected for power plants such as plant-specific values and operation status. These should place a satellite application to a better position (e.g., Nassar et al. 2017).

A consistency check at the provincial-state intermediate aggregation level
Our study demonstrated the utility of emission estimates at an intermediate level such as province or state. In some countries, just population data at this level might be useful for consistency checks. The international data collection system should support this effort. Statelevel data can provide an additional check on EI calculations, potentially add regional constraints, and reduce the uncertainties associated with spatially explicit EIs.
Also, state-level data should help connecting the small models and large models (or subsystem and full system). Although the disaggregation models are often unable to maintain the linkage of emission estimates and spatial extent, we can use the multi-resolution local models, such as Hestia and GESAPU, and global disaggregated models to achieve a global emission field to be used in a future monitoring system. By doing this, the separation of the emission calculation and disaggregation is mitigated. It is important for emission models to help support local climate actions as emission models can reveal emission changes from the local climate actions as a part of the global monitoring framework.

Coordinated research efforts towards future emission monitoring system
With more data and more technological developments, we expect to obtain more robust estimates of emissions from top-down and bottom-up exercises at different scales. To examine the optimal use of future observations, simulation experiments (Observing System Simulation Experiments, OSSEs) have been conducted (e.g., Feng et al. 2016b). OSSEs provide a useful tool to design an optimal top-down vs. bottom-up exercise setup, taking observations, emissions estimates, and other environmental conditions into account (e.g., Feng et al. 2016b). To make OSSEs more effective, a key is to feed realistic statistics based on actual experiments to the simulation experiment. For example, OSSEs for cities might have to rely on only several city cases. Thus, it is important to have a coordinated effort of emission inventories, observations, and modeling. Several successful projects such as Jet Propulsion Laboratory's Megacities Carbon Project (Duren and Miller 2012, https://megacities.jpl.nasa.gov/portal/, see Lauvaux et al. 2016)

Conclusions
In this study, we evaluated the global high-resolution, gridded EI ODIAC using the multiresolution EI GESAPU over the domain of Poland. We focused on the errors associated with the emissions disaggregation from national total to 1-km grid spaces in ODIAC by accepting GESAPU as a truth. The differences between the two data sets were thus taken as a proxy for the disaggregation error (or uncertainty). The emissions data granularity that GESAPU offers should justify us largely attributing the differences as the errors in ODIAC. The total emissions of the two EIs for Poland are very close. This study evaluated the ODIAC emission field by emission types (point and non-point) across different scales (national, provincial, and cities) and identified the root causes of the errors.
With the verified point source information from GESAPU, we took a close look at point source emissions in ODIAC and investigated the root causes of the disagreement with GESAPU emissions. The agreement between ODIAC and GESAPU in total, point source, and non-point source emissions was fairly good. However, we found the good agreement was partially dependent on consistency between emissions categories and power plant information. Errors in CARMA, a global data set on power plants, turned out to be a significant source of error in ODIAC. CARMA information seems to be reasonable for large emitting facilities, but the inclusion of small facilities brought in difficulty to the ODIAC modeling. The errors included both plant characteristics and the exact geographic locations. Comparisons with the detailed studies in GESAPU provided implications of how we should model power plant emissions in ODIAC.
As previously known, disaggregating non-point sources of emissions, here using nightlight data, to higher spatial resolution also leads to errors. This study showed the degree of goodness of disaggregated emissions estimates at the provincial level and confirmed the excellent performance of the nightlight to be a proxy for subnational, non-point emissions at this scale. We also found that population density is a slightly better predictor at this aggregated level and could potentially serve as an emission constraint for ODIAC.
We also evaluated disaggregation errors at the city level. Although we do not often expect the disaggregated emissions at the city level to be robust, the emission error level for Warsaw was 30%. This is a surprisingly good agreement given what is reported for US cities. The error seems to be dominated by the emission proxy representation errors and future improvements in nightlight data should reduce the errors.
There is a general interest in the uncertainty of emissions estimates as a function of spatial resolution. As expected, in general, the emissions uncertainty can be mitigated by aggregating emissions over space. The error level at the native spatial resolution of 1 km can be mitigated by 80% at 100 km. At 10 km, the error can be mitigated to 50%. The biggest numerical errors are attributed to the uncertainty at large point sources. Also, as suggested by the provincial-level emission comparison, even at the 400-km resolution, the errors in non-point source emissions do not converge to zero.
From the outcome of this study, the authors believe that spatially explicit emissions inventories can be improved with 3 major initiatives. These will allow emissions inventories to contribute to a future global emission monitoring framework: 1. Capacity building to support extended, systematic data collection for EI development 2. An emission EI consistency check at the provincial or state intermediate level for robust EI and connecting EIs at different spatial levels 3. Coordinated research efforts of EI, atmospheric observations, and modeling towards future emission monitoring framework This study was solely about FFCO2 but the implications from the study should apply for other compounds and monitoring frameworks that are based on spatially explicit EIs.
Acknowledgments The ODIAC emission data product is hosted by the National Institute for Environmental Studies (NIES), Japan.
Funding information TO is supported by NASA Carbon Cycle Science program (Grant No. NNX14AM76G).   Table 8 ODIAC and GESAPU provincial non-point source emissions comparison (see Fig. 2 for the provincial locations). The values are given in the unit ktC/year. The provincial totals do not sum to the non-point totals presented in Table 2. The differences are defined as ODIAC minus GESAPU. Note the differences in total numbers indicated in  Estimating the geolocation in the nightlight data

Supporting information for the ODIAC-GESAPU emission comparison
We loosely estimated the magnitude of the shirt (geolocation error) we found in the nightlight spatial distribution, in which shift would appear as a spatial error (biases) in the resulting disaggregated emission fields. Here we estimate a nationwide average geolocation error, simply assuming the estimate can be obtained when the total emissions within the city boundaries (black polygons shown in Fig. 11) are maximized. We iteratively calculated the total emissions by changing (a) distance and (b) angle. The iterative optimization calculation yielded the maximum total emissions when distance = 1.6 km and angle = 27.3 degrees. Thus, the correction needs to be made by shifting the nightlight distribution by 1.6 km towards a south-east direction (with 27.3 degrees).