This research begins by extending the USGS’s estimates (Merrill et al. 2018) through 2019 using the same general emissions accounting methods and emission factors used by USGS (US Environmental Protection Agency (EPA) 2020a) (Fig. 1a and b). We then impute federal production to 2030 and again estimate associated life cycle CO2e emissions.
To estimate historic federal fossil fuel emissions we first aggregate publicly available data on historic coal, oil and gas production from 2005 to 2019 for federal lands and waters, not including American Indian and Tribal lands, from the Office of Natural Resources Revenue (ONRR) (US Office of Natural Resources Revenue (ONRR) 2020). To be consistent with USGS we use 2005 as our start year. We calculate associated life cycle emissions of greenhouse gases (carbon dioxide, methane, and nitrous oxide) from the federal mineral estate (coal, onshore natural gas, offshore natural gas, onshore oil, and offshore oil) using calculation methods and assumptions employed by the EPA Inventory (US Environmental Protection Agency (EPA) 2020a).
To calculate upstream and midstream emissions, we scale down EPA’s national-level, fuel- and segment-specific emissions data (US Environmental Protection Agency (EPA) 2020a) using a ratio of federal production (US Office of Natural Resources Revenue (ONRR) 2020) to the Energy Information Administration’s (EIA) national production (US Energy Information Administration (EIA) 2020a). To calculate downstream emissions we multiply production volumes (minus the natural gas volume emitted as fugitive methane or burned in the upstream and midstream segments) by annual fuel-specific consumption by end-use sector from EIA’s Monthly Energy Review (US Energy Information Administration (EIA) 2020f), and apply sector specific emission factors. Sector specific emissions factors are derived by multiplying average annual heat content by fuel type and consuming sector from the EIA (US Energy Information Administration (EIA) 2020f, ) by EPA’s emission factors by gas or annual carbon content coefficient by fuel type (US Environmental Protection Agency (EPA) 2018, 2020b). In order to mirror our analysis to USGS for overlapping years 2005–2014, we also employ the EPA Inventory methods that at the time of this writing use IPCC Fourth Assessment Report (AR4) 100-year global warming potentials (GWP) to convert total emission estimates into a common carbon dioxide equivalent unit. We use a GWP value of 25 for methane and 298 for nitrous oxide (Solomon et al. 2007), which differ from the GWP values recommended in the IPCC’s Sixth Assessment Report published in 2021 (Supp Fig. 1). These estimates may underestimate methane leakage. The EPA revised its methane emissions inventory methodology in 2019 to show a 1.1% leakage rate for the natural gas system, which is below top-down estimates of 2.36% (Alvarez et al. 2018).
We employ the same overarching emissions accounting methods used by USGS (based on the methods and emission factors from the EPA Inventory energy chapter). Our accounting differs due to data availability differences. ONRR continues to adjust historic annual federal production data for up to seven years. This means that in addition to including 5 years of additional historic production data (2015–2019), we also used updated historic volumes for the overlapping years covered by USGS (2005–2014). We use the annual emission factors from the 2020 EPA Inventory whereas USGS used the factors employed by the 2016 EPA Inventory.
To estimate future emissions we collect additional data from the EIA. Specifically, the EIA’s Annual Energy Outlook (AEO) provides projections for oil and gas production for offshore federal waters, but not onshore federal lands. To predict future onshore federal production we assemble panel data for variables of interest by harmonizing available historic data from EIA and ONRR with projected data from the AEO to create panel data across 2005–2030. This approach includes matching regional categorization discrepancies between historic and projected data sets. The panel includes consumption, export, import, and regional production data from the Energy Information Administration’s historic data sets (US Energy Information Administration (EIA) 2020b, 2020c, 2020d, 2020e, 2020f) and the 2020 AEO Reference Case Scenario, which runs from 2020 to 2050 (US Energy Information Administration (EIA) 2020g). In total, we use 31 control variables (Supp Fig. 2) to predict three output variables, total federal lands and waters coal, oil and gas production.
With our panel of control variables, we predict future federal coal, oil, and gas production to 2030 via a regularized regression technique, synthetic controls with elastic net (SC-EN) introduced by Doudchenko and Imbens (2016). SC-EN uses an elastic net penalty term (a combination of LASSO and ridge penalties) to limit overfitting. This machine learning approach combines the original synthetic controls method, introduced by Abadie et al. (2010), with regularized regression. SC-EN provides a more flexible regression technique than traditional synthetic controls or ordinary least squares. Formally, SC-EN estimates values via:
$$ \hat{Y}_{j,T}(0) = \hat{\mu}^{en}(j;\alpha,\lambda) + \sum\hat{\omega}^{en}_{i} (j;\alpha,\lambda) * Y^{obs}_{i,T}. $$
(1)
\(\hat {Y}\) is the unit being predicted. α is how much weight to put on the LASSO or ridge components of the elastic net estimator, which is fixed at .5 in our model. λ is a penalty term and defined by cross validation (Doudchenko and Imbens 2016). ω is a weight for each control observation, \(Y^{obs}_{i,T}\). \(\hat {\mu }\) is the intercept.
Effectively, each pre-period predicted variable is regressed separately on pre-period control variables and penalized by an elastic net operator, thereby providing coefficients for each control variable. These coefficients are applied to the post-period control variables to estimate the post-period predicted variable in each future year.
We use cross validation to test our predictive accuracy. Here, we randomly select control variables with replacement and re-estimate held-out variables in each bootstrap run. We focus our performance test on three similar control variables supplied in the EIA data — total US coal, oil and gas production, and re-predict their AEO projected values from 2020 to 2030. In 100 bootstrap samples per fuel type, we find an absolute mean difference between observed and imputed values of 0.86%, 1.55%, and 1.23% for coal, oil and gas respectively.
Applying this same randomly selected bootstrap approach we impute our three variables of interest - federal coal, federal oil and federal gas - and record the mean of 100 model runs as our reference case. We then subtract EIA’s federal offshore oil and gas projections from our predicted total federal oil and gas production estimates to obtain projected onshore by-fuel estimates from 2020 to 2030 (Fig. 1). We report low and high cases by averaging the second and third lowest and second and third highest bootstrap outputs per year and per fuel type. Finally, we calculate CO2e emissions from our production figures to obtain a future profile of CO2e produced on federal lands and waters.
We calculate the social cost of emissions for years 2005–2019 using the average annual per ton dollar value of CO2e emissions established by the Interagency Working Group (IWG) on Social Cost of Greenhouse Gases under Executive Order 12866 (Interagency Working Group on the Social Cost of Greenhouse Gases 2016). For 2020–2030 we calculate the social cost of carbon, social cost of methane, and social cost of nitrous oxide using the annual values reported in the interim IWG 2021 Technical Support Document (Interagency Working Group on the Social Cost of Greenhouse Gases 2021b), (Interagency Working Group on the Social Cost of Greenhouse Gases 2021a), Supp Fig. 3. We use the IWG annual global values and a central 3% average discount rate to account for the cost of climate impacts to future generations. The 2005–2019 values are adjusted for inflation to 2020 dollars.