Introduction

The rapid development of unconventional tight oilFootnote 1 resources (commonly referred to as “shale oil”) in the USA has had a significant impact on the pricing and availability of global oil supplies. The abundance of the resource surprised many in the petroleum industry (Sanderson et al., 2019), and it has once again turned the USA into an energy powerhouse. Plentiful shale oil arguably provided the impetus for repealing the 1975 oil export ban in December 2015, and in October 2019 the USA became a net oil exporter, however briefly (EIA, n.d.a). The country’s newfound source of oil supply may have been responsible, at least in part, for a global oil price decline experienced after 2014.

The development of shale oil resources in the last decade has raised a series of interesting questions centered on the long-term viability of shale oil production, including the following:

  • Rapidly falling oil prices in 2015 and 2016 caused drilling and well completion to drop precipitously; and yet, surprisingly, production held firm during this timeframe, despite the faster rate of production decline in shale oil wells compared to conventional oil wells. How can this be explained?

  • As a “marginal resource,” how important has shale oil become in balancing global oil demand and supply?

  • The decline in price, attributed to the shale boom by observers like Hanewald (2017), prompted them to declare the OPEC cartel moribund. How much, in fact, has the shale boom reduced OPEC’s market power?

  • In the short- and medium-term, marginal costs for North American shale plays exceed those for conventional oil in the Persian Gulf. How vulnerable are shale oil producers to volatile pricing and supply decisions by OPEC+?Footnote 2

  • Although production in shale oil wells declines rapidly in the first few months, it also has a “long tail.” The vast majority of shale oil wells continue to produce at some level for many years. How will this impact the future oil market?

  • Any natural resource has an earth-bound limit, and yet shale oil exhibits characteristics of unboundedness. Will “peak oil” again emerge as a concern, and if so, when (Ngai et al., 2019)? Put another way, is shale oil a passing fad or a permanent reset of the oil market?

While we do not pretend to have full and complete answers to all of these questions, we present an in-depth review of sample data pertaining to the Bakken Shale oil play in North Dakota, one of the major shale provinces in the USA. Our research provides some insight into the productivity and price sensitivity of well development within the state and, by inference, the resource in general.

Background Information

The “Bakken Shale” is a term commonly used to identify an expansive rock formation that lies beneath Northeastern Montana, Western North Dakota, and Northwestern South Dakota, as well as Southeastern Saskatchewan and Southwestern Manitoba (Fig. 1). Historically, the region has been known as the “Williston Basin.” The Bakken Shale is a world-class, layered petroleum system spanning three different strata or formations (listed in ascending order): the Devonian Three Forks Formation; the Upper Devonian to Lower Mississippian Bakken Formation; and the lower part of the Mississippian Lodgepole Formation (Gaswirth et al., 2013; Gaswirth & Marra, 2015). The Bakken Formation consists of four distinguishable members or subunits, again listed from deepest to shallowest: Pronghorn Member (sometimes called the “Sanish Sand”); Lower Bakken Shale; Middle Bakken Member (which is silty and dolomitic); and Upper Bakken Shale (Theloy & Sonnenberg, 2013). The Upper Bakken Shale is the most geographically expansive of the four units, and the Pronghorn Member is the least. The Pronghorn has been found to be in fluid communication with the underlying Three Forks formation (and is sometimes regarded as part of that formation), which is similarly divided into upper and lower members (Gaswirth et al., 2013). The Middle Bakken, Lower Lodgepole, and Upper Three Forks comprise the primary reservoirs (NETL, 2011), with the upper and lower shale members being the main source rocks (Gaswirth et al., 2013). These various formations are commonly lumped together as the “Bakken Shale.” The Bakken Shale is largely a technology-driven play (Theloy & Sonnenberg, 2013), with progressive advances in horizontal drilling and hydraulic fracturing, along with other factors, thought to be principally responsible for wells with greater initial productivity being drilled over time.

Figure 1
figure 1

Map illustrating the approximate extent of the Williston Basin (beige) and the Bakken Formation (tan). Originally published by the Spokane Spokesman-Review and reproduced with permission. The approximate boundaries of the Fort Berthold Reservation are added

Production from the Bakken Shale is centered in North Dakota (in the area of the Fort Berthold Reservation) where the Middle Bakken and Three Forks formations represent the thickest part of the basin. Annual oil production in North Dakota increased from 123 thousand barrelsFootnote 3 per day in 2007 to 1.5 million barrels per day at the end of 2019 (North Dakota State Industrial Commission [NDIC], n.d.a), over a ten-fold increase. The state provides publicly available data on the number and types of oil wells, producers/operators, well locations, monthly production, and other pertinent information (NDIC, n.d.b). As of 2019, North Dakota listed 17,220 producing oil wells, 97% of which were horizontal hydraulically fractured wells (NDIC, n.d.c). The modeling effort described in this paper focuses on 4360 such wells that were completed between March 2015 and May 2019 in the Middle Bakken and Three Forks members of the Bakken Formation.

Participation by industry players in development of the Bakken Shale has been remarkably diverse. In our sample data set, there were 40 producers, the vast majority of which were small, independent oil companies. Over time, Continental Resources has been the largest company operating in the basin, but it produced only 11.4% of the crude oil considered here.

Infrastructure shortages have persistently plagued the Williston Basin region. Until recently, pipeline takeaway capacity out of the region was limited, leaving much of the interstate transportation to more costly rail.Footnote 4 In addition, because monthly well production varies widely, traditional gathering pipelines can be somewhat impractical and uneconomic to construct, and movement of crude oil away from individual wells is often relegated to trucking.

The diverse and constrained infrastructure situation has had an impact on pricing, in addition to production operations. During periods of rapid growth (2012–2014), North Dakota “first purchase” oil prices trailed West Texas Intermediate (WTI) prices by as much as $20 per barrel (EIA, n.d.a), or 19%, even as WTI itself was discounted against the price of Brent crude oil by 15% in the international market. In 2018 and 2019, the discount below WTI averaged $4.34 per barrel (EIA, n.d.a), or 7%. Generally, the high cost of moving Bakken oil to market explains the differential, but companies with hedged risk to congestion on interstate pipelines may obtain a much higher netback.

Description of Sample Data

Table 1 provides an overview of horizontal wells in North Dakota and the sample we extracted to analyze changing well productivity. Our sample was drawn in two steps. First, North Dakota’s horizontal well data were extracted from NDIC’s webpage (NDIC, n.d.b), which identifies six producing zones—Middle Bakken, Three Forks, Middle Bakken/Three Forks, Lodgepole, Upper Bakken Shale, and Lodgepole/Middle Bakken.Footnote 5 In our sample period, however, only the first three were relevant. Next, well production data for each month from May 2015 to May 2019 were merged with the set of North Dakota’s horizontal wells completed beginning March 2015 (NDIC, n.d.c). This action reduced the number of wells to be analyzed by about two-thirds.

Table 1 Historical completion and production activity in the Bakken Formation within North Dakota

Table 1 summarizes historical completion and production activity in the Bakken Shale within North Dakota. Note that wells completed in 2015 averaged 1926 barrels per month in May 2019. In contrast, wells completed in the first 5 months of 2019 averaged 21,643 barrels per month in May 2019. This increase is a combination of natural decline over 4 years for the wells completed in 2015, productivity increases, and the randomness of the data. Our econometric analysis quantifies these effects separately.

Response to a Drop in Oil Prices

Until recently, oil prices remained on the low side after 2013 (and troughed in early 2020). Lower prices had a direct impact on drilling and the number of wells completed, even though there is a time lag between drilling and well completion. According to the NDIC, new horizontal wells in the Bakken Formation declined from 2273 in 2014 to 738 in 2016, a drop pf 68%.Footnote 6 Figure 2 illustrates the relationship between oil prices and the number of North Dakota horizontal wells completed from 2005 to 2019. Additionally, Figure 2 includes an index of annual oil production. For comparability, well completion and production data are presented as indexes, with those for 2018 equal to 100.

Figure 2
figure 2

Horizontal wells completed in North Dakota compared to Bakken oil production and first purchase price; 1 bbl ≈ 159 L; sources are EIA (n.d.b.), NDIC (n.d.a.), and FRED (n.d.)

It is evident in Figure 2 that the precipitous drop in well completions tracked the collapse of the oil price in 2014, but production did not. Although the state’s horizontal well completions fell 68% between 2014 and 2016, oil production from horizontal wells fell only 4%. The industry was rocked again in 2020 following the swift drop in oil prices due to Covid-19, and this time production did not snap back. North Dakota’s July 2020 production was 30% lower than in December 2019. Perhaps more significantly, as of July 2020, only two North Dakota horizontal wells from the Bakken had been completed since 2019.

Technological Gain

The most extraordinary feature of the 4-year period was a significant increase in well productivity. Without controlling for other variables, production from March 2019 well completions was initially almost four times higher than from wells completed in March 2015. This reflects what we interpret to be a “technological gain” (sometimes referred to as an “experiential gain”) in productivity. (Our econometric modeling of oil supply in the North Dakota Bakken Shale suggests that, over 49 months, productivity nearly doubled, rather than quadrupling, but that analysis controls for location, prices of oil and oil and gas equity, and other factors.) Various drilling, completion, and operating improvements achieved in recent years have all contributed to this technological shift or enhancement, including overall improved knowledge of the resource, a longer drilling range, better fracking techniques, improved chemicals and drilling fluids, and the like. Additionally, until the Covid-19 crisis in 2020, small oil companies still had reasonable access to financing.

Over time, the development of shale oil has sometimes proceeded by trial and error (at least, in the early stages), with successes quickly mimicked and failures left behind. In this respect, technological progress is similar to that of other industries. As noted by Killefer (1948), “Inventions do not spring up perfect and ready for use … One seldom knows who the real father is. The period of gestation is long with many false pains and strange forebirths.”

The idea of technological gain associated with drilling and development of unconventional oil and gas wells is of considerable interest and has been addressed elsewhere in the petroleum literature (Covert, 2015; Fitzgerald, 2015; Montgomery & O’Sullivan, 2017). However, to date, no definitive metric or indicator has been established. For purposes of the present study, we use a simple function of time to account for technological progress, since time can encompass many individual and often unidentified contributors. For example, it can serve as a proxy for changes in the mix of fracking fluids, flexible pipelines that can be reused, and a variety of other innovations that are particularly well-suited for shale oil development. Assuming the other variables in a model represent the factors of production well, a time-trend as a proxy for technological change is similar to the “Solow (1956) residual,” which measures technological progress in his classic model of economic growth.Footnote 7

Further, in some cases, new technologies may have a limited impact by themselves, but may lead to unanticipated synergies over time when combined with other innovations. Indeed, in the case of shale oil development, the combination of two technologies—horizontal drilling and hydraulic fracturing—has resulted in much greater productivity than likely could have been attributed to both technologies used separately (EIA, 2018).

In the Bakken Shale, technological gain is also confounded with attempts by operators to accommodate the naturally occurring variation in geology, petrophysics, and reservoir response. Several authors (Denney, 2011; Pilcher et al., 2011; LaFollette, 2013; Sanderson et al., 2019; Attanasi & Freeman, 2020; Attanasi et al., 2020a, 2020b) address the physical drivers of well productivity in shale oil plays.

We also note in passing that oil and natural gas are produced in conjunction with each other (production complements), and that progressive increases in the productivity of natural gas plays, including the Bakken (EIA, 2020), have also been observed.

Long Tail of Production

The rapid decline in production from a horizontal hydraulically fractured shale oil well does not continue indefinitely, and, as in the case of conventional wells, horizontal Bakken Shale oil wells can have substantial longevity.Footnote 8 Further, only a few years after completion, they also have relatively lower production rates than most conventional wells. Among other factors, the change in flow rate is the consequence of the natural progression of a horizontal shale well’s production from transient flow to boundary-dominated flow (or some other flow regime), as described in, among others, Kabir et al. (2011), Zhou et al. (2017), Zhang et al., (2016a, 2016b), Luo et al. (2018), Yang et al. (2019) and Attanasi et al. (2019). The full history of horizontal wells drilled in the Bakken Shale and summarized in Table 1 demonstrates this revealing characteristic. As the table illustrates, the decade from 1986 to 1995 appears to have been primarily a period of experimentation. By the end of the period, horizontal wells were only 8.6% of North Dakota’s total production. Low oil prices effectively ended the experiment until 2004.

Figure 3 compares average monthly production from all horizontal wells in the Bakken and Three Forks formations completed in March 2015, 2016, 2017, and 2018. (Insufficient observations, ending in May, would make wells completed in March 2019 uninteresting.) The chart illustrates the initial rapid decline followed by the long tail of production. Although the figure demonstrates the steep decline rate, there remains a great deal of randomness, even on average. Note that the early months of production in 2016 showed considerable improvement compared to 2015, but average levels converged after the sixth month of production. In contrast, there appeared to be improvement in every month in 2017 and 2018 (average rates in later months were not yet available).

Figure 3
figure 3

Monthly average production from wells completed in March 2015, 2016, 2017, and 2018. a“bbl” stands for “barrels,” each of which contains nearly 159 L of crude oil

Shale Oil as a Balance Wheel

Easy to find conventional oil fields have long since disappeared in North America. As the industry turned to offshore and more difficult environments, lead times increased. For example, Prudhoe Bay was discovered in 1968, but the oil did not come to market until 1977, following the completion of the Alyeska Pipeline. In the early decades of the American industry, rapid conventional oil production expansion in the USA led to pro-rationing, but economic theory tells us that the first opportunities exploited in producing a good will be the easiest. Prudhoe Bay was lower on the list in terms of ease of exploration, development, production, and transportation, and the fact that it was exploited decades ago underscores the point that the easy opportunities for conventional production in North America are gone. Currently, at least, the time needed to develop shale oil resources is much shorter.

Moreover, production flow from conventional wells can last for decades, while, as noted previously, production from tight oil wells declines rapidly in the first few months—more rapidly than at conventional wells. North Slope oil production peaked in 1989, over a decade after production began, and has declined at a slow rate since then. Nearly half of a horizontal, hydraulically fractured shale well’s total production will be produced in the first 12 months. In this sense, shale oil development and drilling are more like a standard manufacturing process that can be explained by the Marshallian partial equilibrium model of a firm (Newell et al., 2016; Mandy, 2017). In this competitive model, flexible production, along with rapid entry and exit, makes long-term prices more predictable. Previously, prices have peaked in periods of excess demand, or disruption of supply, and then fallen back for decades. Prices still fluctuate, but, in the presence of tight oil, the variations have been smaller since 2014 than they have been since the turn of the century, and smaller than during the 1970s and early 1980s (Fig. 4).

Figure 4
figure 4

Monthly average world price and US tight oil production. “bbl” stands for “barrels,” each of which contains nearly 159 L of crude oil; “MMb/d” stands for “million barrels per day.” Source: EIA (2020) and EIA (n.d.c)

Given the long lead times and massive capital investment needed to develop most conventional oil fields, prices match long-term marginal costs only by accident. In contrast, shale oil’s lead time is typically 1–3 years, making it much more responsive to changing prices, and keeping long-run marginal cost closer to price.

The Organization of Petroleum Exporting Countries (OPEC) depends on a swing producer or producers to optimize prices. Given pressure from other oil exporters, however, the cartel’s largest producer, Saudi Arabia, has found this role increasingly difficult and resists going it alone. In most circumstances, the margin of oil supply that balances the market is thought to be small, on the order of three to five million barrels per day or less, out of global demand in 2019 of just over 100 million barrels per day. (The massive oil demand drop due to Covid-19 obviously shifted this equation.) In contrast, tight oil production in the USA alone reached 8.2 million barrels per day in early 2020 and in a normal year would have continued as the margin of production. If drilling and well development were to completely stop, this figure could drop by half in a little over a year. Likewise, given adequate incentives, production could ramp up quickly. For example, total North Dakota oil production, the majority of which comes from tight oil resources, increased by 18% in 2018 and 12% in 2019 in the face of relatively modest oil prices.

The ability of shale oil to impact markets can be measured by its price elasticity. Based on our econometric modeling of oil supply in the North Dakota Bakken Shale, along with comparisons to prior work, we conclude that the short-run supply price elasticity of shale oil is higher than that of other non-OPEC sources, and that the resource can, indeed, act as a balance wheel.

Econometric Modeling of Oil Supply in the North Dakota Bakken Shale

Sample Data

Table 2 describes the sample database on which the econometric analysis was performed. In total, we analyzed 4360 wells completed between March 2015 and May 2019 in the Bakken and Three Forks formations. Production from the wells averaged 3442 barrels per month, but the standard deviation was twice that level—reflecting the data’s volatility. We excluded observations of positive production below 100 barrels, as those data largely appeared to be erroneous. Since the number of new wells dropped off in 2016 and 2017 before rising in 2018, the average time in production was just under 10 months, and slightly less from the month of peak production. Average well production in the peak month was 19,710, again with a huge range, from 1 to 139,068 barrels per month. On average, there were two wells drilled in each quarter–quarter section (Public Land Use Survey System [PLSS], USGS (n.d.)), but one quarter–quarter section had 21 wells. North Dakota’s first purchase price averaged around $49 per barrel, but had a low of $24 and a high of $68, again reflecting the volatile market, and giving us rich variation without outliers for the estimation of price elasticity. Likewise, the value of the Standard & Poor’s Oil & Gas Exploration and Production Energy Trading Fund, which we used to measure access to capital, fluctuated widely, from a low of $25.17 to a high of $52.43. As noted, much of the drilling was located in or near the Fort Berthold Reservation (Fig. 1).

Table 2 Summary of sample data by well and month; May 2015–May 2019

General Modeling Approach

We used a panel approach (Diggle et al., 2002; Hsiao, 2003; Frees, 2004; Baltagi, 2008) to estimate physical and economic relationships among the data, where the cross-sectional unit was the well, of which there were 4360, and the temporal unit was the month, of which there were 49. For this wide-short panel arrangement, we followed the recommendations of Kennedy (2008), employing pooled ordinary least squares (OLS), fixed effects, or random effects estimation for “a wide, short panel, in which N, the number of cross-sectional units, is large, and T, the number of time periods, is small.” The methodology proceeds by testing the null hypothesis, H0, of equal intercepts for all cross-sectional units, and pooling the data if the hypothesis is not rejected (Kennedy, 2008). A Hausman test (Hausman, 1978; Torres-Reyna, 2007) is then normally used to compare fixed and random effects, where the latter is unbiased only if the regressors are independent of the random effects, but fixed effects cannot be estimated when any of the regressors are time-invariant, and we use several transformations of latitude and longitude as regressors.

Using this general approach, we developed two different equations, or modeling scenarios. In the first case (Model 1), we assumed continuous production; i.e., months of no production are omitted. In this case, we rejected H0 and concluded intercepts are not equal. We took the random effects estimator to be unbiased because we could not perform the Hausman test against fixed effects with the time-invariant regressors.

In the second case (Model 2), we allowed gaps in production for either technical or economic reasons, and count months prior to completion as observations of zero production (to account for the impact of regressors such as price on the timing of completion). In this case, we did not reject H0, which led us to pool the data (Kennedy, 2008). Note that random effects were also considered proxies for omitted variables (regressors), and finding them to be superfluous suggests that Model 2 was well specified.

Model 1: Continuous Production

“Continuous production” is defined here as uninterrupted flow and excludes breaks or pauses in production encompassing an entire month, as well as all other occurrences that can be regarded as zero production. This means that we discarded observations of zero production for an entire month, both before and after a well is completed, from our sample. (Such observations are restored and discussed in the description of Model 2: Discontinuous Production.) Partial months of production were included in the sample. This simplified the task of estimating the rate at which production declines over the life of a well.

Estimation of Model 1

Using the data summarized in Table 2, and excluding observations of zero production, we used the software Stata® (StataCorp, n.d.) to estimate an equation with a different randomly distributed intercept for each well, having rejected a null hypothesis that the intercept is the same for each well, and being unable to use a fixed well effect because the transformations of latitude and longitude that we include as regressors do not vary over time. The random effects reduced any omitted variable bias. For example, wells may differ in terms of the number and durations of shut-ins.

Results for Model 1

The following equation was estimated:

$$\begin{aligned} q_{it} & = \mathop {53.7654}\limits_{4.7054} - \mathop {0.0037}\limits_{0.0003} M_{it} - \mathop {0.4472}\limits_{0.0046} m_{it} + \mathop {0.2555}\limits_{0.0052} q_{it - 1} - \mathop {0.0165}\limits_{0.0018} W_{it - 1} \\ & \quad + \mathop {0.0090}\limits_{0.0003} I_{i} - \mathop {0.0211}\limits_{0.0067} e^{{{Lat}_{i} }} - \mathop {173.4584}\limits_{19.2171} e^{{{Long}_{i} }} - \mathop {2.7164}\limits_{0.4644} {Lat}_{i} - \mathop {13.9810}\limits_{1.9704} {Long}_{i} \\ & \quad + \mathop {4.4193}\limits_{1.1299} \ln \,{Lat}_{i} - \mathop {73.0228}\limits_{8.7841} \ln \left( { - {Long}_{i} } \right) - \mathop {0.4665}\limits_{0.0444} {Lat}_{i} \cdot {Long}_{i} \\ & \quad + \mathop {0.0598}\limits_{0.0179} {dp}_{t} + \mathop {0.1847}\limits_{0.0173} {dp}_{t - 3} + \mathop {0.0006}\limits_{0.0003} F_{t - 1} \\ \end{aligned}$$
(1)

where qit is log production at well i in month t in bbl,Footnote 9Mit is months in production starting with peak at well i in month t, mit is the natural log of Mit, Wit is the number of producing wells in the PLSS quarter–quarter section at well i as of month t, Ii is initial month of production at well i, Longi is longitude of well i, + 100°, Lati is latitude of well i, − 45°, dpt is the first difference in log North Dakota first purchase in month t in $/bbl, adjusted for inflation using the Consumer Price Index for All Urban Consumers, and Ft is the share price of Standard & Poor’s Oil & Gas Energy Trading Fund in month t, also adjusted for inflation using the Consumer Price Index for All Urban Consumers.

The logarithm of production was used as the dependent variable, so predicted production can never be negative. Both levels and logarithms of months in production since peak had negative coefficients, and the functional flexibility afforded by including both transformations of the variable allowed for the characteristic rapid initial decline rates of shale wells.

Standard errors are shown below the coefficients. Inference is robust to heteroscedasticity and autocorrelation. That is, in calculating the standard errors, the (Huber–White) error term is not restricted to be spherical (Freeman, 2006).

All of the coefficients in Model 1 were statistically significant at the 0.05 level or better. The overall R2 was 0.66. ρ, the fraction of the variance in qit explained by the random effects, was 0.18. The full Stata header is shown in Table 8 in the appendix. The estimated persistence in production, qit−1 = 0.2555, suggests that operators had flexibility to vary production from 1 month to the next, but that some such changes would be difficult or impose a cost. Since the sample omits gaps in production, the coefficient on qit−1 does not reflect the cost of shutting in or restarting a well.

The coefficient on initial month of production, Ii, was positive, which we interpreted as reflecting technological progress. This is noteworthy because, by specifying initial month in levels and production in logs, the impact of technological progress is shown to be greatest when wells are at their most productive; i.e., early in their lives. Nonetheless, technological progress has the potential to impact production throughout the life of a well. As Kah (2018) has noted, “While improvement from any specific activity may come to an end, there should still be a long way to go in overall technological advancement.” Fitzgerald (2015) also found evidence of learning in hydraulic fracturing using data from the Bakken Shale.

Economists, like Solow (1956), define “technological progress” as an increase in the ratio of production to (a weighted average of) the factors of production used, a.k.a. “total factor productivity.” This technological progress must be identified separately from the effects of changes in the use of factors when total factor productivity does not change; per well, the volume of injected fluids, proppant, and the number of fracture treatments changed within the sample period. Equation 1 sorts this out. A small, profit-maximizing producer will only increase the use of a factor of production (e.g., proppant) if (1) the price of the product (here, crude oil) rises, (2) the price of proppant falls, (3) the price of another factor (e.g., well length) changes, or (4) she learns how to produce more crude oil with no change in total cost, or in the price of any factor of production. If the prices of crude oil and its factors of production are held constant, then one can rule out Cases 1–3, and Case 4 is what we call “technological progress.” Including a variable in a regression is a way to hold its value constant while the others change. Case 1 is modeled in Eq. 1 in the price terms, and Case 4 in the term for initial month of production, Ii, whose coefficient, our estimate of technological progress, could be biased if we did not somehow model Cases 2 and 3. In the parlance of econometrics, this would be referred to as “omitted variable bias.”

Here is how we model Cases 2 and 3: Data on prices or quantities of materials and labor employed in the Bakken Shale are not publicly available, but these omitted variables correlate with the prices of oil included in Eq. 1, and with oil and gas equity, represented by \(F_{t - 1}\) in Eq. 1. The price of equity is the value of present and expected future profits; \(F_{0} = \sum\nolimits_{t = 0}^{\infty } {\left( {P_{t}^{{{\text{Ind}}}} Q_{t}^{{{\text{Ind}}}} - \vec{W}_{t}^{{{\text{Ind}}}} \vec{\theta }_{t}^{{{\text{Ind}}}} } \right)\left( {1 + r} \right)^{ - t} }\), where \(P_{t}^{{{\text{Ind}}}}\) is the price of crude oil at Time \(t\), \(Q_{t}^{{{\text{Ind}}}}\) is crude oil produced, \(\vec{W}_{t}^{{{\text{Ind}}}}\) is a vector of factor prices, \(\vec{\theta }_{t}^{{{\text{Ind}}}}\) is a corresponding vector of factors employed to produce \(Q_{t}^{{{\text{Ind}}}}\), the “Ind” superscripts indicate that the variables are industry-wide, and r is the temporal discount rate. An increase in any \(W_{0}^{{{\text{Ind}}}}\) will directly lower F0, and likely indirectly lower it by influencing expectations for t > 0. \(\vec{\theta }_{t}\) weights the prices of intensively used inputs relatively heavily, and so they effectively have larger coefficients in Eq. 1. Financial markets are very sensitive to changes in the costs of companies who issue financial instruments, so we have meaningfully controlled for the prices of factors of production used throughout the industry in Eq. 1.

Indeed, fracking fluids (Alibaba, n.d.) and proppant (Market Watch, 2021) are traded globally, so changes in their prices in the Bakken correlate with changes worldwide and, therefore, with the price of globally traded oil and gas equity. There is also a geographically mobile pool of oil field labor that causes wages in the Bakken to correlate with those elsewhere in the industry and, therefore, also with the price of equity. Inasmuch as greater use of fluid, proppant, or labor represents a movement along the supply curve (Case 1), we modeled that by making production, qit, a function of the price of oil, pt; inasmuch as it represents a shift in the industry supply curve (Cases 2 and 3), we have modeled that by making production a function of the price of equity, Ft−1.

As to the supply curve for the firm, the random effects, which allow for a different intercept term at each well, can also stand in for omitted prices or quantities of variable factors of production (Cases 2 and 3) and may correct any remaining omitted variable bias at an individual well. According to Kennedy (2008; pp. 281–282), "Panel data can be used to deal with heterogeneity in the micro units. In any cross-section, there is a myriad of unmeasured explanatory variables that affect the behavior of the people (firms, countries, etc.) being analyzed. (Heterogeneity means that these micro units are all different from one another in fundamental unmeasured ways.) Omitting these variables causes bias in estimation. The same holds true for omitted time series variables that influence the behavior of the micro units uniformly, but differently in each time period. Panel data enable correction of this problem. Indeed, some would claim that the ability to deal with this omitted variable problem is the main attribute of panel data."

Finally, when we added the Producer Price Index for support activities for oil and gas extraction as a regressor, its sign was negative, as expected, but it hardly changed the coefficient on Ii, and certainly not in a statistically significant way. This suggests that the equity price and random well effects have adequately addressed what might otherwise be omitted variable bias in the coefficient on Ii.

Based on Eq. 1 and projecting forward 75 months, Figure 5 illustrates the rapid decline rate of Bakken Shale oil production in the early months, as well as the impact of a technological shift in production between wells where production began in August of 2015 and those where production began in August of 2018. The figure indicates the existence of a proportional upward shift in production from 2015 to 2018; the rate of production decline did not change. In fact, in the first year (12 months) of production, output in the model declined 78% for wells beginning production in 2015 and 79% for those beginning production in 2018.

Figure 5
figure 5

Expected monthly production from wells beginning production in 2015 and 2018. a“bbl” stands for “barrels,” each of which contains nearly 159 L of crude oil

In Figure 5, price, location, and distance between wells are fixed in order to isolate the effect of the technological shift, which is smaller than the quadrupling of production in the raw data mentioned under “Technological Gain,” above. We conjectured that fixing location and well-spacing likely understated the technological shift, inasmuch as developers tend to “learn as they go” when progressing from one situation to the next, as noted in Attanasi and Freeman (2020). Smith (2018) used a proportional shift in the decline curve, like that in Figure 5, based on an assumption of diminishing marginal productivity of new wells. Our results helped validate his assumption.

As suggested earlier, a very general function of latitude and longitude was specified to describe well location in order to account for spatial variation in production. A flexible functional form was needed in order to approximate the irregularity in the geographic distribution of the resource attributable to naturally occurring geological dispersion and any induced spatial variation attributable to operational differences. Dummy variables for township/range/section were considered, but the precision of latitude and longitude explained production as well as any of the dummy variables, with far fewer regressors. 45° was subtracted from the values of latitude, and 100° was added to the values of longitude so that they would exhibit some significant variation in relation to their means, and Stata would not omit them due to collinearity.

Table 3 shows the relative productivity of wells by geographic location. Equation 1 indicates a range of over five to one in the magnitude of the effect of location on production. In other words, when drilling moved from the most productive center of the formation, at about (47.54, − 102.68) to the periphery, productivity dropped substantially. Figure 6 shows the relationship between location and production after controlling for the other variables in Eq. 1. The exact center of the “sweet spot” was at 47.4939° latitude and − 102.5729° longitude. This was about 20 km from the center of the Fort Berthold Reservation, at 47.7031° latitude by − 102.2978° longitude (Fig. 1).

Table 3 Effects of latitude and longitude on production (ratio to average)
Figure 6
figure 6

Effect of location on the production of Bakken Shale oil wells

It has been suggested that well spacing is another factor in productivity (Matthews et al., 2019). Using the PLSS, the model counted the number of wells in each quarter–quarter section in order to estimate the impact of well spacing. To guard against endogeneity of the regressor, we used a 1-month lag in the number of co-located wells. The number of co-located (in the same PLSS QQ section) wells that were in production, Wit−1, bore a negative coefficient in Eq. 1, suggesting that, if wells were not adequately spaced, production was adversely affected, possibly as a result of interference (Ajani & Kelkar, 2012; Jacobs, 2017; Brown et al., 2019; Matthews et al., 2019). A co-located well was counted as being “in production” if it produced more than 100 barrels. Once a well is completed, an additional co-located well reduces production at the original well by 1.65%, with a 95% confidence interval of [1.29%, 2.00%].

In Model 1, the coefficient on the price of equity, as measured by the first monthly lag relative to production of the Standard & Poor’s Oil & Gas Exploration and Production Energy Trading Fund, Ft−1, was positive. Higher stock prices reflect higher expected rates of return on investment, consistent with the Hotelling (1931) pricing model and results previously reported by Smith (1981) and Bjørnland et al. (2017). Another line of reasoning is that better access to financing increases optimal spending on capital, materials, and labor and, therefore, increases supply, as in the discussion of omitted variable bias preceding Figure 5.

Note that price trended up during the sample period (Fig. 2). In fact, the correlation between log price and time was 0.66, while the correlation between its first difference and time was only 0.10. Hence, the log of price was used so that the supply curve would be convex, and the first difference was used to avoid spurious inference. In Eq. 1, the coefficients on the contemporaneous first difference in log price, dpt, and its third lag,Footnote 10dpt−3, were positive; production increased in response to an increase in price. Further, the considerably larger coefficient on the third lag may reflect that, given more time, operators can do more in response to a change in price. With the lagged dependent variable included, the impact on production of a change in price lasting s months increased over time as the sum of a geometric series (Weisstein, n.d.). To illustrate, and ignoring the effect of the third lag,

$$\begin{aligned} dq_{it} & = 0.0598dp_{t - s} \sum\limits_{j = 0}^{t - s} {0.2555^{j} } \\ & = 0.0598dp_{t - s} \left( {\frac{{1 - 0.2555^{t - s + 1} }}{1 - 0.2555}} \right) \\ & \quad \to \frac{{0.0598dp_{t - s} }}{1 - 0.2555}\;{\text{as}}\;t \to \infty \\ \end{aligned}$$
(2)

Even without the additional effect of the third lag, it seems clear that, given more lead time, operators can do more to change the rate of flow from a well. It is also important to consider the predictive capability of Model 1. To predict production in the first month using Eq. 1, we let t − 1 approach t and set Mi = 1, which resulted in:

$$\begin{aligned} q_{i} & = 72.2115 - 0.0221W_{i} + 0.0007F \\ & \quad + \,0.0121I_{i} - 0.0284e^{{{Lat}_{i} }} - 232.9852e^{{{Long}_{i} }} - 3.6486Lat_{i} - 18.7789{Long}_{i} \\ & \quad + \,5.9360\ln \,{Lat}_{i} - 98.0825\ln \left( { - {Long}_{i} } \right) - 0.6265Lat_{i} \cdot {Long}_{i} \\ \end{aligned}$$
(3)

In the sample, 1332 wells peaked in the first full month of production, 1275 in the second, 683 in the third, 404 in the fourth, and 243 in the fifth. It is safe to say that the first month, if it were a full month of production, would be the modal peak month. However, to avoid the problem of mid-month starts, we set the second month as the first full month in Model 1.

Table 4 shows the predicted production in the second month falling between the median and mean in the sample, lending some credence to the model. In this example, location was fixed at 47.9242° latitude and − 103.1173° longitude. For all but 1 month (September, 2017), median production for wells in the second month was below the mean, and the average mean-to-median ratio across months was 1.12. Over the 49-month sample period, median production in the second month of production increased by 66%, and the mean increased by 73%.

Table 4 Summary statistics for the second month of production (bbl/month)

Price Elasticity Under Model 1

Oil supply price elasticity has been approached from a variety of perspectives and contexts, and it continues to be of considerable interest to the petroleum industry, especially given the volatility of price since 1973. Using a series of simulations, Ikonnikova et al. (2017) showed that “the oil price is probably the single most important factor that can drive cumulative Bakken production (2015–2045) up or down as much as 50%.” As also demonstrated here, an important application of the econometric modeling was to estimate the price elasticity for the Bakken Shale. Table 5 summarizes the estimates of Bakken Shale oil supply price elasticity obtained using Model 1, with their 95% confidence intervals. The entries in Table 5 represent the percent changes in production after the indicated number of months resulting from a one percent change in price lasting that length of time. For example, if price rises by one percent for 5 months, production 5 months after the onset of that change would be 0.31% higher than it would have been without the change in price.

Table 5 Estimated supply price elasticity for Bakken Shale oil in North Dakota based on Model 1 (response to a 1% change in price)

Model 2: Discontinuous Production

Model 1 assumes that wells are continuously in production (i.e., there are no gaps), a constraint that excludes actual operating situations associated with completing, shutting in, and restarting wells, which may involve full months of no production. Excluding observations of zero production is largely benign for the purpose of estimating a continuous production curve like that shown in Figure 4. However, gaps in production, including when they begin, how long they continue, and when they end, to some degree reflect the impact of price, even if that is not the main consideration. This connection was evident in the following response to low prices during Spring 2020:

Louisiana Oil and Gas Association President Gifford Briggs said the group is in the process of updating … projections in light of the recent price crash. “Our members are telling me that they’ve instructed their field people to begin shutting in production immediately,” he said (Mackrael et al., 2020).

An opposite response became apparent with a rapid rebound in activity as prices increased in June 2020:

U.S. shale producers are expected to restore roughly half a million barrels per day (bpd) of crude output by the end of June, according to crude buyers and analysts, amounting to a quarter of what they shut-in since the coronavirus pandemic cut fuel demand and hammered oil prices.

Such a swift rise in U.S. production would complicate efforts by top producers Saudi Arabia and Russia to encourage global allies to fulfill their pledges to make record production cuts…

U.S. producers cut supply by roughly two million bpd. But the recovery in benchmark oil prices to around $40 a barrel makes some shale output profitable again, even though that level is unlikely to spur additional new drilling activity. Larger producers are re-opening the taps in low-cost plays in Texas, but also in expensive shale basins in North Dakota and Oklahoma (Kumar & Hiller, 2020).

In the longer term, producers may respond to changing conditions by scheduling and re-scheduling drilling and completion, shutting in and restarting wells, and modulating production at wells that are up and running. Operators may also choose to delay completion of drilled wells until prices improve. The US Energy Information Administration (EIA) noted, as crude oil prices fell in the second half of 2014 and remained low in 2015, the count of new wells drilled in North Dakota fell. During that time, the time to drill a new well remained relatively unchanged, but the time to complete a well increased from about 3 months to nearly a year (EIA, 2019). All of these behaviors enhance shale oil’s role as a balance wheel to the larger oil market.

Estimation of Model 2

For purposes of constructing Model 2, production was assumed to be one barrel during months when production was actually zero, so that the dependent variable, the log of production, would be defined. The variation between one barrel and observations over 100 barrels was virtually the same as between zero barrels and the same observations over 100 barrels. Months from peak production was excluded because observations of zero production before completion have nothing to do with the number of months until peak production occurs. Similarly, the timing of the initial month of production (though not production in that month) was excluded because, when drilling and completion costs are variable, so is the timing of completion; and the later completion occurs, the higher the production, since improved technology can be used. In this case, the timing of initial production would be an endogenous regressor, correlated with innovations in production. Hence, for simplicity, it was omitted. Instead, we controlled for technological progress using a common deterministic trend.

As in the case of Model 1, we tried a random effects modeling approach. However, we found that the random effects explained almost none of the variance in production, whereas a pooled ordinary least squares (OLS) approach produced the same coefficients to five decimal places. Neither running 10,000 bootstrap replications nor down-weighting high-variance observations nor iterating the variance–covariance matrix and coefficients improved inference, and so we invoked Occam’s Razor. We took OLS to be efficient under a heteroscedasticity- and autocorrelation-robust error structure and used it to estimate Eq. 4. The full Stata header is shown in Table 9 in the appendix.

Model 2 Results

Model 2 is given by:

$$\begin{aligned} q_{it} & = - \mathop {9.7530}\limits_{2.5982} + \mathop {0.0072}\limits_{0.0003} t + \mathop {0.9197}\limits_{0.0015} q_{it - 1} + \mathop {0.0501}\limits_{0.0024} W_{it - 1} \\ & \quad + \,\mathop {0.0267}\limits_{0.0041} e^{{{Lat}_{i} }} + \mathop {38.2003}\limits_{10.6452} e^{{{Long}_{i} }} - \mathop {1.1578}\limits_{0.2894} Lat_{i} + \mathop {2.9320}\limits_{1.0856} {Long}_{i} \\ & \quad + \,\mathop {2.9204}\limits_{0.6886} \ln \,{Lat}_{i} + \mathop {15.8138}\limits_{4.8442} \ln \left( { - {Long}_{i} } \right) + \mathop {0.1297}\limits_{0.0259} Lat_{i} \cdot {Long}_{i} \\ & \quad + \,\mathop {0.0614}\limits_{0.0278} dp_{t} + \mathop {0.0337}\limits_{0.0248} dp_{t - 1} + \mathop {0.0616}\limits_{0.0261} dp_{t - 3} + \mathop {0.0043}\limits_{0.0003} F_{I_{i} - 4} \\ \end{aligned}$$
(4)

The coefficient on the deterministic trend, 0.0072, was highly significant (p < 0.0001), though somewhat smaller than that on initial month of production in Eq. 1. Both models may be interpreted to depict rapid technological progress. We again addressed potential omitted variable bias by including the price of equity, \({F_{{{I}_{i}} - 4}}\) and by trying a model with random well effects. That the data indicated that the intercept term did not differ by well was consistent with there being no such bias in Eq. 4. At 0.9197, persistence, the coefficient on qit−1, was considerably larger than in Eq. 1. This likely reflects the greater costs associated with shutting in and restarting wells, as compared to those of simply modulating continuous production.

The coefficient on co-located wells, Wit−1, was positive, in contrast to that in Eq. 1, which is negative. This implies that, when a well has been drilled, as in Eq. 1, adding wells nearby lowers production (Ajani & Kelkar, 2012; Pang et al., 2015; Ajisafe et al., 2017), but when choosing where to drill, as in Eq. 4, it is advised to drill where others have drilled already. In Eq. 1, the addition of a nearby well lowers output of the original well by 1.65%. In Eq. 4, one can expect 5.01% more output from a prospective well for each nearby existing well, with a 95% confidence interval of [4.54%, 5.48%]. This is an interesting twist on the common resource problem, suggesting that unitization of shale plays (or portions thereof) could lead to a Pareto improvement, if it could be technically and legally accomplished (De Kok, 2014; Kramer, 2015; Kleit et al., 2020).

The coefficient on the stock price, lagged 4 months before the date of initial production (as distinct from current production, as in Eq. 1), \({F_{{{I}_{i}} - 4}}\), was positive. This implies that when capital is more available, more is invested, and production is increased. We used the 4-month lag because a period of time shorter than this would not allow for much change in the physical investment necessary to complete a well. Inasmuch as the costs of these investments were correlated with prices in markets for inputs large enough for financial traders to notice, like operating costs for fluid, proppant, and labor, a rise in the cost of such inputs that is expected at Time Ii-4 will both shift the supply curve up, decreasing production, qit, at any given price for oil, pt, and lower the present value of expected profits, lowering the stock price, \({F_{{{I}_{i}} - 4}}\), and so its coefficient was positive. The fact that random well effects, which would stand in for any omitted variables, were found to be superfluous supports the adequacy of the regressors in Model 2, including \({F_{{{I}_{i}} - 4}}\), as a representation of cost.

Because random effects were needed in Model 1, which differed from Model 2 mainly in that months of zero production were omitted from the data, the random effects in Model 1 likely only reflect differences across wells related to shut-ins, re-starts, and the timing of completion, insofar as the differences related to completion are not monotonically related to time; the date of completion was not an omitted variable in Model 1.

Price Elasticity Under Model 2

Table 6 reports the estimates of supply price elasticity for Bakken Shale oil based on Model 2 (discontinuous production), along with their 95% confidence intervals. Recall that Model 2 encompasses situations of zero production associated with various well shut-in/re-start scenarios, as well as observations preceding completions of wells.

Table 6 Estimates of supply price elasticity for Bakken Shale oil in North Dakota, with 95% confidence intervals, based on Model 2 (response to a 1% change in price)

Note that the values reported in Table 6 are higher than those reported in Table 5, reflecting the greater response to a change in price when stopping and starting production is possible, in contrast to a situation wherein completion/shut-in/re-start scenarios are not considered, except in the month when production begins or ends.Footnote 11 The more options one has for changing production, the greater the change in production in response to a given change in price. Further, the coefficients on dpt in Eqs. 1 and 4 were not appreciably different, suggesting that it takes more than a few weeks of higher prices for an operator to respond positively by completing or restarting a well, and more than a few weeks of lower prices for an operator to respond negatively by shutting one in. Significantly, around 85% of the supply impact occurred within the first 2 years, again pointing toward shale oil’s important role as a balance wheel to the oil market.

Evaluating Model 2 Results in Light of Prior Elasticity Estimates

Work specifically focused on the price elasticity of shale oil is still somewhat limited, which our analysis and results help alleviate. A popular approach has been to calculate a breakeven point (BEP) for the price of shale oil (Business Insider, 2014), which Kleinberg et al. (2016) clarified. Ansari and Kaufman (2019) followed with a different BEP calculation and concluded that production is most sensitive to price changes near that point. Our approach is somewhat different from these two in the sense that we sought to specifically quantify the price elasticity of tight oil supply. Put another way, low oil prices slow down shale oil development, and high prices accelerate it. And, as this research has demonstrated, breakeven points can shift dramatically in a few years due to rapid technological progress. Our approach is more closely aligned with that of Smith and Lee (2017) who estimate elasticity of economically recoverable reserves of shale oil, though we do not explicitly model that variable.

Bjørnland et al. (2017) also estimated elasticity of tight oil supply using North Dakota data, but with the following differences in time span and modeling approach. Their data end in 2015, about when ours begin. They used (1) the price of WTI, while we used first purchase price in North Dakota, (2) a combination of current and futures prices, while we found lagged “spot” prices and oil and gas equity prices, which vary directly with expected oil prices, to be significant predictors of production, (3) fixed time effects, while we used completion date or a deterministic trend, and (4) effects of location aggregated into fixed effects on wells, while we specified several transformations of latitude and longitude. In addition, Bjørnland et al. (2017) did not control for well density, while we found that it informs the problem of common resources. Both approaches lead to essentially the same qualitative conclusions. They find support for Hotelling’s (1931) theory of optimal extraction, while we found that production depends positively on common stock prices for extraction of oil and gas, and, moreover, “firms using shale oil technology are more flexible in allocating output intertemporally.”

Estimates of supply price elasticity for the more general category of non-OPEC crude oil are much lower, especially in the short term. For example, Golombek et al. (2018), Alhajji and Huettner (2000), and Vatter (2017) all had previously estimated the price elasticity of non-OPEC oil supply, of which shale oil represents a subset. Golombek et al. (2018) estimated the long-run price elasticity of non-OPEC supply to be 0.32, citing Alhajji and Huettner (2000), who estimated it to be 0.29, whereas Vatter (2017) estimated it to be 0.24. By way of contrast, the last row in Table 6 shows a much larger value of 1.95. A current production-weighted average of our estimate of long-run elasticity of tight oil supply of 1.95 and conventional non-OPEC supply of 0.24 comes out to 0.53, indicating that shale oil has made non-OPEC supply much more elastic, but still decidedly inelastic. Vatter (2017) further estimated within-quarter elasticity of non-OPEC supply to be 0.015, well below the 3-month estimate of 0.23 in Table 6.

Testing Model 2 Against Three Sources of Aggregate US Shale Oil Production

As a test of the robustness of Model 2, Table 7 shows how Eq. 4 performs out of sample, when applied to three different sources of aggregate US shale oil production: Bakken (North Dakota plus Montana); Permian Spraberry; and all US shale oil combined. Here, we used aggregate monthly data because more granular information was not universally available throughout the USA.Footnote 12 Time-invariant variables and well density, assumed not to change, were represented in a constant, which is shown in the last row, and calibrated to provide a prediction for May 2019. None of the projected errors exceeded 8%. For reasons suggested by Coburn and Attanasi (2020), the model did a better job predicting aggregate US production than it does for the Bakken (North Dakota plus Montana) alone. Predictions for the Permian were almost as good as those for the Bakken (North Dakota plus Montana). It is not unreasonable, then, to generalize our results from the Bakken to other shale plays in North America.

Table 7 Projection for Bakken, Permian, and Total US tight oil production (MMb/d)a

Conclusions and Further Discussion

As stated in the introduction, our work does not fully address the many questions pertaining to shale oil resources and their role in global markets. However, using the Bakken Shale as a case study, our work yielded important insights into the productivity and price sensitivity associated with well development within shale plays that inform knowledge of the resource, its ongoing development, and its ultimate role in global energy markets.

We conclude that technological progress in the Bakken Shale and, by extension, other shale plays, was rapid. Analysis of horizontal well production data has revealed significant technological progress over a 4-year period; so much so that it offset a major production decline that would have been expected following a drop in drilling. Nonetheless, oil is a depletable resource, and, at some point, there must be diminishing returns. For the time being, however, oil prices will largely determine the pace of shale oil development in the USA.

We found that this source of supply is more price-elastic than non-OPEC supply in general. It is said that the Bakken is the most price-sensitive of the shale plays (Clemente, 2018), but, inasmuch as the greater elasticity of the Bakken is indicative of the supply of shale oil in general, it is good news for the world’s consumers and the world economy, as greater elasticity of supply dampens extreme price swings. Because of the asymmetric effects of oil prices on the macroeconomy (Mork, 1994), those price swings lower world GDP, and so shale oil’s dampening of them raises world GDP. Consistently, Melek et al (2020) “show that the shale boom boosted U.S. real GDP by a little more than 1 percent” [abstract]. Our formal results underscored the important role of shale oil in the coming decades in balancing global supplies with demand. The range of high and low prices is likely to be smaller than it otherwise would have been, and, ultimately, oil prices and the increasing, long-run marginal cost of developing the shale resource will tend to converge. However, despite the greater elasticity of shale oil supply, at current production levels, non-OPEC supply as a whole remains decidedly inelastic.

Assessing shale oil’s future potential starts with the recognition that, so far, development has been constrained to regions with historic oil production, such as the Williston Basin. This may be due to the industry’s familiarity with historic producing areas, but most likely it is simply due to geology. Virtually all of the production is located in well-known oil regions. Indeed, 89% of shale oil production is from source rock supporting the Permian, Bakken, and East Texas conventional plays. Consequently, while shale oil has proven to be prolific, it remains concentrated in a few regions in which substantial conventional reserves were previously found. This observation hints at the future for shale oil in other countries and the impact it may have in the global arena.

Continued production of tight oil may also be consistent with optimal climate policy. Vatter (2021) argued that optimal tax rates on emissions of CO2 are progressive, they vary directly with the wealth of the taxpayer, and against policy mechanisms that apply “at the source” for oil, because the latter raise prices by the same absolute amount for all consumers, including the poor, which is regressive.