Skip to main content

Algorithms, human decision-making and predictive policing

Abstract

Given their technical sophistication, it is easy to overlook the human choices that underpin predictive policing algorithms and, importantly, the basic structures of decision theory that it embeds. To make a problem amenable to algorithmic computation, the problem must be transformed, often metaphorically, and the problem space delineated. Problem space delineation is one pathway through which human decision-making processes may enter and shape algorithm design, construction, and application. We use decision theory, including behavioural economics, to highlight the choices embedded within this problem space delineation and raise awareness as to the potential effect of these choices on the outcomes of applications of predictive policing algorithms. We highlight the importance of balancing the technical-formal evolution of predictive policing algorithms with lockstep advancement in awareness and application of decision theory. Such awareness may help to mitigate some of the recognised weaknesses with this emerging technology and, also, help to identify those aspects of human decision-making that can be augmented positively by algorithms.

Introduction

The foundations of predictive policing are human, not machine, and it embeds the ‘delineate and prioritise’ methodology of decision theory. Predictive policing is about using technology to forecast where crimes are likely to occur and who is likely to perpetrate them.Footnote 1 To accomplish this, a problem space must be delineated and, within that problem space, elements are ordered or prioritised.Footnote 2 Speaking generally, the distinguishing features of predictive policing are its data-driven nature, its reliance on information technology to process data according to rules (algorithms) and its focus on forecasting (Meijer and Wessels 2019, p. 1032). Moses and Chan (2018, p. 808) note, “Conceptually, predictive policing is closely connected to, but distinguished from, a range of other approaches to law enforcement, including intelligence-led policing (ILP), data-driven policing, risk-based policing, ‘hot spots’ policing, evidence-based policing, and pre-emptive policing”. Moses and Chan (2018, p. 808) point out that the difference between predictive policing and hot spot policing, for example, is that hot spot policing presumes that crime hot spots remain stable over time, while predictive policing presumes evolution and tries to predict the next hot spot (Gorr and Lee 2015). Similarly, the difference between pre-emptive policing and predictive policing is that pre-emptive policing need not be data-driven (Moses and Chan 2018). In practice, these approaches may operate alongside each other.

The core technical component of predictive policing is the algorithm. An algorithm is a set of steps that a computer can be programmed to follow to generate a result. A predictive policing algorithm can be a simple formula carried out by a commercial spreadsheet program, as we will explain in a later section. The best way to see into the heart of predictive policing and, ultimately, the best way to observe its human roots, is to look more closely at the models that are being used. Here is the PredPolFootnote 3 model that has been applied to predictive policing in Los Angeles (Mohler et al. 2015; Lum and Isaac 2016). First, we must imagine the city as a grid that can be divided up into blocks or cells. Then, the predicted rate of crime in each block is given by

$$\lambda_{n} \left( t \right) = \mu_{n} + \mathop \sum \limits_{{t_{n}^{i} < t}} \theta \omega e^{{ - \omega \left( {t - t_{n}^{i} } \right)}},$$

where \(\lambda_{n} \left( t \right)\) is the predicted rate of crime in block \(n\) at time \(t\) (today), \(t_{n}^{i}\) are the times of events in block \(n\) in the history of the process, \(\mu_{n}\) is a baseline rate of events and \(\theta \omega e^{ - \omega t}\) reflects the increase in risk following a recent crime (Brantingham et al. 2018, p. 2). The general idea is to program a computer to use historical crime statistics to compute the best estimates for the variables of the model and, ultimately, a value for \(\lambda_{n} \left( t \right)\) for each block of a grid-representation of the city. The police resources can then be allocated to those blocks with the highest \(\lambda_{n} \left( t \right)\). Like all exercises in decision theory, it is shaped by two primary factors: (1) the decision problem is an ordering or prioritisation task, in this case the ordering of grid blocks according to their \(\lambda_{n} \left( t \right)\) values; and (2) for the problem to be amenable to algorithmic computation, the problem space must be transformed and delineated, in this case the ‘city as a grid’ of blocks of a particular size.

Although it is hidden beneath the surface, the essence of the algorithm is a human essence. In this case, a human decision-maker decided to represent the city as a grid. When we imagine a grid, we imagine the city devoid of its living features. Not only is the grid different from the city of lived experiences, it is also just one possible representation. Critically, the grid is the only representation of the city that provides a problem space conducive to the operation of this type of predictive policing model. If a human decision-maker decides to depict the city differently, a different type of algorithm would need to be developed. The delineation of the problem space, sometimes metaphorically (i.e. the city is a grid), is a crucial step that shapes the development of the algorithm from its inception. In delineating the problem space, choices are made that shape the way the algorithm operates in practice. The significance of these choices is the primary focus of this paper.

Of course, once it is ready for operation, the algorithm or set of steps cannot be followed by the computer at all unless some data have been collected and fed into the model. The choice of data is also the product of human choice. Indeed, the data themselves are human data and cannot exist separately from the human actions that other humans seek to record and the methods they use to record it. And the relationships that are contained in the predictive policing model itself are a human representation of the relationships between a recent crime and the increased risk of future crime. The interpretations of the results are human interpretations. And the decisions regarding specific allocation of police patrols are human decisions. As too are the decisions made by the police officers who patrol one of the blocks within the grid and, in doing so, become a part of the collection of data that re-enter into the predictive policing process at the beginning.Footnote 4

A primary concern among the critics of predictive policing is that predictive policing may be biased, despite its seeming objectivity (Ferguson 2016, 2018; Lum and Isaac 2016; Brantingham 2017; Brantingham et al. 2018; Richardson et al. 2019). This concern has been the subject of an academic debate which has, at times, spilled over into the popular press (e.g. Smith 2018; Moravec 2019). It is a sub-set of the broader debate, currently emerging across various fields of study, about the pros and cons of using algorithms in private and public sector decision-making and the potential for wrong or biased decision-making. Lepri et al. (2018) list hiring decisions, criminal sentencing and stock trading as areas in which decisions are now made or assisted by algorithms. Kemper and Kolkman (2019) list several day-to-day examples, including Google’s PageRank algorithm, Spotify’s music recommendation algorithm and dynamic pricing models used by retailers. Hilbert et al. (2018) explore the implications of YouTube’s watch time optimisation algorithm. The fairness, transparency and fundamental correctness of the decisions made or assisted by these algorithms are now producing a fast-growing academic discussion (Berk et al. 2018).

This paper is organised as follows. First, we work through a simple example of an algorithm from another context (finance) to illustrate the types of choices that are made during the process of problem space delineation and the way in which the problem is transformed in order to make amenable to computerised analysis. Second, we show how this same transformation and problem space delineation occurs during the development of predictive policing models and algorithms and how the delineate and prioritise structure of decision theory finds application in predictive policing. Third, we introduce some foundational concepts from decision theory to explain how decision-making biases may enter the structure of algorithms through pathways other than the data that are used to operationalise them. This allows us to identify those aspects of human decision-making that may be positively augmented by algorithms or artificial intelligence and those aspects that may require deeper consideration.

Problem space delineation and decision: a basic example from financial economics

The tension between economic theory and human experience stems, at least in part, from the roots of orthodox economic theory in metaphors and analogies drawn not from fields of study dedicated to human society (e.g. sociology or anthropology) but from physics and, later, biology (McCloskey 1983, 1985; Mirowski 1989; Klamer and Leonard 1994). The clockwork universe of Newton lies beneath the classical models of the market economy,Footnote 5 while the particle physics of Einstein lies beneath the modern theory of financial markets. And Wall Street firms are among the biggest recruiters of physics PhDs (Derman 2004). The very same mathematical formula that describes the price movements of certain types of financial assets also describes the diffusion of milk through a cup of coffee or heat across a hotplate (Lowenstein 2000). This is the product of a conscious effort, a series of human decisions, to emulate the precision of the physical sciences in economics and finance by mapping elements of the economic reality into mathematical objects, especially those already in use within theoretical physics.

The cognitive role of metaphor and its place in the development of scientific knowledge is widely acknowledged (Black 1962; Hesse 1966; Leatherdale 1974; Ricoeur 1977; Ortony 1979; Lakoff and Johnson 1980). Models as metaphors naturally highlight certain features while deemphasising others (Phillips 2007). Computational models emphasise those features of a context that are amenable to computation. Mathematical models emphasise those features of a context that are amenable to mathematical analysis. And so on. One of the most important algorithms in financial economics, the critical line algorithm developed by Markowitz (1952, 1959), could not exist without Markowitz first having delineated a problem space within which the algorithm could function. In this case the delineated problem space represents a metaphorical mapping of the real world of the financial markets into a narrowly delineated set of statistical objects. The delineation of the problem space emphasises some factors and relegates most of the others. We can use the Markowitz algorithm to explain the necessity of problem space delineation and to highlight the choices that shape it.

In building his algorithm, which was designed to identify portfolios that have the highest expected return for each level of risk, Markowitz (1952, 1959) made a series of decisions in pursuit of an analogy between the investment decision and a constrained maximisation quadratic programming problem. The investor, says Markowitz, is primarily interested in his or her expected returnFootnote 6 and risk. If we equate the investor’s expected return with the average (mean, \(\mu\)) return that the investment has generated in the past and if we equate the risk that the investor must bear with the variability of returns over time (measured by standard deviation, \(\sigma\)), we can say that the investor is interested in each investment as a (\(\mu ,\sigma\)) ‘pair’. The result of Markowitz’ series of decisions was the reduction of all investments and all features of those investments (i.e. all stocks, all bonds, all property, all portfolios and all human capital) to two statistics. Furthermore, the investment decision was recast as a ranking or prioritisation of alternatives based solely on these two statistics. Now instead of deciding about an investment or portfolio based on innumerable qualitative factors such as management, products, prospects, markets and all the other things listed in Fig. 1, investors are portrayed as ranking individual investments based solely on \(\mu\) and \(\sigma\).

Fig. 1
figure1

Metaphorical mapping and problem space delineation

A difficult problem emerges when the investor seeks to rank or prioritise portfolios. It is this problem that Markowitz really set out to solve. While choosing a single investment on the basis of \(\mu\) and \(\sigma\) might be a matter of choosing the highest \(\mu\) at the level of \(\sigma\) with which the investor is comfortable, once investments are mixed together in a portfolio, interesting things begin to happen. The ups and downs of each investment in the portfolio dampen and reinforce the ups and downs in every other investment. A portfolio consisting of very volatile investments might be a very calm portfolio if the ups and downs of the investments it contains dampen each other out. There is a strong motivation, therefore, to diversify and build portfolios rather than put all of one’s eggs in a single basket. But working out the portfolio standard deviation is more difficult than working out the standard deviation for a single investment because these interconnected ups and downs (measured by covariance) must be considered. The investor must work this out for each portfolio before he or she can begin to rank alternative portfolios.

If the expected return and risk for every possible portfolio is worked out and the results are plotted, the whole set of portfolios takes shape.Footnote 7 The plot always looks like the shape we have drawn in Fig. 2. It always has a curved or concave upper edge that slopes gradually upwards from left to right and it always has a somewhat ‘stub’ nose on its western boundary. It is this simpler representation of the entire set of investment opportunities that is the delineated problem space within which Markowitz’s algorithm can be applied. Markowitz was now poised to present the algorithm that would solve the investor’s main problem. Investors, of course, are not interested in most portfolios. Portfolios along the bottom edge of the set in Fig. 2, for example, have lower returns at the same level of risk than portfolios within the set. And portfolios within the set have lower returns at the same level of risk than portfolios along the top edge of the set. How could investors find the portfolios along the top edge of the set without computing the risk and reward for every possible portfolio? Markowitz (1952, 1959) developed the ‘critical line algorithm’ to solve this problem. Today, this approach is put into practice by telling the computer to find the allocations that should be made to each individual investment such that the expected return for the portfolio is maximised for a given level of risk. The computer will produce the results relatively quickly and we are left with just the set of portfolios that are ‘efficient’. This is Markowitz’ efficient set (Fig. 3). Without recasting every individual investment and portfolio as a (\(\mu ,\sigma\)) pair, there is no problem space within which to apply an algorithm that computes the highest portfolio \(\mu\) for a given \(\sigma\).

Fig. 2
figure2

The risk and reward of every possible individual investment and portfolio

Fig. 3
figure3

The ‘efficient’ set of portfolios and the operation of the Markowitz Algorithm

Obviously, investors can use the Markowitz algorithm to guide practical decisions but something more significant resulted from his work. It became commonplace to picture real investors selecting their portfolios, not only with the help of the algorithm, but in the manner described by it. Markowitz’ critical line algorithm became the basis for models of human choice and financial decision-making through which we came to see investors as making decisions in the manner described by the algorithm, ordering their portfolios based on (\(\mu ,\sigma\))-pairs and selecting from the set of efficient portfolios. It was not too long before theoretical models of the financial markets were describing general equilibrium conditions based on this type of ‘transformed’ human behaviour (e.g. Sharpe 1964; Lintner 1965a, b; Mossin 1966, with intertemporal extensions by Merton 1973). Markowitz’ delineated problem space became the foundation for a theory that not only transformed a practical computation problem by making it easier and quicker to perform. Much more than this, Markowitz transformed our picture of the financial markets and the behaviour of the people participating in them.

Before closing this section, we might ask whether predictive policing algorithms can reshape our thinking about how police officers make decisions? It is possible. The example we have been discussing certainly alerts us to the possibility. And there are other examples. Much formal analysis of human decision-making and artificial intelligence has its origins in the study of chess. As Ensmenger (2011, p. 5) explains, “In 1965, the Russian mathematician Alexander Kronrod, when asked to justify the expensive computer time he was using to play correspondence chess at the Soviet Institute of Theoretical and Experimental Physics, gave a prescient explanation: it was essential that he, as a premier researcher in the burgeoning new discipline of artificial intelligence (AI), be allowed to devote computer time to chess because chess was the ‘drosophila’ of artificial intelligence”. Drosophila is a genus of flies, the study of which forms the basis for genetic research (Kohler 1994). Ensmenger (2011) argues that the ‘chess as drosophila’ metaphor has subsequently shaped the study of human decision-makingFootnote 8 and AI. In many ways, the metaphor provided a reductionist simplification of the problem space by mapping elements of lived reality into chess play and back again. Algorithms can become models of human behaviour because they can become metaphors for human decision-making.

Predictive policing algorithms and problem space delineation

We have seen how Markowitz transformed and delineated the portfolio selection problem space before applying an algorithm to solve a problem within that newly delineated space. Figures 2 and 3 would hardly correspond to most people’s mental images of (1) the entirety of their investment opportunities; and (2) the complete set of ‘best’ portfolios. Predictive policing algorithms involve an analogous metaphorical-mapping process to delineate the problem space before developing and operationalising an algorithm. Just as Markowitz delineated the investment problem space to facilitate the analysis of the investment and portfolio problem based solely on the set of (\(\mu ,\sigma\)) pairs, Mohler et al. (2015) defined the predictive policing problem space as a grid consisting of a number of individual blocks where police resource allocation decisions are based on a set of (\(\mu_{n} ,\theta \omega e^{{ - \omega \left( {t - t_{n}^{i} } \right)}}\)) pairs. The predicted crime rate, \(\lambda_{n} \left( t \right)\), in each block of the grid is based on just two factors, \(\mu_{n}\) and \(\theta \omega e^{{ - \omega \left( {t - t_{n}^{i} } \right)}}\), and the decision problem becomes a block prioritisation or ranking problem based on these two criteria.

In the PredPol model, \(\mu_{n}\) is the ‘background rate’ of crime for block \(n\). The background rate, though not directly a crime variable, is a function of crime volume. A higher crime volume results in a higher background rate (Mohler et al. 2015, p. 1402). Depending on the availability and precision of data, \(\mu_{n}\) can be determined over a smaller or larger area.Footnote 9 Mohler et al. (2015) settled on \(150 \times 150\) metre square blocks that together redefine the city as a ‘grid’ and delineate the problem space in a manner conducive to algorithmic computation.

The other component, \(\theta \omega e^{{ - \omega \left( {t - t_{n}^{i} } \right)}}\), is what Mohler et al. (2015, p. 1402) call a ‘triggering kernel’. The triggering kernel adjusts the background rate for recent crime. Two areas with the same background rate will not necessarily represent equivalent risks for future crime occurrence if one area has ‘newer’ crime while the other has ‘older’ crime. Just as an investment with an average (expected) return (\(\mu\)) has a greater (lesser) likelihood of producing an actual return that diverges from this expected return if the standard deviation (\(\sigma\)) is higher (lower), a particular block, n, within the grid has a greater likelihood of producing more crime for a given background rate (\(\mu_{n}\)) the higher the triggering kernel (\(\theta \omega e^{{ - \omega \left( {t - t_{n}^{i} } \right)}}\)). The triggering kernel embeds an exponentially decaying function such that the increment that is added to \(\mu_{n}\) will be smaller (larger) if the crimes recorded within the block, n, are older (newer). One way to visualise this is as a set of lines for each block of the grid with the intercept for each block equated to the background rate, \(\mu_{n}\), with increasing probability of crime as \(\theta \omega e^{{ - \omega \left( {t - t_{n}^{i} } \right)}}\) increases. Some blocks ‘dominate’ others and it is to these that policing resources will be allocated. This is depicted in Fig. 4.

Fig. 4
figure4

Some blocks dominate others in terms of predicted crime

With crime in a city now reduced to a set of (\(\mu_{n} ,\theta \omega e^{{ - \omega \left( {t - t_{n}^{i} } \right)}}\)) pairs, where each pair is associated with a block, n, of some size, we are in a position to use an algorithm to solve the equation:

$$\lambda_{n} \left( t \right) = \mu_{n} + \mathop \sum \limits_{{t_{n}^{i} < t}} \theta \omega e^{{ - \omega \left( {t - t_{n}^{i} } \right)}}.$$

The blocks are prioritised or ranked for allocation of police resources based on \(\lambda_{n} \left( t \right)\) so we need to estimate \(\lambda_{n} \left( t \right)\) for each block. The most interesting thing about this is that the parameters, \(\omega\), \(\mu\), \(\theta\) are not directly observable crime variables. What the algorithm really does in this case is to compute the maximum likelihood estimators (MLE) for the variables. Formally, it is an expectation–maximisation (EM) algorithm of the type originally analysed by Dempster et al. (1977). The algorithm computes the ‘best’ values for the variables of the model.Footnote 10 Then, using these values, we can produce an estimate for \(\lambda_{n} \left( t \right)\) for each of the (\(\mu ,\theta \omega e\)) points (the n blocks of the grid). The blocks with the highest \(\lambda_{n} \left( t \right)\) will be those with the highest background rate and greater number of more recent crime events. Importantly, the basic predictive policing model is ‘block-bordered’ in the sense that the model does not predict the spread of crime across blocks but only the change in the expected rate of crime within blocks.Footnote 11 Remember that a block is an element of the grid problem space, not an actual city block.

Prioritisation and problem space delineation

To improve the transparency and accessibility of algorithmic computation in the closely related field of hot spot forecasting, Lee & O (2020) designed an algorithm that can be run in Excel. Like PredPol, historical crime data are the basis for the algorithm’s computations. The length of time prior to the computation that the designers decide to use varies from case to case. It is generally recognised, however, that recentness of crime is important. Based on population heterogeneity theory and state dependence theory, Lee & O’s (2020) algorithm uses the previous twelve months of incident-based crime data to first compute the Poisson probability of crime for each monthFootnote 12 followed by a weighting process that adjusts for recentness of crime. The algorithm is the set of steps, performed by Excel, that are followed in order identify and rank soon to be crime hotspots.

Like the other algorithms that we have discussed, Lee & O’s (2020) can only operate within the delineated problem space. Within this problem space, the decision problem becomes a prioritisation or ranking task. Once more, the city is transformed into a grid and grid blocks are ordered according to two criteria: (1) Poisson probability, denoted by \(\lambda\); and (2) a ‘boost’ component, which we shall denote by \(\beta\), that reflects recentness of crime. As such, within the city as a grid, blocks are prioritised based on \(\left( {\lambda , \beta } \right)\)-pairs. We have, once more, the ‘delineate and prioritise’ methodology of modern decision theory being applied to the decision problem. In the process, the decision problem is recast as a prioritisation or ranking task. The problem space is delineated to make the problem tractable and computable, in this case by delineating the city as a grid, the blocks or cells of which are to be prioritised, ordered or ranked.

For decision problems that are approached by the ‘delineate and prioritise’ methodology, the criteria on which the ranking is based could be expected utility,Footnote 13 prospect value,Footnote 14 (\(\mu ,\sigma\)), (\(\mu_{n} ,\theta \omega e^{{ - \omega \left( {t - t_{n}^{i} } \right)}}\)), \(\left( {\lambda , \beta } \right)\) depending on the specific decision problem the analyst is trying to solve. Fundamentally, however, the structure remains the same and working a natural problem into that structure is subject to a series of human decisions. It might, according to some authors, be time to reconsider the decisions that have been made up till now. Taylor and Ratcliffe (2020, p. 966), for example, suggest that practitioners consider shifting to larger spatial frames. Such would be a choice to change the delineation of the problem space. What is very interesting and too easily overlooked is that while the decision theory methodology of ‘delineate and prioritise’ is so fundamental to the formal depiction of decision problems, the decisions that people make are not usually constrained by the delineated problem space or the prioritisation or ranking process that is supposed to take place within it. Human users of predictive policing ‘see’ crime differently than their algorithm ‘sees’ it.Footnote 15

Referring to yet another predictive policing model, CommandCentral, Kirkpatrick (2017, p. 23) quotes a former police officer and consultant as saying, “It takes a seasoned police officer to look at the data and say, ‘Hey, I know what that is.’ It may be seemingly benign, but to that seasoned officer who knows the patterns, who knows the persons in that area, that sounds like ‘Bob’. ‘Bob’ used to do that and Bob just got out of prison”. Unfortunately, one of the systematic biases to which human decision-making is susceptible is the perception of patterns that are not there (Croson and Sundali 2005). As this example shows, the use of predictive policing algorithms might assist in overcoming this susceptibility (since the algorithm only ‘says’ a certain thing) but it might also reinforce it (since humans might interpret the results in particular ways). We turn now to the discussion of these types of issues.

Predictive policing, prioritisation processes and decision theory

During the performance of prioritisation tasks, human decision-makers are known to exhibit systematic patterns of behaviour. Using algorithms can (1) help mitigate the influence of these systematic patterns; (2) submerge the systematic patterns beneath a veneer of objectivity; and (3) introduce different systematic patterns. Why do human decision-makers exhibit systematic patterns of behaviour when prioritising alternatives, especially under conditions of risk and uncertainty? Daniel Kahneman & Amos Tversky, two Israeli psychologists, had been researching what they call ‘heuristics and biases’ since the 1960s. They found that people used ‘fast thinking’ (see Kahneman 2011), intuitive rules of thumb or heuristics when making decisions, especially those involving probability, and consequently were often led into error. Moreover, the errors are systematic, repeated in the same way time and again. The heuristics that Tversky and Kahneman (1974)Footnote 16 identified are the following:

  • Availability where the ease with which events can be recalled influences the decision-maker’s judgement of the likelihood of a future occurrence.

  • Representativeness where the likelihood that A (e.g. a person) belongs to a particular class B (e.g. a profession) is assessed by the degree to which A is representative of B. If a person’s description is representative of the stereotype for a librarian, he is judged more likely to be a librarian than a farmer, even though there might be many more farmers than librarians.

  • Anchoring and Adjustment where the decision-maker fails to shift sufficiently from an initial estimate. Interestingly, the anchor can be unrelated to the problem. A person who views a completely random low number produces lower initial estimates than a person who views a completely random high number (e.g. see Switzer and Sniezek 1991).

In the late 1970s, Kahneman and Tversky (1979) encompassed some of their most important observations into a new model of decision-making that they called prospect theory. In doing so, they identified several additional features of the human decision-making process that can lead to departures from the optimal ordering of alternatives. Importantly, while the heuristics listed above will lead to errors in the assessment of probabilities, prospect theory implies that decision-makers will produce sub-optimal solutions to prioritisation tasks even if they have made (or have been given) absolutely correct judgements of the possible outcomes and their likelihoods. The factors that can distort even the most accurate of judgements are as follows:

  • Reference Points where decision-makers do not assess the outcomes of risky decisions absolutely. They reference them against a reference point. As such, a positively valued outcome, such as a 10% decrease in burglaries, might be viewed as a loss if the reference point was 20%.

  • Changeable Risk Preferences when facing losses, people become risk seeking. They will take risks to turn losses around. When facing gains, people become risk averse. They will avoid risk to protect gains.

  • Loss Aversion a loss of some magnitude hurts more than the positive feeling of a gain of the same magnitude.

  • Probability Weighting people overweight unlikely outcomes and underweight more likely outcomes. If something has, say, a 5% chance of occurring, a rational decision-maker would accord the outcome a weight of 0.05. A prospect theory decision-maker, however, would accord the outcome a somewhat higher weighting, perhaps 0.08. The opposite happens at the other end of the probability distribution.

  • Diminishing Sensitivity where outcomes that are further away from the reference point influence the decision-maker less and less.

These heuristics and biases impact the decision-maker when he or she attempts to prioritise or rank alternatives. Each alternative is characterised by a range of outcomes each of which occurs with some probability. The decision-maker must therefore accurately assess both the outcomes and their likelihoods for each alternative as a necessary (but not sufficient) condition for making an optimal decision. The alternatives might be portfolios, parts of a city, or individual suspects. The outcomes might be expected investment returns, changes in the crime rate or crimes solved or prevented. Correctly assessing outcomes and probabilities, however, is not sufficient. During the decision-making process, outcomes and probabilities can be distorted by the factors listed above. Their combined influence is usually enough to cause at least some divergence from an optimal ordering of the alternatives. If a human decision-maker or team is tasked with prioritising parts of the city for police patrols, we might even expect sub-optimality to be introduced by factors seemingly removed from the actual problem itself. If, for example, the individual or team has not met a performance benchmark, being in the domain of losses prompts risk seeking in the assessment of the available information.Footnote 17 If one alternative offers a small chance of an outcome significant enough to recover the team’s standing relative to their performance benchmark, the risk seeking group is more likely to choose it.

A theme that recurs in the literature is that algorithms inherit or reflect the biases of their creators (e.g. Soll et al. 2015, p. 6, Lepri et al. 2018, p. 614). How? An obvious answer is that bias comes from the data that are input into the algorithm (also see Christin et al. 2015; Joh 2017). This is quite different, though, from saying that the algorithm itself (i.e. its structure) inherits the bias of its creator. If the data were better, this bias would presumably disappear and yet if there was bias embedded in the very structure of the algorithm, it would remain. There must be deeper answers that we can provide to this important question.

Human decision-making shapes algorithms from their very inception. We can understand the choices that must be made during the development of an algorithmic approach to predictive policing as a series of prioritisation problems each of which can be impacted by heuristics and biases and each of which might result in a sub-optimal choice:

  • The prioritisation of alternative approaches, including an algorithm-based approach. The decision to use a data-driven, algorithm-based method for assisting with or making decisions is only one of the alternatives that might have been chosen.

  • Next, the problem space must be delineated to enable the model to work. This step alone shapes the whole process.

  • Once an algorithm-based approach has been decided upon and the problem space delineated, the prioritisation of alternative algorithms is another choice problem. For example, PredPol uses a ‘self-exciting point process’ together with the expectation–maximisation algorithm. This works within a problem space defined as a grid. Developers of other predictive policing models have made different choices.

  • If the predictive policing algorithm is to be built from the ground up, there are various interconnected micro-level prioritisation problems involved with this step. For example, the choice of variables to include or exclude. A human decision-maker will judge the likelihood that a person belongs to a certain class (e.g. criminal) by the degree to which that person fits the stereotype of a criminal. This representativeness heuristic can make its way into the structure of the algorithm. Also, because the algorithm designers are intent on facilitating computation, they approach the design task with the most easily recalled basis for computation in mind. This is the type of data that are available. The availability heuristic may shape the algorithm’s design from this point onwards. The ease with which certain algorithm types can be called to mind is another factor to consider. There are relatively few algorithm structures from which to choose and a designer’s background will shape the ease with which they can call to mind a certain structure. They then take steps to fit the problem into that structure.

  • If the predictive policing model assists with decision-making in the sense that its outputs are not unquestioningly implemented, the interpretation of the results and the ultimate decision to allocate police resources to locations is, of course, the overarching prioritisation process. In performing this task, the human decision-maker can overlay the outputs of any formal model with his or her own error-inducing biases. For example, if a model gave a probability of crime today in a particular location as 10%, the decision-maker would tend to overweight this probability estimate. Conversely, if the model gave a 90% chance, the human decision-maker would be inclined to underweight the estimate.Footnote 18

If the multi-layered process of algorithm selection and development is viewed as a series of prioritisation problems, all of which are potentially impacted by heuristics and biases, the ultimate prioritisation task cannot help but be shaped by the human decision-making process even if the ultimate prioritisation task is left entirely to the machine. In a policing context, there always have been and always will be humans whose task is to solve prioritisation problems. If one such problem is the allocation of police resources to parts of a city, we can leave it entirely to the police officers and administrators or make some other arrangement, up to and including the use of algorithms and AI. We know that human decision-making will produce systematic departures from optimal prioritisation. Algorithms do not do away with human decision-making. The question is whether, despite the type of development process we have just described, using algorithms to make or assist with decision-making yields better solutions to prioritisation tasks.

There is quite a lot of evidence to suggest that using algorithms to make or assist decisions can produce better decisions than humans acting alone. Surprisingly, some of this evidence is quite old. Meehl (1954), for example, found that algorithms consistently outperformed human decision-makers in various settings. One of the more extensive studies into this problem was conducted by Grove et al. (2000). They looked at 136 investigations and found that algorithms, on the average, outperformed human forecasters.Footnote 19 In policing and criminal justice contexts, predictive policing models have been shown to perform well against human crime forecasts (Mohler et al. 2015)Footnote 20 and sentencing algorithms have been shown to perform well against the decisions made by judges (Kleinberg et al. 2018). How can algorithms help to dampen the impact of the heuristics and biases that lead human decision-makers into sub-optimal solutions to prioritisation tasks? There are several possibilities:

  • An algorithm will not underweight (overweight) higher (lower) probabilities. Accurate probabilities available to the algorithm will not be distorted. Inaccurate probabilities will not be further distorted.

  • Algorithms do not react emotionally to gains and losses. A human decision-maker oscillates between the domain of gains and the domain of losses. In doing so, he or she prioritises the alternatives differently. For example, an officer who has recently been promoted might be inclined to take less risk, while an officer who has been overlooked might be inclined to take more. The algorithm has no aspirations or goals to serve as reference points and does not react to events outside of the problem space.

  • Algorithms are not subject to diminishing sensitivity. Because a reference point (and exponential decay function) would have to be built into an algorithm to replicate diminishing sensitivity, an algorithm will not tend to ‘ignore’ outcomes that are far removed from a reference point. Even if the algorithm does have a reference point of some sort, it is likely to be less volatile than a human decision-maker’s reference point, which can oscillate as circumstances change. As mentioned before, these include circumstances outside of the problem space.

  • Algorithms are not susceptible to loss aversion. For a human, a loss of some amount is felt more than a gain of the same amount. An algorithm does not experience the emotional reaction that prompts the human decision-maker to take more risk to avoid losses.

  • Algorithms use historical data but do not rely upon ‘ease of recall’ in judging probability. While a human decision-maker will be influenced by the ease with which crime events in particular areas can be recalled, algorithms are not susceptible to the availability heuristic. Sunstein (2019) attributes the success of ‘sentencing algorithms’ to this factor.

Sunstein (2019) appears confident that algorithms, while not overcoming discrimination, are not susceptible to cognitive biases. This seems to be too optimistic an assessment. Algorithms appear to be susceptible to the representativeness heuristic to the extent that their human developers allow it to influence variable choice during the construction of the algorithm. And the availability heuristic may enter by at least two avenues: (1) the ease with which certain datasets can be called to mind; and (2) the ease with which certain foundational algorithm structures can be called to mind. Considering that the whole process of algorithm construction is a series of prioritisation tasks, it would be incorrect to conclude that algorithms are completely de-humanised.

Algorithm aversion and future directions

Human decision-makers exhibit systematic patterns of behaviour when confronted with prioritisation tasks. Algorithms can be tasked with making these decisions for humans or with assisting humans to make better decisions. The human and the algorithm are not completely separable. Humans make prioritisation decisions during algorithm construction. In fact, the decision to use an algorithm rather than some other type of approach to solving the problem is one such decision. All algorithms are the product of human decision-making and, to some extent, it is to be expected that the particular ordering of alternatives that emerges from the algorithm (or with the help of the algorithm) will diverge to some degree from the purely optimal ordering. Interestingly, the orderings produced by algorithms (like the one developed by Markowitz) can become the benchmark for assessing the accuracy of human orderings. It should be clear from our discussion that this only has meaning within a delineated problem space.

As critiques of predictive policing begin to appear in the literature and, simply, as the model-building efforts evolve, a natural tendency might be to add more variables or more data to make the algorithms more objective. Perfect objectivity, though, is a mirage given that each one of these efforts would involve yet further prioritisation tasks performed by humans. Rather than push this beneath a façade of greater model complexity, the ways in which the algorithms are used could be given deeper consideration. Do the users of predictive policing take every algorithm output at face value? If the algorithm prioritises ten city blocks, is it these ten and only these ten that the human user also prioritises, or should the human user merely take the algorithm’s results under advisement? Human interpretation of the algorithm’s results might appear to open the door to further human biases, but it also brings with it the promise of generating better results.

In our discussion of PredPol we noted how it predicts the crime rate expected within city blocks. Contagion is a process devoid of agency. Criminal decision-making involves its own prioritisation tasks. A criminal whose choices are described by prospect theory finds himself in the domain of gains or losses at different times. If his actions within one location are noticed and reported, they will show up in the data for that location and contribute to a higher predicted crime rate for that location. Will the same criminal operate in the same area again? If the first action was successful, the criminal becomes more risk averse according to prospect theory. As such, he might be expected to drift away, perhaps to a neighbouring block, if he expects heightened police attention at the original location. If the first action was unsuccessful, this places the criminal in the domain of losses. His risk seeking might drive him to return to the same (or nearby) location to recoup the losses.Footnote 21

Perhaps the core feature of prospect theory and, consequently, much of behavioural economics, is that decision-makers assess outcomes as gains and losses against a reference point.Footnote 22 The algorithms are not influenced by an emotional response to gains and losses but both the users and developers of predictive policing algorithms are. The developers and advocates for predictive policing experience gains when the products are adopted and operate well (according to set indicators) but they experience losses when the products are discontinued or operate poorly or are found to suffer some flaws, such as producing biased results. As the debate continues, it is important for the developers of predictive policing algorithms to recognise their own response to an evolving situation. In the domain of gains, the adjustments they make to their products will be conservative. In the domain of losses, the adjustments will be shaped by risk seeking. The same pattern of behaviour is to be expected from those who advocate for, use and interpret the outputs of the algorithms.

The critics of predictive policing also have a potential behavioural quirk to consider. As the finance example shows, algorithms have been around for a long time. Their use has undoubted benefits, though the objectivity of the algorithm may be more apparent than real. Consequently, people are very suspicious of algorithms. Dietvorst et al. (2015) document a phenomenon called ‘algorithm aversion’. Even after seeing an algorithm outperforms a human decision-maker, many people still prefer the human decision-maker’s forecasts. Neither the human forecaster nor the algorithm is perfect. While the algorithms in Dietvorst et al.’s (2015) study outperformed the human forecasters, the forecasts were not perfect. Upon seeing the algorithm make a mistake, people tend to lose confidence in it rather quickly, even though the algorithm still does better than the human decision-maker. In the sensitive area of predictive policing, where neither the algorithms nor the humans are perfect, there is a very real possibility that algorithm aversion will take hold before the use of predictive policing has fully played out.

We can end on a somewhat optimistic note. While we have been concentrating on algorithms, there is a broader theme emerging in the literature. This is the use of artificial intelligence (Joh 2018). The advantages of pairing human intelligence and artificial intelligence are being recognised in various fields and it is possible for policing to benefit considerably from this period of innovation. Some people now prefer to call artificial intelligence (AI) augmented intelligence (McKendrick 2019). ‘Augmented’ stresses the assistive role of AI, helping humans perform their tasks better or freeing humans from routine tasks so that they can concentrate on deeper ones. For example, during ‘reporting season’ on Wall Street thousands of companies report their annual financial accounts and journalists need to cover as many of these results as possible. Several years ago, the Associated Press left the task of reporting the mundane details of annual company accounts to AI and, as a result, was able to produce 4,400 news stories rather than the usual 300 (Kolbjørnsrud et al. 2016). This freed journalists to investigate more important matters and helped to highlight possible leads to follow up, leads that might never have been detected by a journalist sorting through large volumes of information. Such is the promise of AI and in law enforcement there are myriad potential applications. Identifying the potential might lead to greater acceptance of predictive policing and AI as part of police tradecraft (see Ratcliffe et al. 2019).

The augmentation of human intelligence has tremendous potential advantages. A dedicated chess computer, for example, can defeat a human chess champion, or at least tie the matches, much of the time. But a human chess champion paired with a computer, even one that is weak compared to the chess computer, can win most of the time! (McAfee 2010). The road to such positive pairings of humans and machines is littered with controversy and consternation. If predictive policing ever becomes completely embedded within law enforcement practice, we shall look back along the road travelled since 2010 and see the remnants of such controversy. Or we might never reach that point. An insurmountable obstacle may yet lie in the path of those who advocate the advantages of predictive policing, including the possibility that it just does not ‘work’. The fact that it is not possible to say right now which outcome is the most likely is testament to the multidimensionality of the contemporary debate.

Data availability

No dataset was used for this paper.

Notes

  1. 1.

    More roundabout definitions are provided by Ferguson (2012) among others. Also see Brantingham (2017). Some predictive policing models are ‘place-based’, focusing on predictions of ‘where’, while others are ‘offender-based’, focusing on predictions of ‘who’.

  2. 2.

    The most common choice is to delineate a city area as a grid and the elements to be ordered or prioritised are blocks or cells within that grid.

  3. 3.

    PredPol is a commercial product developed by George Mohler. This is the Mohler from Mohler et al. (2015). While PredPol includes only a few variables, one product called HunchLab incorporates weather patterns, transport schedules and school cycles into its algorithms to move beyond crime types that cluster in time and space (Shapiro 2017).

  4. 4.

    That this may generate ‘runaway’ feedback loops has been recognised by Ensign et al. (2018).

  5. 5.

    The Newtonian Revolution projected a strong influence over the subsequent development of both the hard sciences and the social sciences. For example, the founding father of modern economics, Adam Smith, was certainly influenced by the Newtonian example (Hetherington 1983).

  6. 6.

    In finance, the expected return (what the investor expects in the next time period) is estimated, in the first instance, by the average of past returns. If an investment has yielded an average return of 10%, that is a good starting point for what might be expected next year. If the standard deviation is 2%, then we might usually expect 10% ± 2% (i.e. somewhere between 8 and 12%).

  7. 7.

    A portfolio can contain just a single investment. Individual investments can be thought of as ‘portfolios of one’.

  8. 8.

    Chess was the experimental staging ground for much of Nobel Prize winning economist Herbert Simon’s work on administrative management and human decision-making (Newell and Simon 1972).

  9. 9.

    That is, if we had crime data for every square metre of the city, each block of the grid could be a 1 m2. Such micro-level data are not available and the block size must be somewhat larger.

  10. 10.

    The Markowitz algorithm produces the best portfolio for a given level of risk. The EM algorithm produces the best parameter values for the PredPol model.

  11. 11.

    Beyond the transformation of city crime into a set of (\(\mu_{n} ,\theta \omega e^{{ - \omega \left( {t - t_{n}^{i} } \right)}}\)) pairs, there is a foundational metaphor that guides some of the human thinking behind the application of this predictive policing model. This foundational metaphor is ‘crime is a disease’. If crime is a disease, it can be modelled as a contagion process. Mohler et al. (2015, p. 1402) state, “\(\theta \omega e^{ - \omega t}\) models ‘near-repeat’ or ‘contagion’ effects in crime data”. The contagious disease metaphor as the basis for a model of crime has a very long history. In many cases, contagion has been used as the explanation or driving force for copycat crime. Some applications have been to violent crimes in general, aircraft hijacking, terrorism and school shootings (e.g. Berkowitz and Macaulay 1971; Hamblin et al. 1973; Midlarsky 1978; Midlarsky et al. 1980; Holden 1986; Towers et al. 2015) but also to crimes such as burglary (e.g. Johnson and Bowers 2004; Johnson et al. 2007; Ornstein and Hammond 2017).

  12. 12.

    The Poisson distribution allows us to compute the probability of a given number of crime events if these occur independently of each other at a constant average rate. The calculation of the Poisson probability is straightforward. Lee and O (2020, p. 9) give the Excel formula that tells the software to compute it.

  13. 13.

    von Neumann and Morgenstern’s (1947) expected utility theory.

  14. 14.

    Kahneman and Tversky’s (1979) prospect theory.

  15. 15.

    Something similar happened in artificial intelligence (AI) research based upon computer chess. The minimax algorithm that lies at the heart of computer chess programs ‘sees’ the game differently from how the human player sees it. As Ensmenger (2011, p. 23) explains, the computer and the human are playing entirely different games. Ensmenger (2011, p. 23) goes on to say, “Chess, as it was played by humans, turned out to be an even more complex cognitive activity than was imagined by the early artificial intelligence researchers. As a result, computer chess came to be seen as increasingly distinct from human chess”.

  16. 16.

    Also see Kahneman et al. (1982).

  17. 17.

    Reference points can be goals or aspirations (Heath et al. 1999).

  18. 18.

    Also, if the interpreter sees crime in a particular way (e.g. as described by a contagion process), then the interpretation of the model’s results may be shaped by this. We saw in a previous section how the PredPol model sees recent crime as contributing to more crime in a particular block while a contagion process implies the spread of crime. The model could yield a high prioritisation for block 10 and the interpreter might see this not only as an indication about future crime in block 10 but as an indication of crime in the neighbouring blocks.

  19. 19.

    It must be noted that algorithms vary in terms of their sophistication and some of the algorithms included in these studies are simple forecasting rules or formulas while others are embedded within models of behaviour in particular contexts.

  20. 20.

    A ‘better’ performance in Mohler et al.’s (2015) assessment is that the algorithm resulted in predictions of 1.4–2.2 times as much crime compared to a human crime analyst and resulted in an average 7.4% reduction in crime volume as a function of patrol time. However, also see Saunders et al. (2016), whose results were not as encouraging.

  21. 21.

    Mohler does not appear to have quite so nuanced a view of criminal behaviour. He says, “If someone breaks into a car in a certain neighbourhood and is successful, they’ll often return to that same neighbourhood a few days later and break into another car” (Kirkpatrick 2017, p. 23). The cases not covered by the adverb ‘often’ are possibly those where risk aversion dominates the criminal’s subsequent actions.

  22. 22.

    Phillips and Pohl (2014) discuss terrorism from a prospect theory perspective.

References

  1. Berk R, Heidari H, Jabbari S, Kearns M, Roth A (2018) Fairness in criminal justice risk assessments: the state of the art. Sociol Methods Res 50:3–44

    Google Scholar 

  2. Berkowitz L, Macaulay J (1971) The contagion of criminal violence. Sociometry 34:238–260

    Google Scholar 

  3. Black M (1962) Models and metaphors. Cornell University Press, Ithaca

    Google Scholar 

  4. Brantingham PJ (2017) The logic of data bias and its impact on place based predictive policing. Ohio State J Crim Law 15:473–486

    Google Scholar 

  5. Brantingham PJ, Valasik M, Mohler GO (2018) Does predictive policing lead to biased arrests? Results from a randomised control trial. Stat Public Policy 5:1–6

    Google Scholar 

  6. Christin A, Rosenblatt A, Boyd D (2015) Courts and predictive algorithms. Data Civil Rights Primer

  7. Croson R, Sundali J (2005) The Gambler’s Fallacy and the hot hand: empirical data from casinos. J Risk Uncertain 30:195–209

    Google Scholar 

  8. Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J Roy Stat Soc 39:1–38

    Google Scholar 

  9. Derman E (2004) My life as a quant: reflections on physics and finance. Wiley, Hoboken

    Google Scholar 

  10. Dietvorst BJ, Simmons JP, Massey C (2015) Algorithm aversion: people erroneously avoid algorithms after seeing them err. J Exp Psychol 144:114–126

    Google Scholar 

  11. Ensign D, Friedler SA, Neville S, Scheidegger C, Venkatasubramanian S (2018) Runaway feedback loops in predictive policing. Proc Mach Learn Res 81:1–12

    Google Scholar 

  12. Ensmenger N (2011) Is chess the Drosophila of artificial intelligence? A social history of an algorithm. Soc Stud Sci 42:5–30

    Google Scholar 

  13. Ferguson AG (2012) Predictive policing and reasonable suspicion. Emory Law J 62:259–325

    Google Scholar 

  14. Ferguson AG (2016) Policing predictive policing. Wash Univ Law Rev 94:1109–1190

    Google Scholar 

  15. Ferguson AG (2018) Illuminating black data policing. Ohio State J Crim Law 15:503–526

    Google Scholar 

  16. Gorr WL, Lee Y (2015) Early warning system for temporary crime hot spots. J Quant Criminol 31:25–47

    Google Scholar 

  17. Grove WM, Zald DH, Lebow BS, Snitz BE, Nelson C (2000) Clinical versus mechanical prediction: a meta-analysis. Psychol Assess 12:19–30

    Google Scholar 

  18. Hamblin RJ, Jacobsen RB, Miller JLL (1973) A mathematical theory of social change. Wiley, New York

    Google Scholar 

  19. Heath C, Larrick RP, Wu G (1999) Goals as reference points. Cogn Psychol 38:79–109

    Google Scholar 

  20. Hesse MB (1966) Models and analogies in science. University of Notre Dame Press, Notre Dame

    Google Scholar 

  21. Hetherington NS (1983) Isaac Newton’s influence on Adam Smith’s natural laws in economics. J Hist Ideas 44:497–505

    Google Scholar 

  22. Hilbert M, Ahmed S, Cho J, Liu B, Luu J (2018) Communicating with algorithms: a transfer entropy analysis of emotions-based escapes from online echo chambers. Commun Methods Meas 12:260–275

    Google Scholar 

  23. Holden RT (1986) The contagiousness of aircraft hijacking. Am J Sociol 91:874–904

    Google Scholar 

  24. Joh EE (2017) Feeding the machine: policing, crime data & algorithms. William Mary Bill Rights J 26:287–302

    Google Scholar 

  25. Joh EE (2018) Artificial intelligence and policing: first questions. Seattle Univ Law Rev 41:1139–1144

    Google Scholar 

  26. Johnson SD, Bowers KJ (2004) The Burglary as clue to the future: the beginnings of prospective hot-spotting. Eur J Criminol 1:237–255

    Google Scholar 

  27. Johnson SD, Bernasco W, Bowers KJ, Elffers H, Ratcliffe J, Rengert G, Townsley M (2007) Space-time patterns of risk: a cross national assessment of residential Burglary victimisation. J Quant Criminol 23:201–219

    Google Scholar 

  28. Kahneman D (2011) Thinking, fast and slow. Farrar, Straus, Giroux, New York

  29. Kahneman D, Tversky A (1979) Prospect theory: an analysis of decision under risk. Econometrica 47:263–291

    Google Scholar 

  30. Kahneman D, Slovic P, Tversky A (eds) (1982) Judgement under uncertainty: heuristics and biases. Cambridge University Press, Cambridge

    Google Scholar 

  31. Kemper J, Kolkman D (2019) Transparent to whom? No algorithmic accountability without a critical audience. Inf Commun Soc 22:2081–2096

    Google Scholar 

  32. Kirkpatrick K (2017) It’s not the algorithm, it’s the data. Commun ACM 60:21–23

    Google Scholar 

  33. Klamer A, Leonard T (1994) So what’s an economic metaphor? In: Mirowski P (ed) Natural images in economic thought: markets read in tooth and claw historical perspectives on modern economics. Cambridge University Press, Cambridge, pp 20–52

    Google Scholar 

  34. Kleinberg J, Lakkaraju H, Leskovec J, Ludwig J, Mullainathan S (2018) Human decisions and machine predictions. Q J Econ 133:237–293

    Google Scholar 

  35. Kohler R (1994) Lords of the fly: drosophila genetics and the experimental life. University of Chicago Press, Chicago

    Google Scholar 

  36. Kolbjørnsrud V, Amico R, Thomas RJ (2016) How artificial intelligence will redefine management. Harvard Business Rev

  37. Lakoff G, Johnson M (1980) Metaphors we live by. Chicago University Press, Chicago

    Google Scholar 

  38. Leatherdale WH (1974) The role of model, analogy and metaphor in science. North Holland, Amsterdam

  39. Lee Y, O S (2020) Flag and boost theories for hot spot forecasting: an application of nij’s real-time crime forecasting algorithm using Colorado Springs crime data. Int J Polit Sci Manag 22:4–15

    Google Scholar 

  40. Lepri B, Oliver N, Letouze E, Pentland A, Vinck P (2018) Fair, transparent and accountable algorithmic decision-making processes. Philos Technol 31:611–627

    Google Scholar 

  41. Lintner J (1965a) The valuation of risk assets and the selection of risky investments in stock portfolios and capital budgets. Rev Econ Stat 47:13–37

    Google Scholar 

  42. Lintner J (1965b) Security prices, risk and maximal gains from diversification. J Financ 20:587–615

    Google Scholar 

  43. Lowenstein R (2000) When genius failed: the rise and fall of long term capital management. Random House, New York

    Google Scholar 

  44. Lum K, Isaac W (2016) To predict and serve? Significance 13:14–19

    Google Scholar 

  45. Markowitz H (1952) Portfolio selection. J Financ 7:77–91

    Google Scholar 

  46. Markowitz H (1959) Portfolio selection: efficient diversification of investment. Wiley, New York

    Google Scholar 

  47. McAfee A (2010) Did Garry Kasparov stumble into a new business process model? Harvard Business Rev

  48. McCloskey DN (1983) The rhetoric of economics. J Econ Lit 31:434–461

    Google Scholar 

  49. McCloskey DN (1985) The rhetoric of economics. University of Wisconsin Press, Madison

    Google Scholar 

  50. McKendrick J (2019) Why AI should rightfully mean augmented intelligence, not artificial intelligence. Forbes, June 29

  51. Meehl PE (1954) Clinical versus statistical prediction: a theoretical analysis and review of the literature. University of Minnesota Press, Minneapolis

    Google Scholar 

  52. Meijer A, Wessels M (2019) Predictive policing: review of benefits and drawbacks. Int J Public Adm 42:1031–1039

    Google Scholar 

  53. Merton RC (1973) An intertemporal capital asset pricing model. Econometrica 41:867–887

    Google Scholar 

  54. Midlarsky MI (1978) Analysing diffusion and contagion effects: the urban disorders of the 1960s. Am Polit Sci Rev 72:996–1008

    Google Scholar 

  55. Midlarsky MI, Crenshaw M, Yoshida F (1980) Why violence spreads: the contagion of international terrorism. Int Stud Q 24:341–365

    Google Scholar 

  56. Mirowski P (1989) More heat than light: economics as social physics, physics as nature’s economics. Cambridge University Press, Cambridge

    Google Scholar 

  57. Mohler G, Short MB, Malinowski S, Johnson M, Tita GE, Bertozzi AL, Brantingham PJ (2015) Randomized controlled field trials of predictive policing. J Am Stat Assoc 110:1399–1411

    Google Scholar 

  58. Moravec ER (2019) Do algorithms have a place in policing? The Atlantic, September 5

  59. Moses LB, Chan J (2018) Algorithmic prediction in policing: assumptions, evaluation, and accountability. Polic Soc 28:806–822

    Google Scholar 

  60. Mossin J (1966) Equilibrium in a capital asset market. Econometrica 34:768–783

    Google Scholar 

  61. Newell A, Simon HA (1972) Human problem solving. Prentice Hall, Englewood Cliffs

    Google Scholar 

  62. Ornstein JT, Hammond RA (2017) The Burglary boost: a note on detecting contagion using the knox test. J Quant Criminol 33:65–75

    Google Scholar 

  63. Ortony A (1979) Metaphor and thought. Cambridge University Press, Cambridge

    Google Scholar 

  64. Phillips PJ (2007) Mathematics, metaphors and economic visualisability. Q J Austrian Econ 10:281–299

    Google Scholar 

  65. Phillips PJ, Pohl G (2014) Prospect theory and terrorist choice. J Appl Econ 17:139–160

    Google Scholar 

  66. Ratcliffe JH, Taylor RB, Fisher R (2019) Conflicts and congruencies between predictive policing and the patrol officer’s craft. Policing Soc

  67. Richardson R, Schultz JM, Crawford K (2019) Dirty data, bad predictions: how civil rights violations impact police data, predictive policing systems and justice. N Y Univ Law Rev 94:192–233

    Google Scholar 

  68. Ricoeur P (1977) The rule of metaphor. University of Toronto Press, Toronto

    Google Scholar 

  69. Saunders J, Hunt P, Hollywood JS (2016) Predictions put into practice: a quasi-experimental evaluation of Chicago’s predictive policing pilot. J Exp Criminol 12:347–371

    Google Scholar 

  70. Shapiro A (2017) Reform predictive policing. Nature 541:458–460

    Google Scholar 

  71. Sharpe WF (1964) Capital asset prices: a theory of market equilibrium under conditions of risk. J Financ 19:425–442

    Google Scholar 

  72. Smith M (2018) Can we predict when and where a crime will take place? BBC News, October 30. https://www.bbc.com/news/business-46017239

  73. Soll JB, Milkman KL, Payne JW (2015) Outsmart your own biases. Harvard Business Rev

  74. Sunstein CR (2019) Algorithms, correcting biases. Soc Res 86:499–511

    Google Scholar 

  75. Switzer FS III, Sniezek JA (1991) Judgement processes in motivation: anchoring & adjustment effects on judgement & behaviour. Organ Behav Hum Decis Process 49:208–229

    Google Scholar 

  76. Taylor RB, Ratcliffe JH (2020) Was the pope to blame? statistical powerlessness and the predictive policing micro-scale randomised control trials. Criminol Public Policy 19:965–995

    Google Scholar 

  77. Towers S, Gomez-Lievano A, Khan M, Mubayi A, Castillo-Chavez C (2015) Contagion in mass killings and school shootings. PLoS ONE 10:e0117259

    Google Scholar 

  78. Tversky A, Kahneman D (1974) Judgement under uncertainty: heuristics and biases. Science 185:1124–1131

    Google Scholar 

  79. von Neumann J, Morgenstern O (1947) Theory of games and economic behaviour, 2nd edn. Princeton University Press, Princeton

    Google Scholar 

Download references

Funding

No funding was received for this paper.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Peter J. Phillips.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Phillips, P.J., Pohl, G. Algorithms, human decision-making and predictive policing. SN Soc Sci 1, 109 (2021). https://doi.org/10.1007/s43545-021-00109-6

Download citation

Keywords

  • Algorithms
  • Predictive policing
  • Human choice
  • Decision-making
  • Decision theory