Neighbourhood effects and their implications for analytics and targeting

  • Tim Drye
Paper

DOI: 10.1057/dddmp.2011.33

Cite this article as:
Drye, T. J Direct Data Digit Mark Pract (2011) 13: 119. doi:10.1057/dddmp.2011.33

Abstract

Currently, there is a renewed interest in utilizing the power of interactions between consumer relationships to increase the effectiveness of marketing campaigns. These effects have long been seen in small-scale experimental units, and it is now relatively easy to track ‘virtual’ digital relationships particularly through social media channels. This article examines the implications of presence of strong off-line ‘real’ relationships; shows evidence of these effects at the local level in attitudinal research; examines a case study of the implications within the Comic Relief supporter database; and suggests ways that the application of analytics could be improved by taking cognisance of these interaction effects.

Keywords

consumer statistical modelling exploratory data analysis consumer segmentation non-linear methods multi-channel marketing scaling laws self-similarity 

The evidence for the presence of strong ‘real’ neighbourhood effects

To be human is to have relationships with others

It seems almost vacuous to look for evidence of the presence of neighbourhood effects — after all, ‘Keeping up with the Joneses’ is part of the lingua franca, having originated in a cartoon strip of that name by Arthur R. ‘Pop’ Momand first published in 1913 by Associated Newspapers. To be human is to relate to friends, family, work colleagues and neighbours, but for the purpose of a complete argument it makes sense to consider the strength of these effects alongside other influences. As current evidence, recent data awaiting publication by the British Population Survey1 shows that 64.4 per cent of 1,995 representative household interviewees claimed that Friends and Family would be an influence on their purchases compared to 2.66 per cent of the interviewees who indicated that recommendations on their social media networks would influence their purchases.

Recently, the attention has been all about the power of relationships in the online environment — how many friends can be acquired on Facebook, can a YouTube clip or email message go viral, how many people are following on Twitter? The recent article by Weinberg and Burger,2 with references therein, describe an approach to make a valuation of the business that is generated via one customer's online interactions with others through their social media networks. As with many things, the digital environment has not invented relationships but just made some of them more transparent and long distance. For example, the much-maligned offer of free flights by Hoover, in August 1992 — to use a digital phrase — went ‘viral’ because the offer was so much talked about. Indeed, brand marketers often talk of the power of the ‘idea’ to translate across any medium, online or off-line, and then get talked about.3 These approaches predated the advent of the internet. As an example, see the recent Hollywood film ‘The Joneses’ (2010), where an artificial ‘family’ unit is placed in a leafy neighbourhood, to sell products to friends this plot line is all too spookily plausible.
Local connections generate large-scale geographic patterns

More statistical evidence for the power and the implications of real networks of consumer relationships is available at the open source US project ‘Where's George’.4 This project enables participants to track dollar bills by entering reference numbers and their current locations. Graphics of the progress of dollar bills illustrate the clustering of behaviour and the presence of ‘Levy’ flights. The Levy flights have been found to be very common descriptions of human mobility5, 6 These Levy flights consist of lots of small local hoops, intermingled with long-range jumps to new locations. For example, your movement around your own house, followed by a drive to work, where you move around your workplace, before travelling elsewhere. This is further developed by noticing that other time scales have different types of ‘hoops’, for example at weekends the hoops are to a leisure activity, at the longer time scale of a month or year there are hoops for holiday locations. These different types of changes and the way they change over different geographic sizes is a common characteristic of dynamic systems, be they human or otherwise, where local interactions exist. There are a number of properties that characterize these types of non-linear behaviour, but we will focus on three: ‘clustering’, the tendency for characteristics to be variegated, ‘clumpy’, rather than smooth; ‘long tails’ where unusual properties are far more common than a typical linear model would predict; and ‘self-similarity’ where a system looks the same across a wide range of different scales of time and/or geography. We will demonstrate that these effects can all be identified within a consumer database.

The implications of ‘strong’ neighbourhood interactions

Neighbourhood interactions require non-linear models

If, as seems self-evident, neighbourhood interactions between consumers exist, do they matter? Most analytical methods start from the premise that they do not,7, 8 as background to typical methods of analysis used within a marketing context. In nearly every case, a model will be devised that assumes that each individual in an analysis set will have a series of different attributes — be they demographic, transactional or attitudinal — that can be used to predict their propensity to make different purchases or other behaviours. Implicit in these models is that the inevitable variations around the model predictions are random and independent of all other elements of the model. This independence assumption is a crucial element of the model that is taken for granted. So in this context, ‘strong’ means that neighbourhood interactions are sufficiently strong to make the assumption of independence practically invalid.

To clarify, let us make the distinction between individual interactions and neighbourhood ones more explicit. In a typical model, we include factors that refer to attributes of the individual; in complex analysis, this can grow to a high number of attributes, and we might also ‘linearise’ properties about a neighbourhood and treat them as individual characteristics. For example, we might estimate an individual's salary based on information about summarized data for their locality, be it their postcode or some large geographical unit. This ‘linearized’ approach is used widely within a group of methods classified as multi-level modelling or hierarchical regression.9 Effectively these approaches assume that any system reaches its steady state equilibrium, located at some optimum, often assumed to be the maximum likelihood of occurrence.

In this article, what we mean by a neighbourhood interaction is a factor in the model that reflects the tendency for individuals to have the same opinion, attitude or response as their neighbours. These models become non-linear because they rely on an estimation of the outcome variable, in order to predict the outcome variable — since estimates of neighbours’ outcomes are required to construct a complete estimate of each individual's outcome. These non-linear properties (among other things) generate feedback loops, both positive and negative, and make the assumption about steady-state equilibrium untenable. A consequence of the feedback loops is the potential for very small effects to cause large-scale changes in outcome. The lack of stable optimums and the multitude of potential relationships make these models computationally difficult.

Rather than give up at the lack of any final definitive solution, in other fields it has been found that some relatively straightforward techniques identify the likely characteristics of solutions. These help to illuminate the consequences of these interactions, even though a detail computation is practically impossible, and provide a guide towards the most appropriate ways to linearize analysis to make solutions tractable, while preserving important characteristics of the behaviour. It has to be stressed that these methods act as an exploratory tool; however, they do not provide a detailed understanding of the local behaviour, but indicate the types of behaviour present. As a result, they can be used as part of the exploratory data analysis phase of a statistical project, in order to indicate the presence of non-linear behaviour, guide the selection of appropriate predictor variables and perhaps indicate good estimates of the size of clusters to be used within either detailed cluster analysis or regression techniques.
Non-linear systems display scaling laws

Some detailed analysis of small networks (circa 100 participants) have been conducted,10 however, to examine large databases of customers (over 100,000), as explained above, it is computationally impossible to track all the potential interactions, but various studies of physical behaviour of matter provide clues. The methods adopted in these complex physical systems indicate some useful heuristics to adopt to identify the consequences of the presence of comparable individual and relational interactions. Studies of the statistical physics of soft matter and complex fluids show what the implications are when the strength of neighbourhood interactions are similar to the strength of individual-level interactions; they are called ‘Critical’ systems and show the presence of clusters and shapes of typical sizes and duration.11 One approach is to use scaling laws of the form f(r)rd; these types of equations have been shown to allow the classification of the three properties we are interested in — clustering, long tails and self-similarity — which will be referred to later. A fruitful area to examine is to track the number of items within a typical structure of relationships, as the scale of the measurement is increased. The scaling law then becomes Nrd, where N is the number of individuals within the structure, r is the geographical scale of the assessment and d is termed the ‘fractal dimension’. As a guide for interpretation of structures in two dimensions, the area of a circle has a ‘fractal dimension’ of 2, the length of a line has a ‘fractal dimension’ of 1 and unstructured dots would have a fractal dimension of 0.

So can we find evidence within a consumer database of these non-linear characteristics reminiscent of the behaviour within more scientific environments?

Identifying the impact of neighbourhood effects on consumer databases

Scaling laws can be found in consumer databases

In other projects, in a number of different consumer databases, not in the public domain, we have identified the presence of clustering of customers at different geographic levels. We will show here the results that were obtained when examining the Comic Relief database of supporters. For those who are unfamiliar with it, Comic Relief conducts an annual fund-raising campaign each year in March focused on raising money for humanitarian relief work, using a televised evening of celebrity performance and engagement as its trigger. This is a particularly interesting example as Comic Relief has a broad reach of communication, predominantly via the BBC, to support its high-profile annual campaigns. Support and engagement has no geographical constraints. Nevertheless, evidence of neighbourhood ‘clustering’, long tails and self-similarity are all found to be present.

When studying the effect of neighbourhood interactions within consumer databases, it is necessary to identify geographic units that classify a location at different geographic scales. In the UK, it is possible to use a combination of the Postcode geography generated by the Royal Mail,12 and the geographic units used by the Office of National Statistics13 to distribute census and other official statistics. This provides six levels, each of which comprehensively covers the UK: Postcodes, Output Areas, Lower and Middle Output Areas, Postcode Districts and Postcode Areas. The scope of the geographic units ranges from circa 1.4 M residential postcodes to 120 postcode areas. This provides an initial basis for exploration, to compare different scales. However, these different geographic units themselves have some structure in the way they are derived; the postcodes are designed around the Royal Mail distribution network, and the framework of Output Areas has been generated by homogenizing, as far as practical, the types of house within any given area.

The more careful exploration, presented here, uses a series of square grids measured in metres (m); at each level, the scale has been halved, so the lowest scale is derived from the 100 m grid references distributed by the Office of National Statistics for the population mid-points of each unit postcode. The next level consists of a 200 m grid, the third a 400 m grid and so on up to a large-scale grid of 25,600 m. This series of grids is designed to remove any structure that appears in the results that might be an artefact of the reference geography used.
Exploratory Data Analysis reveals data descriptors that scale

If the Comic Relief database under examination showed perfectly linear behaviour, then the distribution of counts of supporters should theoretically follow a Poisson distribution at each geographic scale and have a ‘fractal dimension’ of 0 at low levels of geography. In the author's experience, over a number of consumer databases this has never been the case. However, the data can often be approximated by generating an over-dispersed Poisson distribution by assuming that the distribution consists of groups of customers, plus a long tail consisting of geographic units with counts of supporters that are much higher than normal. Some of these long tails are found to consist of distribution sites, and/or unusual concentrations such as student halls of residence, and so on, but the remainder represents part of the evidence for the presence of non-linear characteristics.

One approach is to go through the process of intricately fitting to a re-scaled Poisson distribution at each geographic scale and then estimating a scaling law for the fitted parameters. This is a delicate process, and has the potential of missing the point that we are conducting an exploratory process, using a simplified model. Instead, we present here results from following a non-parametric approach that makes no particular assumptions about the underlying distributions at each geographic level.

As can be seen from Figures 1 and 2, the percentiles of each distribution can be identified; the current study focuses on the 50, 75, 90 and 95 percentiles of the distributions of supporter counts. Inevitably, as the grids go to higher geographical scales, the number within each square increases; what we want to examine here is the relationship between the percentiles of these counts and the geographical scale, as shown in Table 1. The underlying structure of scaling laws can be identified by looking at logarithmic plots of scale against the values of the percentiles, as shown in Figure 3. Interpretation of Figure 3 indicates the three different characteristics of non-linear systems. The long tails are shown by the high values of the 90 and 95 percentiles, respectively, and the clustering is implied by the relatively low value of the 50 percentile. However, the most striking effect is the consistent gradient across each of the different percentiles, across a wide range of scales. Without making any assumptions about the nature of the underlying distribution, the illustration that these lines are approximately parallel indicates that at each scale the distribution looks the same, as a constant factor is all that changes.
Table 1

The scaling of Comic Relief supporter distribution of counts within increasing sizes of geographical grid

Scale of grid (m)

Percentile of size distribution

 

50

75

90

95

100

2

7

12

16

200

2

10

54

92

400

4

20

113

229

800

17

77

342

668

1,600

78

326

1,121

2,051

3,200

306

1,324

3,788

6,265

6,400

1,112

5,138

13,535

23,136

12,800

4,609

19,959

50,083

61,225

25,600

13,511

65,496

148,128

218,673

Figure 1

Distribution of the count of Comic Relief supporters within two geographic grids of 200 m & 400 m squares

Figure 2

Cumulative distribution of the count of Comic Relief supporters in two geographic grids of 200 m and 400 m squares

Figure 3

The percentiles of the count distribution of Comic Relief supporters within squares at a range of geographic scales

Given that these non-linear features can be identified, it is of interest to interpret the gradient of the line, as the ‘fractal dimension’ previously used for physical systems. This gives a fractal dimension of approximately 1.7, below the behaviour expected for a circle. This could indicate a mixed distribution of dots, (d=0) and circles (d=2) or more intriguingly a tendency to ovals, that is, shapes with some local direction driven by characteristics of local geography.

So if we accept that these non-linear characteristics are really present, how is this relevant to practical marketing analysis and what are the implications, without the need for complex analytical methods? We would advocate that even if complex modelling methods are not adopted, considerable benefit can be accrued by an understanding and acceptance that these effects are present and are also strongly driven by the physical locations of consumers.

Analytical strategies for accommodating the impact of neighbourhood effects

Use Conventional Linear models with care
Apply Linear models ‘softly’ to individual records
Place a high priority on models with few variables
Consider modelling at multiple levels of geography rather than one
Design pilot studies to the effect of neighbourhood interactions
Design Multi-channel Campaigns that impact across different geographic scales at the same time
  1. 1
    Circumspect application of linear models Linear models include a wide category of statistical regression methods, including linear regression, multiple regression and logistic regression, brought together by Nelder and McCullagh,14 but in this article it more generally refers to any method where the outcome is predicted by the input conditions alone, and the solution is assumed to be found at a steady-state equilibrium. Typically, a complete solution including the effects of neighbourhood interactions discussed here would require iterative methods that make an initial estimation of the predictions, and then include these estimates in the next step of the model, to re-predict the outcome; this iteration is then repeated until a stable solution is reached. However, despite the considerable effort involved in solutions of this nature, the result might easily be wide of the mark due to the difficulty in accessing the initial conditions accurately, given that small changes can have large implications for that end result. So what simple modifications to standard approaches can accommodate at least some of these non-linear characteristics?
    1. a)
      When applying linear models to interpret and understand consumer behaviour, it has to be understood that the scaling and clustering effects cannot be reproduced. Linear models of whatever complexity are reliant on the predictor variables to reflect the scaling characteristics. For example, it is tempting to suggest that variables such as housing type and tenure, which are often found to be strong predictors in models, are strong because they have high levels of local correlation, rather than there being any inherent relationship between tenancy and a particular modelled characteristic. So the first way to utilize the exploratory technique presented here is to use within any linear models predictor variables that follow similar scaling laws, ideally having a comparable ‘fractal dimension’ to the outcome variable. Therefore, for example, in any project where affluent individuals are thought to be a key component, it would be worthwhile comparing the customer database ‘fractal dimension’ with the ‘fractal dimension’ of the distribution of counts of (say) detached houses within each geographic grid. In that sense, the non-linear characteristics will then look after themselves.
       
    2. b)
      Given that the non-linear behaviour can induce a high degree of variability, (witness the long tails, around the estimated values any linear model predicts), it is worthwhile allowing the application of models in a probabilistic fashion rather than following a deterministic calculation. Thus, when applying linear models, we would suggest that the method of ‘top-slicing’ scoring models is reviewed for a more probabilistic process as advocated by Press.15 By ‘top-slicing’, what we mean is the process of generating a score, typically a propensity score or a cluster membership assignment, and then applying this by choosing the highest values only. Therefore, if an individual is allocated to a cluster, this is on the basis of the highest score alone; or if a group is selected, this is on the basis of choosing those individuals with the highest scores. Recent studies indicate that the top-slicing methods are only efficient when there is complete control of a situation and unlimited resources. In more practical scenarios, with constrained resources and many other sources of variation, most particularly here those generated due to the non-linear characteristics, it is more efficient to include a level of randomization of the profiled allocations. This would reflect an acceptance that any specific individual may be driven by their own attributes, but could also be governed by the attributes of their neighbours. The analysis of the optimum selection process in these recent studies indicates that individuals should be based upon a process of square root sampling. Therefore, when individuals are allocated to a particular cluster, this is based upon, pj1/2, where pj is the probability of membership of cluster j. If individuals are selected on the basis of a propensity score, then a probabilistic sample is taken based on the square root of the propensity. As an example, suppose that three segments have been identified — A, B and C. If a segment attribution model predicts segment membership for one individual as p(A)=0.7, p(B)=0.2, p(C)=0.1, instead of deterministically allocating this individual to segment A, generate the segment allocation probabilistically, according to the distribution p(A)=0.52, p(B)=0.28, p(C)=0.20. These probabilities are derived by taking the square roots of the segment membership probabilities and normalizing the results so that they sum to unity for each individual.
       
    3. c)
      It is also relevant to accept the limitations of the modelling methodologies employed. It is tempting to suggest that to improve a model a combination of increasing the range of variables included in the model and utilizing more complex linear algorithms will suffice. However, in studies of decision making within unbounded non-linear conditions,16 it has been shown that additional variables and complex optimization can have a detrimental effect on the predictive power. The studies advocate strongly the discovery of simplified heuristics that capture most information within relatively few variables. It has to be emphasized that no amount of additional variables or complexity in a linear model can account for the non-linear effects described in this article. In fact, the consequences of this perspective are to suggest that the best linear models will be constrained to using a handful of predictor variables, identified with the assistance of the exploratory tools demonstrated here, that as a result contain within themselves some approximate tracking of the effect of localized interactions.
       
    4. d)

      It is also interesting to consider the implications of the property of self-similarity to multi-level modelling and the current move within the UK to develop and promote ‘individual’-level segmentation products at the expense of ‘postcode’-level products. The general implication of the property of self-similarity is that there is no special size or scale, which has priority over another. As such, while it is a development to generate ‘individual’-level products, this should not be seen as an alternative to the ‘postcode’ level, rather as a compliment. In fact, to use an individual-level segmentation and remove the postcode discriminator might well reduce performance rather than improve it. Any property of self-similarity would also drive the requirement for segmentations at high levels of geography as well. To go the whole way would suggest a set of segmentations with a small number of separate groupings that are estimated at as many different geographic scales as is practical. It remains an area for further research to identify what is the best way of linking the different scales together, and whether and how the methods of hierarchical regression might be modified to accommodate non-linear characteristics. Given the research conducted into unbounded decision making,16 it is likely that the optimum solutions will require solutions specific to different cases rather than there being a general method that suits all circumstances.

       
     
  2. 2
    Design of marketing pilots and tests Marketing design often includes a process of testing and piloting, which inevitably relies on sampling a small proportion of the target market. The process of sampling can impact the conclusions drawn about some of the behaviour discussed here. As an example, the Comic Relief database of supporters was matched to the reference set of postcode locations that had been used within the British Population Survey over the 3 years, 2008–2010, a sample of circa 100 K different locations. The results for this sub-sample of the database are shown in Table 2 and Figure 4. While the chart still shows the presence of the self-similar behaviour, estimation of the other properties, most particularly the ‘fractal dimension’, is biased — the sampling lowers the estimate significantly as can be seen by the reduced gradients in Figure 4 compared with Figure 3. It remains an area for research to understand the most efficient methods of research sampling under these non-linear conditions. In order to help understand the implications of the non-linear effects demonstrated here, it is helpful to examine other areas where the consequences have already been considered. One area where exploration has previously taken place is the study of patterns of internet traffic to servers and ISPs.17 An understanding of these patterns helps to construct appropriate resource allocation, and the outcomes of this field of research may prove relevant to estimation of call centre allocations for a campaign rollout based on results from a marketing pilot. Another area where these effects have been considered is in the geographic growth of populations in and around cities.18, 19, 20 Most relevantly,19 there is a demonstration of how different geodemographic segments grow in different ways. That study reviews the growth of different types of job function and characterizes the growth using the ‘fractal dimension’. If nothing else, studies of this type could be used to refine the weights used within research samples when the results are projected over the whole population. While it may well be impractical to study the nature of neighbourhood effects on marketing strategy, being aware of their potential presence suggests changes in the normal methods of testing and sampling. For example, it is common to take a 1-in-N sample of data sources to get an indication of that source's potential contribution. Taking this type of sample will reduce the data to at best one individual from any neighbourhood, and thus will at a stroke remove the opportunity to assess any neighbourhood effects and also remove the potential to utilize the neighbourhood impact as a by-product of the encouraged ‘word-of-mouth’. Instead, we would advocate designing tests based on samples of streets, neighbourhoods and communities. This has the potential to analyse the presence of local effects, but also to determine whether the test and pilot campaigns benefit from these interactions. By running geographically dispersed tests, the opportunity to assess any neighbourhood enhancement is removed. (a)Targeting and the Application of Diverse Communications There has been some exploration of the implications for marketing strategies of the presence of non-linear behaviour.21, 22, 23 However, these studies look at different facets of the marketing agenda. On the whole, they focus on strategic management and the provision and management of resource to handle a diverse and chaotic environment, rather than an interpretation of the pattern of consumer distributions. However, a theme comes out of these articles, as well as the studies of unbounded decision making. In each case, it is demonstrated that while specialization might have very short-term advantages, there is a strong longer-term benefit to diversification, and preparing responses that are flexible at different levels. To carry that approach on in this context, so that neighbourhood interactions can be utilized to best effect, it seems appropriate to look to generate opportunities for positive feedback and enhancement. The very presence of neighbourhood interactions suggests that, where possible, marketing communications should use a tiered approach. As technology makes more individualized communications possible, there is a natural temptation to focus on these areas, and to generate more individualized communications. It is natural that the cost benefits of digital targeted communications seem most efficient, but they start to treat the smallest scale as the special one. It is probably best to challenge this instinct by seeking ways to deliver softer more diffuse messages to compliment the highly targeted techniques. As an example, this might be achieved by seeking to implement a TV broadcast, alongside a local newspaper-based insert distribution to complement direct communication via email or mail. It is perhaps natural for each type of marketing service provider to see the scale at which their own service is delivered as in some way privileged — what these studies indicate is that while these scales are special to the organization they are not recognized collectively by the consumer.
    Table 2

    The scaling of Comic Relief supporter distribution of counts within research sampling frame in increasing sizes of geographical grid

    Scale of grid (m)

    Percentile of count distribution

     

    50

    75

    90

    95

    100

    2

    4

    7

    9

    200

    3

    6

    10

    13

    400

    5

    10

    18

    25

    800

    7

    18

    37

    53

    1,600

    10

    31

    72

    116

    3,200

    15

    52

    142

    249

    6,400

    27

    109

    315

    541

    12,800

    72

    277

    776

    1,319

    25,600

    268

    981

    2,126

    3,956

    Figure 4

    The percentiles of the count distribution of Comic Relief supporters within squares at a range of geographic scales within the British Population Survey Sampling Frame (2008–2010)

     
Use Conventional Linear models with care
Apply Linear models ‘softly’ to individual records
Place a high priority on models with few variables
Consider modelling at multiple levels of geography rather than one
Design pilot studies to the effect of neighbourhood interactions
Design Multi-channel Campaigns that impact across different geographic scales at the same time

Conclusion

This article has advocated that analytics should not ignore the presence of neighbourhood interactions and merely brush over them by making early assumptions of the validity of subject independence. It has shown that in other physical scenarios, localized neighbourhood effects generate large-scale patterns that provide potential analogies and suggest potential approaches to behaviour within high-volume databases, most particularly the presence of ‘long tails’, ‘clusters’ and ‘self-similarity’ at different geographical scales.

It has demonstrated using practical data about real consumers, within the Comic Relief database, the presence of each of these properties. This was done using a non-parametric exploratory technique, which confirmed the presence of a scaling law within the Comic Relief data and allowed the estimation of a ‘fractal dimension’.

Rather than advocating the need to make large-scale investments in highly complex analysis, we have advocated that some simple methods can be utilized to mediate the effects of these neighbourhood interactions, even if they are difficult in themselves to analyse directly. These simplified methods have been applied to complex systems in other areas of study, but still remain to be adopted and widely applied within the analytical understanding of consumer behaviour.

The implications of non-linear characteristics have been suggested, particularly the methods of application of simplified linear models, the design of marketing campaign tests and the development of integrated communication strategies at different geographic levels. In each area, it is advocated that approaches that simplify the techniques, but then apply the results across a wide range of geographic scales, provide the most fruitful area both for immediate practical application and further research.

Acknowledgements

The author thanks the reviewers for the helpful feedback received in the development of the final version of this article.

Copyright information

© Palgrave Macmillan, a division of Macmillan Publishers Ltd 2011

Authors and Affiliations

  • Tim Drye
    • 1
  1. 1.DataTalk (Statistical Solutions) Ltd.CambridgeshireUK

Personalised recommendations