A representation learning framework for detection and characterization of dead versus strain localization zones from pre- to post-failure

Experiments have long shown that zones of near vanishing deformation, so-called “dead zones”, emerge and coexist with strain localization zones inside deforming granular media. To date, a method that can disentangle these dynamically coupled structures from each other, from pre- to post- failure, is lacking. Here we develop a framework that learns a new representation of the kinematic data, based on the complexity of a grain’s neighborhood structure in the kinematic-state-space, as measured by a recently introduced metric called s-LID. Dead zones (DZ) are first distinguished from strain localization zones (SZ) throughout loading history. Next the coupled dynamics of DZ and SZ are characterized using a range of discriminative features representing: local nonaffine deformation, contact topology and force transmission properties. Data came from discrete element simulations of biaxial compression tests. The deformation is found to be essentially dual in nature. DZ and SZ exhibit distinct yet coupled dynamics, with the separation in dynamics increasing in the lead up to failure. Force congestion and plastic deformation mainly concentrate in SZ. Although the 3-core of the contact network is highly prone to damage in SZ, it is robust to pre-failure microbands but is decimated in the shearband, leaving a fragmented 3-core in DZ at failure. We also show how loading condition and rolling resistance influence SZ and DZ differently, thus casting new light on controls on plasticity from the perspective of emergent deformation structures.

Here we develop a new data-driven framework to study structured deformation in granular media from the perspective of collective motion. The framework is broadly applicable to spatiotemporal kinematic data, not just those from granular matter. Hence a further benefit of this effort is that it may facilitate connections between granular deformation and ordered motions in the broad discipline of complex systems. In turn, these can form the basis for a common framework for the study of collective motion dynamics which transcend system particularities.
Our proposed framework proceeds in two-phases based on an input data on individual grain displacement and rotation (Fig. 1). In the first phase, representation learning (RL) [23] is used to identify strain localization zones (SZ) [4,7,8,10,11] and dead zones (DZ) [2,6,24]; in the second phase, a combination of novel statistics and existing networks metrics is employed to characterize the coupled dynamics of SZ and DZ from pre-to post-failure. To the best of our knowledge, no prior study has distinguished SZ and DZ regions to the individual grain level, for all stages of a test, as accomplished here. Much like the force network, our hypothesis is that the deformation of a granular sample may be usefully treated as embodying a coupled dual structure, SZ and its complementary DZ, each with its own unique dynamics. This work is inspired by artificial intelligence studies of complex systems, where the consensus view is that complex systems appear more complex than they truly are. That is, measurements of observed properties of such systems typically give rise to redundant high-dimensional data, despite their behavior being governed by relatively few dominant intrinsic properties. Thus when extracting useful insights from high-dimensional data, one can avoid the inevitable 'curse of dimensionality' by introducing a representation of the observed data in a reduced dimension with minimal to no loss of essential information. Accordingly, the framework we propose strategically builds on the newly introduced metric s-LID [25]. In essence, s-LID measures the local intrinsic dimensionality (LID) [26,27] of the grain motion data in a neighborhood of the displacement-state-space (DSS), centred at the focal grain's displacement. This representation proved to be discriminative of the kinematic structure of the data to the extent that multiple, concurrent strain localization patterns at different spatial scales can be detected [25]. More details of s-LID can be found later in Sect. 3. s-LID is ideally fit for the purposes of this study for three key reasons. One, we achieve a significant reduction in data dimensionality in representing the observed motions of a reference grain and its s neighbors in DSS: a dataset of (s + 1) displacement vectors is represented by a single scalar, namely, the s-LID value for the focal grain. Two, because s-LID measures the minimum number of latent variables which are necessary to describe the data, it is discriminative of the complexity of collective grain motions. As previously shown in experiments and simulations, grain motions in SZ are complex and strongly nonaffine [5,7,11], especially in shearbands where transient structures like vortices form [11-13, 15, 28-30]. Consequently, motions in SZ require more latent variables to describe than the quasi-rigidbody motions in DZ, which makes s-LID a very meaningful representation of the kinematic data for the purposes of distinguishing SZ from DZ. Third, s-LID is highly robust in detecting patterns of different shapes and length scales [25]. This is because it makes no assumptions about any geometrical features of the target pattern(s) and, unlike the traditional continuum mechanics measure of strain, s-LID obviates the need to prescribe a spatial scale. The only parameter in s-LID is s, the number of nearest neighbors to be studied in DSS. Increasing s allows the identification of outliers with respect to a larger neighborhood of DSS, thereby enabling the identification of patterns with more complex kinematics, as characterized by more outlying or anomalous grain motions relative to a focal grain's displacement. This is why persistent shearbands, having more complex kinematics than microbands, are exposed at higher s values.
The rest of the paper is arranged as follows. In Sect. 2, we briefly summarize pertinent research gaps in the study of grain motion data in light of relevant developments in data-driven methods and machine learning. In Sect. 3, we describe the components of our framework, followed by a description of the input data in Sect. 4. Results are presented and discussed in Sect. 5, before a brief conclusion in Sect. 6. Finally, we note that the methods are applicable to three dimensional deformation, but we confine our analysis to a well-studied set of four ensembles of spherical grains submitted to planar biaxial compression for ease of visualization of patterns and their interpretation. The four granular samples differ in material property and loading condition.

Filling research gaps with data-driven studies of collective motion
To put this work in context, we briefly review a class of datadriven tools that have objectively identified and characterized emergent deformation patterns in granular matter. These require no a priori information on spatial scale, geometry, and location of target structures. We emphasize that some of these tools arose from studies of strain localization where the same data studied in this work were examined alongside: (a) physical experiments on synthetic and natural granular materials including photoelastic disks, hydrogels, concrete and different types of sand (e.g., [28,29,[31][32][33][34][35][36][37][38]); and (b) field scale studies of catastrophic failure in both engineered and natural slopes (e.g., [19,39,40]). Consequently, we have a unique opportunity to directly compare and relate: (a) the new findings reported in this study to previously uncovered properties and mechanisms underlying the dynamics of patterned motions in these samples; and (b) strain localization phenomena across vastly different scales from controlled and idealized small-scale laboratory tests through to natural field-scale conditions.

Complementing strain
Using strain, emergent deformation patterns like intermittent microbands and persistent shearbands have been classified and studied under the phenomenon broadly known as strain localization [4,10,28,41]. Still, unresolved challenges remain. Among these is the accurate identification and differentiation of these structures at all stages of a test, from pre-to post-failure. This has proven difficult primarily 75 Page 4 of 19 because these structures: (a) coexist in time, (b) vary in spatial scales, (c) interact to give rise to interchanging passive and active components [10,30,40], and (d) produce secondary patterns bearing features of the primary ('parent') structures of microbands and shearbands [25]. A key limitation of strain measures in deciphering all these details from grain motion data is its reliance on a predefined spatial scale-a fixed 'window of observation' (representative volume element)-which may obscure important details, or even miss novel patterns of deformation (e.g., [10,42,43]). Indeed experiments and simulations have shown that even persistent shearbands are far from static structures [5,44,45]. Thus studies which rely on a prescribed static shearband boundary in the critical state regime may be subject to these drawbacks. Other gaps in knowledge on how such stationary shearbands emerge from microscopic and mesoscopic plastic events were recently discussed in the context of internal variables like fluidity which, though similarly entails a predefined spatial scale, can account for rates of plastic events [46]. Given the rapid advances in measurements, data-driven tools from artificial intelligence and modern data science may help fill these critical gaps, especially if they can offer complementary information on emergent patterned deformation with minimal a priori assumptions.

Identifying SZ in kinematic-state-spaces
Prior studies have explored methods for detection of kinematic patterns which are accomplished entirely in the abstract kinematic-state-space, as opposed to the physicalstate-space (e.g., [33,34,40]). These have uncovered novel patterns of strain localization in the pre-failure regime. Their key advantage is that they do not make a priori assumptions on the spatial scale, geometry and location of target patterns: instead they focus solely on the structure of (or pattern of correlations in) the kinematic data. For example, a study of shortest paths in kinematic complex networks, quantified through closeness centrality, has shown that the pattern formed by the ultimate shearband can be detected in virtual and real sand samples early in the pre-failure regime [33,34]. This pattern, coined embryonic shearband to distinguish it from the shearband that later forms in the failure regime in the same location and with the same geometry, was also detected using s-LID [25]. Furthermore, embryonic shearbands were found to coexist and interact with microbands in the early stages of the pre-failure regime in biaxial compression tests [25]. These co-occurring strain localization modes not only differed in spatial scales and geometry but also in relative dominance. The microbands dominated the early stages of the pre-failure regime, while the shearbands dominated the post-failure regime. Dead zones also appeared in most stages of the test. Clear criss-crossing patterns formed by the microbands enclosed small quadrilateral dead zones. In the intermediate states, as the sample transitioned from pre-to post-failure, new SZ band patterns formed, bearing properties of microbands and shearbands. That is, microbands in some parts of the sample vanish, as nearby dead zones grew in size and change shape; simultaneously, new SZ bands formed that are both wider and further apart from each other. In the mesoscience framework [47], these intermediate states comprise the so-called mesoregime where a compromise-in-competition governs the interaction between microbands, shearbands and dead zones [19,47].

Identifying and characterizing SZ from the contact network and birth-death dynamics of mesoscopic building blocks
Data-driven approaches using dynamical systems in conjunction with network flow theory have proven effective in predicting and characterizing emergent deformation structures, without need for a priori information on the geometrical properties, location, and member grains of target structures (e.g., see [35,38,48] and references therein). Distinct from complex network based approaches (e.g., force chain detection method in [49]), network flow approaches solve a dual optimization problem to model the coupled evolution of shearbands and force chains based solely on the contact network and the contact strengths. By design, constituent grains of each structure are distinguished by their dynamics and in turn the structure's evolving location in a sample across the different stages of a test. Predictions on the impending shearband and force chains form the output solution to the optimization problem (see [19,38,48] and references therein). Promising results have been obtained for various unbonded and bonded granular systems (e.g., sand, concrete), across vastly different spatial scales. These include real slopes where early and accurate predictions of the location of an impending landslide were attained using ground motion data from spaceborne and ground-based radar [19].
Complementary strategies, which also proved effective in the detection of SZ and the characterization of its dynamics, focus entirely on the salient internal structures of the contact network. An example is the k-core, the remaining components in a network after removing all the nodes with degree less than k (e.g., [36,50]). For the same samples studied in this work, the unjamming transition was found to be marked by the fragmentation of the system-spanning k-core, with a concomitant localization and burst to a peak in the local nonaffine motions inside persistent shearbands [32,36,37]. These findings concur with those from more recent rigidity percolation studies of the contact network in the jamming transition for granular packings under shear, for example, in 2D using the pebble game [51] and in 3D [50] using k-core. Although these did not examine strain localization per se, some useful connections can be drawn to the earlier work in [36] which quantitatively characterized the dynamics of the contact network with respect to emergent shearbands as the system unjams. Morone at al. [50] found jamming to be dominated by the sudden emergence of the giant k-core in the contact network. In Walker et al. [36], the percolating giant 3-core characterized all of the jammed states in the pre-failure regime, in marked contrast to the 3-core in the failure regime. That is, the 3-core fragments into multiple disconnected components at the onset of failure (the so-called critical state regime), when the system unjams, leaving no k-core ( k > 2 ) component lying in the shearbands for the rest of the failure regime [36]. In Henkes et al. [51], where the contact network was treated as a composite of socalled rigid clusters and floppy modes, the transition to jamming was found to be characterized by a system-spanning rigid cluster such that the grain displacements in the floppy regions were found to be more nonaffine compared to the rigid regions on both sides of the jamming transition. Studies of compression tests, from simulations to experiments, have shown that local nonaffine motions concentrate in SZ and that the onset of failure is marked by a sudden burst to a peak of the global average nonaffine motion [5, 7-9, 11, 28, 32, 36, 37, 44]. This suggests the possibility that floppy modes (rigid clusters) may share common properties with SZ (DZ). The elucidation of this relationship is important, but is outside the scope of the present study. An excellent recent review of these internal deformation modes and the challenges they pose for constitutive modelling can be found in [43].
The multiscale Ripley's spatial point pattern method [52,53] is another useful data-driven approach for studies of SZ dynamics. It has been used in ecology to study interspecies and species-event interactions as spatiotemporal patterns of attraction and repulsion (e.g., [54] and references therein). It was first applied to granular media to distinguish diffuse versus localized modes of failure [55] but later also proved effective in characterizing the precursory dynamics of failure from shearbands [30] to damage localization in quasibrittle fracture [56]. In [30], Ripley's method revealed several birthdeath dynamics of 3-cycles (3 grains in mutual contact) and force chains which are unique to the embryonic shearband at pre-failure and the actual shearband during failure. First, a preferential attrition of 3-cycles in the face of continuing 3-cycle births develops in the embryonic shearband during pre-failure. Second, also in the same region, is a distinct pattern of progressive destruction of the most stable subgroup of 3-cycles, the persistent 3-cycles that existed from the beginning of the test. This process culminates in the total eradication of persistent 3-cycles in the fully formed shearband at the start of the critical state regime, consistent with the disappearance of the 3-core in this area. Third, for the rest of the critical state regime, spatial repulsion dynamics can be observed between the persistent 3-cycles outside of the SZ and 3-cycle birth-death events inside SZ, giving rise to a length scale of around 7-10 particle diameters for the shearband thickness. Fourth, a pattern of spatial attraction between buckling force chains, 3-cycle birth-death events inside SZ formed during unjamming.

Identifying and characterizing DZ
Compared to SZ, relatively little work has been done to characterize DZ quantitatively. In the pre-failure regime, Kuhn [4] reported observing zones where the "slip deformations are less than the assembly mean" and which are thicker than, and separate, microbands. In general, however, DZ has been typically studied from incipient failure onwards. Some of the earliest observations of these came from experiments on landing gears on granular beds, in support of the Apollo program in the 1960s, which highlighted their evolution near solid boundaries during shearing loads [1,2,57]. More recently, Murthy et al. [6] examined solid-granular interaction systems using hybrid imaging techniques to uncover the microstructural details of motions around indenting bodies. DZ topology beneath intruders and penetrating bodies were found to be populated by percolating force chains fortified with truss-like 3-cycles [24].

Methods
The proposed two-phase framework is illustrated in Fig. 1. The s-LID value is employed in the Identification Phase to identify the dominant strain localization structure and its complement. In the Characterization Phase, we test the hypothesized dual structure of deformation by characterizing the dynamics of each class (SZ versus DZ) with respect to three aspects: fluctuating component of motion, contact network topology, and force congestion, as listed in Table 1.

Learning the intrinsic structure in data
Many methods have been proposed for learning the intrinsic structure embodied in the data. In machine learning, these methods have emerged into a class of their own known as representation learning, where they are then further classified into supervised or unsupervised, linear or nonlinear, shallow or deep [23]. The one relevant to this study, introduced in 2000, is the subclass called manifold learning [23], which is aimed at discovering the intrinsic structure of highdimensional datasets. The intuition behind manifold learning methods is that the dataset A key metric for describing the geometry of M is the intrinsic dimensionality (ID). In essence, ID is an estimate of d. ID is thus a global property of the dataset N , which describes the minimal number of latent factors that are needed to effectively represent its data points . The higher the ID, the more complex is the dataset. ID can be adapted to assess the local structure of a neighborhood (comprising s neighbors) around a chosen data point . This gives rise to the local intrinsic dimensionality (LID) of [26,27]. LID( ) is a measure of the complexity of the sub-manifold centered at . A lower LID indicates a local neighborhood (sub-manifold) with a simpler geometry that can then be described using fewer variables.
The kernel idea in the formulation of LID comes from the geometrical relationship that generalizes the volume enclosed by a hypersphere in n-dimensional space. That is, the volume of a hypersphere in ℝ n is determined by its radius r and the dimensionality n, where Γ is Euler's gamma function. By measuring concentric neighborhood volumes and radii, n can be found as follows Definition 1 (Local Intrinsic Dimensionality [26]) Given a query point , let r > 0 be a random variable denoting the distance from to its neighboring points in a given feature state space. If the cumulative distribution function F(r) is positive and continuously differentiable at distance r ≥ 0 , then the LID of F at r is, wherever the limit exists. Thus LID of is A geometric interpretation of LID, recalling Eq. (1), is that it is a measure of the space filling capacity of the local neighborhood of points around . Specifically, from Eq. (2), LID gives the rate at which the number of encountered points grows as the neighborhood expands in size with Mean absolute value of rotation in a given zone [0, +∞) ⟨ ⟩, ⟨ ⟩ Mean distance to kinematic center in a given zone based on displacement and rotation, respectively. The kinematic center of each grain is defined as the average displacement vector (or rotation scalar) of its own local neighborhood. The lower the value of ⟨ ⟩ , the closer is the motion in the zone to rigid-body translation By contact network topology c Subgraph centrality, a measure of node importance in the complex network based on the participation of each node in all subgraphs, where a subgraph is defined as a closed path starting and ending at a given node in the network. The higher the value, the higher is the connectivity.
[0, +∞) Likelihood of member grains of a given zone being in the 3-core of the contact network [0, 1] By force transmission ⟨‖ ‖⟩ Mean contact force magnitude ‖ ‖ in a given zone [0, +∞) Congestion index, the ratio between the contact force magnitude and the capacity of a contact, where capacity is approximated by the inverse of relative displacement between the pair of associated grains. The higher the value, the more force-congested is the contact, the closer is the contact to failure distance r from the query point , with F(r) being the probability of finding a neighbor within a distance r from . Both ID and LID have been used successfully in diverse areas of science including machine learning, ecology, and climate science [58][59][60][61][62]. In our study, the dataset comprises the individual grain displacements N = { i } N i=1 at every equilibrium state of the loading history, where N represents the total number of grains in the studied granular sample. F (r) is unknown. Hence we must turn to an estimation of LID( ) for every grain displacement. This was recently undertaken in [25], which highlighted the need to quantify and investigate the relative dominance of different strain localization patterns and their coevolution with dead zones throughout loading history. This is achieved using the method described in what follows.

s-LID: LID of grain displacement data
, a set of N points in the 2D displacement-state-space (DSS). The task is to estimate LID for every point in N at each equilibrium state of the loading history. To do this, Zhou et al. [25] adapted the method in [63] which used a maximum likelihood estimator from extreme value theory in statistics. This requires the choice of a parameter s , and hence we term the estimator s-LID: Here, is the displacement of a focal grain, a 2-dimensional vector representing its translation in the horizontal and vertical directions over an axial strain interval Δ yy = 0.002 (other strain intervals have been tested with similar results); d i ( ) is the Euclidean distance between the focal grain displacement and its i-th nearest neighbor in DSS; d s ( ) is the maximum of the neighbor distances in DSS; and s ≥ 2 is a parameter to control the size of the neighborhood to investigate (i.e., the number of nearest neighbors to use in the estimation procedure). Note further that the data points in DSS are assumed to not overlap with each other. This ensures that the s-LID value is well-defined as a unitless score ranging from 0 to +∞ : the higher the s-LID value of a grain, the more outlying is its motion. By varying s, one can probe different levels of kinematic resolution (zooming in and out) with respect to the focal grain's motion in DSS. A lower s focuses on the properties of motions nearest in values to , while a larger s focuses on a wider range of displacements relative to . This strategy of tuning the kinematic parameter s uncovered concurrent strain localization patterns at multiple spatial and kinematic scales. Results in [25] suggest that grains with higher than average s-LID score are likely to be located in the strain localization bands. Specifically, in the early pre-failure stages of the loading history, a small s identifies a system-spanning criss-crossing pattern of microbands, whereas a large s reveals the embryonic shearband which gives an early prediction of the location and geometry of the ultimate shearband pattern.

Identification of SZ and DZ
We refer to the dominant s-LID pattern at each strain state as P ( ) s * , and the associated critical neighborhood size as s * ( ) . To find P ( ) s * , a reference pattern must first be selected. There are two main reasons why the grain rotation field forms the natural first choice for this reference pattern. First, grain rotations have long been recognized as an informative indicator of not just one but multiple modes of strain localization [4,10,11,13,41,[64][65][66]. In particular, the intermittent criss-crossing pattern of microbands can be identified in the early stages, albeit often imperfectly [4,10,25], while the persistent shearband can be reliably identified in the failure regime (e.g., [32,35]). Second, many other fundamental kinematic events are governed by grain rotations, including rolling and sliding at the microscopic contact scale and mechanisms like force chain buckling and vortices at the mesoscopic scale [4-7, 11-13, 24, 29, 32]. Indeed recently uncovered kinematic details from more advanced measurement techniques have ignited renewed interest in the role of rotation in controlling plasticity, rheology and strength of granular materials [10,[67][68][69].
With the rotation field serving as the reference pattern, the dominant s-LID pattern P ( ) s * is next identified as the one having the highest similarity to it, as measured by normalized mutual information (NMI) [70]. Specifically, at a given strain state , the critical neighborhood size s * ( ) , and hence the dominant pattern P ( ) s * , is selected from a range of concurrent s-LID patterns associated with varying neighborhood sizes as: where P ( ) s is the s-LID pattern with a specific neighborhood size, which is defined as the set of grains whose s-LID values are higher than the average s-LID score among all grains. P ( ) r is the rotation pattern that identifies grains with higher than average cumulative rotation during strain states − Δ yy to , and S is the set of candidate neighborhood sizes. 1 is the indicator function, I(1 P ( ) s ;1 P ( ) r ) is the mutual information [70] between the two binary vectors extracted from the corresponding s-LID and rotation patterns defined as The denominator in (4) is a normalization factor based on entropy H(•) [70], where the entropy of a random variable X is defined as The normalization term scales the mutual information into the range from 0 to 1: the more similar two patterns are, the closer is NMI to 1. The indicator vector 1 P ( ) s * corresponding to the dominant s-LID pattern partitions the system into two distinct areas: SZ which represents the dominant strain localization structure for the given strain state, and DZ which contains the rest of grains. In the following, we show that these two zones differ from each other in several aspects.

Local deformation
The thermodynamical formalism of internal state variables which characterize the plastic response due to the rearrangements in the internal structure can be traced back to Coleman et al. [71]. With respect to shearbands in frictional granular materials, the nonaffine grain motions associated with the underpinning mechanism of force chain buckling have been derived and introduced as internal variables in thermomechanical continuum formulations, leading to realistic predictions on the averaged stress anisotropies inside emergent shearbands and coexisting dead zones in both fully confined and semi-confined granular systems [18,72]. Other sources of nonaffine motion have also been studied (e.g., shear transformation zones in amorphous solids [73] and trimer bending in photoelastic disk assemblies [74]). Accordingly, a natural first step is to characterize the dynamics of SZ and DZ with respect to the local deformation in a small spatial neighborhood: a focal grain and its first ring of contacting neighbors. This spatial neighborhood was considered in past investigations into local affine and nonaffine deformation in physical and numerical experiments which explored the evolution of strain localization [32]. By comparison, here we exploit pairwise distances in DSS to generate simpler measures of local collective motion for each zone.
We start with the simplest statistic: the mean absolute value of the corresponding grain motions in each zone Z (Z= SZ or DZ). That is, at each strain state, we measure the average displacement and rotation magnitude in each
zone, denoted by ⟨‖ ‖⟩ and ⟨� �⟩ , respectively: here ‖•‖ is the norm of a vector, |•| is the absolute value of a scalar, and ⟨•⟩ is the average operator. Next, we quantify in each zone the local fluctuating displacement and rotation [9,28], ⟨ ⟩ and ⟨ ⟩ , respectively. We measure these by first considering the deviation of the motion of a focal grain from the local average motion or kinematic center (i.e., the average motion of the focal grain and its first ring of neighbors): this is given by (or ), the Euclidean distance between the focal grain's displacement (or rotation) and its respective kinematic center. Consequently, ⟨ ⟩ is simply the average distance across all the grains in either SZ or DZ. Intuitively, the smaller the ⟨ ⟩ , the closer is the collective motion in each zone to rigid-body translation (i.e., ⟨ ⟩ = 0).

Structure, topology and dynamics of the contact network
We focus on two structures of different length scales in the contact network. The first is a local property that summarizes the local minimal cycles of a grain. The second is global in the sense that it is a property of the entire contact network, namely, the 3-core. The 3-core is generally a mesoscale structure: it encapsulates more than two grains in contact (microscale structure) but is typically smaller than the whole sample (macroscale structure), albeit it is possible for the global contact network to be identical to its 3-core. The subgraph centrality [75,76] is a measure of the local connectivity at the grain level. It gives a summary of the minimal cycle membership of a grain and is defined as the weighted sum of the number of closed paths (cycles) of different lengths: where k is the path length, and k (i) is the number of paths of length k for node i. As can be seen, shorter paths (smaller cycles) receive higher weights whereas the importance of longer paths (larger cycles) decay significantly. The higher is c for a given node, the more densely connected is the local neighborhood of the corresponding grain. Note that this is a grain property just like the coordination number (or degree of node i) which corresponds to the number of contacting neighbors of grain i. The subgraph centrality c i gives more information than degree since it takes into account the local packing or arrangement of grains around i and not just its contacts.
At the global level, an informative structure within contact networks of granular media is the k-core [33,77,78]. The k-core constitutes the remaining components in a network after removing all the nodes with degree less than k. It can be viewed as a type of network attack, which is a process of removing node(s) and/or link(s) from a given network. As shown in [78], the 3-core is the most informative. The 3-core removes all low-degree grains within the network while retaining only those grains in the minimal cycles of the contact network [79]. This corresponds to those parts of the sample where grains are either isostatically (coordination number = 3 ) or hyperstatically (coordination number > 3 ) constrained [31]. In contrast to other k-core graphs where k ≠ 3 , the fragmentation of the 3-core from an initially single component graph was found to coincide with the onset of failure, specifically, the start of the near-steady critical state regime, from which the shearband becomes fully formed. This means that for all of the failure regime, the k-cores for all k ≥ 3 are fragmented and exist in multiple components. Thus it suffices for the purposes of this study to focus only on the 3-core.
To relate a given zone Z (Z= SZ, DZ) to the 3-core, we introduce the likelihood of member grains of each zone being in the 3-core of the contact network: where N 3-core∩Z represents the number of grains in both the 3-core and zone Z, normalized by the zone size N Z .

Force transmission
A key lesson learned from physical and numerical studies of force transmission in granular contact networks is that a contact's proximity to failure (or breakage) is just as important as the actual force transmitted through it [19,35,38]. This predisposition to failure arises due to force congestion: when there is a large build up of force at a contact, which then brings that contact close to its limiting force capacity or strength ij (i.e., the minimum force needed to break apart the contact). Force bottlenecks, which comprise contacts that are prone to congestion, were found to be a powerful indicator of the likely location of impending shearbands in both laboratory and field scale studies (see [19,38] and references therein). Acoustic emission measurements have detected these internal differentials in proximity to failure [20,80]. Thus a useful measure for a given contact between grains i and j would be the level of force transmitted at the contact relative to its strength ij . The challenge though is that we have no direct measure of these limiting force capacities. Nevertheless, a robust proxy for contact strength is is the magnitude of the relative displacement [35,38]. The intuition here is that large relative motions develop in weaker and more unstable contacts, consistent with experiments [32,81]. Accordingly, we propose the following force congestion index ij A higher ij means a more force-congested contact which in turn means the contact is more predisposed to failure. Here ij is measured for three groups: SZ are contacts between two SZ grains; DZ are contacts between two DZ grains; IB are contacts between DZ and SZ grains.

Input data
We chose a unique set of data for this study to leverage the opportunity to gain insight into the robustness of deformation patterns across vastly different materials, loading conditions and spatial scales. The input data analyzed in this work are chosen from a well-studied family of discrete element simulations of spherical particles subjected to planar biaxial compression: CP1, CP2, CV1, CV2 [30,37]. Their mechanical properties and mechanisms underlying the dynamics of strain localization have been examined from different perspectives and compared to those observed in controlled laboratory experiments on different materials (e.g., sand, concrete, photoelastic disks and hydrogels) [31,32,34,35,55] and their associated discrete element simulations, and in field scale studies of catastrophic landslides in several countries, from quasibrittle to more ductile slopes, both engineered and natural [19,39,40]. Each assembly contains 5098 particles with isotropic initialization and a packing fraction of 0.858. They are subjected to biaxial compression at a constant strain rate in the vertical direction, while the motions are constrained to the plane. A combination of Hooke's law, Coulomb's friction, rolling friction and hysteresis damping is used to model the interactions between contacting particles. A rolling resistance and a sliding resistance act at the contacts, both of which are governed by a spring up to a limiting Coulomb value of ‖ n ‖ and r R min ‖ n ‖ , respectively, where n is the normal contact force, R min is the radius of the smaller of the two contacting particles, and r are the parameters to control the sliding and rolling friction, respectively. Complete details of these simulations can be found in [37].
The four samples differ from each other on two aspects. First is the loading condition: CP1 and CP2 are subject to constant confining pressure while CV1 and CV2 are subject to constant volume. Second, two different values of rolling resistance apply: r = 0.02 for samples CP1 and CV1 and r = 0.20 for samples CP2 and CV2. These differences affect the shape of the persistent shearband in the specimens, as visualized by the accumulated buckling force chains in the spatial domain [37]. As shown in the supplementary file: CP1 has a single diagonal band spanning from the bottom-left to the top-right of the sample (Fig. 2); CP2 contains two 'V's overlapping each other in the lower-left of the sample (Fig. SIa); CV1 has two bands meeting in the lower mid-point of the sample to form a 'V' (Fig. SIb); CV2 has three bands forming the pattern 'V/' (Fig. SIc). These accumulated buckling force chains serve as the "ground truth" shearband in the analysis.

Results and discussion
For brevity, attention here is mostly paid to the results from CP1, since the main trends are qualitatively similar across all the samples (see the supplementary file). That said, we highlight where differences arise in the spatiotemporal dynamics, including those due to loading condition and rolling resistance.

Identification of SZ and DZ
The identified SZ and DZ from s-LID can be found in Fig. 3, along with the reference pattern generated from grain rotations P ( ) r which identifies the dominant strain localization structure. For example, at a , high rotation grains are spread throughout the sample, forming a collection of thin, incomplete, disconnected bands, which resemble the systemspanning criss-crossing microband pattern. In contrast, from peak stress to failure ( df ), grains with relatively high rotations are seen to concentrate in the location of the shearband, albeit more noise points are picked up by P ( ) r compared to the ground truth shearband (Fig. 2). The pattern from s-LID, on the other hand, is capable of identifying multiple concurrent strain localization patterns: criss-crossing microbands can be observed for small neighborhood size s in the early pre-failure regime (e.g., s = 225, a ), while the shearband emerges for large s in the failure regime (e.g., s ≥ 900, e ). The role that grain rotations play in the development of strain localization is well-recognized [5,7,10,37,41,[66][67][68][69]81]. A recent study which completely suppressed grain rotations, however, underscored that grain rotations are a sufficient but not a necessary condition for strain localization [25]. Consequently, as discussed earlier, we home in on the dominant pattern at each strain state using P ( ) r . That is, we treat P ( ) r as the reference for selecting the critical neighborhood size s * for s-LID and the corresponding dominant s-LID pattern, P ( ) s * . Black boxes in Fig. 3, and the zoom-in images of SZ and DZ in the pre-failure regime in Fig. 4, highlight P ( ) s * . Notice that P ( ) r is noisy and incomplete in its representation of microbands and shearbands. Throughout loading history, over 70% of high rotation grains are captured by P ( ) s * . However, compared to the fuzzy pattern presented in P ( ) r , a far better partition of the spatial space can be seen in P ( ) s * . In the early stages, SZ forms the criss-crossing microband pattern that comprises a large number of short, thin, connected bands ( s = 225, a ). As loading history advances, the number of bands in SZ reduces while the band thickness gradually increases, ultimately forming the shearband pattern in the failure regime. On the other hand, DZ, the complement to SZ, comprises a number of small quadrilateral clusters whose member grains are well connected in the early pre-failure regime. With increasing strain, the number of spatial clusters in DZ drops, as the size of the clusters increases until peak stress, beyond which DZ essentially consists of two large spatial clusters separated by the shearband ( d − f ). The strain evolution of s * is informative of the evolving complexity in internal motions and corresponding regime change points. It is interesting to observe that the critical change point occurs not at peak stress but earlier at b , the onset of dilatation in the sample (Fig. 5). Prior to b , s * fluctuates at small values around 100, the kinematic scale that resolves microbands. At around b , s * sharply rises and thereafter mainly remains high, consistent with the more outlying motions observed in the shearband in the failure regime. This suggests that macroscopic dilatancy in the prefailure regime is governed by the dynamics of the developing shearband (transition from embryonic to the persistent shearband) where underlying grain motions are more complex thus leading to higher dimensionality than those seen in microbands.
It is interesting to consider the performance of the current method, designed for the purposes of distinguishing SZ from DZ, in relation to the well-known mesoscale structures that emerge inside the shearband, for example, vortices [11-13, 15, 28-30]. As can be seen from the patterns of P ( ) s * from d to f , the current method does not reveal the internal structure of shearbands: that is, the whole shearband is identified as one continuous domain of SZ (Fig. 3). This is because of our chosen featurestate-space which consists only of displacements. If one is interested in capturing vortices or other emergent mesoscale patterns where grains collectively rotate about a common centre, in a manner consistent with near-rigidbody rotation, then information on angular rotation angles Fig. 4 Examples of the identified SZ and DZ shown for a zoomedin area (black box) of the whole sample (grey area) at three different stages of the pre-failure regime. Proportion of SZ and DZ grains throughout the loading history Fig. 5 Change of s * throughout the loading history (if such information is available) may be incorporated as one of the features. Such a study and explorations into the internal structure of shearbands are outside the scope of this effort. The key point here is that the proposed method can be readily modified to capture different patterns of motion by changing the features that form the studied state-space. Here DSS serves the intended purpose of this study well. For instance, in Fig. 6, we show two vortices in similar locations at c (at pre-failure) and d (at onset of failure): s-LID identifies the former at pre-failure as associated with a small DZ inside the embryonic shearband, whereas the latter is classified entirely as SZ, suggesting that the target pattern is captured effectively.

Characterization of deformation
We observe a difference in the dynamics of motion in the identified SZ and DZ, which increases with strain, especially from the onset of dilatancy b (Fig. 7). Generally, the mean absolute value of displacement ⟨‖ ‖⟩ in SZ and DZ are positively correlated, initially increasing slowly with strain in the pre-failure regime, while rising sharply to a peak at unjamming stages when the stress ratio drops in the failure regime (Fig. 7a). However, the evidence in ⟨ ⟩ shows that the local fluctuating displacements in small spatial neighborhoods (a focal grain and its first ring of contacting neighbors) are essentially concentrated in SZ, while the motions in DZ are relatively uniform albeit with a slightly higher ⟨‖ ‖⟩ (Fig. 7b). DZ grains are in near rigid-body translation, with both particle rotation ⟨‖ ‖⟩ and local fluctuating rotations ⟨ ⟩ being essentially confined to SZ (Fig. 7c, d). These trends suggest that local nonaffine strain, nonaffine curvature and energy dissipation are mainly from SZ, consistent with past findings from simulations and experiments [28,32].

Characterization of the contact network
The evolution in the contact topology of SZ and that of DZ are correlated, with DZ having higher connectivity as reflected in higher ⟨c⟩ and 3-core likelihood across all stages of loading (Fig. 8a, b). Though pre-failure microbands and shearbands at failure can be detected in the spatial distributions of low c (Fig. S13a), the distinction between SZ and DZ is not as clear as those from s-LID.
Results on connectivity from the 3-core of the contact network show that initially ( a , b ) , over 80% of DZ grains lie in the 3-core subgraph (Fig. 8b). The likelihood of SZ grains being located in the 3-core is down by about 15% ( a , b , zoomed-in, Fig. 8d). From b onwards, the difference in the 3-core likelihood between SZ and DZ is enhanced gradually. Prior to peak stress state ( c − d ), less than half of SZ grains remain inside the 3-core, whereas more than two-third of grains in DZ preserved their 3-core membership. A dramatic drop to around 10% in the 3-core likelihood can be found in SZ at the peak stress state. From then, the shearband area in the main forward diagonal of the sample becomes devoid of 3-core grains, while the remaining 3-core occupies half of the DZ space ( e , f , Fig. 8d). This corroborates earlier trends reported in [78] during the strainsoftening regime. However, different from [78], the analysis conducted here is applied over the entire loading history, which highlights that throughout loading history, DZ forms the rigid backbone of the material with high connectivity and exhibiting more locally affine motion. This raises interesting potential connections to rigid clusters [51].
An interesting question that we now address is the robustness of emergent rigid structures to different forms of strain localization dynamics. Figure 8c shows that the 3-core can manifestly withstand microband dynamics, and prevail as a single connected component until the shearband is fully formed. Thereafter, the 3-core breaks apart into multiple components which remain essentially disconnected and separated by the shearband for the rest of loading history. Accordant spatial repulsion dynamics between the persistent 3-cycles outside of the SZ and 3-cycle birth-death events inside SZ can be observed in this regime, in marked contrast to the preceding pre-failure regime where this repulsion in entirely absent [30]. We next explore the causes of this through the lens of force transmission.

Characterization of force transmission
We investigate the difference between SZ and DZ based on the force transmitted at contacts relative to a measure of the contact capacity. The dynamics in the average contact force magnitude ⟨‖ ‖⟩ within each group is reported in Fig. 9a. In general, ⟨‖ ‖⟩ correlates well with the macroscopic stress, as expected. Consistent with the trends reported earlier in Sects. 5.2.1, 5.2.2, a progressive separation between SZ and DZ again manifests from the onset of dilatancy, b , through to peak stress.
Additionally, it can be seen that the IB group of contacts show consistently higher ⟨‖ ‖⟩ than DZ throughout the loading history. Note that the relatively low mean force magnitude in DZ is mainly due to its large population compared to SZ and IB (Fig. 9b). To get a better sense of the difference in the underlying dynamics of force transmission in each zone, we turn our attention to the average congestion index ⟨ ⟩ (Fig. 9c) and the visualizations of in different contact groups (Fig. 9d).
Similar to the trends we saw earlier in Fig. 7b-d, much higher ⟨ ⟩ can be found in SZ, compared to that of DZ. This large separation in ⟨ ⟩ between the two zones initiates from b and is further enhanced in the lead up to peak stress ratio. The ⟨ ⟩ in the IB group of contacts (yellow curve, Fig. 9c) correlates well with that for SZ (red curve, Fig. 9c). In fact, it is even slightly higher at several stages, suggesting that there is a higher likelihood of congestion in these IB contacts. This in turn leads to bottlenecks that can impede force transmission across the DZ-SZ interface. Essentially similar dynamics manifest in real sand, bonded grains in concrete samples and at the field scale, offering great potential for early prediction of failure from grain motions to ground motion data [19,35,38,40,48].
Global congestion increases with strain, as the number of highly force-congested contacts grows predominantly in SZ (Fig. 9d). The proportion of contacts with above mean congestion index increases with strain in SZ and IB but decreases in DZ (data not shown). This corroborates earlier studies that showed force chain contacts inside SZ continually become predisposed to collapse [32,37,44]. More importantly, we see here a mechanism that underlies the difference in the dynamics of pre-failure SZ versus SZ at failure. This mechanism is path redundancy [38,48], which corresponds to the system's ability to redirect forces to alternative pathways or contacts. In the stable pre-failure regime, path redundancy is relatively high, since grains are redundantly constrained (hyperstatic) [31]; recall Fig. 8a, b. In these stages, the system can redirect forces to alternative paths to avoid contacts reaching their limiting force capacity (i.e., the force needed to break apart the contact). Hence these force reroutes give the system resilience to the microband dynamics (Fig. 8c), which is governed by contact rolling and sliding [66]. By contrast, at failure, the sparse connectivity in shearbands means many member contacts are either isostatic or hypostatic. This makes them prone to force congestion (Fig. 9c, d) and ensuant cascading failure, especially those in force chains, resulting in intense acoustic emissions [20,80].

Insights on controls on plasticity
Here demonstrates how the method and knowledge gained so far can be used to deliver new insights on controls on plasticity at the microscopic and macroscopic scales by focussing on rolling resistance at contacts and loading conditions, respectively. The aim is to understand the relative influence on SZ versus DZ of each of these factors. As shown in the supplementary file, the dynamics in SZ and DZ are qualitatively similar across all the other which differ from CP1 in rolling resistance (CV2 and CP2) or loading condition (CV1 and CV2).

Influence of rolling resistance at contacts
The key influence of rolling resistance is threefold. First, although rolling resistance suppresses grain motions overall, it bears a stronger impact on SZ than DZ (Fig. 10). The SZ and DZ identified by s-LID are rendered in the background in red and blue, respectively. Each contact is visualized as a black line, whose thickness stands for its congestion index: the thicker the line, the more likely the contact is congested Fig. 10 Temporal dynamics in mean absolute rotation across samples with different rolling resistance and loading condition peak values of ⟨� �⟩ with respect to DZ in both CP1 and CP2 are similar ( ∼ 0.025 radians) and the same is true with CV1 and CV2 ( ∼ 0.01 radians): see Fig. 10-left. On the other hand, grain rotations in SZ are suppressed by more than half from CP1 to CP2 (Fig. 10-right) and by about one third from CV1 to CV2 (Fig. S7c v.s. Fig. S10c). Similar effects can be observed in the other kinematic descriptors (compare Figs. S4, S7, S10 to Fig. 7), which suggest that the large motions that develop in SZ can be greatly reduced by the increase in rolling resistance, while movements in DZ are less impacted.
Second, in terms of connectivity, the samples with higher rolling resistance has a more degraded contact network in the failure regime. As can be seen in Fig. 11, the 3-core likelihood in both samples show a gradual decline before reaching the stress peak. However, given the high rolling resistance, grains in CP2 are more locked together-CP2 can accommodate a much higher stress ratio (Fig. S1a) with a less connected contact topology (Fig. S5). On the other hand, it is also more vulnerable to "network attacks": for example, 3-core can be considered an attack since it removes grains with degree below 3 from the contact network. This in turn reduces the number of grains in the 3-core, as evident in the decline in the minimum value of p(3-core | DZ) (dashed curve, Fig. 11-left), along with the appearance of a larger number of disconnected components or 'islands' (Fig. S5c, d). These are consistent with earlier findings on the connectivity of the grain contact network [78] and its complementary pore space network [45] for these samples.
Thirdly, a higher rolling resistance improves the force capacity of the contacts in CP2, which explains why CP2 can transmit higher forces on average (Fig. S6a), while being less congested (thinner lines with lower spatial density in Fig. S6d).

Influence of loading condition
A key feature of the CV samples is the rise of the stress ratio to a peak that manifests a prolonged plateau before decreasing to its residual value (Fig. S1b, c). In the CP samples, there is a distinct stress peak (Fig. 2, Fig. S1a). Our analysis sheds light on the mesoscale structures and deformation that underlies this difference in the macroscopic stress. For example, different from the CP1, after the initial decline in the average subgraph centrality ⟨c⟩ of DZ in CV1, ⟨c⟩ of DZ stays at its minimum for a long period (during the same strain interval where the stress ratio plateaus), before a recovery in the connectivity leading to a higher centrality in the failure regime (Fig. S8a). This recovery is essentially absent in SZ (Fig. S7a, S11a), where interconnected and larger voids develop [5,45]. Thus, under constant volume, the formation of large voids in SZ leaves less space for DZ to occupy, resulting in DZ having a more connected set of force transmission paths.
Additionally, changing the loading condition from constant confining pressure to constant volume uniformly demotes the grain motions of CV samples in the critical failure regime. Compared to CP1, the grain rotations and other kinematic properties in both SZ and DZ of CV1 are reduced jointly (Fig. 10). Specifically, the peak value of ⟨� �⟩ decreases from 0.025 to 0.013 in DZ and a similar extent of rotation suppression can be found in SZ (from 0.175 to 0.088).

Conclusion and future outlook
We used concepts from representation learning to develop a new data-driven framework to identify and characterize the coupled dynamics of emergent dead zones (DZ) and strain localization zones (SZ) across all stages of deformation history. This framework, with the newly introduced metric s-LID at its core, delivered new insights into the dual structure of granular deformation and its dynamics. Compared to those in SZ, grains in DZ move collectively in near rigid-body motion, lie mostly in the 3-core of the contact network where redundant force transmission paths exist and thus less forcecongested contacts. Grain properties and loading conditions influence SZ and DZ differently. Increasing rolling resistance suppressed motions more significantly in SZ than those in DZ. Loading-wise, the samples under constant volume exhibited a distinct recovery in their internal connectivity close to failure, but this is mainly confined to DZ. How the coupled dynamics of SZ and DZ adapt to changes in other material properties