Introduction

Ending all forms of violence against women and girls (VAWG) is a sine qua non condition for achieving gender equality and woman empowerment worldwide. VAWG occurs in different spheres of life, including work, neighborhoods, and school, but it is more frequently observed in the context of intimate relationships [1,2,3].

The urgency of addressing this problem has increased during the COVID-19 pandemic, as women's preexisting vulnerabilities have been further exacerbated by confinement measures, which have led to an increase in the amount of time family members spend together at home [2]. Furthermore, the access to and use of protective services assisting women, such as emergency shelters and refuges, health institutions, and social networks, have also been disrupted during the pandemic [4]. This alarming situation puts the identification of the risk factors for intimate partner violence (IPV) victimization at the center of the debate.

The general purpose of this study is to contribute to the understanding of IPV by identifying and describing the extent to which a comprehensive set of risk factors are associated with IPV victimization. However, examining IPV presents two key challenges. First, most related studies have focused solely on analyzing individual and relationship levels while overlooking the significance of community and societal factors in relation to IPV [5, 6]. Second, IPV is a multifaceted social problem with numerous potential covariates. Additionally, the links between each of these covariates and IPV may be described by various alternative effects, including linear, nonlinear, or interaction effects, which cannot be established or assumed a priori. This complexity, in terms of the extensive range of potential covariates and their multiple alternative effects, may result in a high-dimensional system of equations that is difficult -or impossible- to solve using traditional inference methods applied in IPV studies.

To overcome these challenges, this paper proposes a research strategy that involves creating a large dataset of more than 35,000 observations by integrating microdata from the 2016 Mexican Survey on the Dynamics of Household Relationships (ENDIREH) with information from nine other sources. The dataset contains 42 theoretical categorical and continuous covariates that describe the individual, relationship, community, and societal levels of the ecological approach. To account for the different potential effects of these covariates, a model with an additive structure was proposed [7, 8]. However, due to the resulting high-dimensional setting, traditional inference methods cannot be used. Therefore, the boosting algorithm is employed to simultaneously estimate and select variables and choose the best model [9,10,11].

This paper relates to the scientific literature on the factors associated with experiencing IPV in Mexico. This research contributes to this literature by examining the physical IPV from a multilevel approach, considering individual, relationship, community, and societal levels of the ecological model. Methodologically, the contribution of this paper is the introduction of algorithm-based regression models able to deal with big and high-dimensional data into study of IPV. These models, although more complex, are more realistic since they enable us to identify a wide class of effects -linear, nonlinear, random, spatial, and interaction- and derive a sparse and parsimonious model from which researchers can draw conclusions in the same way as if the data had been fitted by conventional regression models.

The ecological model of IPV

Under the ecological approach, factors associated with IPV are observed at four different interacting levels: the individual level, the context of the intimate relationship, the community, and the society [12]. The interaction of variables at these levels is key because no single factor can explain IPV. Instead, the combination of many of these factors determines women's risk of victimization [12]. To briefly discuss the risk factors most consistently found to be significant in the literature, the following paragraphs present some findings at each level of the ecological model. This literature review will then be used to propose variables for analyzing the case of Mexico.

Individual-level factors

Personal sociodemographic, biological, and psychological characteristics of individuals in an intimate relationship, along with their previous experiences of victimization and perpetration, influence the likelihood of becoming a victim -and perpetrator- of IPV.

Regarding factors that increase a woman's vulnerability to victimization, it is widely recognized that women are more likely to suffer from IPV if they are young, face economic hardship, have a low educational level, have a personal history of abuse, and/or accept unequal gender norms [12,13,14]. Empirical support for these findings is not limited to multicountry studies [15,16,17]; rather, it includes country-specific analyses. For example, in Uganda, risk factors for IPV were found to include early initiation of sexual activity, limited educational attainment, younger age, a history of sexual abuse during childhood or adolescence, and acceptance of violence [18]. Similar results were reported in an examination of the Ethiopian case [19]. A study on the prevalence of IPV in Norway revealed that women with unsafe living conditions and a history of relationship issues are more likely to experience IPV [20]. More evidence in this respect is available in studies conducted in Ecuador and Ghana [21], Canada [22], Turkey [23], and the West Bank, Palestine [24].

On the other hand, regarding the partner's individual characteristics -perpetration risk factors-, existing studies have shown that IPV is more prevalent in relationships where the male partner has a low education level, low socioeconomic status, controlling behaviors toward the woman, previous exposure to abuse, personal history of violence perpetration, harmful use of alcohol and/or drugs, and low gender-equitable attitudes [12, 13, 15, 25,26,27]. For instance, a study in Vietnam revealed that partner's behaviors supporting male domination, alcohol misuse, and witnessing violence during childhood were risk factors for perpetrating IPV [28]. Additionally, a study in Sweden determined that partner migration background is a relevant factor altering women's IPV victimization risks [29]. Further studies corroborating these results can be found in [30] with data from Sub-Saharan African countries, in [31] focusing on Turkey, and in [32] examining the Chinese context.

Relationship-level factors

The second level of the model focuses on the woman's interpersonal relationships within her family -with other household members- and with social peers, as well as her relative situation with her partner.

Regarding the woman's relationship with her family, studies suggest that households with traditional gender roles, characterized by an unequal division of household chores, restrictive gender expectations, and the prioritization of male authority within the household, as well as overcrowding, are factors that increase the risk of IPV. These findings were reported in a study conducted in Chicago [33], and similar conclusions were documented in three autonomous Spanish communities [34], in Lagos, Nigeria [35], and Egypt [36]. Additionally, evidence also indicates that the woman's closest social circle, including peers, influences the occurrence of IPV. In Tanzania, it was suggested that peers impact IPV through the internalization and pressure to conform to peer network norms, as well as their direct involvement in couple power dynamics [37]. Furthermore, studies have indicated that victims of IPV often spend less time with friends, leading to a reduced support network within their social circle [38].

The relationship level also considers the woman's situation in relation to her partner, i.e. IPV victimization risks are not solely associated with a woman's individual-level characteristics; rather, her vulnerability depends on a combination of her own factors and those of her partner [12]. For instance, an analysis of Ethiopian data revealed that a woman’s education level is not a direct correlate of IPV; instead, the difference in education relative to men’s is significant [39]. Moreover, studies from European Union countries indicated that disparities between a woman’s economic status and that of her partner contribute to higher IPV risks [40]. Additionally, a significant age gap has been found to increase the likelihood of IPV in India [41] and Turkey [23].

Community-level factors

The third level of the ecological approach to IPV examines the various settings in which social interactions occur and aims to identify the contexts associated with IPV. Generally speaking, there is broad consensus that IPV is not evenly distributed in the geographical space but rather tends to be concentrated in communities characterized by high rates of violence -including that of gender based- and crime and with development issues such as inequality, poverty, and unemployment [12, 15, 42, 43].

In a study analyzing data from Samoa, it was found that, in addition to individual- and relationship-level factors, community characteristics, such as greater proportion of women engaged in household decision-making, as well as higher rates of employed men, were associated with lower levels of IPV [44]. Similarly, data from India indicate that the level of political involvement among women at the community level correlates significantly with marital conflicts and instances of violence, the overarching observation of this study was that increased representation of women in political spheres at the district level is associated with higher risks of physical IPV experienced by women [45]. Community dissimilarities in urbanization and education were identified as contributing factors to the likelihood of experiencing IPV in Ethiopia [19]. This evidence has also been found in a survey in Athens, Budapest, London, Östersund, Porto, and Stuttgart [46] and in metropolitan areas in the U.S. [47].

Societal-level factors

At the societal level, specific conditions, including the prevalence of harmful traditional gender norms that support nonegalitarian beliefs, corruption, limited access to quality public services, and high levels of criminal activity, can influence IPV victimization risks.

Research in [48], revealed that greater collective tolerance towards violence in the Niger Delta, compared to other regions in the country, was linked to a significantly higher prevalence of IPV. In India, it was found that societal factors such as attitudes towards mistreatment and standards of living increase individual IPV risks [49]. Studies in northern Uganda suggest that IPV results from structural violence in conflict-affected areas [50]. Supporting this perspective, an analysis of the Afrobarometer data shows that experiences of corruption are associated with individual permissive attitudes towards violence against women, consequently heightening the likelihood of IPV [51].

In addition to the previously mentioned key findings, studies underscore the significance of considering the interaction among multiple factors. For example, a study in Spain revealed that unemployment is an IPV risk factor for adult women but not for younger or elderly women [52]. Another study found that the relationship between age and victimization risk varies between indigenous and non-indigenous Canadian women [53]. Additionally, in Argentina, the impact of age on IPV was found to vary depending on the respondent’s level of education [54].

The Mexican case has been examined using a multidimensional approach within the ecological framework, yielding consistent findings regarding individual- and relationship-level factors [55,56,57,58,59,60,61]. However, these studies predominantly rely on data from the ENDIREH, limiting the examination of community and societal levels. Furthermore, the conclusions drawn are based on models that do not account for variable selection or model choice, and alternative effects of covariates, such as linear, nonlinear, and interaction effects, are not considered. Previous studies have shown that nonlinear and interaction effects are significant in describing the association of several covariates with the likelihood of experiencing IPV [62, 63].

Methods

Variables and data sources

Dependent variable

The response variable in this study measures women’s experiences of victimization from various violent physical acts committed by an intimate partner, including instances where their male partners pushed, pulled their hair, slapped, smacked, tied them up, kicked, threw objects at them, hit them with a fist or object, attempted to strangle or suffocate them, attacked them with a knife or blade, and/or shot them with a firearm [64].

Data is sourced from self-reported responses to a 2016 ENDIREH question, which asked partnered women and girls about their victimization experiences within their current relationship over the past 12 months -from October 2015 to October 2016-. Responses to this question could be “many times”, “sometimes”, “once”, or “never” and were subsequently recoded as “yes” or “no” to create a binomial response variable. Given that the survey does not precisely define the frequency of “many times” or “sometimes” -leaving it to the respondent’s discretion- dichotomizing the variable helps reduce measurement error. In this way, focusing the study on the likelihood of experiencing physical IPV simplifies the interpretation of the findings and facilitates comparisons between two mutually exclusive population subgroups: women who were victims of physical IPV in the previous 12 months and those who did not experience physical IPV during the same period. See [64] for details.

Independent variables

Based on the ecological model and aiming at collecting information on a wide range of potentially associated factors at the individual, relationship, community, and societal levels to properly characterize the victims, perpetrators, and contexts of victimization, different official sources were identified and used to obtain as much information as possible. The full list of potential covariates included in this study is shown in Table 1.

Table 1 List of covariates by level, type, effect, and source

As shown in Table 1, information at the individual and relationship levels is sourced from the 2016 ENDIREH. For individual-level factors, the included variables aim to capture women’s demographic information, their socioeconomic situation, personal experiences, and attitudes towards gender issues, alongside their partner’s primary individual demographic and socioeconomic characteristics. Concerning the relationship level, variables related to the woman’s interpersonal networks and family are included. Additionally, the interaction between a woman’s and her partner’s individual characteristics is considered to describe their relative roles and statuses within the relationship. For example, while a woman’s age is an individual-level factor, when contextualized within her intimate relationship, it is introduced in interaction with her partner’s age. This interaction is consequently regarded as a relationship-level factor. This approach allows for an examination of whether the age gap between the woman and her partner is relevant in explaining IPV victimization risks, as found in [23] and [41].

At the community level, all the variables correspond to official estimates at the municipal level. Data from various sources is incorporated, including geographic information -geographic centroids for municipal polygons- and homicide statistics from the INEGI, poverty statistics and the municipal Gini index from the National Council for the Evaluation of Social Development Policy (CONEVAL), the municipal human development index estimated by the United Nations Development Program (UNDP), information from the 2015 Intercensal Population Survey, data from the National Population Council (CONAPO), and information from the 2015 National Census of Municipal and Delegation Governments (CNGMD).

Finally, at the societal level, variables are incorporated to reflect conditions at the supracommunity level. Data sources for this include the 2015 National Survey of Quality and Governmental Impact (ENCIG) and the 2016 National Survey on Victimization and Perception of Public Safety (ENVIPE). All these variables are aggregated at the state level.

After combining the abovementioned data from the different sources and levels into a single, unified dataset (see the Electronic Supplemental Material for a description of this data integration process), the data were prepared for the analysis (a description of this process can be found in the Electronic Supplemental Material). The final dataset consisted of 35,004 observations of women who were surveyed at the age of 15 years or older, were married or cohabitating with a male partner and had at least one child at the time of the survey. This dataset is freely available from Figshare at https://doi.org/10.6084/m9.figshare.22153463. More information on the variables can be found in the Electronic Supplemental Material and in the original sources [64,65,66,67,68,69,70,71,72]. The raw data can be accessed at www.inegi.org.mx, www.coneval.org.mx, www.conapo.gob.mx, and www.mx.undp.org.

Model

Let us consider a generalized additive model as in [7, 8], with variable \({y}_{i}\) following a \(Bernoulli\left({\pi }_{i}\right)\) distribution with probability \({\pi }_{i}\in \left[0, 1\right]\), indicating whether woman \(i\) was a victim or not of physical violence within the intimate relationship in the last 12 months (\(1=True\)), for \(i=1,\dots ,n\) observations. By inserting the covariates from Sect. "Model" into a probit additive model, we can formally express it as follows:

$$\eta_i = g^{ - 1} \left( {\pi_i } \right) = \, \beta_0 + \mathop \sum \limits_{j = 1}^{13} {\text{w}}_{ij}^{\prime} \beta_j + \mathop \sum \limits_{k = 1}^{26} s_k \left( {{\text{z}}_{ik} } \right) + \mathop \sum \limits_{l = 1}^4 \delta_l \left( {varying_l } \right) + \mathop \sum \limits_{m = 1}^5 \theta_m \left( {surface_m } \right) + \varphi \left( {sp_i } \right) + \varepsilon_i$$
(1)

For \(g\left({\eta }_{i}\right)={\pi }_{i}\), the standard normal cumulative distribution is used, \({\beta }_{0}\) is the model intercept, and \({\varepsilon }_{i}\) are the standard normal errors. The structure of the additive model proposed has five components, each of which is related to a different effect based on the type of variable and previous findings from research:

  • Parametric effects: \({\sum }_{j=1}^{13}{\text{w}}_{ij}{\prime}{\beta }_{j}\) is the model component for estimating the linear effects of the categorical covariates.

  • Smooth effects: \({\sum }_{k=1}^{26}{s}_{k}\left({\text{z}}_{ik}\right)\) is a model component that captures smoothing parameters for univariate continuous independent variables. Each of these continuous covariates is centered at the mean for achieving convergence [73]. One of the goals of this paper is to properly identify the functional form of the association between IPV and the covariates; consequently, both linear and nonlinear effects are considered modeling alternatives for continuous covariates. To incorporate this into the model, consider the decomposition \({s}_{k}\left({\text{z}}_{ik}\right)={\alpha }_{0}+{\alpha }_{1}{\text{z}}_{ik}+{s}_{k}^{centered}\left({\text{z}}_{ik}\right)\), where \({\alpha }_{0}+{\alpha }_{1}{\text{z}}_{ik}\) is a linear expression and \({s}_{k}^{centered}\left({\text{z}}_{ik}\right)\) is a smooth deviation from the linear form [11, 73]. Functions \({s}_{k}^{centered}\left({\text{z}}_{ik}\right)\) are modeled as smooth p-splines with a second-order difference penalty and 20 equidistant inner knots [74]. From the decomposition of \({s}_{k}\left({\text{z}}_{ik}\right)\), three possible results can be derived: a nonsignificant effect, linear effect, and nonlinear effect.

  • Interaction effects: \(\sum_{l=1}^{4}{\delta }_{l}\left({varying}_{l}\right)\) captures the interactions between a continuous and a categorical variable. Four interactions are considered in this paper:

    1. o

      age of the woman by indigenous origin,

    2. o

      age of the woman by education level,

    3. o

      age of the woman at her first sexual intercourse by condition of consent, and

    4. o

      age of the woman at marriage or at cohabitation by condition of consent.

  • Surface effects: \(\sum_{m=1}^{5}{\theta }_{m}\left({surface}_{m}\right)\) incorporates the interaction between two continuous covariates and are modeled as bivariate p-spline base-learners. The following surface effects are considered:

    1. o

      age of the woman by age at first childbirth;

    2. o

      age of the woman by age at her first sexual intercourse,

    3. o

      age of the woman by age at marriage or at cohabitation,

    4. o

      age of the woman by age of the husband or partner, and

    5. o

      woman’s monthly earned income by husband’s or partner’s reported monthly earned income.

  • Spatial effects: \({\varphi }_{\tau }\left({sp}_{i}\right)\) introduces geospatial information, estimated by bivariate tensor product p-splines [11].

Moreover, to take into account the hierarchical data structure, in which individual observations are connected to the information of the municipalities, and in turn to the state information, cluster-specific random intercepts are introduced. In addition, to ensure the representativeness of the data, sampling design and weights from the ENDIREH are used in the model.

The model in Eq. (1) cannot be estimated via traditional inference models due to its inherent high dimensionality and complex structure. To overcome this issue, the following strategy was used. First, the boosting algorithm is used for model optimization [10, 75]. This approach combines estimation with simultaneous variable selection and model choice. The algorithm iteratively selects only the best fitting effect in each step. In this paper, 2000 initial boosting iterations are performed with a shrinkage parameter of 0.5 [76]. To achieve unbiased selection of influential variables and model choice, one degree of freedom is assigned to every alternative effect, aiming to make them comparable in terms of flexibility [9, 11]. To avoid overfitting, cross-validation is applied to find the optimal number of iterations. Consequently, as discussed in [73], multicollinearity problems are also prevented.

Thereafter, after this initial selection of variables at their appropriate functional form, complementary pair stability selection with per family error rate (PFER) control was performed to avoid falsely selecting variables. For this purpose, a cutoff of 0.8 is set; i.e., to be considered a stable effect, at least 80% of the boosting fitted models had to be selected. Given the number of covariates in the full model and their alternative effects, this cutoff of 0.80 corresponds to a PFER with a significance level of 0.0425. See [77, 78] for details.

Finally, 95% confidence intervals for each of the effects selected via stability selection were calculated by drawing 1000 random samples from the empirical distribution of the data using a bootstrap approach based on pointwise quantiles [73].

All computations were executed in the R package “mboost” [76]. The code and data used to replicate this study are available from Figshare at https://doi.org/10.6084/m9.figshare.22153463.

Results

In the data, it is observed that approximately 8.3% (2,892 respondents in the sample) of the surveyed women experienced some form of physical violence within the relationship in the 12 months prior to the interview. After the boosting algorithm was applied to these data, 3970 iterations were required to optimize the model according to Eq. (1). Subsequently, once the stability selection approach was used, nine covariates were found to be significantly associated with physical IPV victimization.

It is important to note that, as in traditional regression models, the coefficients in Table 2 indicate the direction and strength of the covariate effect on the response. To transform the coefficients into the probability of experiencing physical IPV, the cumulative standard normal distribution must be used. A significant effect is found when the corresponding 95% confidence interval does not contain zero. For categorical variables, estimates show the difference in comparison to the reference category. For the remaining selected effects, refer to the corresponding figures.

Table 2 Estimates of the boosting additive probit model for physical IPV
Fig. 1
figure 1

Physical IPV victimization risk and women’s age at first sexual intercourse by consent. The bold lines indicate the expected value whereas the area within the dashed lines represent the 95% confidence intervals

Fig. 2
figure 2

Physical IPV victimization risk and women’s age at marriage or at cohabiting by consent. The bold lines indicate the expected value whereas the area within the dashed lines represent the 95% confidence intervals

Fig. 3
figure 3

Physical IPV victimization risk and average number of household members per room in the dwelling. The bold line indicates the expected value whereas the gray area represents the 95% confidence interval

Fig. 4
figure 4

Spatial effects of physical IPV victimization risk

Fig. 5
figure 5

Physical IPV victimization risk and share of senior positions in the public administration held by women. The bold line indicates the expected value whereas the gray area represents the 95% confidence interval

At the individual level, only one variable was found to be significantly correlated with physical IPV: women’s age at first sexual intercourse by the condition of consent. This study found that women who initiated sex during childhood are at a higher risk of victimization compared to those who initiated sex as adults, as illustrated in Fig. 1. The data reveal a consistent pattern: the risk of victimization steadily decreases as the age at which a woman first had sex increases. No significant differences are observed based on the condition of consent -95% confidence intervals overlap-. Therefore, the expected risk of victimization is consistent regardless of whether the first sexual experience was consensual or not.

At the relationship level, physical IPV was significantly associated with six distinct factors. These factors encompass three key dimensions: the woman's relative role within the relationship, her family environment, and her relationships with peers and friends. Specifically, concerning the woman's role within her intimate relationship, a significant correlation between physical violence and age at marriage or cohabitation was found only among women who agreed to marry or move in by choice (Fig. 2). The effect is depicted as an inverted U-shaped curve reaching a maximum at about 18 years of age with a probability of approximately 57% of becoming victims of physical IPV.

At the relationship level, a second significant finding related to the woman's relative role within the intimate relationship was the importance of her autonomy in making decisions about her sexual and professional life as a factor for physical IPV, as shown in Table 2. Women with a moderate level of autonomy experience a reduced risk of victimization compared to those with low or high levels of autonomy, with decreases of approximately 9.3 and 16.2 percentage points for professional and sexual autonomy, respectively.

In terms of household characteristics at the relationship level, both the number of members in the family and the distribution of housework among them are found to be significantly associated with physical IPV. Notably, regarding overcrowding, as the number of household members grows, the likelihood of suffering from physical violence in the context of intimate relationships increases at a rate of 1.6 percentage points per member per room in the dwelling, as shown in Fig. 3.

Furthermore, concerning the division of household chores as a significant household factor linked to IPV, it was found that women in families where all housework is solely performed by male members are approximately 8.6 percentage points less likely to experience physical IPV than those women in families with other housework arrangements, as indicated in Table 2.

Concerning the woman's relationship with peers and friends as a relationship-level factor, it was found that, on average, women with a moderate level of perceived support from social networks have a 9.7 percentage points higher probability of victimization compared to women with low or high levels of support, as shown in Table 2.

Finally, at the community level, two factors were found to be associated with physical IPV. In particular, it was found that physical IPV is not a phenomenon homogeneously distributed in geographic space -see Fig. 4-. It is observed that women and girls living in municipalities located in the central region of Mexico are on average more likely to suffer from physical violence perpetrated by their partner than women living in other regions. Lower probabilities of physical victimization within intimate relationships are observed for women living in municipalities in northwestern Mexico.

Furthermore, another community factor relevant to physical IPV was women's participation in municipal government, as shown in Table 2. Specifically, the risk of victimization increases with the proportion of senior positions -Municipal President and heads of Municipal Secretaries- held by women, as illustrated in Fig. 5. In communities where all senior positions are held by men, the likelihood of a woman experiencing physical IPV is approximately 4.5 percentage points lower than the risk for a woman living in a community where 50 percent of these positions are held by women.

It is important to note that no factor at the societal level was found to be relevant for physical IPV.

Discussion

The abovementioned results indicate correlations between the covariates and women’s probability of suffering physical violence in the context of her intimate relationship. These significant correlations do not necessarily imply a causal effect; nevertheless, key insights can be obtained, and possible explanations can be drawn.

Firstly, at the individual level of the ecological model, only women’s age at first sexual intercourse was found to be significantly associated with physical IPV. Findings suggest that women who initiate sex during childhood, even if they do not consider it an unwanted or coerced experience, are a particularly vulnerable population to physical IPV in Mexico. This result is consistent with previous studies [2, 12, 18]. Moreover, this indicator is particularly important in gender studies as it reflects broader issues of gender inequality, power dynamics, and vulnerability to abuse [44, 79, 80]. It is crucial to recognize that the involvement of a child in sexual activity is a form of abuse, as the child is not able to give informed consent and is not emotionally or physically prepared for it. Consequently, sexual initiation during childhood can negatively impact individuals’ personality and emotional state, affecting their self-esteem, perception of healthy relationships, and behavior [79, 80]. Thus, sexual experience during childhood becomes a vulnerability factor for revictimization at later stages in life.

The relationship level emerged as the ecological level with the most significant variables associated with physical IPV. This could be underscoring the importance of interpersonal dynamics -encompassing those with the intimate partner, family, and social peers- in understanding and addressing physical IPV. Specifically, concerning the woman's relative role within the intimate relationship, the effect of age at marriage or cohabitation for those who consented to marriage is described by an inverted U-shaped curve. One potential explanation for this pattern is that women who marry or cohabit during childhood, a form of violence against girls, may be more inclined to justify their partner's violent behavior and adhere to traditional gender roles, thereby making them less likely to report IPV. This explanation finds support in studies conducted in Pakistan [81] and Ethiopia [82], which identified a more tolerant attitude towards wife-beating among women married as children, generally with a limited decision-making power. The curve reaches its peak for women who marry in adolescence. As reported in [83] this could suggest that adolescent marriages may be characterized by greater IPV due to increased antisocial behaviors such as alcohol use, disagreements, and jealousy, serving as a causal mechanism for future acts of violence. This dynamic within the relationship may persist into subsequent phases of life. The descending side of the curve indicates that as the age at marriage or cohabitation increases, the risk of victimization decreases. This decrease could be attributed to the fact that victimization risks may diminish as individuals mature and acquire greater social, emotional, and economic resources, as explained in [84].

Furthermore, a second significant finding at the relationship level highlighted the importance of a woman’s autonomy in making decisions about her sexual and professional life. Interestingly, the victimization risk of women with high levels of sexual and professional autonomy does not differ significantly from that of women with low decision-making power. According to previous studies [85, 86], this association suggests that, compared to low autonomy as a reference, as women's decision-making power increases to a medium level, the victimization risk decreases by approximately 16.2 percentage points for sexual autonomy and 9.3 percentage points for professional autonomy. However, as women's agency in these areas increases from a medium to a high level, their male partners may seek to exert domination and power over the woman in other aspects of the relationship, leading to an increased risk of physical violence -a phenomenon known as “male backlash” [86]-. Consequently, this escalates the probability of women's victimization.

At the relationship level, concerning household characteristics associated with IPV, it was observed that women living in overcrowded households face a higher risk of physical IPV victimization. The lack of sufficient living space for family members can intensify stress, tension, and conflicts among household members, thereby increasing the likelihood of physical violence by a partner against the woman [35]. Additionally, the division of household chores among family members was identified as another family characteristic linked to IPV victimization risks. The findings indicate that in households where housework is exclusively performed by women, the risk of IPV victimization is higher. As explained in [36], such households likely adhere to more traditional gender norms, which manifest in the use of violence as a mechanism of control by men over their partners.

At the relationship level, not only the interaction with the intimate partner and the family are associated with IPV, but also the woman's relationship with peers and friends. About the variable social networks, defined as woman's perception about having support from peers and friends, as described by [38], a potential causal mechanism leading to IPV could be that as women improve their social connectedness, conflicts and disputes with their partners first increase, increasing their probability of suffering from physical IPV. Once a medium level of social connectedness is surpassed, the likelihood of IPV victimization declines to its initial level, perhaps due to men's acceptance of their partner's social interactions or to a positive influence from friends on the woman, resulting in a decrease in the likelihood of accepting violence.

Lastly, community characteristics were also found to be associated with physical IPV. It is important to remark, that these characteristics were not related to economic, demographic, or security conditions of the community, but exclusively to the women's role and participation in the Municipality and the geographic distribution of IPV, which might indicate the nature of this phenomena and the community conditions that influence the intrahousehold dynamics. Specifically, the risk of physical IPV is disproportionately concentrated in central Mexico, while women in northwestern municipalities have a reduced likelihood of experiencing physical victimization in intimate relationships. This geographic distribution of physical IPV risks may reflect the spatial variation of other related factors, particularly those associated with gender and development. For instance, overlaying indicators from the Gender Atlas produced by INEGI on the map in Fig. 4 reveals that lower rates of women's informal labor and a smaller percentage of the female population living in multidimensional poverty are situated in the northern regions of Mexico, whereas higher rates are found in the central-southern region. Additionally, higher average schooling grades for women are observed in the northern states of Mexico, with lower rates in central and southern Mexico [87].

Furthermore, regarding the association between the risk of victimization and the proportion of senior positions -Municipal President and heads of Municipal Secretaries- held by women, it can be seen as a two-way relationship. On the one hand, a more active role of women in the political public sphere may lead to more tensions and disputes in private life, specifically in the context of intimate relationships. This causal effect has been previously found in India [45]. On the other hand, since physical violence is one of the most visible faces of IPV, more concerns about this issue are observed in municipalities in which women and girls have greater probabilities of being victimized. As a consequence, more women are involved in public decisions to fight against IPV in these municipalities.

No factor at the societal level was found to be significantly associated with physical IPV. This could indicate that elements closer to the individual, such as her interactions with peers, family, and partner, may be more relevant in understanding and addressing physical IPV than broader societal factors. It suggests that societal-level variables might not directly influence the occurrence of physical IPV in Mexico.

Conclusions

To determine the factors linked to women’s likelihood of suffering from physical violence by their intimate partners, we applied a boosting additive model with a binomial response variable to a dataset composed of more than 35,000 households. To properly describe the risk factors associated with physical IPV, following the ecological model approach, we introduced a set of 42 potential covariates by integrating data from nine different sources, including surveys, administrative records, and censuses, and incorporating information at the individual, relationship, community, and societal levels.

From a methodological point of view, by applying the boosting algorithm to the high-dimensional data structure of the model, we were able to automatically select, across the multiple covariates and modeling alternatives, those variables found to be significantly correlated with the probability of physical IPV victimization without establishing a priori a particular functional form. In this way, not only linear relationships but also nonlinear and interaction effects were selected.

The results contribute to the study of physical IPV in Mexico in four ways. First, the findings call for the importance of including variables at different levels of the ecological model and not restricting the analysis to individual and relationship factors, as has been done in most related studies. Second, a set of factors correlated with physical IPV is found, including age at first sexual intercourse, age at marriage, autonomy about one’s sexual and professional life, social networks, overcrowding, division of housework, and women’s participation in government. Third, some groups of people at particular risk of victimization are identified, such as women who had sex for the first time as children and women living in overcrowded households, which implies that comprehensive IPV prevention programming is needed to delay sexual initiation, protect girls and women from forced sex and forced marriage, and point out strategies to overcome families. Finally, this phenomenon is not homogenously distributed across the country; rather, higher probabilities of victimization are concentrated in the center of Mexico.

The main limitation of this paper is that the findings refer exclusively to associations between the covariates and the likelihood of suffering from physical IPV victimization, which does not necessarily imply causality. A second limitation regards the fact that the analysis covers a single year and that there could be specific features that do not apply to other periods. Additionally, this study exclusively analyzes the IPV experiences of married and/or cohabiting women with at least one child. Women in other relationship or family structures may face different risk factors for IPV. Furthermore, this study focuses only on physical IPV. Other forms of IPV, such as emotional, sexual, and economic violence, may have distinct risk factors due to their inherent nature and dynamics. It is also important to consider that this paper does not examine other features of physical violence, such as frequency and severity. Lastly, while survey studies offer valuable insights into the prevalence and correlates of IPV, it is key to recognize their inherent limitations. One such concern is the risk of non-disclosure bias, whereby individuals may be hesitant to report their experiences of victimization, particularly if they are still in an abusive relationship and fear potential repercussions from their partner. This potential bias could result in an underestimation of the true prevalence of IPV within surveyed populations. Readers must consider these limitations when interpreting the findings and applying them to broader populations or different types of IPV.