Poverty dynamics and graduation from conditional cash transfers: a transition model for Mexico’s Progresa-Oportunidades-Prospera program

The effects of conditional cash transfers (CCTs) on poverty and well-being have been widely studied. However, there is limited knowledge on how a CCT should respond to the dynamics of poverty. How should program administrators treat beneficiaries that exit poverty in period t-1, but exhibit a high probability of falling into poverty in period t? This is a relevant, yet unanswered question. This paper provides an analysis of the implications of poverty dynamics in the implementation of graduation strategies of CCTs, taking Mexico’s Progresa-Oportunidades-Prospera (POP) program as reference case. We propose a Markovian transition model that allows to control for unobserved heterogeneity, state dependence, and attrition. The model provides a framework for a generic graduation condition that can be applied to cash transfer programs that follow well-defined eligibility income thresholds. Overall, we find that only one-third of program beneficiaries that were poor in 2002 exhibited low probabilities of becoming poor in 2009–12 and therefore could be regarded as true ‘graduates’ of the program. We also find that the ‘recertification’ process of POP—which takes place every three years—would be more efficient if it took place every 3.7 and 5.1 years in urban and rural areas, respectively.


Introduction
Governments in developing countries have increasingly resorted to cash transfer programs to fight extreme poverty and vulnerability. The typology of these programs is complex and diverse in terms of design features, scope and policy objectives. 1 Since the early 2000s, conditional cash transfers (CCTs) have expanded rapidly in Latin America to cover, on average, 25% of households living in the region (Ibarrarán et al. 2017;Nino-Zarazua 2011). By design, these programs aim to tackle the intergenerational transmissions of poverty by providing income support to people in poverty in exchange for regular school attendance of children and periodic health check-ups of household members. Although in these programs the transfer is received by parents, it is assumed that their children will be better equipped to escape poverty in the future.
Poverty dynamics and eligibility criteria play a pivotal role in the entry to, and exit from, CCTs. A distinctive feature of CCTs is their use of complex systems of identification and selection of beneficiaries, based on categorical and geographical criteria, and proxy-means tests that identify households' poverty status based on in-situ interviews. 2 These systems of identification and selection of beneficiaries have the specific objective of improving the efficacy of poverty targeting while restricting the political manipulation of antipoverty policies. However, due to the costs involved in updating these information systems, the poverty status of beneficiary households at the entry point remains unchanged until a recertification process is conducted.
In Brazil, Colombia, Chile and Mexico, some of the pioneer countries that introduced CCTs to scale, the first generation of beneficiary households could have received support for more than ten years (Accion Social 2010; Barrientos 2013;Behrman et al. 2008;Neidhöfer and Niño-Zarazúa 2017). With such length of treatment duration, one might expect that the poverty status of beneficiaries, and possibly their behaviour, could have changed over time. In reality, CCTs usually adopt graduation strategies that are based on categorical criteria (for example, as long as eligible school-age children remain enrolled in school) and also on periodic eligibility assessments that determine the income levels upon which beneficiaries leave the cash transfer program. To date, there has been little recognition of the importance of poverty dynamics in the design and implementation of conditional cash transfers. 3 The change in the poverty status of program beneficiaries can reflect structural factors but also behavioural responses that impact negatively the dynamics of poverty. The literature of poverty dynamics and welfare regimes highlights that households with past exposure to welfare benefits are more likely to participate in transfer programs in the future (Andrén 2007). In the assessment of poverty transitions we would need to separate, among the recurrent poor, those that had been de facto affected by economic and social hardship from those whose behaviour had been biased towards poverty and subsequent eligibility for the transfers. The latter is what Sen (1995) calls 'incentive distortion' in which households change their behaviour in order to keep their eligibility status. 4 There is a large literature examining the underlying causes of state dependence, particularly in in Europe and North America. In Canada, for instance Hansen and Liu (2014), report that the probability of recurrence of welfare program participation can reach up to 1 For a typology, see Barrientos and Niño-Zarazúa (2011). 2 For example, the SIUBEN in Dominican Republic, the SELBEN in Ecuador or the FSU in Honduras. 3 To illustrate, in 2007 nearly 60,000 households were dropped from Colombia's Familias en Acción, because they had crossed the eligibility poverty threshold to remain in the program. Three years later 30% of those who had left the program fell back into poverty (Barrientos and Villa 2015a, b). 4 Kanbur et al. (1994) have explored the implications of labour incentives to program eligibility. 90%, in contrast with individuals who have never participated. Concerns about whether, and the extent to which, welfare regimes have generated present and intergenerational welfare dependency that undermine work incentives and entrepreneurship have been at the centre of scholarly work over the past half century (Weissberg 1970;Antel 1992;Gottschalk 1992;Maloney et al. 2003). In the presence of CCTs, the poor and vulnerable non-poor with previous program exposure could have incentives to change their behaviour to keep their benefits, relative to those eligible households that never participated in a CCT program. Under such conditions, poverty dynamics would exhibit, partly at least, state dependence whereby poverty and program participation can affect observed and unobserved heterogeneity that ultimately impact future poverty (Heckman 1981).
Issues about welfare dependency among the underclass, female-headed households and ethnic and racial minorities have been extensively studied (Darity and Myers 1983;Robins 1986;Harris 1993;Blume and Verner 2007). The literature highlights the role of family structures and norms (Lindbeck et al. 1999;Lindbeck and Nyberg 2006), substance abuse (Schmidt et al. 1998) and the generosity and length of welfare spells (Blank 1989;Hoynes and MaCurdy 1994;Bane and Ellwood 1996;Hu 1999;Fortin et al. 2004), as underlying factors behind undesirable behavioural patterns. 5 Although state dependence has been largely studied in high income countries, there are concerns that similar behaviours could also be observed in developing countries. 6 In the context of Latin America, however, where the size of benefits is considerably smaller, and their duration often much shorter, than in high income countries, evidence of state dependence from CCTs is weak. For instance, in Mexico (Skoufias and Di Maro 2008;Alzúa et al. 2013), Brazil (Barrientos et al. 2016) and Chile (Galasso 2006) no sizable negative effects of CCTs have been found on adult labour force participation, whereas in the Dominican Republic (Canavire Bacarreza and Vasquez-Ruiz 2013) and Colombia (Barrientos and Villa 2015a, b) studies even report small but positive labour supply effects.
Discussions on the duration of income support of CCTs and subsequent exit strategies should, nevertheless, not be constrained to the current poverty status of beneficiaries, and also account for possible changes in the incentives that transfer programs may generate. Some CCTs have adopted the concept of 'program graduation', which consist of detecting households that have moved up the income ladder and crossed the eligibility threshold to receive cash benefits. Mexico's Progresa-Oportunidades-Prospera (POP) program, for instance, conducts eligibility assessments (recertificaciones) of beneficiaries every three years. When households cross the eligibility threshold, program administrators either drop the household or reduce the level of benefits according to the household's predicted disposable income. 7 Ideally, program graduation should consider the non-recurrence of poverty (Munro 2008). However, the problem with this graduation approach is that it ignores the possibility of 'graduated' households exhibiting non-positive trajectories in their socioeconomic mobility. Vulnerable non-poor households are often at risk of becoming poor and, consequently, eligible to receive welfare benefits. How should CCTs respond to the conditions of vulnerability to poverty? More specifically, how should transfer programs treat 5 For reviews of the literature on welfare dependency, see Moffitt (1992), Penman (2006), and Barrett and McCarthy (2008). 6 Indeed, Narayan-  have found that in more than 15 developing countries, previous poverty experiences even in the most precarious conditions, is a positive predictor for future poverty. 7 A recent study of Mexico's Oportunidades (González-Flores et al. 2012)) shows thathouseholds that became ineligible through the recertification process were primarily because they had obtained durables or changed their composition. For a review of existing CCTs graduation practices, see Cecchini and Madariaga (2011) beneficiaries that exit poverty in period t-1, but exhibit a high probability of falling into poverty in period t? This is a relevant, still unanswered question.
This paper contributes to existing literature in two important ways. First, it conceptualises the implications of poverty dynamics in the implementation of graduation strategies of CCTs. We consider a hypothetical poverty trend of a typical household participating in a CCT and address the question of whether the program should end or continue support per household's current and prospective wellbeing status. Second, we estimate the likelihood of vulnerable but non-poor households becoming poor and hence eligible to receive a cash transfer. To achieve this objective, we adopt a Markovian model of multivariate normal probabilities that allows to estimate poverty dynamics and, eligibility to program treatment while accounting for three important factors: (i) unobserved heterogeneity determining poverty status; (ii) the possibility of selection bias associated with behavioural responses that can lead to state dependence; and (iii) potential bias arising from panel attrition. Our empirical approach help identify a 'graduation line' above which the non-poor would exhibit low probabilities of future poverty spells.
The conceptualization and empirical approach adopted in this study provide parameters for the assessment of graduation rules of CCTs in contexts where longitudinal household survey data or administrative records of transfer programs are available. 8 To illustrate the proposed method, we implement the empirical strategy using the nationally representative longitudinal Mexican Family Life Survey (MxFLS) and focusing on Mexico's POP program, which is the flagship anti-poverty program in the country and has been a general reference point for the replicability of similar strategies in other contexts (Nino-Zarazua 2011;Parker and Todd 2017).
POP is perhaps the most documented CCT, with extensive evidence of its impacts on a wide range of outcomes, including, inter alia, poverty and migration (Alcaraz and Alejandro (2012), consumption (Angelucci et al. 2012;Hoddinott and Skoufias 2004), health (Gertler 2004), nutrition and early childhood (Hoddinott et al. 2008). Several design features of POP have been subject to extensive analysis, including the targeting mechanisms (Coady and Parker 2009;Skoufias et al. 1999), transfer size (Levy 2006), and periodicity and general operations rules. Recent findings have demonstrated upward social mobility of children originally benefitted by POP 20 years ago (Kugler and Rojas 2018).
POP was launched in August 1997 under the name Progresa, with the aim of breaking the intergenerational cycle of poverty. The program provides income support to households in poverty bimonthly, together with nutritional supplements to young children aged four months to two years, and pregnant and lactating women. The cash transfer is given to female household heads in exchange for school attendance of children of school age, health checkups of household members and attendance to group meetings where health, hygiene, and nutrition issues are discussed. POP also provides school grants per child enrolled in primary, secondary and tertiary education (Levy 2006;Nino-Zarazua 2011;Skoufias et al. 2001). POP initially covered 300,700 households (about 1.6 million people) living in 6,344 rural municipalities. It rapidly expanded its coverage in rural, and after 2000, also in poor urban localities, and by 2002-the baseline period of our study-the program already covered 21.6 million households, or approximately 21% of Mexico's population under the name 8 Currently there are 41 developing and transition countries with household panel data suitable for the Markovian model proposed in this study. In addition, in Latin America alone, 19 countries have collected administrative records of beneficiary households of CCTs that contained valuable yet limited information. Administrative records can, under certain conditions, be matched with population and housing census data.
Oportunidades. In subsequent years, the program continued to grow although modestly, to approximately 22% of total population (see Fig. 3 in the Appendix).
The program is centrally run by a federal agency that identifies and selects program beneficiaries through a system that involves: (i) a geographical criterion for the selection of poor areas using a census-based marginality index; (ii) a categorical criterion identifying eligible households with women in reproductive age; and (iii) a proxy-means test for which an estimated income should fall below the food poverty line, officially known as the 'minimum welfare line' (MWL). Another line, the so-called capability line (CL) also determines eligibility of existing beneficiaries. The CL is the sum of MWL plus education and health expenses. Households are entitled to POP if their predicted disposable income falls below the MWL and they are graduated from the program if their predicted disposable income becomes higher than the CL (SEDESOL 2013).
POP's operation rules indicate that the recertification process should take place between the third and sixth year and between the third and fourth year in rural and urban areas, respectively. If the estimated disposable income is higher than the MWL but lower than the CL, then the income support is reduced over a pathway in the subsequent three years that is referred to as the 'differentiated support scheme' (DSS). During the DSS, households are given income support but without the amount corresponding to support the education of children in primary school. The role of the CL is critical for the graduation process, as any household with income above that level is dropped from the program. POP's operation rules implicitly recognize the menace of poverty dynamics by allowing graduated non-poor beneficiaries to re-join the program if they become poor in the future. However, so far, the response to poverty dynamics has been limited as this can only be done four years after graduation.
Thus, while the central objective of CCTs is to support human capital formation of children-which supports children's resilience to future poverty-the focus of this study is strictly on the eligibility of households to program treatment, which is determined by households' poverty status, and not the level of children's education or other wellbeing dimensions.
The remainder of this paper is structured as follows: Section 2 discusses the implication of poverty dynamics for the implementation of CCTs; Section 3 presents the transition model for an exit strategy. Section 4 describes the data and discusses the income calculation, covariates and instruments used in the empirical analysis, whereas Section 5 presents the results. Finally, Section 6 concludes with some reflections on policy.

Poverty dynamics and program implementation
Understanding poverty dynamics is critical in the context of antipoverty programs and, more critically, CCTs. The economic trajectory of participating households can exhibit poverty and non-poverty spells that influence their eligibility to receive program benefits. As cash transfer programs such as POP intend to focus on households below the poverty line, program administrators may incur in prospective exclusion errors by dropping beneficiaries who become non-poor after crossing the poverty line in time t-1 but fall again into poverty in time t.
Poverty dynamics in such situations can affect eligibility for program benefits. Over time, the poor could be broadly divided into two aggregate groups: chronic poor and transient poor. 9 Two different sub-groups can be identified within the chronic poor, namely, always poor and usually poor. The always poor experience persistent poverty without being classified as non-poor over a given period. Wellbeing improvements tend to occur gradually, whereas declines tend to emerge abruptly. The usually poor tend to fluctuate sporadically under and above the poverty line; 10 therefore, they can be regarded as ineligible to receive a transfer program for a short period. Similarly, the transient poor are those who escape poverty but can fall into poverty thresholds; they can be divided into churning poor and occasionally poor. The churning poor fluctuate below and above the poverty line in a seasonal pattern, especially in rural areas where households are reliant on seasonal food production (Dercon and Krishnan 2000). The occasionally poor, for most of the time, tend to be above the poverty line but can experience a poverty spell at least once throughout the course of life (Hulme and Shepherd 2003). Figure 1 above illustrates hypothetical scenarios of poverty dynamics and their implications for the implementation of CCTs. The vertical axis is divided by the poverty threshold that separates poor households or individuals from the non-poor. The horizontal axis represents the time line divided into several hypothetical cases according to the observed patterns of household income. The solid line indicates the income level of a household initially classified as never poor that falls below the poverty threshold later on. Household B would experience a transitory poverty spell, and become eligible to a cash transfer program for a given period. In contrast, descending household C might have experienced a negative shock that pushed it below the poverty line permanently. 11 Always poor household D, would be persistently under the poverty threshold, with constant eligibility to any targeted intervention. Usually poor household E is persistently poor in the long run, although it may experience a temporary non-poverty spell. A CCT would stop the transfer to a usually poor household at point a, ignoring the fact that it would eventually fall into poverty again at point 9 Jalan and Ravallion (2000) regard transient poverty as the varying component of consumption that can be mitigated by insurances or income stabilization schemes. Similarly, they define chronic eligibility as the nontransient component that remains once consumption is smoothed. They consider that chronic eligibility can be mitigated with long-term investments in human and physical capital. 10 This is also known as a saw-tooth trajectory (Davis 2009). 11 Baulch (2011) examines the causes of a household falling into chronic poverty. He identifies among the causes the lack of resilience to negative shocks (idiosyncratic or covariant), especially when the affected household is endowed with low levels of physical, natural, human, financial, or social capital.
b and become eligible for support once again. CCTs would incur in prospective exclusion errors if usually poor households were dropped at point a.
If the administrators of a CCT were able to estimate the transition probabilities of household income and determine whether a household was experiencing chronic or transient poverty, they could then prevent household E from being excluded from the program. An illustrative case is provided by the poverty reduction program Chile Solidario between 2002 and 2011. In its Puente (Bridge) component participating households were required to meet a minimum socioeconomic level in a matter of 24 months to leave poverty. Once the first cohort of beneficiaries completed the Puente stage of Chile Solidario the formerly Ministry of Development and Planning (known as MIDEPLAN, today Ministry of Social Development) conducted a follow-up evaluation (MIDEPLAN 2009). Among those households that had left poverty at some point between 2002 and 2004, 30% slipped into poverty back again at some point in the participation period. An ideal graduation strategy would guide program administrators to identify such income dynamics. However, the current practice of many CCT is to drop non-poor beneficiaries after recertification, with the expectation that they will never fall into poverty in the future. 12 The official response of CCTs to changes in the poverty status of their beneficiaries have varied across countries in Latin America (see Table 5 in the Appendix). In some countries, households are merely dropped from the program like in the case of Argentina's Asignación Universal por Hijo and Colombia's Families in Action. In Ecuador graduating beneficiaries are given microcredits to set up businesses, while in El Salvador, Comunidades Solidarias provide income generation training and seed capital for businesses. At the moment, we are not aware of any explicit strategy to reincorporate graduated beneficiaries falling back into poverty.

A transition model for an exit strategy
The analysis of poverty dynamics has been advanced by economic models of vulnerability to poverty, although there is no consensus about concepts and measurement (Baulch 2011). Broadly, two general approaches have been influential. The first approach is based on exante estimates that predict the likelihood of a household or individual to fall below a given threshold of income or well-being over t time periods (Briguglio et al. 2009;Calvo and Dercon 2013;Chaudhuri et al. 2002;Dercon and Krishnan 2000;Foster et al. 2010;Gaiha and Imai 2004;Harttgen and Günther 2007;Naude et al. 2009;Zhang and Wan 2009). Although these methods have made an important contribution to the understanding of poverty dynamics, they are limited to account for observed and unobserved heterogeneity in the context of implementation of CCTs.
The second approach is based on Markovian transition matrices that estimate the probability of a household in poverty to become non-poor and, of a non-poor household to become poor, which allows to account for unit-level heterogeneity arising from individual characteristics that may drive households to switch from a poor to a non-poor status or vice versa. These characteristics can be observed (e.g. education levels, health status) or unobserved (e.g. cognitive and entrepreneurial abilities, risk aversion). Accounting for unobserved heterogeneity is relevant insofar as it may correlate with past program eligibility (Heckman and Borjas 1980). The Markovian approach allows us to account for state dependence, i.e. the extent to which previous experiences of program participation can affect future poverty (and continued participation) (Azariadis and Stachurski 2005). In Heckman's words, "past experience has a genuine behavioural effect in the sense that an otherwise identical individual who did not experience the event would behave differently in the future than an individual who experience the event" (Heckman 1981:91).
Here we consider the case of program eligibility being affected by observed and unobserved heterogeneity that determine poverty status and non-random attrition, and also by the possibility of contemporary poverty being influenced by previous experiences of cash benefits that could lead to state dependence. One fundamental reason for our empirical strategy is that the poor population in time t−1 could be overrepresented because an incentive distortion to participate (or remain) in POP, and because non-poor households would be more likely to attrit due to transaction and opportunity costs associated with program participation in time t. Indeed, Coady and Parker (2004), Das et al. (2005),Álvarez et al. The Markovian transition model that we adopt allows us to estimate the probability of becoming (or staying) in poverty in time t while simultaneously accounting for the initial conditions in t-1 and potential non-random attrition. 13 This is done by assuming that the current poverty status depends on previous observable characteristics, which in turn predicts the probability of being eligible in the next period. 14 We follow Jenkins (2011) and consider for i =1, . . . , N households, the probability of being poor and eligible to POP in time t-1, as an initial condition, is given by, where β is a vector of parameters, X it−1 is a vector of covariates at baseline, μ i is a specific individual effect, and δ it−1 is an orthogonal white noise error. u it−1 is assumed to be randomly and normally distributed. Poverty status is defined when the probability of becoming poor is greater than zero. Thus, if p * it−1 > 0 then P it−1 = 1. We also assume that a household whose poverty status is observed in t-1 is also observed in time t. To illustrate, consider r * it to be the probability of retention of those households observed in both time periods, whose relationship with the observed covariates is given by the following expression: where ψ and W it−1 are the vectors of parameters and baseline covariates, respectively, η i is a specific effect, and ξ it is the white noise error. v it is assumed to be normally distributed, with an expected value of one and a variance of zero. Similar to the previous case, a household is retained when r * it > 0, that is, R it = 1. As we are interested in the implications of poverty dynamics for the implementation of POP, the third component of our equation system focuses on the transition of poor 13 See Jenkins (2002, 2004) for a detailed discussion on the Markovian approach. 14 We note that Biewen (2009) has pointed out that a drawback of this approach is that the longitudinal structure of data is not fully exploited. He proposes an alternative method that internalises state dependence but ignores attrition, which is a source of bias in our case, given the structure of the MxFLS. households in t−1 to non-poverty in t. The transition probability of being non-poor in t is thus determined by the transition from being poor or non-poor in t−1 as follows: where γ 1 , γ 2 and z it−1 are the vectors of parameters and baseline covariates, respectively; τ i is the specific household effect and ζ it a white noise normally distributed error. Similar to the poverty status in the baseline period, if np * it > 0 then NP it = 1, but if and only if R it = 1 Note that u it−1 , v it , and ε it from Eqs. 1, 2, and 3, respectively, represent the unobserved heterogeneity that cannot be explained by observed household characteristics. The correlations between these error terms have relevant implications for the interpretation of our results. Thus, the correlations between error terms are defined as The interpretation of Eqs. 4, 5, and 6 is straightforward. ρ 1 indicates the relationship between initial poverty status and the retention of the household in the survey. When ρ 1 > 0, household i is more likely to be observed to be poor in time t-1, and exhibits a lower probability to being attrited in time t. ρ 2 shows the extent of the association between the unobserved individual level factors that determine poverty in the baseline period, t-1, and non-poverty in t. If ρ 2 > 0, then the higher probability of the household of being poor in t-1 makes it more likely to leave poverty in time t. ρ 3 captures the correlation between the unobserved individual-level effects and the likelihood of retention with those determining current non-poverty status. Thus, if ρ 3 > 0, then the probability of the household to be observed in both periods will positively drive the probabilities of being (or becoming) non-poor, vis-à-vis households with higher probabilities to attrit.
The validation of the model is made through the statistical significance of the correlations. If ρ 1 = ρ 3 = 0, then the retention of the households is not relevant and the attrition can be considered as random. If ρ 1 = ρ 2 = 0, the poverty status in the current period, t, is not endogenous to the eligibility status in the baseline period, t-1. Finally, if ρ 1 = ρ 2 = ρ 3 = 0, then the three equations are mutually exogenous, and therefore there is no need to estimate them simultaneously (a simple binary selection model could be a valid alternative).
Of particular interest for our study is to estimate the predicted poverty exit rate, that is, the predicted probability of being non-poor in t given that the same household was poor in time t-1, i.e. l it = Pr(P it−1 = 0|P it = 1) If we assume that the poverty status has reached a steady state, we can define 1/l i and log(0.5)/ log(1 − l i ) to measure the mean and median spells after which we expect a change in the poverty status of program beneficiaries. This step also provides us with information on the period in which an upward socioeconomic mobility is likely to be observed, so the non-poor could then be graduated from the program as they do not comply with the eligibility criteria. Our analysis focuses on this prediction.
To do so, we resort to partial likelihood estimators for each household with an eligibility status adopting the following log-likelihood equation: where Given the non-linearity of the log-likelihood function in Eq. 7, and the complexities that arise from trivariate normal distribution functions, we resort to simulations, as suggested by Cappellari and Jenkins (2004) and Gourieroux and Monfort (1996) to estimate the parameters of interest, which are presented in Section 6.

Data
To estimate the Markovian model, we use the longitudinal Mexican Family Life Survey (MxFLS), which collects a wide range of information on socioeconomic indicators, demographics, and health indicators, on the Mexican population. The survey was implemented collaboratively by the National Council of Science and Technology, Universidad Iberoamericana, Centro de Investigación y Docencia Económica (CIDE), and the National Institute of Statistics and Informatics (INEGI). The National Institute of Public Health and the University of California at Los Angeles (UCLA) were also involved. The baseline (MxFLS-I) was collected in 2002; a second wave (MxFLS-II) was collected during 2005-06, with a re-contacting rate of 88.5%; and a third round (MxFLS-III) was collected over the period 2009-12, with a re-contacting rate of 83.5%.
While the survey is not collected frequently (e.g. yearly) vis-à-vis the expected changes in income levels, two central points are worth noting about our selected data: First, POP is designed to expect changes in the eligibility status of its beneficiaries every 3 to 5 years, which is consistent with the time gap between the MxFLS surveys. Second, the MxFLS surveys are a unique source of longitudinal data available in a developing country that has one of the world's largest CCT program.
The MxFLS follows a probabilistic, stratified, multi-staged, and independent sampling frame designed to be nationally representative for rural and urban areas. Rubalcava and Teruel (2004) show that the sampling frame involved a random selection of localities in the 32 Mexican states as well as a random selection of households within the selected localities. The intended sample size was set for both rounds at 8,000 households and 35,000 individuals with an oversampling that assumed a retention rate of 90%. The survey was collected in each round between the months of April and July. The questionnaire is integrated by ten modules that include information on household profiles, consumption expenditure, income, including from social assistance programs, intra-household dynamics, and cognitive skills. 15

Income calculation
Since we focus on current poverty and the eligibility to receive POP, we calculate household disposable income following INEGI's methodology, upon which official figures on poverty and inequality in Mexico are based (INEGI 2013). 16 More specifically, the calculation of household disposable income is based on four sources. First, earnings from labour activities including wages, gifts, profits, severance payments, and in-kind goods paid for work and earnings from self-employment. Second, rents from properties and financial assets, ownership of intellectual property, and profits from firms owned by household members. Third, transfers from private or public sources. Transfers include pensions, cash, and in-kind transfers from friends, relatives, donations, charity, or state programs, including POP. As we are interested in poverty transitions for eligibility to POP, we exclude from the income calculation the transfer from this program. Finally, income from housing rents. Each house-owner was asked about how much they would have paid if they rented the house they occupy. The estimated value of rented houses was added to the disposable income, assuming that house-owners dispose the income not allocated to housing rents. A challenge emerged when calculating income with MxFLS data. In the MxFLS-I, the questions on the value of housing rental costs were not included in the questionnaire. This generated an information gap for those households owning their house, whose income would be incorrectly calculated unless an additional imputation was considered. To address this limitation, we adopted the following steps. First, we compared the information in MxFLS-I with that of MxFLS-II to identify those households that lived in the same owned house. We found that 97% of unattrited households surveyed in MxFLS-I were still living in the same residence in MxFLS-II. Second, we used the declared rental value in MxFLS-II to retrieve it to MxFLS-I using the housing component of the consumer price index generated by INEGI to deflate rental values to the corresponding month of the survey in 2002.
We recall that the eligibility of a household to participate in POP relies on its poverty profile. Households with per capita incomes below the CL poverty line (food poverty line plus education and health expenses) and integrated by members younger than 22 years of age and/or with women between 15 and 49 years old, are entitled to participate in the program. As eligibility is thus contingent on age progression, we focus on households with members under 12 and 39 years of age in 2002.
In MxFLS-I, the CL was set at 987.72 and 1027.35 Mexican pesos a month per person for rural and urban areas, respectively. For MxFLS-II and MxFLS-III, the poverty lines were indexed using the consumer price index at the corresponding month of each survey. Table 1 below displays the mean income and proportion of households below the CL. It indicates that 37.3, 30.7, and 32.2% of Mexican households were identified with income below the CL in MxFLS-I, MxFLS-II, and MxFLS-III, respectively. In Mexico, the poverty incidence is higher in rural areas. However, instead of worsening, the poverty incidence slightly declined in the observed period. Contrarily, in urban areas, the poverty incidence increased between the last two survey rounds.

Covariates
The descriptive statistics of the covariates used to predict the transition probabilities are presented in Table 6 of the Appendix. We clustered the set of covariates according to their role in household living conditions. The first group of covariates include physical infrastructure and access to public services, which reflect the living standards of a household. Overall, households lived predominantly in houses with cemented floor and walls of hard materials. Rural areas had low latrine use as type of toilet, whereas access to electricity was nearly universal in urban areas. Better physical infrastructure and access to public services are expected to reduce the likelihood to, or transition into, poverty.
The second group of covariates relate to individual characteristics of the household head and his or her spouse. These include age, gender, and years of education. A third set of covariates include the number of children and other adult members of the household, their mean age and years of education, health status, and dependency ratio. Households with higher dependency ratios and low human capital endowments are expected to exhibit a higher probability of falling into poverty. We have also included a vector of productive physical assets that help generate income and cope with idiosyncratic or covariate shocks. The vector includes the ownership of a house, vehicle, electric appliances, livestock, and poultry. Finally, the last group of covariates include idiosyncratic and covariant shocks experienced in the past five years. Idiosyncratic shocks include the death or accidents of household members whereas covariant shocks include the loss of employment or a crop, and whether the household had been victim of a natural disaster. These contingencies are expected to increase the probability of falling into poverty.
As pointed in Section 4, our Markovian transition probability model allows us to account for potential bias due to initial welfare conditions and non-random attrition. This requires the inclusion of instrumental variables that are strong predictors of Eq. 1 but not of Eqs. 2 and 3, and also that are strong predictors of Eq. 2 but not of Eqs. 1 and 3. For the first set of instruments, we experiment with parental years of education of the household head and the environmental characteristics of localities, measured by the level of precipitation, elevation and temperature. The underlying assumptions here are based on strong theoretical and empirical evidence that shows that low educated parents will raise low educated children, and that ultimately will lead to an intergenerational persistence of poverty (Chevalier et al. 2013;Lundborg et al. 2014;Moav 2005). Furthermore, we expect that the strictly exogenous variation in weather conditions will affect food production and thus prices, which directly impact poverty rates in time t−1 (Carter et al. 2007;Battisti and Naylor 2009;Ivanic et al. 2012;Skoufias 2012).
For the second set of instruments, they should be strongly associated with the probability to attrit before being eligible to the program. As migration and attrition are highly correlated, we opt for the inclusion of a variable that measures whether the household head was living in a different location from his or her birthplace at the age of 12. The underlying assumption here is that previous migration experience makes the household head more likely to migrate and therefore attrit in the sample survey (Thomas et al. 2012). This assumption is valid in the sense that previous migration experiences make individuals more likely to re-migrate compared to those who have never migrated. Bryan et al. (2014) recently tested this proposition in the context of Bangladesh by delivering economic incentives to households to out-migrate during the lean season. They found that households were 8 to 10 percentage points more likely to re-migrate, relative to those that did not, even after the economic incentive had long stopped.
While poverty can be a determinant of migration, earlier studies (e.g. Lopez and Schiff 1998;Mayda 2010) have found that poverty in fact impose constraints to migration due to fixed costs and persistent credit market imperfections, so while people often migrate in search for better opportunities, this more often occurs among the non-poor. Thus, there is not strong reason to suspect a systematic relationship between past migration and present poverty. McKenzie (2005) actually used past migration as instrument for current migration in the context of Mexico and found that past migration did not affect present wellbeing outcomes.
To complement our identification strategy, we also include the distance of the municipality where households reside to the State capital city in t−1. Previous studies have shown that migration decisions are risky and costly (Jaeger et al. 2010), so distance as instrument would capture the transaction costs and associated risks of mobility that are also related to the probability of attrition. 17 We tested empirically the assumption of joint exogeneity of our instruments and the results confirmed our priors. The results are presented in Table 4 in Section 5.1.

We begin the discussion by looking at the poverty transition matrix between MxFLS-I (2002) and MxFLS-II (2005-06), and between MxFLS-I (2002) and MxFLS-III (2009-12).
As discussed earlier, eligibility to POP is determined by the CL poverty threshold under which beneficiaries receive income support. The transition matrices are estimated using a sample with unattrited and attrited households following Jenkins' (2011) framework. Afterwards, we run the model on the full sample with attrited households. The parameters in Eqs. 1 to 6 are estimated using the partial likelihood estimators derived in Eq. 7.
The poverty transition matrices are presented in Table 2. Our poverty calculations are validated by comparing them with the official figures reported by the Consejo Nacional de Evaluación de la Política de Desarrollo Social (CONEVAL 2009), which show that poverty rates and eligibility to POP followed a similar scale and trend to our findings. 18 We begin by focusing first on urban areas and using the unattrited sample. We observe that 17.7 and 22.4% of non-poor households in MxFLS-I (2002) became poor and thus eligible to receive POP in MxFLS-II (2005-06) and MxFLS-III (2009-12), respectively. We also find that 67.2 and 67.1% of urban poor households in MxFLS-I became non-poor in MxFLS-II and MxFLS-III, respectively and thus ineligible to receive POP.
Interestingly, the dynamics of poverty transitions in rural areas are found to be stronger than in urban areas. Our results indicate that 32.2 and 35.5% of non-poor households in 17 Similarly, (McKenzie et al. 2010) used distance from the household to the New Zealand consulate in Tonga as an instrument for migration when looking at welfare impacts on migrants in New Zealand. 18 Eligibility and program treatment may not necessarily be strongly correlated due to data limitations (e.g. missing values) and also inclusion and exclusion errors incurred in the implementation of CCTs. This, however, does not undermine the findings of our analysis, as we focus on the implications of poverty dynamics in the implementation of graduation strategies of CCTs, which are based on a set of eligibility criteria.  Table 3 presents the typology of poverty dynamics based on the framework depicted in Fig. 1. We find that 52% and 41.6% of households were identified as never poor in urban 19 The GDP growth rate for 2009 was in the order of -4.7, which also had detrimental effect on the poverty rates. Taking the CL poverty line as reference point, the headcount index increased from 5.9% in 2008 to 6.3% in 2010 in urban areas, whereas it declined from 26.2 to 23.9% in the same period. and rural areas, respectively, whereas 12.5% and 11.7% of households were identified as always (chronically) poor in urban and rural areas, respectively. For the purpose of program graduation, we are interested in examining the dynamics of the transient poor, especially those exiting poverty. Based on our calculations, around 35.5 and 46.7% of the Mexican population in urban and rural areas, respectively, are transient poor, but only 10.7% of urban households and 11.6% of rural households that were poor in MxFLS-I were in a clear path to remain non-poor in the longer term and therefore ineligible to receive POP.

Estimating transition probabilities
Before turning to the results from the estimation of the predicted transition probabilities, we note that that we are not interested in estimating the determinants of poverty dynamics i.e. why some households exit poverty while others do not, but instead, we are interested in the correlates of poverty status, i.e. the implications of poverty dynamics for the graduation strategy of POP' beneficiaries. Thus our focus is on the transition probabilities of households that were poor in t−1 but are non-poor in t.
Our model allows us to account, as discussed earlier, for the effect of previous poverty spells, program treatment experience (initial conditions) and attrition in the estimation of the probability of poverty exit. We have estimated Eq. 7 after running a simulation of 320 draws for urban and rural areas, respectively, number after which the model reached stable results. Table 4 section A presents the correlations and hypothesis tests for the exogeneity assumption after estimating Eqs. 4, 5, and 6. Overall, the correlation between unobserved factors of the initial poverty status and attrition, ρ 1 , is negative in both rural and urban areas but just statistically significant in urban areas. This indicates that unobserved factors determining poverty status in MxFLS-I were correlated with being unattrited in MxFLS-II and MxFLS-III. Furthermore, the correlation between unobserved characteristics and being poor and eligible to receive POP in MxFLS-I and non-poor in MxFLS-II and MxFLS-III, ρ 2 , is negative and statistically significant in both urban and rural areas, indicating that ceteris paribus, we cannot reject the null hypothesis that being poor and treated by POP in time (1) and (2)  Similarly, the correlation between unobserved characteristics affecting attrition and the current poverty status of households, ρ 3 , is also negative and statistically significant in both urban and rural areas, indicating that non-poor households in time t are more likely to attrit. In section B of Table 4 we also present the hypotheses tests that show that the joint correlation coefficients are different from zero, indicating that the assumption of joint exogeneity cannot be rejected, and hence the multivariate estimation of the transition probabilities is justified.
In Table 7 in the Appendix we present in more detail the results from the transition estimates and marginal effects of the model. Overall, we observe that several covariates are strong predictors of poverty transitions in rural areas. For example, household living in a dwelling with concrete walls had a probability 18% higher of exiting poverty after being in poverty in time t−1. Similarly, concrete roof materials, characteristic of apartment dwellings, also increased the probability of poverty exit by 13.3%, whereas having running water inside the dwelling increased the transition probability by 8.3%.
Furthermore, households with garbage collection services had a 11.6% higher probability of exiting poverty, while households living surrounded by waste and residuals were unsurprisingly 23.3% less likely to get out of poverty. Spouse's age and years of education also appeared as strong predictors of transition probabilities, with a positive effect: One additional year of spouse's education increased the transition probability to exit poverty by 1.6% for the entire household.
The results also show that the number of children also have a significant positive impact on rural transition probabilities, while dependency ratios have a negative effect. The highest negative marginal effect is reported from households with chronically ill members. A household with a chronic ill member was 19.5% less likely to move out of poverty and thus become ineligible to receive POP. Owning material assets such as an additional house, a vehicle, and agricultural machinery further increase the probability of becoming non-poor among rural households, by 21.7, 12.3 and 3.6%, respectively whereas shocks, in particular those related to loosing crops in time t−1, had negative effect on the probability of crossing poverty line and thus POP's eligibility threshold.
In the urban context, we found that households living in houses with concrete walls and running water in the dwelling as well as garbage collection service had higher transition probabilities of exiting. Factors related to household composition also had a strong effect on the transition probabilities, with overcrowded dwellings, older household heads, and young children impacting negatively on exit routes from poverty. Interestingly, the only asset relevant positive trajectories in the transition probabilities in urban areas were financial assets, which may also reflect the fact that urban households enjoy better access to financial services.
To verify the validity of our findings, we present in Section B of Table 4 the test for the exclusion restriction of the instruments in Eqs. 1 and 2. Overall, the results indicate that we cannot reject the null of exogeneity and therefore our instruments are correctly excluded in this system of equations. The two last rows of section (B) show the test of inclusion of instruments, whose significance shows that in both cases, Eqs. 1 and 2, the instruments are correctly specified.
Finally, the last section C of Table 4 presents the predicted transition probabilities. We find that the probability of escaping poverty in time t after being poor in t−1 is on average 63.3% for rural areas and 80.8% in urban areas. In fact, urban households are more likely to exit poverty and, hence, stop receiving POP. Now if we assume that poverty trends reach a steady state in t, we can calculate the expected duration of an average eligibility experience. This information is crucial for the design of a graduation strategy for POP, which at the moment conducts costly eligibility tests (recertificaciones) every three years.
Our results indicate that the poverty status of an average household would change in 1.719 and 1.256 periods in rural and urban areas, respectively, which means that the expected eligibility duration to receive POP would be on average, in the order of 5.1 years for rural households and 3.7 years for urban households. With this piece of information, we can also identify the income threshold above which the probability of a rural or urban household falling into poverty again would be below 0.5. Figure 2 presents the poverty entry probabilities. All in all, we find that the income threshold for program graduation would be at around 2086 and 1511 Mexican pesos at 2009 prices for urban and rural areas,  respectively. Above that income threshold, program administrators could drop beneficiary households with a degree of certainly that they will not fall into poverty in the immediate future.

Conclusions
We have provided a generic framework for the analysis of poverty dynamics in the context of graduation strategies of conditional cash transfer programs, taking Mexico's POP program as our reference case. Poverty and thus program eligibility are not static over time. They may vary due to household composition dynamics, income shocks and also behaviour factors. To date, most CCTs impose arbitrary timeframes to graduate beneficiaries without strict consideration to the income transitions that beneficiary households may experience. We propose a method that accounts for unobserved heterogeneity, state dependence, and attrition that are likely to bias, under conventional approaches, program graduation estimates.
Our findings reveal an important degree of heterogeneity in the predicting factors of poverty dynamics across rural and urban environments. A key finding is that only one-third of program beneficiaries exhibited low probabilities of becoming poor in the future and therefore could be regarded as true 'graduates' of the program. We also find that the recertification process of POP-which takes place every three years-would be more efficient if it took place every 3.7 and 5.1 years in urban and rural areas, respectively.
In contexts of idiosyncratic and aggregate economic vulnerability, where households often move in and out of poverty, graduation strategies become fundamental for antipoverty policy design. We have provided empirical estimates for income levels above which the probability of falling into poverty is low. We regard these income levels as a 'graduation floor' for future implementation of conditional cash transfers.
An important question is whether CCTs should stop support to households abruptly or they should be phased out gradually. Gradual exit strategies have been adopted by tax credit schemes in OECD countries and few CCTs in Latin America. The slope of the graduation line is usually estimated based on labour incentives that arise in the absence of cash transfers. The decision of whether CCTs should opt for an abrupt or a gradual exit strategy is beyond the scope of this paper; however, we point out that in the context of poverty and vulnerability, the slope curve of any graduation strategy will need to consider factors that relate to process of human capital formation, the resilience of households to shocks and also potential state dependence behaviours.  Familias en Accion.

Expansion of POP's coverage
(SISBEN) with youngest child.
Changes in the SISBEN algorithm may assignment score.
drive beneficiaries to graduation. Source: Authors with information from the operating rules for each program (1) * Dependence ratio is defined as the number of members under 14 years and over 65 years divided by the number of members between 15 and 64 years of age.
(2) Eligibility according to CL; (3) Weighted number of observations Source: Authors based on MxFLS-I   Source: Mexican Family Life Survey-I,II and III. Notes: (1) Robust standard errors in parentheses.