Are the total fertility rates of men and women different at below-replacement levels? An answer obtained from the G7 countries

A child is born to a father and a mother. This fact, however, is yet to be recognized by demography, in which fertility refers to women’s natural ability to give birth. The main reason for the absence of men is that data on births are more often available for women than for men. But in the last few decades, data availability has greatly improved. Recent studies show that total fertility rates (TFRs) of men can be calculated for most countries in the world and that the difference between the TFRs of men and women can be quite large. For low-fertility countries, nonetheless, these studies show little difference between the TFRs of men and women, giving rise to the question: Is men’s fertility worth further investigation? To avoid ambiguity in describing a particular difference as small or big, this paper provides a formula for probabilistic TFRs. Using hypothesis test on probabilistic TFRs, we can say that the difference between the TFRs of men and women is statistically significant for all the G7 countries, except for France. To explain the differences between the TFRs of men and women, this study uses models of stable populations and concludes that the one-sex stable population models cannot explain the results whereas a two-sex joint stable population model can do so. By using the two-sex population model, we can explain why the TFR of men in France is almost the same as that of women, and why it is lower than that of women in the other six G7 countries. More specifically, by using the model, we can help explain 76% of the variance in the difference between the TFRs of men and women. Future studies may be able to show that men’s TFR is significantly lower than women’s in other countries too and explain why it is so. The above findings, however, require closer attention and further investigation, because low fertility could lead to socioeconomic problems. Beyond TFRs, extending fertility studies from women to men as well, that is, conducting fertility studies on both women and men, will fundamentally improve our knowledge about fertility age patterns, trends, determinants, policies and other related issues.


Introduction
In general, reproduction involves both men and women.But in demography, 'Fertility refers to the number of children born to women' (Weeks, 1999).Subsequently, in social sciences and policy analyses commonly used fertility indicators are defined in reference to women.The main reason for the absence of men is 'that data on births are more often available for the mother than for the father' (Preston et al., 2001).
Data availability, however, has profoundly improved in recent decades.For example, the United Nations Demographic Yearbook includes data on births including the age of the father from civil registrations for multiple countries.Also, major international demographic surveys include information on men in their questionnaires, which can be used to estimate male fertility.As such, data availability is no longer an insurmountable obstruction to studying men's fertility at national levels, which in turn may provide insights into the bias and consequence of ignoring male fertility.Keilman et al. (2014) calculated and analyzed the total fertility rates (TFRs) of men and women from more than 60 countries and regions.Schoumaker (2017aSchoumaker ( , 2017b) ) devised methods to estimate the TFR s of men from multiple sources and calculated the TFRs of men and women for most of the countries in the world.
These multi-country studies indicate that male fertility indicators can now be calculated, and the differences between the TFRs of men and women can be quite large.For low-fertility countries, whose data are more reliable, these studies show that there is 'little difference' between the TFRs of men and women or that 'male fertility is often slightly lower than female fertility'.This raises the question of whether men's fertility is worth further investigation.
Using data from the G7 countries, this study found that the TFR of men is almost identical to the TFR of women for France, and lower than the TFR of women for the other six countries, ranging from 0.4% (for the UK) to 8.1% (for Italy).To avoid ambiguity while describing a particular difference as big or small, this paper provides a formula for probabilistic TFRs.Using hypothesis test on probabilistic TFRs, we show that the difference between the TFRs of men and women is statistically significant for all the G7 countries, except for France.
In other words, the differences between the TFRs of men and women are not the result of random fluctuations, and therefore they need to be further investigated.To explain the differences between the TFRs of men and women, this study uses models of stable populations and concludes that the classic one-sex model cannot explain the results whereas a two-sex joint stable population model can do so.By using the twosex joint population model, we can explain why the TFR of men is almost identical to that of women for France, and why it is lower than women's for the other six countries.More specifically, by using the model, we can explain 76% of the variance in the differences between the TFRs of men and women.

3
Are the total fertility rates of men and women different at…

Data and the total fertility rates of men and women in the G7 countries
To avoid the impact of the COVID-19 pandemic on fertility which could influence the end result, we collected data on births by the age of the mother and the father from the United Nations Demographic Yearbook for the year 2015 (United Nations, 2017) for Canada, Germany, Japan and the United States, for the year 2016 for Italy and the United Kingdom, and for the year 2014 for France.Data on the populations of men and women and survival probability by age were obtained from the Human Mortality Database (2022).The data on the G7 countries have been frequently used in academic studies perhaps because they are both multinational in character and concise.
A special reason why we chose the data on the G7 countries for this paper is that their TFRs ranged from replacement level (for France) to very low fertility (for Italy).The number of birth registrations by the age of the father was less compared with the age of the mother in the G7 countries, ranging from 2% in Japan to 12% in the US.This discrepancy is addressed by adjusting the total births registered by the age of the father equal to that of the mother, assuming the age structure of the non-registered fathers is identical to that of the registered fathers.
The age-specific fertility rate for women is defined as the number of births to the women of an age group divided by the number of exposure-to-risk of these women.The age-specific fertility rate for men is defined by replacing the 'women' with 'men'.
By summing up the age-specific fertility rate of men over age, we obtained the total fertility rate of men (TFR m ), which differs from the total fertility rate of women (TFR w ), as can be seen in Fig. 1, in which the age range for calculating the TFRs is 15-50 for women and 15-60 for men.A pattern is apparent in the differences between the TFRs of men and women.For France, the levels of TFR m and TFR w are almost identical and close to the replacement level, whereas TFR m is lower than TFR w for the other six countries.Also, the lower the levels of TFR m and TFR w are, the bigger the difference will be.Whether these differences are significant or not is unclear, but whether they are statistically significant or not can be quantitatively described.This paper provides a model for probabilistic TFRs, and uses hypothesis test to indicate whether a difference between the TFRs of men and women is statistically significant.

Probabilistic total fertility rate and hypothesis test
To derive the formula of probabilistic total fertility rate of women, the definition of TFR w is cited from the World Health Organization (2022): Total fertility rate is 'the average number of children a hypothetical cohort of women would have at the end of their reproductive period if they were subject during their whole lives to the fertility rates of a given period and if they were not subject to mortality'.This definition can be applied to TFR m by replacing 'women' with 'men'.In the derivation, there is no need to distinguish men from women, and the formula for probabilistic total fertility rates can be used for both men and women.
Let us assume the number of the 'hypothetical cohort' of men or women is N, which is the number of men or women in minimal reproductive age interval (15 years of age in this study).Let f(x) be the unbiased age-specific fertility rate for (x, x + 1) years of age, which is also the probability of giving birth at (x, x + 1) ages, assuming there is no mortality.The probabilistic process of childbearing of the ith woman (or man) can thus be characterized by the random function B i (f(x)) that obeys the Bernoulli distribution.According to probability f(x), B i (f(x)) takes value 1 to represent the event of giving birth at (x, x + 1) ages, or 0 to describe no birth.
Subsequently, the random number of children that the ith woman (or man) would have at the maximal reproductive age is . The total number of births given by all N women (or men) is ∑ N i=1 C i .And the probabilistic total fertility rate, namely Tfr, is the total number averaged by N women (or men): Using (1) to model the Tfr, the Bernoulli distribution assumes that a man or a woman can account for at most one birth in one year, which excludes the possibility of multiple births or polygamy.Nonetheless, it does not affect the calculation, because the additional births through multiple births or polygamy are counted in f(x) and distributed to all women (or men) in the corresponding age groups.
Noting that the mean and variance of B i (f(x)) are f(x) and f(x)[1 − f(x)], the mean and variance of Tfr are (1) 1 3 Are the total fertility rates of men and women different at… The calculations of the mean and variance of Tfr do not require more data than those needed for calculating the TFR.In national sample surveys, the variance of Tfr has been calculated by using sample values of the TFR obtained from different sample locations (e.g.Statistics Indonesia et al., 2013), which, however, requires more data than that for (3).Another way of doing so is by first calculating the variances of the fertility rate for each age group using binomial distribution (e.g.Chiang, 1984), and then summing up these variances to obtain the variance of Tfr.Since the population size changes with age, this variance of Tfr does not measure the uncertainty of the childbearing process experienced by the hypothetical cohort whose size does not change with age.
The Tfr is a random variable and comprises the N random variables, C i , which is the random number of children that the ith woman (or man) would have, where i ranges from 1 to N. Since the N random variables C i are independent of each other, and their probability distributions are identical, random variable Tfr approximately obeys a normal distribution when N is bigger than 30 according to the law of large numbers (see Agresti & Finlay, 1997).And the bigger the N, the closer is the distribution of Tfr to normal.
For this study, the N is the number of men or women at age in one of the G7 countries and it is quite large, thus the probability distribution of Tfr is very close to normal.
In view of a probabilistic total fertility rate that includes the uncertainty of the childbearing process, an observed TFR is a random sample of Tfr regardless of how large N could be.The difference between an observed TFR and the mean value of Tfr is a sample error or random fluctuation.Only when N is infinitely large does an observed TFR converge to the mean value of Tfr.When N is not infinitely large, the difference between TFR m and TFR w may include random fluctuations and may not represent the difference between the mean values of Tfr m and Tfr w .For example, a large difference between TFR m and TFR w does not necessarily imply a statistically significant difference between the mean values of Tfr m and Tfr w , and vice versa.
Besides, whether an observed difference is large or small often becomes an issue of debate among observers.So how do we know whether the difference between the mean values of Tfr m and Tfr w for a country is statistically significant?To find out, we use hypothesis test as follows.
Given that the probabilistic total fertility rates for men and women, namely Tfr m and Tfr w , are normal variables, the below test statistic Z is a standard normal variable under the null hypothesis that the mean values of Tfr m and Tfr w are identical, The 95% confidence interval of Z, or the interval that a random sample of Z will fall in with 95% probability, is [− 1.96, 1.96].And a random sample of Z, namely z, can be calculated as If z falls outside (− 1.96, 1.96), then the hull hypothesis will lead to an event that occurs with a probability smaller than 0.05.An unlikely event should not occur, but if it occurs, it indicates the null hypothesis is wrong and should be rejected.Subsequently, the conclusion is that the difference between the mean values of Tfr m and Tfr w are statistically significant, or simply speaking the difference between the TFR m and TFR w is statistically significant.Otherwise, the difference between the means of Tfr m and Tfr w cannot be said to be statistically significant.
To observe the test better, Log(|z|∕1.96) is used to show the results for the G7 countries in Fig. 2. When Log(|z|∕1.96)> 0 , the difference between the TFR m and TFR w, is statistically significant, and vice versa.
Figure 2 indicates the differences between the TFR m and TFR w are not trivial, and are statistically significant for all the G7 countries, except for France.The pattern in the differences, as shown in Fig. 1, is not a result of random fluctuations and thus requires an explanation.1 3 Are the total fertility rates of men and women different at…

Explanation models
Stable population models are used to explain the difference between the TFRs of men and women.Let the age-specific fertility rates (ASFRs) of men and women at age x be f m (x) and f w (x), and the probability for a female and male birth to survive to age x be p m (x) and f w (x).The intrinsic growth rates for the stable populations of women and men, namely r w and r m , can be calculated separately from the characteristic equation of Lotka (1939), Let the difference between the TFRs of women and men be DTFR, The difference between the model TFRs of women and men is the model DTFR, which is denoted as DTFR * (r m , r w ), where TFR * w (r w ) and TFR * m (r m ) are the model TFRs of men and women, b mother (x) and b father (y) the numbers of births attributed to mothers aged x and fathers aged y, P w (x) and P m (y) the numbers of women aged x and men aged y, and N w and N m the numbers of women and men at the minimal reproductive age.
When fertility is at the replacement level, r w and r m are zero.Since mortality is low at reproductive ages, p w (x) and p m (y) decline only slightly with age and can be simplified as constant over age as p w (α) and p m (α).Since the sex ratio at birth is higher than 1, and since male mortality is higher than that of female at the prereproductive age, N m would be close to N w .Noting also the integral of b mother (x) equals the integral of b father (y), (9) becomes In this way, we explain why the DTFR should be approximately zero when fertility is close to the replacement level, which is the case for France.
Furthermore, let the mean age of childbearing of women and men in corresponding stationary populations be MAC w and MAC m , ( 7) p m (y)e −r m ⋅y dy, Using ( 10) and ( 11) and the Taylor expansion on the exponential terms in ( 9), ( 9) becomes the two-sex separate explanation model below: Why DTFR > 0 at below replacement fertility levels can be explained using a useful fact: that men are normally older than their women partners.This fact is verified across religions and 130 countries (Ausubel et al., 2022) and can be interpreted as MAC m > MAC w .Given that MACm > MAC w , model ( 12) leads to DTFR * (r m , r w ) > 0 when r m ≤ r w .However, using Lotka's characteristic equation for men and women separately, r m ≤ r w cannot be guaranteed.Thus, model ( 12) cannot guarantee that DTFR * (r m , r w ) > 0.
In Fig. 3, the black bars describe the observed values of DTFR, which are positive for all the G7 countries, except for the almost zero but negative value for France The white bar shows the value of DTFR*(r m , r w ), which is close to the observed value for Italy, acceptable for Germany and Japan, but mistakenly negative for Canada, the UK and the US, and is way off for France.In other words, the two-sex ( 11) 1 3 Are the total fertility rates of men and women different at… separate explanation model cannot explain the difference between the TFRs of men and women for more than half of the G7 countries.
The problem for the failure of the two-sex separate explanation model is caused by the fact that r m and r w are different.The problem can be solved using a two-sex joint stable population model (Li, 2022), which measures the age-specific fertility rates for men and women jointly as where b(x, y) are the numbers of births attributed to women aged x and men aged y and are calculated by using the estimating procedure of the model.The denominator of a joint fertility rate should be the average of women and men.The geometric average is used for this because it is reasonable and the simplest.The arithmetic average is simpler because it is linear, but it is not reasonable.A positive joint fertility rate and a positive arithmetic average of men and women will give a positive number of births, even if there are zero men or women.Other averages may be also reasonable, but will be more complex.
On the basis of this joint fertility rate, a two-sex joint stable population model is established, of which the two-sex joint intrinsic growth rate, r, is calculated using the characteristic equation below.
where s is the sex ratio at birth.
To study the difference between male and female fertility, the TFRs of men and women can also be modeled in two-sex joint stable populations as below.
Accordingly, using the two-sex joint stable population, model ( 9) becomes the two-sex joint explanation model where r < 0 and MAC w < MAC m .Hence, we guarantee that the two-sex joint explanation model ( 16) will work, but the question is how well.
The gray bars in Fig. 3 show the values of DTFR*(r), or the model difference between the TFRs of men and women: they look close to the observed DTFR, or acceptable for all the G7 countries.
To be quantitative, the R-squared of the explanation models is calculated as where the subscript i stands for the ith country, and the DTFR * i can be DTFR * i (r) for the two-sex joint model ( 16) or DTFR * i (r w , r m ) for the two-sex separate model (12).Applying model ( 16) to the G7 countries, we get the R-squared value of 0.76, which indicates that 76% of the variance in DTFRs is explained.As a comparison, the R-square of applying model ( 12) is -0.92, implying that using of the two-sex separate model is worse than using the corresponding mean value to explain the differences between the TFRs of men and women.Female stable population models (Lotka, 1939;Leslie, 1945) form the major basis of demography and are successfully applied to various one-sex issues.The failure of model ( 12) indicates that these models cannot work for two-sex cases and calls for the use of the two-sex stable population models.
The two-sex joint explanation model ( 16) is suitable to describe the observed pattern for the following reasons.For the numerators of the ASFRs, most of the children are born to men at the ages around MAC m and to women at the ages around MAC w .Also, MAC m is bigger than MAC w .For the denominators of the ASFRs, the male population (N m [p m (x)∕p m ( )]e −r⋅x ) and female population (N w [p w (x)∕p w ( )]e −r⋅x ) increase exponentially with age at the same rate (− r > 0).
Thus, the ASFR of men at the ages around MAC m would be reduced from the values corresponding to the stationary populations (r = 0), and such a reduction would be bigger than that of women at the ages around MAC w .Since the stationary TFRs of men and women are about the same, and since the reduction of men is bigger than that of women, the model TFR of men would be smaller than the model TFR of women, explaining the observed differences.Thus the difference between the TFRs of men and women is explainable.

Discussion
Two factors determine whether a difference between the TFRs of men and women is statistically significant.One is the difference between the TFRs, the other is the size of the hypothetical cohort, or the number of men or women in the minimal reproductive age interval, namely N. The bigger the N and the difference between the TFRs, the more likely the difference is statistically significant, and vice versa.
The difference between the TFRs of men and women for France in 2014 was only 0.002, which is not statistically significant.But if such a difference were observed in a population ten times larger than the population of France, it would become statistically significant.Similarly, a difference observed in one year may not be statistically significant, but in five or ten years, it could become significant, because random fluctuations tend to cancel each other out during longer periods of time.This can also be understood from the perspective of calculation: the longer period, the larger the N.
To study the fertility of some special age groups, for example, the 15-19 age group, the model of probabilistic total fertility rates can use flexible.The lowest and ( 17) Are the total fertility rates of men and women different at… highest ages can be set as 15 and 19 years, respectively.Hypothesis test can also be used to the difference between the TFRs, or the fertility of a particular age group, of women in two countries or socioeconomic groups.These differences can be analyzed to understand the effects of socioeconomic factors on fertility.By attaching statistical significance to the process, we can improve such analyses.Hypothesis test can also be applied to the difference between the TFRs of women in two separate years.Statistical significance can also help in modeling the change with the passage of time in the TFR of women, which is important in population projection.
The explanation models are built on stable male and female populations.In the G7 countries, the TFR w declined, falling below replacement levels in the 1970s and then fluctuated around some constants, and so should the TFR m according to the pattern shown in Fig. 1.Mortality declined only slightly in productive and younger ages because there was little room for any notable decline, and the numbers of emigrants and immigrants were small compared with the size of the population.Thus, the male and female populations would approach the corresponding stable populations.The male and female populations did not approach one-sex stable populations separately, because model ( 12) is built upon one-sex separate stable population and cannot explain the difference between the TFRs of men and women.
Nonetheless, the two-sex separate model, which uses the one-sex models separately, is not useless.Female or one-sex stable population models are used to conduct various one-sex demographic analyses or develop calculation methods.When two sexes are taken into consideration, theoretical problems with the one-sex models are well-known: The intrinsic growth rates of men and women are always different and thus the corresponding male and female stable populations cannot exist together in the long run.However, it is not known whether one-sex models will still work for practical two-sex issues during limited periods.The failure of model ( 12) indicates that such models cannot work for a two-sex issue within a limited period.
On the other hand, model ( 16) is built on two-sex joint stable populations and explains 76% variance of the difference between the TFRs of men and women, indicating that the male and female populations have been approaching two-sex joint stable populations, or the age structures of men and women in the specified year of around 2015 could be simplified by that of the corresponding two-sex joint stable populations.
Model ( 16) simplifies the determinants of the differences between the TFRs of men and women (TFR w -TFR m ) into two categories.One is the age gap between men and their women partners (MAC m -MAC w ), which is the source of the difference.The other is the intrinsic growth rate (r), which determines how much difference a certain source could produce.For instance, if the age gap were zero, the difference would be zero regardless of the intrinsic growth rate.And for a given non-zero age gap, a change in the intrinsic growth rate can make the difference bigger or smaller.
In Fig. 2 of Schoumaker's paper (2017b), we observe a reverse pattern in the difference between TFR m and TFR w .In a situation where fertility is higher than the replacement level, TFR m is bigger than TFR w .Besides, the higher the fertility level, the bigger the difference.Model ( 16) appears to explain this pattern: with r > 0 and MAC w < MAC m , the DTFR * (r) is negative in model ( 16), which is consistent with the observed TFR w < TFR m.But at higher than the replacement level, fertility would be declining according to observations (e.g.United Nations, 2022) and the demographic transition theory, and the transitional populations should not be close to stable populations.Therefore, how model ( 16) could explain the observed pattern in high-fertility countries is a question that needs to be answered.
'Today, two-thirds of the global population lives in a country or area where lifetime fertility is below 2.1 births per woman' and the TFR w across the world is projected by the United Nations to drop from 2.34 in 2020 to 1.84 in 2100 (United Nations, 2022).The populations of more countries would stabilize at below-replacement levels in developing countries, where the age gap between men and their women partners is bigger than the gap in developed countries (Ausubel et 2022).
Apart from explaining what has happened in the G7 countries, model ( 16) can also be used to forecast what could happen in developing countries which have reached or will reach below-replacement level fertility: If their intrinsic two-sex joint growth rates stabilize at levels similar to that in the G7 countries, the TFR of their male population would be substantially lower than the TFR of their female population.Only time can tell if this forecast will come true.
In summary, the TFR of men is significantly lower than that of women in the G7 countries, and the difference between the TFR of men and women could be more substantial in other countries in the foreseeable future.These findings call further investigation because low fertility could create socioeconomic problems.And extending fertility studies from women to the entire population, which includes both men and women, can greatly improve our knowledge about fertility age patterns, trends, determinants, policies and related issues.

Fig. 1
Fig.1Total fertility rates of men (TFR m ) and women (TFR w ) for the G7 countries.Source: United Nations Demographic Yearbooks and Human Mortality Database

Fig. 3
Fig.3Observed and model difference between TFR m and TFR w for G7 countries.Source: As for Fig.1