Sex Differentials in Rates of Birth and Infant Mortality in India
India has made tremendous progress since 1990. GDP per capita has increased by 4 times, adult literacy rate has increased by 1.5 times, life expectancy at birth has increased by 12 years with women’s now exceeding men’s, maternal mortality rate has decreased by 2.5 times, neonatal and infant and child mortality rates have all decreased by 3-4 times, gender inequality index has improved, and human development index has improved. Moreover, these have all improved steadily, almost every year.
What would you expect about sex ratio at birth in India?
I expected the following:
It would have improved across the whole country.
It would be worse in the north than in the south.
It would be worse in rural areas than in urban areas.
The law banning prenatal sex determination, which was passed almost 30 years ago, would have helped with the known problem of selective abortion of female fetuses.
I had also assumed these to be true and touted them as facts to multiple people, including twice in the last two years. Recently, when I tried to confirm this, I learnt the opposite:
Sex ratio at birth has gotten worse across the whole country.
It is better in the south than in the north, but neither is it better in every southern state nor is it good in absolute terms.
It is worse in urban areas than in rural areas.
The impact of the law has been mild.
I also looked into a few adjacent topics and learnt a couple more surprising things:
Infant mortality rate is worse for girls than boys.
Sex ratio for the population as a whole has improved.
I found what I read counterintuitive, and was in sufficient disbelief to look for some original sources. In the rest of this article I lay out what I learnt in detail using data for the last decade. Scroll down to the bottom for links to all media and data.
Sex Ratio at Birth (SRB)
In an ideal state of equilibrium, we might expect an equal number of boys and girls being born. An evolutionary model called Fisher's principle explains this in an intuitive way. But due to various factors, slightly more boys than girls are born almost everywhere in the world. The exact number varies, but it is typically around 105 boys per 100 girls. There are just two outliers: East Asia, mainly China; and South Asia, mainly India. Could they have made a mistake about India, or had bad data?
The chart above shows the SRB in India according to three independent sources.
Chao et al is an academic model that uses a number of factors to estimate the SRB in all countries of the world.
The World Bank uses sampling data to estimate SRB in all countries. It uses a five-year moving average to smooth out sampling fluctuations. Strangely, the recent data looks like a straight line for many countries I looked at, and not just for India.
India's own census organization, ORGI, in addition to conducting census, also conducts half-yearly large scale sample surveys called Sample Registration System (SRS). Since 2000, it has been collecting and publishing data for SRB. It uses a three-year moving average to smooth out sampling fluctuations in SRB.
Overall, they all agree that our SRB is terrible.
An explanation of the trend is that amniocentesis starting in the mid 70s and after that ultrasound starting in mid 80s became widely available and affordable. These were used for sex determination and selective abortion of female fetuses. In 1994, the government passed PCPNDT, a law banning prenatal sex determination. It had a small but positive impact in reducing the rate of deterioration. In 2003, the law was amended to introduce more stringent regulations and punishments, after which the ratio actually improved a little for the first time. But there is neither a sufficiently large improvement nor a sustained change in the right direction to feel good about.
There are studies showing a correlation between the availability of ultrasound testing equipment and worsening sex ratio.
Could the problem be due to a few states that have an outsized impact?
SRS also breaks down data for states and union territories that have at least 10 million people, covering more than 98% of our population. The above chart shows the differentials in each state, i.e. (SRB - 1). States are roughly grouped in regions from left to right: north (13% of total population), east (39%), center (26%), and south (21%).
The SRB gets worse as we go from the south to the center to the east to the north. Chhattisgarh and Kerala are the best, followed by West Bengal and Himachal Pradesh. In absolute terms, every other state has a bad SRB. On the other end, Uttarakhand and Haryana are the worst, followed by Delhi and Gujarat. Punjab improved the most, and Karnataka and Tamil Nadu deteriorated the most.
How about the divide between rural and urban areas?
In both rural and urban areas, SRB gets worse as we go from the south to the center to the east to the north. Chhattisgarh is the best state for rural areas, and Kerala the best for urban and second best for rural areas. Punjab showed the largest improvements in both rural and urban areas, though in absolute terms the ratio is still bad.
Strikingly, in almost every state SRB gets worse as we go from rural to urban areas. With four exceptions: Punjab, Uttar Pradesh, Madhya Pradesh, and Maharashtra. Two other noticeable things: SRB improved a lot in the urban areas of Uttar Pradesh immediately after the separation of Uttarakhand, and deteriorated a lot in the urban areas of Andhra Pradesh, after the separation of Telangana.
The simplistic reason I expected SRB to be better in urban areas is because they have higher living standards, lower poverty rates, better healthcare, and greater opportunities overall, all of which might convince people that having a girl isn't really a tax to fear, much less to take drastic measures.
One theory is that India is seeing a trade-off between fertility rates and sex ratio. I looked into it, and for now remain unconvinced. Birth rates have been steadily declining in every state, in both rural and urban areas. Just comparing with the less consistent sex ratio charts, as well as calculating them, I didn't find a strong correlation. The trend is also in line with the rest of the world – fertility rates have been declining in every single continent, region, and country.
Another theory is that the ability to determine the sex of a baby, coupled with a cultural preference for boys, has increased the prevalence of sex-selective abortions. This theory also explains why it is worse in urban areas than in rural ones. The amenities arrived at cities first, are more widely available in urban areas than rural ones, and are affordable to a larger percentage of people in urban areas. Culture changes slower than life outcomes, so not only has the improvement in the standard of living been insufficient in offsetting the cultural preference for boys, but it may also have contributed to a stronger expression of those preferences.
If that is true, as the trends of increasing urbanization and improving healthcare continue, we can expect the SRB to get worse even in rural areas in the absence of any other interventions.
This is a more visual though less informative way of looking at the same data to convey the idea that SRB has gotten worse over the last decade almost everywhere.
Infant Mortality Rate (IMR)
The bias for male births worldwide begins getting naturally countered almost immediately. Boys are more likely to suffer from birth complications and infectious diseases, and are more likely to die in infancy than girls. This continues even as they age. Boys tend to die at greater rates than girls, and men earlier than women. We see that almost everywhere in the world, once again with the exception of China and India. The above time series shows IMR for girls along the X-axis and for boys along the Y-axis. We are on the x=y line, and sometimes even below it (towards x>y).
This time series shows IMR data from SRS plotted similarly, broken down by states, as well as separately for rural and urban areas. Following the big gray dot (India), it is clear that the IMR for girls is greater than for boys. I have no idea why. I have read that some of the differential may be due to selective negligence of girls. If that is true, I would expect to see it in the form of a greater skew during the earlier stages of infancy. But I haven’t been able to confirm that. Though SRS breaks down infant mortality data further into still births, early neonatal deaths (< 7 days), late neonatal deaths (7-29 days), and post neonatal deaths (29-365 days), I could not find the data separated by sex.
There are however two positive trends: the points are all moving towards the origin, meaning IMR is going down almost everywhere; and many points have crossed the line into the x<y territory, meaning IMR for girls in many states is reducing faster than for boys.
My intuition is that, unlike SRB, IMR trends are moving in the right direction. We are catching up with the rest of the world. So instead of worrying too much about the differential, India should continue its efforts to drive down the absolute numbers across the board. In some countries like Norway, infant mortality rates are as low as 0.25%, which is 15 times less than in India. If we can achieve that rate, differentials in either direction will relatively seem like rounding errors.
Kerala once again is the best state. Uttar Pradesh is the worst, and being the largest state with about one-sixth of the whole population, concentrating efforts to improve its situation will go a long way in addressing the issue.
Another positive trend is the rural vs urban distinction. Unlike SRB, IMR is without exception better in urban areas than in rural areas in every state. This is true even when looking only at IMR for girls or for boys. This is a clear indicator that improving standards of living, including better access to sanitation and healthcare, will be enough to keep moving us in the right direction. There doesn't seem to be a cultural baggage to worry about when it comes to reducing infant mortality. Rather, families with a strong preference for boys who may have neglected girls are simply not having girls, and the rest are taking care of their infants regardless of sex equally well.
Child Mortality Rate (CMR)
Mortality rates for children who have survived infancy start to look a lot better. There is no longer a uniform sex differential. By 2019, the worst rates are around 0.2%, and more than half the states have rates below 0.1%. There are some differences across states as well as regions, but the numbers are all so low that relative to SRB and IMR it is a less urgent problem.
Longer Term Trends
For the census years 1991, 2001, and 2011, the data includes population breakdown by five-year age groups. This doesn't get us SRB, but shows longer term trends of various age groups over two decades. Below are dot plots for the sex ratio in the years 1991, 2001, and 2011, for the total population as well as every age group.
Between 1991 and 2011, not just SRB but also the sex ratio in children up to the age of 10 became worse. But after that age, we start seeing the reverse. The ratio in every other age group has improved in at least rural or urban areas, and in many cases in both. The ratio for the population as a whole (first row) has also improved in both rural and urban areas. Things are still quite imbalanced, but the root cause is that many girls are being prevented from being born.
I tried looking at the data in other ways, including state wise breakdown, but there were several things I couldn't explain, mainly stemming from not understanding how the census is conducted, and more importantly how the process changed over the decades. Some examples:
The 1991 Census didn't have J&K data, and it was interpolated in some way.
The 1991 Census also didn't have data for the subsequently formed states of Chhattisgarh, Jharkhand, and Uttarakhand. So this would have impacted the numbers of the states that they had been previously part of.
How big a role do urbanization and inter-state migration play? How to take those into account?
How to take into account the 3-5 million people in each census categorized under "age not stated"?
Following an age group, e.g. considering the age group 0-4 in 1991, age group 10-14 in 2001, and age group 20-24 in 2011 as all representing the same group of population, I found inconsistencies in both directions – increasing in number for one census and then decreasing for the next. I found this for multiple age groups. How does one explain such inconsistencies?
Bonus: The above chart shows sex ratio spanning a whole century. I don't know if it even makes sense to plot all the data points on one graph as the method of getting these numbers has changed. Until 1981, the Indian census was conducted by just sampling about 10% of the population. I imagine as we go farther back, the process gets less and less reliable. I don't know what our ex-colonizers counted, but their numbers look great.
All images in their original quality, including many that weren't embedded in this article, can be found here. Links to all interactive visualizations, and all the raw data itself along with details of their original sources can be found here.
Unlike most of the world, India often measures the sex ratio as the "number of females per male". I stuck with the standard definition of "number of males per female" because it's easier to compare against other countries. Some of the older graphs are referring to the same thing when they say "gender ratio".
The two time series of girls vs boys scatter plots use percentages. All other charts represent birth rates as "number of births per 1000 people" and death rates as "number of deaths per 1000 people in that age group".
The raw data was obtained from miscellaneous documents hosted on the Census catalog, especially the ones titled "Sample Registration System Statistical Report 20XX". The surveys are being conducted twice a year without fail, and their sample size has steadily increased from 7.5 million in 2011 to 8.2 million in 2019. They claim to be one of the largest demographic surveys in the world. I can't assess survey design, but as per the descriptions included in the reports, as well as the thoroughness of the reports itself, I think they are of a high quality.
Note that some of the Datawrapper visualizations take more than ten seconds to load. Please let me know if you see anything missing or wrong.
Many thanks to MCK for all the discussions and suggestions.