By: Robert Woodberry

One argument in favor of religious liberty is that it unleashes religion’s pro-social potential and benefits. Conversely, an argument against religious liberty is that religion is a socially dangerous force that needs to be held in check or at least highly regulated. Given these contending positions, it becomes important to engage studies such as the one by University of Chicago professor Jean Decety and colleagues entitled “The Negative Association between Religiousness and Children’s Altruism across the World.”

Mainstream media outlets picked up on this study and publicized it widely. “Religious Children are Meaner than Their Secular Counterparts” proclaimed a headline in the Guardian. “Religious Kids are Jerks” raved the Daily Beast. Hundreds of other newspapers and blogs touted similar articles: the Economist, Forbes, Good Housekeeping, the LA Times, the Independent.

But what is the evidence behind these claims? Does it match pervious research? Is it worth all the hype? My analysis of the article demonstrates that the project was poorly constructed and the data analysis sloppy. The authors do virtually nothing to test alternative explanations or mitigate the flaws in their research design. The results contradict the vast majority of other research on the topic. The authors extrapolate well beyond what the data show, and reporters extrapolate beyond even what the authors claim. However, to make the problems clear for those not trained in statistics and not familiar with the previous research on the topic takes some space.

The authors ran an experiment with 1,170 children in six countries (Canada, China, Jordan, Turkey, USA, and South Africa). In the main experiment, the authors gave each child 30 stickers and allowed them to pick the 10 they wanted to keep. Researchers then told the child that they lacked time to run the experiment with other children in their school, but if the child gave up some of their 10 chosen stickers, the researchers would give those stickers to another child. The researchers then counted the number of stickers each student gave back as a measure of how “altruistic” the children were. The researchers then interviewed a parent of each child and asked the parent an open ended question about the parent’s religion. The researchers decided whether or not they considered the parent religious, and then applied their religious designation to the child. The children (5-12 years old) were not asked about their own religiousness. The researchers then compared the number of stickers given away by “religious” and “non-religious” children and found that on average “non-religious” children gave away more stickers (or actually 86 percent of a sticker more). Yes, the global media campaign is about a fraction of a sticker. Despite the huge diversity of people in their cross-national sample (for example, Canada and Jordan) and the many factors that influence the generosity of children in such diverse contexts (for example, poverty), the researchers assumed that the only difference between the “religious” and “non-religious” children was being religious. [1]

In the second experiment the researchers showed the children a series of scenarios involving one child pushing another and other types of “interpersonal harm.” The religious students judged the behaviors as more “mean” than the non-religious children. Muslim children recommended a harsher punishment for the bad behavior than non-religious children. The punishments Christian children recommended were indistinguishable from those recommended by non-religious children.

However, in the conclusion of the article and in most media reports the various authors claim that “religious” children were “meaner,” “harsher,” or “more vindictive” without qualifying that only Muslim children were (if we assume the problematic sample applies to Muslim children in general). The researchers did not interpret the religious children’s concern for people who were harmed by another child as a sign of altruism, but as vindictiveness. The authors do not adjust their evaluations of the severity of the punishments to account for the children’s interpretation of the severity of the offences—if you don’t think pushing someone is bad, you obviously won’t want a strong punishment for it. Nor do they interpret the Christians as merciful for simultaneously thinking harming another student was meaner than non-religious students thought, yet calling for equally mild punishments as non-religious students did.

In the third experiment, researchers asked parents how empathetic their child is and how sensitive to injustice. Religious parents rated their children as more empathetic and sensitive to injustice than non-religious parents. The researchers interpret this as parental blindness—assuming the sticker experiment better captures the empathy and sensitivity of the children than either the child’s concern for children who are shoved, or a lifetime of parental experience. Alternatively the researches could have asked a teacher or other students about the empathy and sensitivity of the children (as outside corroboration), but they did not.

The researchers interpreted these three experiments as indicating that religious people think they are more helpful, when in fact they are actually “less helpful” and “more punitive.” In interviews with reporters Decety explains that if people think they are more moral they give themselves permission to be more immoral. Thus, thinking you are moral is detrimental. Decety also claims his research shows that secularization is good. “…secularization of moral discourse does not reduce human kindness. In fact, it does just the opposite.” Both claims are rather broad and not well supported by the data. We do not know if the children think they are more moral, only that their parents think they are more moral. We do not know why the religious children gave away fewer stickers (or even if the association is causal) let alone that they acted less “morally” because they think they are more moral. Nor did the researchers do any investigation about the effect of secularization on kindness.

Popular articles extrapolate even further. An article in the Mirror claims that “Children of atheists are kinder and more tolerant”—although it is unlikely that 28 percent of the parents in the sample coded as “non-religious” are all atheists. An article in Forbes claims the research demonstrates that religious people are “less moral” and that “history backs-up the scientific evidence that secular people are more moral.” I guess a fraction of a sticker outweighs Hitler and Stalin, but who’s counting?

So, how do we evaluate if this research is worth taking seriously? A six-question test—outlined in brief here and in more detail later—help answer this question.

1) Does the research adequately deal with and explain previous literature? No.

2) Is the article in an appropriate, peer reviewed journal, where scholars are likely to have been able to catch the major flaws? No.

3)
Do the authors use a representative sample of the groups they are studying? No.

4) Do the authors do sufficient work to demonstrate that the relationship between religion and giving behavior is causal? No.

5) Is the statistical analysis rigorous and appropriate? Is it plausible that difference in religious upbringing is the only thing that makes stickers more valuable to some of the children than other children? No.

6) Is religion carefully measured and are religious groups carefully distinguished? No.

Finally, as regards mainstream media coverage of this study, it’s striking that the many newspaper and magazine articles that reported this story consistently take the research by Jean Decety and his colleagues as objectively true and unproblematic. They do not interview any other scholar who has researched this topic, nor cite any of the many peer-reviewed journal articles and university press books that find a different result. Given the dozens of scholars who have researched religion and altruism, it would not have been hard to find another scholar who could have offered perspective. I thought it was standard journalistic procedure to get more than one point of view for a story. Maybe not if the story says something you desperately want to be true.

Evaluating Research on Religiousness and Altruism in Children: A Six-Question Test

Earlier, I referred to a six-question test to evaluate the quality of the study by Decety and colleagues claiming religious children are less altruistic and more vindictive than non-religious children. This study was widely lauded in mainstream media sources—generally without interviewing any other scholars who research this topic. Below are in-depth answers to each of the six questions:

1) Does the research adequately deal with and explain previous literature? No.

There are dozens of articles and books about the relationship between religion and altruism. The vast majority of this research shows that religious people are more altruistic than non-religious people. Much of this literature is based on self-report, but some is based on unobtrusive observation. Much of this research also comes from high-quality, random samples. However, there is some complexity in the evidence about the relationship between religion and altruism so we need to look at the evidence by type.

First, as both the article and most media reports affirm, there is a widespread popular belief that religious people are more helpful. However, Decety and colleagues dismisses this evidence out of hand. This implies that most people are stupid. If religious people were in fact significantly less generous that secular people, the popular perception that the reverse is true would be hard to sustain—especially for people who interact with them regularly (as opposed to academics and reporters who typically do not).

Second, survey research consistently finds that religious people give more time and money to both religious and non-religious causes, both in formal and informal settings. Most of this evidence is based on self-report. Therefore, Decety and colleagues suggest that the association is caused entirely by social desirability bias—that is, highly religious people exaggerating how helpful they are more than non-religious people exaggerating how helpful they are. Some social desirability bias is plausible, but neither Decety and colleagues, nor the one article on the topic they cite, give any concrete evidence that the association between religion and self-reported helping behavior is caused entirely by social desirability bias. They assume it is. This is a strong assumption. Some of the survey-based research on altruism even attempts to measure and control for social desirability bias—yet still finds an association between religion and helping behavior.

Third, laboratory studies based on games typically find either no relationship between religiosity and giving, or a weak positive relationship. Typically these studies are done with college students, often students from psychology or economics classes, and often in Europe. Little is known about whether or not behavior in these experimental games matches people’s altruistic behavior in real life, or if undergraduate psychology majors behave similarly to other people. Game situations may alter behavior—for example, we all know people who love violent video games and happily kill people on screen, but who are not unusually violent in real life. Moreover, since these types of games are used so often in psychology classes, it is unclear whether or not students have read about them before and know the purpose of the game while they are playing it. Even if we assume that games played in a laboratory perfectly capture how everyone acts in the real world (which I do not), laboratory-based experiments do not suggest a negative relationship between religion and altruism, just a neutral or weak positive relationship.

Finally, and most convincing to me, unobtrusive observation of real-life behavior suggests a positive relationship between religion and helping behavior both at the societal level and the individual level. This research also suggests that Christians, particularly Protestants, are more likely to be involved in institutional helping behavior. For example, in Japan virtually all the voluntary work with homeless people is done by religious organizations, the vast majority of which are Christian despite the fact that Christians are a tiny minority in the country. Similarly in countries like the United States, the vast majority of voluntary humanitarian organizations, private schools and so on were set up by religious groups/people. This would be unlikely if religious people were less generous with their time and money than the non-religious.

We see a similar pattern on an individual level. For example, when academics conduct surveys, they often ask interviewers to evaluate how friendly and cooperative the respondents were. As part of my master’s thesis, I analyzed every survey I could find that collected this type of information. I found that interviewers rated highly religious people as being significantly more helpful and cooperative than non-religious people, and that those who had to be convinced to participate in the survey in a follow-up attempt were significantly less religious than those who agreed to participate from the beginning. This suggests that in ordinary life religious people are more generous with their time than non-religious people. Moreover, because survey respondents are contacted in isolation, religious people’s greater helpfulness is not caused by their relational networks, greater social pressure, or higher likelihood of being asked to volunteer and give money.

Thus, most non-laboratory research suggests a strong positive association between religion and helping behavior, and laboratory research suggests a neutral or weakly positive
association between religion and helping behavior. No line of research suggests a negative relationship between religion and helping behavior. The research by Decety and colleagues is clearly an outlier, and if reporters had cared to interview anyone who does research in this area, these scholars would have likely told them so.

2) Is the article in an appropriate, peer reviewed journal, where scholars are likely to have been able to catch the major flaws? No.

The article is published in a biology journal (Current Biology – Cell), despite the fact that the article does not focus on anything biological, and none of the authors are biologists. This seems odd. Perhaps publishing the article in a biology journal avoided getting reviewers that know the literature on religion and altruism, and who would likely force the authors to do a better job: for example, measure religion well, add sufficient controls for plausible alternative explanations, and at least deal with the previous literature on the topic. Basing an article primarily on t-tests from a non-random sample may be acceptable in biology, but I haven’t seen a peer-reviewed statistical article like this published in a reputable social science journal since the advent of personal computers (after which scholars did not have to calculate statistics by hand).

3) Do the authors use a representative sample of the groups they are studying? No.

Both the academic article, and the popular articles based on it, talk about religious children and non-religious children in general, but the authors did not sample children in a way that allows them to generalized to religious and non-religious children or even to religious and non-religious children in the seven cities where they conducted their research. Given the serious problems with the sample, we do not know who the results generalize to.

The authors picked six countries non-randomly (Canada, China, Jordan, Turkey, USA, and South Africa), picked one or two cities from each of these countries non-randomly, and then recruited respondents non-randomly. Nothing about the sample is random, and there are many ways this sampling method is likely to bias results towards religious children appearing less altruistic. For example, if you recruit religious children from a South African slum and non-religious children from the families of University of Toronto professors, you are likely to find some differences between the children that have nothing to do with religion.

Because the sample is not random, all generalizations from their sample and all significance tests using their sample are meaningless. Any first year statistics textbook will tell you this. When research based on random samples exists, we should ALWAYS privilege results from random samples over non-random convenience samples. And random samples consistently suggest a positive association between religion and altruism.

4) Do the authors do sufficient work to demonstrate that the relationship between religion and giving behavior is causal? No.

Even in good samples, correlation does not prove causation. But with a badly biased sample, even more effort is required to demonstrate that a correlation is plausibly causal. Unfortunately, the authors do not even go to the effort I would require in an undergraduate statistics class.

Of course, demonstrating causality is difficult. The authors cannot randomly assign religious background to children and then see if religion causes differences in altruistic behavior. Thus, social scientists typically try to account for as many alternative explanations as possible, to demonstrate that the association between religion and giving behavior is not caused by something else. Past research on altruism demonstrates that many factors predict giving behavior, but the authors do not control for them. If any of these omitted factors is correlated both with religiosity and with the giving behavior of children, or is correlated with which religious and non-religious people are sampled, then the relationship between religion and giving in the author’s analysis will be biased.

For example, both wealth and trust can influence giving. If children from wealthier backgrounds have more access to stickers than children from poor backgrounds, on average this makes stickers less valuable to wealthy children than poor children. Giving stickers to other children is less costly for wealthy children than for poor children. Similarly, in contexts of high-trust, low corruption, and low violence, people generally trust “the system” more. A child from a high-trust context may trust an unknown researcher to give the sticker gift to another child more than children from low-trust environments. If in the sample “non-religious” children disproportionately come from wealthy, privileged families and live in high-trust environments relative to the religious children, this will create a spurious negative association between religion and giving. But religion is not reducing giving; poverty and trust are.

Problematically, it seems likely that the authors coded many more Canadians as “non-religious,” and more Jordanians and South Africans as “religious.” But Canadian children are also typically wealthier and trust strangers more than Jordanian and South African children. Similarly, if we think about the university contexts where the samples were taken, it seems likely that the authors sampled wealthier, high status “non-religious people” and poorer, lower status “religious people.” So for example, if we look at the people who live around the University of Chicago, wealthy, high-status, low-religiosity people disproportionately live in Hyde Park (immediately around the university), but they are surrounded by a large predominantly poor, African-American population, who live in government housing, have struggling schools, and are typically much more religious than their Hyde Park neighbors. Non-random samples taken at universities often have this problem—getting children of university employees (who are disproportionately privileged but secular) and those in the surrounding communities (who may be disproportionately less privileged and more religious).[2]

5) Is the statistical analysis rigorous and appropriate? Is it plausible that differences in religious upbringing is the only thing that makes stickers more valuable to some of the children than other children? No.

I cannot remember the last time I saw a published statistical research article in a peer-reviewed social science journal based primarily on t-tests (which assume the only relevant difference between the religious and non-religious children in the sample is their religion). If we compare the sticker giving of poor Christian children from a South African slum with wealthy non-religious child in a Toronto suburb, is it plausible to think the only difference between them if their religion? No. But both the authors and journalists focus on the comparison between the religious and the non-religious without any controls (this comparison assumes the two groups are identical in every other way). Because there are more non-religious people in Canada than South Africa or Jordan, and wealth probably influences how valuable stickers are to children, carefully controlling for country and socio-economic status (SES) is crucial. The authors do some of this, but in a weak and misleading way.

The authors back up only two of the t-tests with OLS regressions. In these regressions they only control for age, country, and what they misleadingly label “SES.” However, the only measure of SES they use is a rough measure of mother’s education (simplified to six categories), but they never mention this in the text. I had to search their supplemental material to find their measure of “SES.”[3] But is control for mother’s education (in six categories) sufficient to
equalize the socio-economic status of all children? I don’t think so. That implies, for example, that every child whose mother has a high school degree has the same access to resources as every other child whose mother has a high school degree, regardless of income, wealth, father’s education, race, parental marital status, etc. I doubt the authors think mother’s education fully accounts for SES either, or they would not have hidden the measure in an online appendix. It takes six words to say, “We measured SES using mother’s education.” Not a lot. But that would have raised red flags.

Think of it this way: if you are wealthy and go to a well-financed school, you may have hundreds of stickers at home and get more regularly. Thus, stickers are not particularly valuable. It is easier to give stickers away because you can easily get more. Alternatively, if you come from a poor single parent family and attend a poorly-financed school, you may rarely get stickers. This makes stickers much more valuable to you and makes giving them away harder. If two children have an equal amount of altruism, on average the child who has easy access to stickers is likely to give away more stickers, than the identical child that has little access to stickers.

Now think about how this might work in the sample we are discussing. Presumably the University of Chicago team recruited people close to the university. Imagine, they recruited two 8 years olds, one named Gwyneth and the other Kanisha. Gwyneth is European-American and attends the Laboratory School, an elite private school with lots of resources. Her father is a physics professor at the University of Chicago and makes a large salary. Her mom earned a B.A. from Harvard, and works at the Chicago Art Museum. Both parents came from wealthy, well-educated families and are not religious.

Kanisha lives 10 blocks away from Gwyneth, but in a government housing project in a South Side slum. Kanisha is African-American and attends a struggling public school with few resources. Her mom is a single parent, who attended a local community college in the evenings and recently graduated with a degree in social work, but still works as a waitress at Denny’s and is struggling financially. Kanisha and her mom attend a local AME church every week.

In the regression the authors published, they assume Gwyneth and Kanisha are identical—that is, the only relevant difference between them is their religious identity. Both children are eight years old, both live in the United States, and both have a mother with a B.A.—thus the authors assume both children have identical socio-economic status, that stickers are equally valuable to both of them, and that the only cause of differences in how many stickers they give away is their religion. But it is hard to believe that stickers are equally plentiful in both their homes and both their schools. It is also unlikely both are equally trusting that “the system” will work fairly for them, or that an unknown stranger will actually give the stickers to another child. Even if Kanisha’s religiosity increases her generosity relative to other similar children, this increase may be insufficient to overcome the differences in wealth and generalized trust between her and Gwyneth.

To check the authors results, and see if poor planning prevented the authors from measuring any other aspect of SES (because they did not collect that data), I asked the authors if I could have a copy of their questionnaire and replication data. So far they have not responded. Sometimes scholars keep data private for a while so that they can finish more publications from it before others have access. However, the demographic questions asked in a questionnaire are typically freely shared.

6) Is religion carefully measured and are religious groups carefully distinguished? No.

Over 60 percent of the religious people in the sample are Muslims. This means that in most of the t-tests and all of the regressions, Muslims disproportionately drive the results. If Muslims are different from other religious groups, or if the Muslims in the sample are disproportionately from poor or distrustful communities, this would bias the results the authors attributed to all religious children. But are Muslims identical to all other religious groups? In the t-tests the authors show us, the difference between Muslims and Christians is statistically significant 50 percent of the time. So why do they lump them as one group in all the regressions and all their conclusions? When results for Muslims and Christians differ; why do they treat the pattern for Muslims as representing all religious people?

The authors use two measures of religiosity: “How often do you attend religious services?” and “How often do you experience the ‘divine’ in your everyday life?” They then merge these into a single variable. The measure of frequency of attendance is biased toward Muslims. Because Muslims are expected to pray five times a day, every day, attending religious services more than once a week is more common among Muslims than Christians. The frequency of divine experience is likely biased toward Pentecostals. Thus, Muslims and Pentecostals will tend to cluster at the high end of the religiosity variable.

We never learn if either being religious or religiosity predicts lower generosity in all six countries in their sample and for both Muslims and Christians. Religion and religiosity are, for the most part, assumed to be one thing and assumed to work the same everywhere—as if the type of religion and the context of religion do not matter. Given the major sample problems, it would increase the plausibility of their results if the religious/low-giving association were consistent regardless of context and regardless of religious tradition.

In summation, Decety and colleagues ask an important and interesting question. However, they use a problematic non-random sample and inappropriate statistical techniques to analyze it. The sample is biased in a way that seems likely to make non-religious children seem more generous. The authors do almost nothing to account for alternative explanations. They lump religious groups together in a way that even their limited analysis suggests is inappropriate. They generalize to “religious” and “non-religious” children in a way that their sample does not allow. They make claims well beyond those supported by their analysis and reporters make claims well beyond even those made by the study’s authors. The authors have not (yet) made either their data or questionnaire available to others scholars to allow them to check their results. The article is published in a biology journal in which the editor and reviewers are unlikely to be familiar with either the previous research on the topic or the research standards required to make generalizable causal claims from sampled human populations. Both authors and journalists almost completely ignore the previous research on the topic, virtually all of which directly contradicts the study’s conclusions.

The fact that this study was published in a peer-reviewed journal and was so widely cited in the popular press—almost universally without interviewing or citing anyone else who has researched this topic or has a different point of view—is troubling. Although Decety claims that secularization makes people kinder and more moral, his research project does little to determine whether or not his belief is true.

[1] In some analyses the researches also statistically controlled for age, country, and a rough measure of the education of the child’s mother. I dis
cuss the adequacy of their controls later in this post.

[2] Alternatively psychologists tend to sample students in their psychology classes, which creates other problems.

[3] For readers who are statistically trained, the author’s models even violate the assumptions of OLS regression. The number of stickers children give away is a count variable, thus Poison or negative binomial regression are appropriate, not OLS regression.

Robert Woodberry is an associate professor of political science and director of the Project on Religion and Economic Change at the National University of Singapore.

This piece was originally authored on November 17, 2015 for the Religious Freedom Project at Georgetown’s Berkley Center for Religion, Peace, and World Affairs.