Playing with Numbers

(For a parallel discussion see Armon A. Brott's paper: The Battered Statistic Syndrome.)

Science is everywhere. Or, more to my point, "science" is everywhere. You can't turn on the TV or pick up a newspaper without discovering that some psychologist, sociologist, behavioural expert, social activist, or big corporation has done some "research" and has some fascinating—and shocking—"statistics" to share with you. It's overwhelming. Unfortunately, it's also mostly bullshit.

I hate living in a post-scientific world. I admit it, I'm a scientist at heart, and the stuff that passes for "science" these days just makes my stomach churn. Whatever happened to the 1800's, when Science was considered a sacred trust? Yeah, they weren't very good at it, and put forth a lot of stupid ideas that flew only because of prejudice and limited knowledge, but they did their best to be honest.

These days it seems that anything goes. Misquote results, fudge the numbers a bit, selectively cull your data until you get the result you want or, if you're lazy, just make it up. It's all fair game.

If you want an example, just check out any "research" that touches on feminism or AIDS. Almost all of it is complete crap. Anything that touches both of these subjects is surely garbage. Anything on the environment is running a close third.

It seems that caring about truth, accuracy, or honesty is now considered quaint.

Although some people, who don't agree with my political views, may find the examples I've given irksome and even predictable, I don't want you to think that this sort of numerical monkey-business is restricted to liberals, feminists, and tree-huggers. My point in writing all of this is to demonstrate that whenever you hear a statistic or a number on television, or read it in the newspaper, your immediate response should be, "Where did this number come from? Was there any research? Was the research accurately quoted? Who did the research? Were they impartial?" I do this, even when the statistic or number backs up one of my pet beliefs or theories. Even when the number leads to the "right" conclusion I still question it, and I think that you should, too.

Here's a list of some of the "science" games I've seen in the media and on the Web, in no particular order.

"Recent studies have shown..." Anyone who includes this phrase in their argument is surely spewing nonsense. Which studies? By whom? I don't even trust most studies that are actually quoted, with names of researchers, institutions, and dates, let alone vague references to something that someone may have researched somewhere. This phrase can accurately be translated as "The following opinion is dear to my heart and I want everyone to accept it without question..."

Selectively quoting from studies. The vast majority of references to studies that you will see are misquotes. Like the Bible, a research paper is full of information, and it's easy to pick out just the bits that bolster your argument, while ignoring the bits that don't. Feminists are famous for quoting from studies of domestic violence, noting how prevalent male-on-female violence is, while not mentioning that the same studies found female-on-male violence at almost the same levels. Dr. Judith Kleinfeld discusses how this technique led to the pop wisdom that "schools shortchange girls." The only way to quote a study properly is to try to summarize all of the findings and present a balanced picture. Few people bother with this.

Quoting only "friendly" studies. This is a variation on the previous theme. If you find twenty studies on a topic and only three back your opinion, then quote only the three that agree with you, and don't mention that there are seventeen more that don't.

Equivocation. This is a popular and deceptive argument technique: ask people if they think that rapists should go to jail. Of course they should. Who would defend the idea of letting rapists go free? Then define a "rapist" as everyone from the man who violently forces a woman to have sex (which everyone was thinking of when they answered the question) to some guy who simply asked a woman a second time after she's said, "No." In other words, after winning the argument, change what the word "rapist" means to fit your own agenda. If anyone objects to your throwing witless college boys in prison, snap back with, "But you agreed! Do you want to let rapists go free?" This works with all sorts of words and situations, which explains the technique's popularity.

Define it later. This is similar to equivocation. Do a psychological study asking people about experiences they've had. Then, after they've answered the questions, define your syndrome or problem in the right terms so that the right number of people fall into your category. Whatever you do, never tell the people beforehand what you're hoping to discover, and don't use accepted definitions of words. Mary Koss's study of rape on campus is a famous example. She asked women if they had ever had sex that they didn't want with a man who bought them drinks. It's not uncommon for people (not just women) to get drunk then do something that they later regret, so a lot of women answered yes. After the questionnaires are all back in, Ms. Koss defined this experience as "rape" and Presto! she instantly created a rape crisis on campus. Of course, someone went back to the women who answered the questionnaire, and the majority did not describe their experience as "rape," but that's because they weren't creative thinkers like Ms. Koss.

"Fastest growing..." Only two women on your campus OD'd this year versus twenty men? Last year there was only one female OD to eighteen male OD's? No problem: women are now the "fastest growing" group when it comes to campus drug overdoses, because there was one last year and two this year. Never mind that they OD at 10% the rate that men do. They need more help and special programs because there are 100% more this year than last year.

"Underreported. Unrecognized. Ignored." Only one female sufferer last year and still only one this year? Once again, no problem. The female side of this problem is now "underreported." I was once called by a woman, soliciting for a charity, who solemnly told me that her charity was the "most under-funded of charities." It sounds so poignant until one asks, "According to whom?" In order to decide whether her charity was under-funded, she would have to know two things: how much funding her charity is receiving (easy) and how much funding it should receive. Who came up with that last number, and how? Is this anything more than simply saying, "I think people ought to give us more money?" Is saying that some problem that women have is "underreported" anything more than simply saying, "I want more attention?"

Absolute numbers. "43 women were murdered in our city last year." It sounds awful, doesn't it? The question is, how many men were murdered? If the answer is 430, then 43 doesn't look so bad any more. If only 4 men were murdered then 43 is terrible. Beware articles that give "shocking" numbers out of context.

Right under your nose. "Twenty percent of murder victims are female!" Make this the first sentence in your article, then bang on for a while about how awful it is that all of these women and girls are being murdered. Chances are, you'll get sympathetic correspondence from angry women and apologetic men. Now, look again at the number. Twenty percent of murder victims are female. So, eighty percent of murder victims are male, yet those eighty percent are never mentioned in the article. This isn't so much an abuse of statistics as leveraging off of societal prejudice, but it's widespread nonetheless.

Ignoring the implications. Check out the following quote: "Over a quarter-million women are victims of violent crime in the workplace each year and homicide is the leading cause of death for women on the job. [U.S. Department of Justice as quoted by Patricia Ireland, president of N.O.W.] At first this seems like a shocking statistic until you think about it. Men die all the time on the job. Men are construction workers, garbage collectors, underwater welders, and stunt doubles. They do all sorts of dangerous things and frequently get killed doing them. On the other hand, we have that "...homicide is the leading cause of death for women on the job." Now let's do one more thing: look at the raw numbers. In 1995, 1024 people were murdered on the job, 780 men and 244 women. Apart from falling into several of the other categories of fallacies, Patricia Ireland's quote obscures something else: if homicide is the leading cause of death for women on the job, and only 244 women were murdered on the job, then very few women die on the job relative to men, and work is a very safe place for women.

There is one other thing in Ireland's quote worth mentioning. She claims that "over a quarter-million women are victims of violent crime in the workplace each year," yet we now know that only 244 were killed, at least in 1995. How can one reconcile the ratio of "violent crime in the workplace" and workplace homicide being a million to one? Either only one in a million violent workplace incidents results in a murder, or there is something rotten with Ireland's definition of "violent."

Quote from the best year. My source in "ignoring the implications" may be guilty of this. Maybe 1995 was a year in which there was a statistical "slump" in the number of women killed on the job; maybe in other years ten times the number of women are killed as men. Studies or articles that quote from a single year rather than show a trend over time should be instantly suspect. One famous statistic that suffers from this is the assertion that in the year following a divorce, the ex-wife's standard of living goes down, while the ex-husband's goes up. Ever wonder why the study stopped at a year? It stopped there because after the first year, the trend reverses itself, so further study would contradict what the "researcher" was trying to prove in the first place. Solution? Publish only the results from the first year.

Statistics from abroad. The AIDS activists are experts at this. Did you know that millions of people every year contract AIDS? That it is not just a disease of homosexuals and intravenous drug users but affects heterosexuals at a far higher rate than it does either of these latter two groups? Well, yes, if you happen to live in Africa, all of that is true. However, the people who quote these numbers are arguing for more research spending in Canada and the U.S. Certainly we might want to do research on AIDS in order to save those afflicted in Africa, or out of forward-thinking self-interest to advance our knowledge of this kind of virus for when a more devastating variant comes along. However, this is not what these people are arguing. They are trying to make you and me believe that millions of people every year contract AIDS in the United States and Canada. By conveniently omitting the words, "in Africa" or "in the whole world, including Africa, where the disease is rampant," they're trying to pull the wool over your eyes and make you believe something that is far from true. If you challenge them on this, they will smile and reply, "I didn't say that all of those AIDS victims were in North America." Nonetheless, it's what they were hoping you would believe.

Pick your sample to give you the results you want. A recent study by none other than Statistics Canada researched violence against women by men. Their findings? Violence against women is a serious problem in Canada. Apart from the statistical errors ("lies") built into the final report, there was a huge, glaring error with this: violence against women is a serious problem in Canada... as compared to what? Stats Can didn't bother looking at any other side of the story: How much violence is there against men by women? How many of that violence resulted from mutual punch-ups? Without this information, their "violence against women" statistic is left hanging in empty space, and as such it isn't all that useful.

A number of studies by Women's Studies types in the United States have come to the same conclusion by interviewing women in battered women's shelters. This, as Armon Brott pointed out, is "...like interviewing people at McDonald's, asking them if they eat fast food." If one studies only women, and asks only about violence done to those women by men, then you will obviously conclude that men are violent toward women. If you ask women in battered women's shelters if they think men are violent, you already know the answer you're going to get. Regardless of their results, all of these studies are weak. Logic dictates that in order to have a strong study, you should take the view of your opponents, anticipate their objections, and provide enough context to prove them wrong. Modern "studies" take the opposite approach by avoiding anything that might contradict what the researches wanted to prove from the start.

Obscure your sample; let readers draw their own conclusions. The most famous (and oft-quoted) example of this is the ubiquitous statistic that women make 75¢ for every dollar that men make. What the people who quote this don't tell you is that it's a society-wide figure. It's taken over all working men and all working women. Never mind that more women work part-time than men; never mind that women frequently leave the workforce to have children then re-enter it at the same salaries they received before the pregnancy, while men stay in the work force and their salaries go up. Never mind that thirty years ago there were great inequities in the workforce, and that the oldest workers are making the most money and are almost exclusively male, but that they'll be retiring shortly. The 75¢ figure ignores these three and a host of other variables. In fact, depending upon the job, once you correct for experience, education, hours of work, and everything else that could affect the figure, the correct number is more like 95¢ for every dollar. No, it's still not fair, but it's not nearly as shocking as 75¢, is it?

Did you know that in divorce, the children go to the ex-husband 75% to 80% of the time? I didn't know that either. In fact, when I heard it, I thought that it was patently ridiculous. "Everyone" knows that fathers get the children only 5% to 10% of the time. So what's up with this fem-statistic? Well, the people who came up with this one simply took a sample over all divorces, including uncontested ones, and redefined "joint custody" as "the ex-husband got custody." Sneaky, huh? It's not strictly a lie: if there is joint custody then true enough both parties got custody, and one of those is the ex-husband. However, what the listener is thinking of is divorce cases in which sole custody is being awarded (and the writer knows this). So, the feminists who coined this figure, while not lying in the strictest sense, were trying to mislead their audience.

Compare apples and oranges. A recent decision in Canadian courts upheld a Human Rights Commission study on gender bias in public sector salaries. The Commission had come up with an enormous number for the amount of back-pay the government should pay to the almost exclusively female (surprise!) "underpaid" workers. The courts upheld the Commission's findings (further evidence that Supreme Court judges are either biased, malicious, or stupid.) Unfortunately, the "study" was nonsense. It decided, for example, that secretaries were underpaid by comparing them with construction workers. How, precisely, one decides that a secretary working hard in a (relatively) comfortable, air-conditioned office is doing "equivalent" work to a builder working hard in a (relatively) hazardous construction project is beyond me. However, people do try to pass such determinations off as "scientific truth."

"Don't ask, don't tell." This tactic is popular with AIDS people and anti-smoking crusaders. A good way to inflate the number of sufferers of diseases, short of outright lying, is to take the National Enquirer approach: rather than asking, "Are we sure that this person died from this disease?" Ask yourself, "Is it possible that this person died from this disease?" Presto! Your numbers go way up.

Believe it or not, some people don't believe that AIDS is caused by the HIV virus¹. I'm no doctor, so I'm not qualified to endorse their theories, but on their pages is an interesting point. Many patients have had their cause of death listed as "AIDS" when in fact they did not test positive for the HIV virus. These people claim that this is evidence that the virus doesn't cause AIDS. Maybe. Or maybe it's evidence of overly zealous medical staff. After all, the more people that die from something, the more public sympathy one can rally and the more money shows up to research a cure. So, why not list "AIDS" as the cause of death for anyone who dies of something AIDS-ish, even if you're not sure? You're not exactly lying, but then it's not science, either.

The worst offenders in this area are by far the anti-smoking people. Now, I like anti-smoking campaigners. They've made my air more breatheable. However, this doesn't excuse the kind of nonsense statistics that they spew. I've lose count of the number of times that I've heard that such-and-so number of people die from "smoking-related illnesses." Just what the hell is "smoking-related"? Lungs filled with tar gum? Or some disease that someone contracted because they had some other disease that maybe was caused by their smoking habit? My complaint is that it's just too easy to define "smoking-related" as anything you would like it to be. Did an 80-year-old man who succumbed to pneumonia die of a "smoking-related" disease? Maybe if he hadn't smoked he would have lived one month longer and incurred exactly the same medical costs. Does that count? Do all of the dollars we spent looking after him therefore accrue under the "costs of smoking"? Even if we would have spent the same money one month later if he had been a non-smoker? Anti-smoking advocates are hoping that you don't ask questions like this, lest their Godzilla numbers be pared down to iguana size.

Make it up. I have seen this several times, but one example sticks in my mind, from a CBC television documentary on statistics in the media. One of the people interviewed on the program used to write the science column for a New York newspaper. She said that she normally just summarized the studies and reports sent to her by various agencies until one day she received a report from the New York food bank stating that they served some astounding percentage of the New York population. For once, she did the math, and figured out how many people that was, how many food banks there were, and came to the conclusion that it just wasn't possible. She tracked down the person who had created the report, and asked them where the numbers came from. The response? "We wanted some strong numbers that would grab people's attention, and that percentage seemed alarming while still being plausible." In other words, they just made it up.

The idea, of course, is that if your cause is noble, truth doesn't matter. If you raise awareness and thus feed more poor people, who cares if you lied to do it? The problem with this ethic, of course, is now that you've inflated your numbers, the charity across the street has to inflate theirs even more to get the attention they think they deserve, and on it goes. Eventually nobody (for example yours truly) believes anybody's statistics, whether they're true or not, and a sort of generalized malaise sets in. Despite the fact that lying nets short-term gain, in the end it leads to ruin.

The worst groups for completely fabricated statistics are feminists and the AIDS people. Statistics on domestic violence and rape are routinely plucked out of the air. Statistics on the prevalence of AIDS are not much better.

This article in Canada's National Post is primarily about gun licensing and gun controls, but it starts off with a very good debunking of various pop media statistics. It illustrates many of the things I've mentioned here with concrete examples.

¹ You can read a rebuttal of this idea on the Web. There are several sites by the "non-believers" themselves, but I've lost the URLs... I'll keep looking.