Table of Contents
Calling Bullshit by Carl T. Bergstrom and Jevin West is about learning how to recognize different forms of bullshit.
Clicks
The internet news economy is driven by clicks, not for building a relationship. Quality of information is no longer important as it once was in the pre-internet age.
How to win attention? Sensationalism.
Another way was discovered by Steve Rayson, who looked at 100 million articles published in 2017. He tried to find the most common phrases in the most shared articles.
The most successful headlines don’t convey facts, they promise an emotional experience. The most common phrases among successful Facebook headlines include “will make you”, “will break your heart”, “will make you fall in love”, “will make you look twice” or “will make you gasp in surprise.”
This also works on Twitter. Others include “make you cry”, “give you goosebumps.” and “melt your heart.”
Intellectual experiences can’t compete. This is a huge shift in the way we consume media. In the olden days, headlines tried to simply give you the essence of the story “Men Walk on Moon. Astronauts Land on Plain; Collect Rocks, Plant Flag.”
The Washington Post announces, “One-Fifth of This Occupation Has a Serious Drinking Problem,” “How to Evade the Leading Cause of Death in the United States,” CNN promises to inform you. “Iceland Used to Be the Hottest Tourism Destination. What Happened?” asks USA Today. (So as not to leave you in suspense: lawyers; don’t get in a car accident; and nobody knows.)
The printing press allowed for a variety of books to emerge. Cable TV allowed people to customize their experience. And before 1987, the Fairness Doctrine of the US Federal Communications Commission (FCC) tried to ensure balanced coverage of controversial issues on the news.
But Reagan repealed it. And with the introduction of 24-hour news cycles, cable news channels proliferated and became more partisan.
Algorithms make things worse. Facebook, Twitter, and other social media platforms use algorithms to find relevant posts and stories. But these algorithms don’t want to keep you informed, they want you to stay active on the platform. They feed you more of what you want to hear, which makes you more biased.
Social media also helps spread misinformation – false claims not deliberately meant to deceive. The first to break the news gets most of the traffic so publishers skip fact checking.
Fake Personalities
It used to be tricky to create fake personalities online, because we needed pictures to trust the person we chatted with.
Fake accounts sometimes use stock photos or images scraped from the Internet—but these were easily tracked down by savvy users using tools such as Google’s reverse image search.
No longer. New algorithms known as adversarial machine learning can create realistic faces of people who do not exist.
A real picture of a real person is paired with a computer-generated picture of someone who doesn’t exist. Your goal is to guess correctly. People usually don’t do much better than chance. And even with lots of practice, people are fooled one in five times.
Similar machine learning algorithms can “voiceshop,” generating fake audio and video that seems real. It does this by synthesizing audio from old recordings and grafting expressions from a model. These deepfake videos can make it seem like anyone is saying or doing anything.
Jordan Peele made one with Obama, who in the end says in his voice (Peele’s words): “How we move forward in the age of information is going to be the difference between whether we survive or whether we become some kind of fucked-up dystopia.”
Three ways to defends against bad information line.
1) Technology: Use AI to detect fake news (but seems like a losing battle).
2) Government regulation: Problem is, it opposes first amendment in US Constitution (freedom of speech). Second, who gets to choose what is classified as fake news?
3) Education: Teach media literacy and critical thinking. Solve the problem from the bottom-up.
Scientific studies often find correlations, like exercise and reduced cancer rates. But when reported on the news, people get the wrong impression. They think they are getting prescriptive advice. People want to know what they ought to do. When the popular press obliges, they do so with no evidence of causality.
Original scientific articles make this error too. Nutritionists have debated the benefits of whole-milk vs reduced-fat milk in preventing obesity, and typically favor the latter. But a recent study of children in San Francisco showed that whole-milk consumption was associated with lower obesity. The authors of the study cautioned that this is a correlation, and does not show a causal relationship, but the title suggests otherwise: “Full Fat Milk Consumption Protects Against Severe Childhood Obesity in Latinos”
The authors make another error, they suggest that the results contradict the recommendations that promote lower fat milk consumption. But there is no evidence for this.
Delayed Gratification
Everyone has heard of the marshmallow test – the study done to measure the correlation between self-control and success in life. The takeaway is that parents should condition their children to work hard and then get rewarded so that they associate reward with hard work. But when this experiment was replicated many years later, it gave different results.
Smoking Doesn’t Kill?
There is a difference between necessary and sufficient causes. Mike Pence argued that there was public hysteria over smoking, that in fact, 2 out of 3 smokers do not die from a smoking related illness, and 9 out of 10 do not contract lung cancer. But that is bullshit. Pence says that smoking doesn’t kill, but then he says that a third of smokers die from smoking-related illnesses. Pence conflates sufficient cause with probabilistic cause. Smoking is not sufficient to guarantee smoking-related illness, but it greatly increases the probability that someone will die of smoking-related illness. Similarly, you can say that miners get lung cancer and never smoked. But this argument conflates necessary cause with probabilistic cause.
Scientific Problems
Scientists once claimed that they discovered cold fusion. This was later disproved. Science’s deepest foundations can be questions, and even replaced, when they are incompatible with new discoveries. Geneticists and evolutionary biologists thought that genes were the only molecular vehicles of inheritance. But when genetic sequencing became cheaper, strong evidence accumulated, which showed that there was more to the picture. Parents sometimes pass a second layer of nongenetic information about what genes to activate in different circumstances. This became known as epigenetics.
Replication Crisis
In many cases, scientific results are irreproducible. Sometimes, it’s because of fraud, but this is very rare, and it doesn’t explain why half of the results in fields are irreproducible. How can we explain the replication crisis? P-values.
P-values tell us the probability that the pattern we see could have occurred through chance alone. If that is highly unlikely, we say that the result is statistically significant. But this is tricky. An unlikely hypothesis remains unlikely, even with a very small P-value. Usually, a P-value of 0.05 is used.
Researchers are much more likely to publish positive and not negative results. Sometimes, they hack P-values, they fit the data to accommodate the P-value rather than publish everything they find.
Why Most Published Research Findings Are False
In 2005, the epidemiologist John Ioannidis explained the problematic nature of scientific research. To understand his argument, we need to know the base rate fallacy.
Imagine you’re a doctor, treating a patient who thinks they have Lyme disease. You test his blood to check for antibodies against the bacteria that cause the disease. Unfortunately, the test came back positive. The test is quite accurate, but not exceptionally so. It shows a false positive around 5 percent of the time.
What are the chances that your patient has Lyme disease? Many people, including doctors, think the answer is 95 percent. But this is wrong.
Ninety-five percent is the chance that someone who doesn’t have Lyme disease would test negative. You want to know the chance that someone who tests positive has Lyme disease. It turns out that this is a low probability because Lyme disease is quite rare. In areas where it is endemic, only about one person out of one thousand is infected. So, imagine testing 10,000 people. You’d expect to have about 10 true positives, and about 0.05 × 10,000 = 500 false positives. Fewer than 1 in 50 of those who test positive are actually infected. Thus, you expect your patient would have less than a 2 percent chance of having the disease, even after testing positive.
A similar fallacy is the prosecutor’s fallacy. Imagine a person was sued because his fingerprints were on the murder weapon. And the persecutor argued that there is a one in 10 million chance that the suspect is innocent and there is a fingerprint match. This seems like conclusive evidence, but here, the probability being calculated is the probability of having a match, given the person is innocent. The defendant objects and says that actually, since there are 60 million records in the database, then in the US, there are 6 people who would get matching fingerprints, and 5 of them would be innocent. So, the real probability is the probability that the person is innocent given that his fingerprints match (we know this for sure). And here, we get a different probability. There is only a 1/6 chance that he is guilty, or a 5/6 chance that he is innocent.
With that out of the way, let’s come back to Ioannidis. In his paper “Why Most Published Research Findings Are False,” Ioannidis draws the analogy between scientific studies and the interpretation of medical tests. He assumes that because of publication bias, most negative findings go unpublished and the literature comprises mostly positive results. If scientists are testing improbable hypotheses, the majority of positive results will be false positives, just as the majority of tests for Lyme disease, absent other risk factors, will be false positives.
The authors don’t argue with Ioannidis’s mathematics. But they do argue with his assumptions. For most published findings to be false, scientific experiments would need to be like rare diseases: highly unlikely to generate a positive result. But scientists get rewarded for publishing interesting work, and it’s difficult to publish negative results. We would expect scientists to test hypotheses, while undecided, seem reasonably likely to be true.
If we want to actually measure how big of a problem publication bias is, we need to know (1) what fraction of tested hypotheses are actually correct, and (2) what fraction of negative results get published. If both fractions are high, we’ve got little to worry about. If both are very low, we’ve got problems.
So, they argue that scientists will test hypotheses that stand a good chance of being true. The chance could be 10 percent of 75 percent, but unlikely to be 1 percent or 0.1 percent. As for publishing negative results? That occurs around 15 percent of the time. Within biomedicine, 10 percent. Social psychology? Only 5 percent. We don’t know if this is because psychologists are less likely to publish negative results, or they are choosing experiments that are more likely to generate positive results.
The fraction of published results that are negative is not what we really want to know. We want to know the fraction of negative results that are published.
How can we get that?
We need to look into files buried in file drawers. Erick Turner at the FDA found a brilliant way around this problem. In the US, when investigators run a clinical trial, an experiment using human subjects to test outcomes of medical treatments, they need to (by law) register this trial with the FDA. They need to file paperwork that explains what the trial is designed to test, how it will be conducted, and how the outcomes will be measured. After the trial, the team needs to report the results to the FDA. But they aren’t required to publish these results ina scientific journal.
This system gave turner and his team a way of counting the published and unpublished trials in one area of research. Turned compiled a list of 74 clinical trials aimed at evaluating the effectiveness of 12 different antidepressant medications. Results from 51 of these trials were published, 48 with positive results (drug is effective) and 3 with negative results.
If someone looked at the literature, they would conclude that these antidepressant drugs are effective.
But with access to the experiments as initially registered, the FDA sees a very different picture. They see 74 trials of which 38 yield positive results, 12 yield questionable results, and 24 yield negative results. From those numbers one would reach a more pessimistic conclusion—that some antidepressants seem to work somewhat under some circumstances.
What’s going on? How did clinical trials with a 51 percent success rate end up being reported as successful in 94 percent of the published papers?
One reason is that almost every positive result was published, but less than half the negative/questionable results were published. Worse, of the 14 negative/questionable results that were published, 11 were recast as positive findings.
A sailor only sees the part of the iceberg above the water, similarly, a researcher reads only the positive results in the scientific literature. This makes it hard to tell what’s really happening.
Thankfully, there is a way to estimate the size of the submerged piece of the iceberg. One way is a meta-analysis, which looks at multiple studies at the same time. Bu doing this, we can see when the published literature is likely to be representative of the set of all experiments conducted, and when the published literature reflects bad behavior (P-value hacking, publication bias). Figuring out how to do this well is a hot area in statistical. research.
How to tell whether a scientific article is legitimate? First, any scientific paper can be wrong, no matter where it was published or who wrote it, or even how well supported its arguments are.
Linus Pauling was a brilliant scientist who remains the only person to have received two unshared Nobel Prizes, the prize in chemistry and the peace prize—but he also published papers and books that turned out to be completely wrong, from his proposed triple-helix structure of DNA to his views about the benefits of high doses of vitamin C. Nature and Science are the two most prestigious join the basic sciences, but they too have published some howlers. In 1969, Science published a paper about a nonexistent polymer of water known as polywater, which contributed to fears among defense researchers of a “polywater gap” between the US and Russia. Nature published an erroneous paper in 1988 purporting to show that homeopathy can be effective.