We’re regularly bombarded with news of the latest scientific research findings, and sometimes it seems like you can find a study to tell you just about anything. My concern with news reporting of research findings is that many people (including members of the media) have relatively limited research literacy. Research literacy refers to the ability to draw upon a body of knowledge about research methods in order to interpret and critically evaluate research findings. Greater research literacy means, among other things, a more effective bullshit detector.
In this post, I’ll break down some of the terminology that’s often used in research. The concepts involved are not necessarily intuitive, so building research literacy involves learning some fundamentals. The randomized, double-blinded, placebo-controlled trial is considered the “gold standard” for much medical research, and it’s useful to understand why that is.
Peer review is an important quality control step before research papers are published. Reviewers are independent of the journal the paper has been submitted to, and they remain anonymous (except to the journal editors). Peer reviewers are selected by editors because of their expertise in the field, and they will give feedback to guide the editor in making a decision about whether or not to publish a manuscript. Most often reviewers will recommend improvements be made before a paper is accepted.
The peer review process can potentially get a little shakier when it comes to “open access” journals. The articles in these journals are publicly available without a subscription, and instead, the journal’s revenue stream comes from authors, who must pay a publication fee. Some of these journals are credible and reputable, while others are predatory.
Because I’ve had work published in academic journals, my email address is out there in that world, and I regularly get emails from obscure open access journals asking me to do reviews for papers that are so completely unrelated to my research field it’s sometimes laughable. Hmmmm.
Quantitative vs qualitative research
Quantitative research collects quantifiable data and generally uses statistical methods to analyze that data. Experiments are one type of quantitative research design, and particularly designs may be most suitable to answering particular types of research questions. Quantitative research is the basis of much of the medical knowledge that we have. However, numbers can never tell the whole story, and that’s where qualitative research comes in.
Qualitative research uses people’s stories to answer questions like what does it feel like to have mental illness, for example. I’m partial to a method called autoethnography, which is a study of the self in the context of culture. I used this in my master’s thesis to examine my own experiences of mental illness, and it’s a really exciting way for the voices of people with mental illness to be heard.
Qualitative research is often considered as producing lower strength evidence than quantitative research, but it can be an excellent way to provide richer detail of what is studied quantitatively. For example, a quantitative study might look at the effectiveness of interventions to reduce the use of seclusion and restraints in a psychiatric ICU, while a qualitative study might talk to the patients in the ICU and get their descriptions of what it felt like to experience these different types of interventions.
Elements of a clinical trial with experimental design
There are a few elements of experimental design that allow it to support cause-and-effect inferences.
There is a great deal of person-to-person variability in health status, genetics, and many other factors. By randomizing which participants are assigned to the intervention group and which are assigned to the control group, the likelihood decreases that one particular group will be stacked with people having certain characteristics. Greater homogeneity between participants receiving intervention A and those receiving intervention B improves the validity of the results obtained.
Blinding refers to who knows what intervention the patient is receiving. In a double-blinded study, neither the patient nor the investigators evaluating the outcomes know this. Blinding is considered desirable because it reduces the chance of bias in the results.
In practical terms, in a study of drug A, drug A and a placebo might be contained in identical gelatin capsules. Each participant would receive a numerically coded dose, and a member of the research team not involved in administering the drug or evaluating the participant’s response would be responsible for managing the codes. Specifics vary from study to study, but the purpose is still the same.
Let’s say you give drug A to 20 people. A month later 15 of them are doing better. That’s great, but it doesn’t tell you the extent to which drug A is or is not affecting outcomes. That’s where controls come in. A control represents a comparison group that allows researchers to make some determination of what is causing what. Using a placebo control is often the most desirable to separate out how much of the response is actually due to the drug being studied.
Sometimes another active drug is used as a control. One example of this is in studies of ketamine for major depressive disorder. Since ketamine is an anaesthetic, if a normal saline infusion was given as a placebo, any attempts at blinding would go right out the window. Therefore, studies might use something like intravenous midazolam, a rapid- and short-acting benzodiazepine, as a control. Other types of controls include “treatment as usual” (the receiving whatever treatment is considered standard practice) or “waitlist” (patients who are currently on a waitlist to receive the intervention that’s being studied).
Risk is often quantified as a percentage, but a number needs sufficient context to be meaningful. Absolute risk refers to the probability of something happening, full stop, while relative risk refers to the probability of something happening in relation to some other designated population, factor, or situation. Relative risk numbers are meaningless unless you know the reference point. Let’s say drug A causes a 500% increase in the risk of cancer X. Sounds scary, right? But what if the absolute risk of cancer X is 0.00001%? That means taking drug A increases the risk of cancer X to 0.00005%, which is a lot less frightening.
Reports on the news may talk about an increased risk of death from drug B. Again, that sounds pretty scary. Except the absolute risk of death is 100%, with no exceptions (at least thus far in human evolution). That means we need more information. What time frame are we talking about? Risk of death within 1 year? Within 20 years? It makes a difference, and greater research literacy helps us to understand that.
Correlation vs causation
Correlation refers to things that tend to happen together, and this can mistakenly be taken to mean there is a causative relationship. If we looked at 100 people who had died, we would find that 100% of them had skin. Does that mean having skin causes death? Of course not.
This is a common type of mistake in the anti-vaxxer movement. Autism can show up at around the same age as children receive some vaccines, but that doesn’t mean there is a causative relationship. Causation can be difficult to establish definitively, which is why it’s important to have well-designed, rigorous, peer-reviewed research studies.
The term “significant” is used differently in research studies than it is in general usage. Significant results in a study indicate that from a statistical perspective the results obtained were unlikely to be due to chance. It does not mean that the results are large in scale (this would be described as a large effect size) and it does not mean that the results are necessarily important or meaningful.
Related to this is the concept of “confidence intervals”, which gives a range that expected results would fall within a certain percentage of the time, such as 95%. As a completely arbitrary example, say placebo was associated with a 10% improvement in symptom X, with a 95% confidence interval of +/- 5% (i.e. 5-15%). Study drug A was associated with 25% improvement in symptom X with a 95% confidence interval of +/- 5% (i.e. 20-30%). Because drug A is associated with a greater improvement in symptom X, and the lower end of its confidence interval doesn’t overlap with the upper end of the placebo confidence interval, the results are considered statistically significant. These numbers are completely made up, but hopefully, this example gives you the right idea.
Review papers evaluate the existing research literature on a topic. “Systematic” refers to clearly laid out criteria for inclusion and exclusion of studies in the review and elements used to evaluate the quality of the included studies. Review articles are a great way to get a lot of information all in one place that’s already been carefully scrutinized.
So there it is, a quick overview of bringing a critical eye to research. The best way to improve research literacy is by actually consuming research literature. This is where Google Scholar can be your new best friend. Full research papers often aren’t available for free access, but for the most part, there is public access to a paper’s abstract, which is a concise summary of the key points including method, results, and conclusions.
PubMed is another great source of information, and often hits on Google Scholar link to the corresponding PubMed page. Papers with the results of research funded by the National Institutes of Health (NIH) in the United States are available for free on PubMed. I’ve read a lot of research papers over the years, and while they can be a bit dry, they can also be an invaluable source of information. And the more you read the easier it gets. Trust me. And improving your research literacy is a great way to make sure your bullshit detector is as well-tuned as possible.
The Science Corner has info on media & research literacy, fake news, public health, and debunking pseudoscience.