3 Pronged Approach to Reading a Clinical Study

There has been an interesting discussion on how to critically appraise a study on the Evidence-Based Health listserv over the last day. It is interesting to see different opinions on the role of critical appraisal.

One important thing to remember as pointed out by Ben Djulbegovic is that critical appraisal relies on the quality of reporting as these 2 studies showed: http://www.ncbi.nlm.nih.gov/pubmed/22424985  and http://www.ncbi.nlm.nih.gov/pmc/articles/PMC313900/ . The implications are important but difficult for the busy clinician to deal with.

There are 3 questions you should ask yourself as you read a clinical study:

  1. Are the findings TRUE?
  2. Are the findings FREE OF THE INFLUENCE OF BIAS?
  3. Are the findings IMPORTANT?

The most difficult question for a clinician to answer initially is if the findings are TRUE. This question gets at issues of fraud in a study. Thankfully major fraud(ie totally fabricated data) is a rare occurrence.  Totally fraudulent data usually gets exposed over time. Worry about fraudulent data when the findings seem too good to be true (ie not consistent with clinical experience). Usually other researchers in the area will try to replicate the findings and can’t. There are other elements of truth that are more subtle and occur more frequently. For example, did the authors go on a data dredging expedition to find something positive to report? This would most commonly occur with post hoc subgroup analyses. These should always be considered hypothesis generating and not definitive. Here’s a great example of a false subgroup finding:

The Second International Study of Infarct Survival (ISIS-2) investigators reported an apparent subgroup effect: patients presenting with myocardial infarction born under the zodiac signs of Gemini or Libra did not experience the same reduction in vascular mortality attributable to aspirin that patients with other zodiac signs had.

Classical critical appraisal, using tools like the Users’ Guides, is done to DETECT BIASES in the design and conduct of studies. If any are detected then you have to decide the degree of influence that the bias(es) has had on the study results. This is difficult, if not impossible, to determine with certainty but there are studies that estimate the influence of various biases (for example, lack of concealed allocation in a RCT) on study outcomes. Remember, most biases lead to overestimate of effects. There are 2 options if you detect biases in a study: 1) reduce the “benefit” seen in the study by the amounts demonstrated in the following table and then decide if the findings are still important enough to apply in patient care, or 2) discard the study and look for one that is not biased.

This table is synthesized from findings reported in the Cochrane Handbook (http://handbook.cochrane.org/chapter_8/8_assessing_risk_of_bias_in_included_studies.htm)

BIAS                                                                       EXAGGERATION  OF EFFECT OF BENEFIT

Lack of randomization                                                             25% (-2 to 45%)

Lack of allocation concealment                                              18% (5 to 29%)

Lack of blinding                                                                          9% (NR)

Finally, if you believe the findings are true and are free of significant bias,  you have to decide if they are CLINICALLY IMPORTANT. This requires clinical judgment and understanding the patient’s baseline risk of the bad outcome the intervention is trying to impact. Some people like to calculate NNTs to make this decision. Don’t just look at the relative risk reduction and be impressed because you can be misled by this measure as I discuss in this video: https://youtu.be/7K30MGvOs5s

What to do when evidence has validity issues?

I often wonder how different clinicians (and EBM gurus) approach the dilemma of critically appraising  an article only to find that it has a flaw(s). For example, a common flaw is lack of concealed allocation in a randomized controlled trial. Empirical studies show that the effects of experimental interventions are exaggerated by about 21% [ratio of odds ratios (ROR): 0.79, 95% CI: 0.66–0.95] when allocation concealment is unclear or inadequate (JAMA 1995;273:40812). 

bias1

So what should I do if the randomized trial doesn’t adequately conceal the allocation scheme? I could discard the study completely and look for another study. What if there isn’t another study? Should I ignore the data of a perfectly good study otherwise? I could use the study and adjust the findings down by 21% (see above for why) and if the effect of the intervention  still crosses my clinically important threshold then I would implement the therapy. I could use the study as is and assume it wasn’t important because the reviewers and editors didn’t think it was. This is foolish as many of them probably didn’t even recognize the flaw nor would many of them understand the impact.

I don’t have the right answer but wonder what more learned people do. I personally adjust the findings down and determine if I still want to use the information. The problem with this approach is it assumes that in the particular study I am reviewing that the estimate of effect is in fact biased…something I can’t really know.

What do you do?

Publication Bias is Common in High Impact Journal Systematic Reviews

A very interesting study was published earlier this month in the Journal of Clinical Epidemiology assessing publication bias reporting in systematic reviews published in high impact factor journals.  Publication bias refers to the phenomenon that statistically significant positive results are more likely to be published than negative results. They also tend to be published more quickly and in more prominent journals. The issue of publication bias is an important one because the goal of a systematic review is to systematically search for and find all studies on a topic (both published and unpublished) so that an unbiased estimate of effect can be determined from including all studies (both positive and negative). If only positive studies, or a preponderance of positive studies, are published and only these are included in the review then a biased estimate of effect will result.

Onishi and Furukawa’s study is the first study to examine the frequency of significant publication bias in systematic reviews published in high impact factor general medical journals. They identified 116 systematic reviews published in the top 10 general medical journals in 2011 and 2012: NEJM, Lancet, JAMA, Annals of Internal Medicine, PLOS Medicine, BMJ, Archives of Internal Medicine, CMAJ, BMC Medicine, and Mayo Clinic Proceedings. They assessed each of the systematic reviews that did not report an assessment of publication bias for publication bias using Egger test of funnel plot asymmetry, contour-enhanced funnel plots, and tunnel effects. RESULTS: The included systematic reviews were of moderate quality as shown in the graph below. About a third of “systematic reviews” didn’t even perform a comprehensive literature search while 20% didn’t  assess study quality. Finally, 31% of systematic reviews didn’t assess for publication bias. How can you call your review a systematic review when you don’t perform a comprehensive literature search and you don’t determine if you missed studies?

Quality of included reviews

From J Clin Epi 2014;67:1320

Of the 36 reviews that did not report an assessment of publication bias, 7 (19.4%) had significant publication bias. Saying this another way, if a systematic review didn’t report an assessment of publication bias there was about a 20% chance publication bias was present. The authors then assessed what impact publication bias had on the estimated pooled results and found that the estimated pooled result was OVERESTIMATED by a median of 50.9% because of publication bias. This makes sense as mostly positive studies are published and negative studies aren’t. Thus, you would expect the estimates to be overly optimistic.

The figure below reports the results for individual journals. JAMA had significant publication bias in 50% of the reviews that didn’t assess publication bias while the Annals had 25% and BMJ 10%. It is concerning that these high impact journals publish “systematic reviews” that are of moderate quality and have a significant number of reviews that don’t report any assessment of publication bias.

Results by journal

From J Clin Epi 2014;67:1320

Bottom Line: Always critically appraise systematic reviews published in high impact journals. Don’t trust that an editor, even of a prestigious journal, did their job….they likely didn’t.

Useful diagram to teach basic EBM concepts

Dr. La Rochelle published an article in BMJ EBM this month with a very useful figure in it (see below). It is useful because it can help our learners (and ourselves) remember the relationship between the type of evidence and its believability/trustworthiness.

Untitled

Lets work through this figure. The upright triangle should be familiar to EBM aficionados as it is the typical hierarchy triangle of study designs, with lower quality evidence at the bottom and highest quality at the top (assuming, of course, that the studies were conducted properly). The “Risk of Bias” arrow next to this upright triangle reflects the quality statement I just made.  Case reports and case series, because they have no comparator group and aren’t systematically selected are at very high risk of bias. A large RCT or systematic review of RCTs is at the lowest risk of bias.

The inverted triangle on the left reflects possible study effects, with the width of the corresponding area of the triangle (as well as the “Frequency of Potential Clinically relevant observable effect arrow) representing the prevalence of that effect. Thus, very dramatic, treatment altering effects are rare (bottom of triangle, very narrow). Conversely, small effects are fairly common (top of triangle, widest part).

One way to use this diagram in teaching is to consider the study design you would choose (or look for) based on the anticipated magnitude of effect. Thus, if you are trying to detect a small effect you will need a large study that is methodologically sound. Remember bias is a systematic error in a study that makes the findings of the study depart from the truth. Small effects seen in studies lower down the upright pyramid are potentially biased (ie not true). If you anticipate very large effects then observational studies or small RCTs might be just fine.

An alternative way to use this diagram with learners is to temper the findings of a study. If a small effect is seen in a small, lower quality study they should be taught to question that finding as likely departing from the truth. Don’t change clinical practice based on it, but await another study. A very large effect, even in a lower quality study, is likely true but maybe not as dramatic as it seems (ie reduce the effect by 20-30%).

I applaud Dr. La Rochelle for developing a figure which explains these relationships so well.

Allocation Concealment Is Often Confused With Blinding

During journal clubs on randomized controlled trials there is often confusion about allocation concealment. It is often confused with blinding. In a sense it is blinding but not in the traditional sense of blinding. One way to think of allocation concealment is blinding of the randomization schedule or scheme. Allocation concealment hides the randomization or allocation sequence (what’s coming next) from patients and those who would enroll patients in a study. Blinding occurs after randomization and keeps patients, providers, researchers, etc from knowing which arm of the study the patient is in (i.e. what treatment they are getting).

Why is allocation concealment important in a randomized controlled trial? Inadequate or unclear allocation concealment can lead to an overestimation (by up to 40%!) of treatment effect (JAMA 1995;273:408). First, consider why we randomize in the first place. We randomize to try to equally distribute confounding and prognostic factors between arms of a study so we can try to isolate the effect of the intervention. Consider a physician who wants to enroll a patient in a study and wants to make sure her patient receives the therapy she deems likely most effective. What if she figured out the randomization scheme and knows what therapy the next patient will be assigned to? Hopefully you can see that this physician could undermine the benefits of randomization if she preferentially funnels sicker (or healthier) patients into one arm of the study. There could be an imbalance in baseline characteristics. It could also lead to patients who are enrolled in the study being fundamentally different or not representative of the patient population.

From The Lancet

From The Lancet

You will have to use your judgment to decide how likely it is that someone could figure out the randomization scheme. You can feel more comfortable that allocation concealment was adequate if the following were used in the RCT:
sequentially numbered, opaque, sealed envelopes: these are not able to be seen through even if held up to a light. They are sealed so that you can’t peek into them and see what the assignment is. As each patient is enrolled you use the next numbered envelope.
pharmacy controlled: enrolling physician calls the pharmacy and they enroll the patient and assign therapy.
centralized randomization: probably the most commonly used. The enrolling physician calls a central research site and the central site assigns the patient to therapy.

Proper randomization is crucial to a therapy study and concealed allocation is crucial to randomization. I hope this post helps readers of RCTs better understand what concealed allocation is and learn how to detect whether it was done adequately or not. Keep in mind if allocation concealment is unclear or done poorly the effect you see in the study needs to be tempered and possible cut by 40%.

N-of-1 Trial for Statin-Related Myalgia: Consider Conducting These Studies in Your Practice

The March 4th edition of the Annals of Internal Medicine contains an article by Joy and colleagues in which they conducted an N-of-1 trial in patients who had previously not tolerated statins. This is important because patients often complain that they cannot tolerate statins despite needing them.  I have wondered how much of this was a self-fulfilling prophecy because they hear a lot about this from friends and various media outlets. A N-of-1 trial is a great way to determine if the statin-related symptoms are real or imagined.

First, lets discuss N-of-1 trials. What is a N-of-1 trial? It’s a RCT of active treatment vs. placebo in an individual patient. The patient serves as his or her own control thus perfectly controlling for a variety of biases. When is a N-of-1 trail most useful? This design is not useful for self-limited illnesses, acute or rapidly evolving illnesses, surgical procedures or prevention of irreversible outcomes (like stroke or MI). It’s most useful for conditions that are chronic and for with therapy is prolonged. It’s best if the effect you are looking for occurs fairly quickly and goes away quickly when treatment is stopped. These trials are a good way to determine the optimal dose of a medication  for a patient. They are also good to determine if an adverse effect is truly due to a medication. Finally, they are good way to test a treatment’s effect when the clinician feels it will be useless but the patient insists on taking it. How is a N-of-1 trial conducted? Get informed consent from the patient and make sure they understand that a placebo will be part of the study. Next the patient randomly undergoes pairs of treatment periods in which one period of each pair applies to the active treatment and one to placebo. A pharmacist will need to be involved to compound the placebo and to develop the randomization scheme (so as to keep clinician and patient blinded). Pairs of treatment periods are replicated a minimum of 3 periods. There needs to be a washout period between moving from active to placebo and vice versa. The length of the treatment period needs to be long enough for the outcome of interest to occur. Use the rule of 3s here (if an event occurs on average once every x days, then observe 3x days to be 95% confident of observing at least 1 event). What outcome should be measured? Most commonly these types of trials will be conducted to determine the effect of an intervention on quality of life type measures (eg pain, fatigue, etc).  Ask the patient what is the most troubling symptom or problem they have experienced and measure that as your outcome. Have the patient keep a diary or ask them to rate their symptoms on some meaningful scale at certain follow-up intervals. Do this while on active and placebo treatments. You will have to determine how much of a difference is clinically meaningful.  How do I interpret N-of-1 trial data? This can be a little difficult for non-statistically oriented clinicians. You could do the eyeball test and just see if there are important trends in the data. More rigorously you could calculate the differences in means scores of the placebo and active treatment periods. These would then be compared using a t test (freely available on the internet).

Back to Joy and colleagues N-of-1 trial on statins. They enrolled patients with prior statin-related myalgias. Participants were randomly assigned to get the same statin and dose that they previously didn’t tolerate or placebo. They remained on “treatment” for 3 week periods with 3 week washout periods in between. Patients weekly rated their symptoms on visual analogue scales for myalgias and specific symptoms (0-100, with 0 being no symptoms and 100 being the worst symptoms). It was felt a difference of 13 was clinically significant. What did they find? There were no statistically or clinically significant differences between statins and placebo in the myalgia score (4.37) nor on the symptom specific score (3.89). The neat thing the authors did was to determine if patients resumed taking statins after reviewing the results of their N-of-1 trial and 5 of the 8 patients resumed statins (one didn’t because a statin was no longer indicated).

So are statin related myalgias mostly in our patients’ heads? Maybe. This study is by no means definitive because it only enrolled 8 patients but it at least suggests a methodology you can use to truly test if a patient’s symptoms are statin related or not. This is important to consider because the most recent lipid treatment guidelines focus on using statins only and not substituting other agents like ezetimibe or cholestyramine. So give this methodology a try. You and your patients will likely be amazed at what you find.

Conflicts of Interest in Online Point of Care Websites

Full disclosure: I was an Society of General Medicine UpToDate reviewer several years ago and received a free subscription to UpToDate for my services.  I use UpToDate regularly also (through my institution library).

Amber and colleagues published an unfortunate finding this week in the Journal of Medical Ethics. They found that UpToDate seems to have some issues with conflicts of interest by some of its authors and editors.

UpToDate makes this claim on its Editorial subpage: “UpToDate accepts no advertising or sponsorships, a policy that ensures the integrity of our content and keeps it free of commercial influences.” Amber and colleagues findings would likely dispute that the content is “free of commercial influences”.

Amber and colleagues reviewed the Dynamed and UpToDate websites for COI policies and disclosures. They only searched a limited number of conditions on each site (male sexual dysfunction, fibromyalgia, hypogonadism, psoriasis, rheumatoid arthritis, and Crohn’s disease) but their reasoning seems solid: treatments of these entities can be controversial (for the 1st 3) and primarily involve biologics (last 3).  It seems reasonable that expert opinion could dominate recommendations on male sexual dysfunction, fibromyalgia, hypogonadism and that those experts could be conflicted. (Editorial side note: Few doctors recommend stopping offending medications or offer vacuum erection devices instead of the “little blue pill”. Most patients don’t even realize there are other treatments for ED other than “the little blue pill” and its outdoor bathtub loving competitor!- but I digress). The biologics also make sense to me because this is an active area of research and experts writing these types of chapters could get monies from companies either for research or speaking.

What did they find? Both Dynamed and UpToDate mandate disclosure of COIs (a point I will discuss later). No Dynamed authors or editors reported any COIs while quite a few were found for UpToDate. Of the 31 different treatments mentioned for these 6 topic areas evaluated for 14 (45%) of them the authors, editors, or both received grant monies from the company making the therapies mentioned. Similarly 45%  of authors, editors, or both were consultants for companies making these therapies.  For 5 of the 31 therapies authors or editors were on the speakers bureaus for the companies making these therapies. What’s most worrisome is that both the authors and editors were conflicted for the psoriasis chapter. Thus there were no checks and balances for this topic at all!

From blackbeltbartending.com

While finding COIs is worrisome it doesn’t mean that the overall quality of these chapters was compromised nor that they were biased. We don’t know at this time what effect the COIs had. Further study is needed. Unfortunately, this is probably just  the tip of the iceberg. Many more chapters probably suffer from the same issues. Furthermore, traditional textbooks likely have the same problems.

Disclosing COIs is mostly useless. Disclosure doesn’t lessen their impact.  I don’t understand why nonconflicted authors can’t be found for these chapters.  Do we care so much about who writes a chapter that we potentially compromise ethics for name recognition? Those with COIs should not be allowed to write a chapter on topics for which they have a conflict. Period. If UpToDate is so intent on having them maybe they could serve as a consultant  to the authors or a peer reviewer but even that is stretching it. What really bothers me is that the editors for some of these chapters were also conflicted thus leaving no check on potential biases. As Amber and colleagues point out, even though these chapters underwent peer review what do we know about the peer reviewers? Did they have any COIs. Who knows.

So what should all you UpToDate users do? I suggest the following:

  1. Contact UpToDate demanding a change.  They have the lion’s share of the market and until they lose subscribers nothing will likely change.
  2. Check for COIs yourself before using the recommendations in any UpToDate chapter. You should be especially wary if the recommendation begin with “In our opinion….”.  (An all too often finding)
  3. Use Dynamed instead. It has a different layout format than UpToDate but is quicker to be updated and it seems less conflicted. And its cheaper!