What to do when evidence has validity issues?

I often wonder how different clinicians (and EBM gurus) approach the dilemma of critically appraising  an article only to find that it has a flaw(s). For example, a common flaw is lack of concealed allocation in a randomized controlled trial. Empirical studies show that the effects of experimental interventions are exaggerated by about 21% [ratio of odds ratios (ROR): 0.79, 95% CI: 0.66–0.95] when allocation concealment is unclear or inadequate (JAMA 1995;273:40812). 

bias1

So what should I do if the randomized trial doesn’t adequately conceal the allocation scheme? I could discard the study completely and look for another study. What if there isn’t another study? Should I ignore the data of a perfectly good study otherwise? I could use the study and adjust the findings down by 21% (see above for why) and if the effect of the intervention  still crosses my clinically important threshold then I would implement the therapy. I could use the study as is and assume it wasn’t important because the reviewers and editors didn’t think it was. This is foolish as many of them probably didn’t even recognize the flaw nor would many of them understand the impact.

I don’t have the right answer but wonder what more learned people do. I personally adjust the findings down and determine if I still want to use the information. The problem with this approach is it assumes that in the particular study I am reviewing that the estimate of effect is in fact biased…something I can’t really know.

What do you do?

Learning materials for “Make Your PowerPoints Evidence-Based” workshop

I did a workshop on how to design multimedia slides to be consistent with Mayer’s Cognitive Theory of Multimedia Learning. The workshop materials are below.

 

https://docs.google.com/presentation/d/1zNPsZNNODQxywAH9QZxi6mSgrxh4P08UYThO5ckfP5Q/edit?usp=sharing : These are the Google slides I used for the workshop.

 

Here are 2 handouts that I used:

1. Goals of Instructional Design Handout : reviews methods to reduce extrinsic cognitive load, manage intrinsic cognitive load, and foster germane cognitive load

2. Make Your PowerPoints Evidence-Based handout used during the workshop

Is tinzaparin better than warfarin in patients with VTE and cancer or not?

The CATCH trail results were published this week in JAMA. Read the abstract is below. Do you think this drug is useful for venous thromboembolism (VTE) treatment?

Importance  Low-molecular-weight heparin is recommended over warfarin for the treatment of acute venous thromboembolism (VTE) in patients with active cancer largely based on results of a single, large trial.

Objective  To study the efficacy and safety of tinzaparin vs warfarin for treatment of acute, symptomatic VTE in patients with active cancer.

Design, Settings, and Participants  A randomized, open-label study with blinded central adjudication of study outcomes enrolled patients in 164 centers in Asia, Africa, Europe, and North, Central, and South America between August 2010 and November 2013. Adult patients with active cancer (defined as histologic diagnosis of cancer and receiving anticancer therapy or diagnosed with, or received such therapy, within the previous 6 months) and objectively documented proximal deep vein thrombosis (DVT) or pulmonary embolism, with a life expectancy greater than 6 months and without contraindications for anticoagulation, were followed up for 180 days and for 30 days after the last study medication dose for collection of safety data.

Interventions  Tinzaparin (175 IU/kg) once daily for 6 months vs conventional therapy with tinzaparin (175 IU/kg) once daily for 5 to 10 days followed by warfarin at a dose adjusted to maintain the international normalized ratio within the therapeutic range (2.0-3.0) for 6 months.

Main Outcomes and Measures  Primary efficacy outcome was a composite of centrally adjudicated recurrent DVT, fatal or nonfatal pulmonary embolism, and incidental VTE. Safety outcomes included major bleeding, clinically relevant nonmajor bleeding, and overall mortality.

Results  Nine hundred patients were randomized and included in intention-to-treat efficacy and safety analyses. Recurrent VTE occurred in 31 of 449 patients treated with tinzaparin and 45 of 451 patients treated with warfarin (6-month cumulative incidence, 7.2% for tinzaparin vs 10.5% for warfarin; hazard ratio [HR], 0.65 [95% CI, 0.41-1.03]; P = .07). There were no differences in major bleeding (12 patients for tinzaparin vs 11 patients for warfarin; HR, 0.89 [95% CI, 0.40-1.99]; P = .77) or overall mortality (150 patients for tinzaparin vs 138 patients for warfarin; HR, 1.08 [95% CI, 0.85-1.36]; P = .54). A significant reduction in clinically relevant nonmajor bleeding was observed with tinzaparin (49 of 449 patients for tinzaparin vs 69 of 451 patients for warfarin; HR, 0.58 [95% CI, 0.40-0.84]; P = .004).

Conclusions and Relevance  Among patients with active cancer and acute symptomatic VTE, the use of full-dose tinzaparin (175 IU/kg) daily compared with warfarin for 6 months did not significantly reduce the composite measure of recurrent VTE and was not associated with reductions in overall mortality or major bleeding, but was associated with a lower rate of clinically relevant nonmajor bleeding. Further studies are needed to assess whether the efficacy outcomes would be different in patients at higher risk of recurrent VTE.

When I approach a study with marginally negative results I consider several things to help me decide if I would still prescribe the drug:

  1. Was the study powered properly? Alternatively, were the assumptions made in sample size calculations reasonable. Sample size calculations require several data points. The main ones are: desired power, type 1 error rate, expected difference in event rates between the arms of the trial. The usual offender is the authors overestimating the benefit they expect to see. The authors expected a 50% relative reduction in event rates between the 2 arms of the study. That seems high but is consistent with a meta-analysis of similar studies and the CLOT trial.  They only saw a 31% reduction. This would have meant the study needed more patients and thus is underpowered. (post hoc power 41.4%).
  2. How much of the confidence interval is on the side of being beneficial? Most of the CI in this case is below 1.0 (0.41-1.03). Thus, I pay more attention to this than the p-value (0.07). There is potentially 59% reduction in the hazard of VTE and only a 3% potential increase in VTE. This is a clinically important reduction in VTE.
  3. What are the pros and cons of the therapy? Preventing VTE is important. The risk of bleeding was less in with tinzaparin. Had the bleeding been higher then I might have had different thoughts about prescribing this drug.
  4. Are the results of this trial consistent with previous studies? If so, then I fall back on it being underpowered and likely would prescribe the drug. A metaanalysis of 7 studies found a similar reduction in VTE (HR 0.47).

Thus, I think the study was underpowered for the event rates they encountered. Had there been more patients enrolled they likely would have found a statistically significant difference between groups. I would not anticipate the results shifting from benefit to harm with more patients. It is likely the patients in this trial were “healthier” than patients in the previous trials.  I feel comfortable saying tinzaparin is likely beneficial and I would feel comfortable prescribing it.

This demonstrates the importance of evaluating the confidence interval and not just the p-value. More information can be gleaned from the confidence interval than a p-value.

Do lipid guidelines need to change just because there is a new, expensive drug on the market? NO!

Shrank and colleagues published a viewpoint online today positing that lipid guidelines should return to LDL based targets. I think they are wrong. The use two studies to support their assertion.

First they use the IMPROVE IT study. In this study patients hospitalized for ACS were randomized to a  combination of simvastatin (40 mg) and ezetimibe (10 mg) or simvastatin (40 mg) and placebo (simvastatin monotherapy). The LDLs were already pretty low in this group: baseline LDL cholesterol levels had to be between 50 to 100 mg per deciliter  if they were receiving lipid-lowering therapy or 50 to 125 mg per deciliter if not on lipid lowering therapy (Average baseline LDL was 93.8 mg/dl). The results show minimal benefits as demonstrated below:

IMPROVE IT resultsCurrent guidelines would recommend high potency statin in this patient population. Adding ezetimibe to moderate dose statin is probably equivalent to a high potency statin (from a LDL lowering  perspective). This study (and all ezetimibe studies) should have tested the difference between simva 40-ezetimbe 10 and simva 80mg or atorvastatin 40 or 80mg. So to me IMPROVE IT doesn’t PROVE anything other than a more potent statin leads to less cardiovascular events…something we already know.

Now on to the 2nd argument. They argue that alirocumab (Praluent), the first in a new class, the proprotein convertase subtilisin/kexin type 9 (PCSK-9) inhibitors should lead to LDL guided therapy again. Why? “Early results suggest these drugs have a powerful effect on levels of low-density lipoprotein cholesterol (LDL-C), likely more potent than statins“. A systematic review of studies of this drug shows a mortality reduction but the comparators in these studies was placebo or ezetimibe 10mg. Why? We have proven therapy for LDL and this drug should have been compared to high potency statins. That study will likely not ever be done (unless the FDA demands it) because the companies making this drug cant risk finding that it works only as good as a high potency statin or possibly worse.  Also does this class of drugs have anti-inflammatory effects like statins? Are they safer? This is an injectable drug that has to be warmed to room temperature prior to use and is very costly compared to generic atorvastatin.

In my opinion, no guideline should be changed without appropriately designed outcomes studies for the drugs being recommended. In this case, the risk-benefit margin needs to be impressive to justify the cost as we have dirt cheap potent statins already.

The authors of this viewpoint make no great rational argument for guidelines change other than that there is a new drug on the market and it might work. Lets see if it does and at what cost (both monetary and physiological).

Misconceptions about screening are common. Educate your patients.

An article published online today by JAMA Internal Medicine is very revealing about the misconceptions patients can have about screening, in this case lung cancer screening. This study was conducted at 7 VA sites launching a lung cancer screening program.  Participants underwent semi-structured qualitative interviews about health beliefs related to smoking and lung cancer screening. Participants had some interesting beliefs:

    • Nearly all participants mentioned the belief that everyone who is screened will benefit in some way
    • Many participants wanted to undergo screening to see “how much damage” they had done to their lungs
    • Rather than being alarmed by identification of a nodule or suspicious findings requiring monitoring with future imaging, several participants expressed the belief that identification of the nodule meant their cancer had been found so early that it was currently harmless

From https://upload.wikimedia.org/wikipedia/commons/3/3f/Thorax_CT_peripheres_Brronchialcarcinom_li_OF.jpg

Its important to educate our patients on what screening is and isn’t. They need to understand the role of screening. I like to ask patients what they expect to get out of screening. It can help you discover their misconceptions. They need to understand that they still need to change behaviors (in this case smoking) even if the screening test is negative. I think we all too often just order the screening test because a clinical reminder tells us to without thinking of how it could be interpreted by our patients.

Food for thought: What is the rate of false positive rate of CT scan for lung cancer screening?

Click here and read the results section of this abstract for the answer. Shocking isn’t it.

Which journals contain the best content for a general internist?

I did an entirely unscientific survey today of the table of contents of the several journals from March 1 through May 31, 2015. I evaluated the titles of original research and systematic review articles for their usefulness to a general internist. I considered an article useful if it informed me of information that, as a general internist, I could use to take care of an outpatient or inpatient (non-ICU). I then did a simple proportion of useful studies divided by total studies published. Admittedly there are major limitations of my study but it was informative. Here’s what I found (from least to most useful):

  1. NEJM 20.4% useful
  2. Annals of Internal Medicine 25%
  3. Lancet 29.5%
  4. JAMA 32.4%
  5. BMJ 46.1%
  6. JAMA Internal Medicine 61.5%

I postulated that predigested/preappraised journals would be more useful and they were.

  1. BMJ EBM 52.2% (2 caveats here: I am an associate editor of BMJ EBM so I have that conflict, and BMJ EBM publishes pediatric, OB/GYN and surgical studies also. Thus, the % of articles for a general internist is reduced as the total number of articles published is fixed at 23 per issue)
  2. ACP Journal Club 65.7%

Does this surprise you one way or another? Did you expect an even lower percentage or a higher percentage for each journal?

This could be a useful technique to decide which journal(s) to subscribe to. Review the main ones in your area and decide which contains the best content and focus on those journals. Don’t rely on reputation of the journal alone.

Alternatively, get BMJ EBM or ACP Journal Club and read it cover to cover. You will be very up to date if you do that. Only clinically important and methodologically sound articles make it into these publications. In addition, these journals include expert commentary about each article.

Treating Low T can be dangerous

I am bombarded with low t (low testosterone) commercials on the radio and television. There is a men’s health clinic in my city that will screen and treat men for this horrendous affliction. They guarantee greater sexual prowess and a happy marriage. What they don’t mention are the side effects that can be deadly.

An important study was published in JAMA in 2013 that showed increased cardiac risk in veterans who were prescribed testosterone. The caveat of this study is that all the patients in the study had undergone cardiac catheterization and thus were at higher risk for CAD than those who don’t undergo cardiac cath. As shown in the image below at any given point during follow-up those assigned to testosterone were at 29% greater risk of death, MI or stroke than those on no testosterone therapy. Adjusting for the presence of CAD had no effect on the estimate of outcomes. Thus, even those without CAD (by catheterization) were at increased risk of death, MI and stroke. Most patients in this study got patches or injections. Around 1% got the gel.

Survival curve for testosterone1
A new study has looked at differences in risk among different testosterone dosage forms. This was a huge retrospective cohort (544K patients): 37.4% injection, 6.9% patch, and 55.8% gel users. The outcomes of interest were myocardial infarction (MI), unstable angina, stroke, and composite acute event (MI, unstable angina, or stroke); venous thromboembolism (VTE); mortality; and all-cause hospitalization. They compared these outcomes between injection users and gel users and between patch users and gel users. They didn’t have a nonuser group but that wasn’t really needed as risk compared to nonusers has been established with the study I noted above. The results are shown in the 2 figures below.

Results 2Results 1

 

 

 

 

 

 

 

 

 

 

Using injectable testosterone was associated with increased risk of stroke, death, MI, and hospitalization compared to testosterone gel (left figure above). Testosterone patches only increased the risk of MI compared to testosterone gel (right figure above). You should look at the absolute rates in the tables in the paper as they are low and what I report above are relative rates which can be misleading.

The bottom line is that you should have a good reason to replace testosterone and not just because the patient’s T is low. You should consider the cardiovascular risk of this drug and counsel the patient on this risk (in addition to risk of prostate cancer and polycythemia). If you choose to replace T then the gel is the safest followed by patches.

Evidence-Based Teaching Principle 3: Modality Principle

The following is a slide I might use to begin teaching about p-values, type I and type 2 errors. What do you think about it? Will students learn deeply from it? (Would like to see a larger version of the slide? Please click on it)

Version 1

Version 1

Or do you think students would learn more deeply from this slide? The words at the bottom of the slide would be spoken by the instructor while the graphic is displayed.

Version 2

Version 2

Research would predict version 2 is better and will lead to deeper understanding. But why? What is different about them?

Version 1 violates the modality principle which states that people learn more deeply from multimedia lessons when words explaining concurrent graphics are presented as speech rather than as on-screen text. In version 1, the visual channel would have to simultaneously process the graphic and the printed text. This would likely overload this channel. In contrast, in version 2 the education message is split across separate cognitive channels- the graphic in the visual channel and words in the auditory channel.

Some caveats or limitations of this principle:

  1. It’s more important for novice learners
  2. It’s more important if the material is complex and presented at a rapid pace in a lecture. If the learner can control the pace of the material the modality principle is less important.
  3. Doesn’t apply if only printed words are presented on the screen (without any corresponding graphic)
  4. There are times when words should be presented on screen
    • words are technical
    • words are not in the learner’s native language
    • words are needed for future reference (e.g. directions to a practice exercise)

What’s the evidence for this? The modality principle is supported by more research than any other multimedia principle. Mayer identified 21 studies published through 2004 and found an average effect size on transfer tests of 0.97 (effect sizes > 0.8 are significant, 0.5 are moderate).

Evidence-Based Teaching Principle 2: Contiguity Principle

The following is a slide I might use to teach about interpreting a forest plot. What do you think about it? Will students learn deeply from it? (Would like to see a larger version of the slide? Please click on it)

Version 1

Version 1

Or do you think students would learn more deeply from this slide?

Version 2

Version 2

Research would predict version 2 is better and will lead to deeper understanding. But why? What is different about them?

Version 1 violates the spatial contiguity principle which states that people learn more deeply from a multimedia message when corresponding words and pictures are presented near rather than far from each other on the page or screen. In version 1 the words describing the image are at the bottom of the slide. The learner will have to look away from the graphic to find this description and then hold it in working memory (remember working memory is limited in capacity and time it can hold an object) while he looks back to the image and tries to process them together. This can overload cognitive capacity and impair learning. Version 2, on the other hand, has the words right next to the corresponding graphic thus reducing cognitive work. This is especially important when words refer to parts of on-screen graphics.

Other common violations of the spatial contiguity principle  include:

  • Feedback is displayed on a separate screen from the practice exercise or question
  • Directions to complete practice exercises are placed on a separate screen from the application screen
  • Key elements of a graphic are numbered but the legend is at the bottom of the screen

Watch the following video about how to calculate the number needed to treat. Will students learn deeply from this video?

Research would predict they won’t because the instructor violated the temporal contiguity principle which states that people learn more deeply from a multimedia message when corresponding animation and narration are presented simultaneously rather than successively. Cognitive capacity will be overloaded because the learner has to hold all of the relevant words in working memory until the animation is presented. This principle is especially important when narration and animation segments are long and when students can’t control the pace of the presentation.

What’s the evidence for this? Mayer, in Table 12.7 in the Cambridge Handbook of Multimedia Learning (2014), summarizes 22 studies on spatial contiguity published through 2012 and finds an average effect size of 1.10 (effect sizes > 0.8 are significant, 0.5 are moderate). Table 12.8 summarizes 9 studies on temporal contiguity published through 2008 and finds an average effect size of 1.22. Thus, following the contiguity principle leads to deeper understanding.

Evidence-based Teaching Principle 1: multimedia principle (Use words and pictures rather than words alone)

The following is a slide I might use to teach about one of the criteria for critically appraising a therapy study. What do you think about it? Will students learn deeply from it?

Version 1

Version 1

The multimedia principle states that people learn more deeply from words and pictures than from words alone. Why might this be? Reflect upon the cognitive theory of multimedia learning and think about why the multimedia principle leads to better learning.

Here is another version of the previous slide that better adheres to the multimedia principle. (Note: Would you like to enlarge the image? If so, please click on it). What do you think about this one? Will students learn more deeply from it or version 1?

Version 2

Version 2

Where are the words you say? They would be spoken during a lecture explaining the same information on the version 1 slide. They just aren’t typed out on the slide. Another format would be to put the written words in the notes area in PowerPoint.

Research would predict that students will learn more deeply from version 2 than version 1. Why? Remember active processing occurs where we take words and images and develop verbal and pictorial models. The words and images work together to help learners develop the models. Words alone can lead to more cognitive work for the learner to construct a model. Also, words alone might not be effective in activating prior knowledge which we need to do so that it can be integrated with our new model which then leads to learning. I have left out an important explanation here (can you guess what it is?) but it is a multimedia principle of its own and will be covered in an upcoming post.

Are all images created equal? What kind of images should I use? The answer to both of these question is that it depends. Lets focus on what you are trying to teach first. If you are trying to teach a motor skill or complicated manual tasks animated images or video seems to work better. Static images are better or just as effective as animation for everything else.  Static images seem to be better for promoting deep understanding.

Which graphic below do you think would lead to better understanding about heart function? A or B?

From Butcher. J of Educ Psychol 2006;98:182

From Butcher. J of Educ Psychol 2006;98:182

Butcher (2006) found that simpler visuals (a) led to better understanding. The simpler visual led learners to make more attempts to understand how the heart works than the complex visual. Making more attempts led to better mental models. It seems that too complex of images can overwhelm novice learners.

Some caveats or limitations of this principle:

  1. If learners can control the pace of instruction complex images promoted stronger knowledge gains (in a lecture setting where the instructor controls the pace simple images are better)
  2. Its more important for novice learners
  3. Sometimes only words can be used to explain a topic

What’s the evidence for this? Mayer, in Table 7.1 in the Cambridge Handbook of Multimedia Learning (2014), summarizes 9 studies published through 2006 and finds an average effect size on retention tests of 0.19 and for transfer tests of 1.63 (effect sizes > 0.8 are significant, 0.5 are moderate). Thus, this principle shows weaker effects for retention but good effects on deeper understanding.