Evidence-based Teaching Principle 1: multimedia principle (Use words and pictures rather than words alone)

The following is a slide I might use to teach about one of the criteria for critically appraising a therapy study. What do you think about it? Will students learn deeply from it?

Version 1

Version 1

The multimedia principle states that people learn more deeply from words and pictures than from words alone. Why might this be? Reflect upon the cognitive theory of multimedia learning and think about why the multimedia principle leads to better learning.

Here is another version of the previous slide that better adheres to the multimedia principle. (Note: Would you like to enlarge the image? If so, please click on it). What do you think about this one? Will students learn more deeply from it or version 1?

Version 2

Version 2

Where are the words you say? They would be spoken during a lecture explaining the same information on the version 1 slide. They just aren’t typed out on the slide. Another format would be to put the written words in the notes area in PowerPoint.

Research would predict that students will learn more deeply from version 2 than version 1. Why? Remember active processing occurs where we take words and images and develop verbal and pictorial models. The words and images work together to help learners develop the models. Words alone can lead to more cognitive work for the learner to construct a model. Also, words alone might not be effective in activating prior knowledge which we need to do so that it can be integrated with our new model which then leads to learning. I have left out an important explanation here (can you guess what it is?) but it is a multimedia principle of its own and will be covered in an upcoming post.

Are all images created equal? What kind of images should I use? The answer to both of these question is that it depends. Lets focus on what you are trying to teach first. If you are trying to teach a motor skill or complicated manual tasks animated images or video seems to work better. Static images are better or just as effective as animation for everything else.  Static images seem to be better for promoting deep understanding.

Which graphic below do you think would lead to better understanding about heart function? A or B?

From Butcher. J of Educ Psychol 2006;98:182

From Butcher. J of Educ Psychol 2006;98:182

Butcher (2006) found that simpler visuals (a) led to better understanding. The simpler visual led learners to make more attempts to understand how the heart works than the complex visual. Making more attempts led to better mental models. It seems that too complex of images can overwhelm novice learners.

Some caveats or limitations of this principle:

  1. If learners can control the pace of instruction complex images promoted stronger knowledge gains (in a lecture setting where the instructor controls the pace simple images are better)
  2. Its more important for novice learners
  3. Sometimes only words can be used to explain a topic

What’s the evidence for this? Mayer, in Table 7.1 in the Cambridge Handbook of Multimedia Learning (2014), summarizes 9 studies published through 2006 and finds an average effect size on retention tests of 0.19 and for transfer tests of 1.63 (effect sizes > 0.8 are significant, 0.5 are moderate). Thus, this principle shows weaker effects for retention but good effects on deeper understanding.

Evidence-based teaching of EBM (and anything else)

I am going to have a series of posts on multimedia teaching principles. I am pursuing a master degree in instructional design and educational technology and as I am learning about instructional design I am realizing how poorly I have designed much of my teaching materials.  Furthermore, violations of the principles I will discuss in this series is very common in medical education.  Its not the fault of the instructors as they haven’t been taught these principles. For some reason in medicine we assume doctors and PhDs know how to teach.

The point of this series will be to present multimedia design concepts that have been proven in the educational literature to improve learning based on tests of retention (do you remember the content based on simple recall) and transfer (can you apply the information to solve a closely related problem).

We all teach using multimedia materials. If you put words (spoken or written) and images together in a presentation that is a multimedia presentation. So this series will be applicable to all teachers.

This first post will set the stage for future posts. The theory upon which all other posts will be based is the Cognitive Theory of Multimedia Learning by Richard Mayer.

cognitive theory of multimedia learning

The main components of this theory are as follows:

  1. Dual channels: there are 2 pathways to process information: auditory and visual ( designated in blue and green, respectively)
  2. There is limited capacity of each pathway to process information
  3. Active processing occurs in each pathway

When words or images are presented to us we first have to determine which words or images are important (or which portions of them are important).  After we select words, images, or sounds that are meaningful, we organize them in our working, active memory into verbal and/or pictorial models. We then actively integrate these models with activated prior knowledge to create new knowledge (learning).

Multimedia presentations should be designed to facilitate this process. During the remainder of this series I will present evidence-based ways to do this. I will delve further into the Cognitive Theory of Multimedia Learning when I discuss how this process is affected by each of the design principles.

Podcasting to help keep current

Intro image

I am taking a class on Multimedia as part of my Master of Educational Technology degree program. This week our assignment was to develop a podcast and I decided to make it EBM related (always make your work count twice). I used Audacity, a free audio editor and recorder, to create the podcast. There was a learning curve but I have it mostly figured out. In the past when I created all my YouTube videos I “lectured” off the top of my head. For this assignment I had to write a script first and read from it. This is much better than ad-libbing. I don’t have an verbal tics (like “uhs”) and my cadence is better.  I suggest if you do any recordings, even about things you know a lot about, make a script and read it.

Medicine Review in a Few will be a podcast series in which I review what I consider important studies in Internal Medicine.  Each episode will review one study and will last less than 10 minutes; hence the “in a few” portion of the title. I think its important to keep information that isn’t interactive and is only processed through one channel fairly short. I personally lose interest and focus with long podcasts. According to data from Stitcher.com the average listener abandons a podcast within 22 minutes.

In Episode 1 I review the ADJUST-PE study. I chose to begin my podcast series with this study because I recently used the information in this study to care for a patient. I wasn’t aware of the findings of this study until one of my residents brought it to my attention.  I plan to only review clinically useful studies and will comment on any methodological limitations of the studies that I think the average clinician wouldn’t recognize or know how that limitation impacts the study findings. I think podcasts are a good medium to review studies.

For now, the podcasts will only be posted here but if I keep up with this endeaver I’ll ultimately try to get them on iTunes.

The image I used above is from splitshire.com and requires no attribution. The music used in my podcast is royalty free from Looperman.com.

How to calculate patient-specific estimates of benefit and harm from a RCT

One of the more challenging concepts for students is how to apply information from a study to an individual patient. Students have been taught how to calculate a number needed to treat (NNT) but that isn’t often very useful for the current patient they are seeing. Usually our patients are sicker or healthier than those in the study we are reading. Studies include a range of patients so the effect we see in the results is the average effect for all patients in the study.

Imagine you are seeing Mr. Fick, a 70 yo M with ischemic cardiomyopathy (EF 20%) and refractory anemia (baseline Hg 7-10 mg/dl). He reports stable CHF symptoms of dyspnea walking around the house after about 30 ft. He reports other signs and symptoms of CHF are stable. Medications include lisinopril 20mg bid, aspirin daily, furosemide 80 mg daily, and iron tablets daily. He is not taking a beta blocker due to bradycardia and can’t take a statin due to myopathy. He has refused an ICD in the past. BP is 95/62 mm Hg, pulse is 50 bpm, weight is stable at 200 lbs. Labs done one week earlier show a stable Na 0f 125 mmol/l, K 3.8 mmol/l, Hg 8 g/dl, platelets 162 k, WBC is normal with 22% lymphs on differential, cholesterol is 220 mg/dl, and uric acid is 6.2.  Since he has severe CHF you are considering adding spironolactone to his regimen. he is concerned because he has a hard time tolerating medications. He wants to know how much it will help him. What do you tell him?

This figure is from the RALES trial, a study of spironolactone in patients with advanced CHF. Use the figure below to figure out Mr. Fick’s individual estimated risk of death if he agrees to take spironolactone.

RALES figure

There are 4 methods I will demonstrate to calculate a patient-specific estimate of effect from an RCT. First, think about what information you will need to estimate Mr. Fick’s specific benefits of spironolactone. You will need the NNT from the RALES trial and Mr. Fick’s estimated risk of death (we call this the PEER or the patient expected event rate). Where do we get the PEER of death for Mr. Fick? You use a validated prediction rule. I use Calculate by QxMD. Look in the Cardiology folder under heart failure and open the Seattle Heart Failure Model. Plug in Mr. Fick’s data and you get his 1 year expected risk of death (56%).

Method 1: Calculate patient-specific NNT using PEER: the formula for this is 1 / (PEER x RRR) where RRR is the relative risk reduction from the RALES trial (30%. To calculate that: 1-RR is the RRR). So plugging that in, Mr. Fick’s NNT is 1 / (0.56 x 0.3) = 6 (the NNT from the RALES trial is 9).

Method 2: Estimate patient-specific NNT using f: F is what I call the fudge factor. It is your guesstimation of how much higher or lower Mr. Fick’s risk of death is than that of the average patient in the study. If you say he is 2 times more likely to die then f is 2. If you think he is half as likely then f is 0.5. The way to use f is to divide the study NNT by f. This gives an estimate of Mr. Fick’s NNT. So lets just say Mr. Fick is twice as likely to die than those in the study. The NNT of the study is 9.  So 9/2 is 4.5 which I would round up to 5.

NNTs are nice but its hard to use them directly with a patient. The next 2 calculations are more useful for patients.

Method 3: use the RR  to calculate Mr. Fick’s actual risk of death: the RR of death in the RALES trial is 0.70. You multiply this by his estimated death rate and you get his expected death risk if he were on spironolactone instead of nothing. His risk of death is 56%. So 0.70 x 0.56 = 39%. So if Mr. Fick takes spironolactone I expect his risk of death to go from 56% down to 39%. That’s useful information to tell the patient.

Method 4: use the RRR to calculate Mr. Fick’s actual risk of death: This is similar to the concept above except that you have to remember that the RRR (relative risk reduction) is relative. So first you calculate how much risk is reduced by the treatment. The RRR is 30% (1-RR is RRR). Then I multiply this by the patient’s risk of death. 0.30 x 0.56 is 0.168. This 16.8% represents how much risk I have removed from the baseline risk. Now I have to subtract it from the baseline risk and I get his final risk. So 0.56-0.168=0.39 or 39%. Same number as method 3 and it has to give the same number because its just a different way of calculating the exact same thing.

I hope this is useful and now you can give patients some real numbers instead of just saying your risk is decreased by x%.

Remember you need: patients risk of the event without treatment (usually from a prediction rule or maybe the placebo event rate of the study or placebo rate of a subgroup) and event rates from the study. Then you can make all the calculations from there.

The devil is in the details- overstating the results of the effects of corticosteroids in patients with pneumonia

This blog post will tie in nicely with what I blogged on earlier today about composite endpoints. Read that post first before reading this.

Today I received my e-table of contents from JAMA and read a study on the of Effect of Corticosteroids on Treatment Failure Among Hospitalized Patients With Severe Community-Acquired Pneumonia and High Inflammatory Response. The primary outcome of the study was “treatment failure (composite outcome of early treatment failure defined as [1] clinical deterioration indicated by development of shock, [2] need for invasive mechanical ventilation not present at baseline, or [3] death within 72 hours of treatment; or composite outcome of late treatment failure defined as [1] radiographic progression, [2] persistence of severe respiratory failure, [3] development of shock, [4] need for invasive mechanical ventilation not present at baseline, or [5] death between 72 hours and 120 hours after treatment initiation; or both early and late treatment failure).”

The authors make a bold statement:

The results demonstrated that the acute administration of methylprednisolone was associated with less treatment failure…”

I find this statement (from the 1st sentence in the discussion section) to be a vast overstatement of what they in fact found in this study.  Examine the table below (I trimmed out the per-protocol analysis results) and see just what was actually reduced by steroids.

From JAMA 2015;313(7):677-686

From JAMA 2015;313(7):677-686

Steroids had no effect on “early treatment failure”. They significantly reduced “late treatment failure” but this was all driven by one outcome. The only thing steroids did was reduce radiographic progression. They didn’t help any other outcomes of this large composite but yet the authors make this sweeping statement of steroids being associated with less treatment failure. This demonstrates the importance of looking at the individual components of the composite and not just focusing on the overall composite result.

It also demonstrates why I don’t like to read the discussion section of a paper nor the conclusions from an abstract- you will be misled. The reviewers and editors should have toned down these conclusions as they are a gross overstatement of what was actually found.

How to decide when a composite endpoint should go into the compost

Composite endpoints are commonly used in studies. A composite endpoint is an endpoint composed of several other endpoints. If a patient experiences any one of them they are considered to have experienced the endpoint of the trial. For example, a composite endpoint in a typical cardiovascular study includes nonfatal MI, nonfatal stroke and cardiovascular death. A patient doesn’t have to have all three just one of them.

Why use composite endpoints? The main reason is to reduce the number of patients needed in the study. The chance of a patient having any one outcome is much less than having any one of three outcomes. They are also used to potentially reduce the length of follow-up needed in a study. A patient is likely to develop one of three outcomes more quickly than any one outcome or one of the components of the composite can occur sooner than another (e.g. doubling of serum creatinine vs. initiation of hemodialysis).

Not all composites are created equal. Some are good and many are poorly developed. Examine the composite outcome below from the RENAAL trial published in the NEJM in 2001. The primary efficacy measure was the time to the first event of the composite end point of a doubling of the serum creatinine concentration, end-stage renal disease, or death. What do you think? Is this a good composite or a poor composite? (Note: I put a red mark next to the components of the composite)

From the RENAAL trial, NEJM 2001

From the RENAAL trial, NEJM 2001

I think this is a poorly designed composite. Why do I say that? A good composite should have the following characteristics:

  1. Each component should be valued equally by patients,
  2. Each component should occur with similar frequency, and
  3. The intervention should have the same relative effect on each component.

With this in mind, reevaluate the RENAAL composite endpoint. Hopefully you agree with me that its not a good composite endpoint. Let’s examine it more closely.

Issue #1: would patients consider each of the components to be of equal value? Patients would not consider death and doubling of serum creatinine as being equal. Clearly they would value death as a much worse outcome. So this composite fails here.

Issue #2: do each of the components of the composite occur with equal frequency? Looking at the percentages of the components in the losartan group they are pretty close to each other (21.6%, 19.6%, and 21%) so I would give the composite a pass on this criterion.

Issue #3: does the intervention (losartan) have an equal effect on each of the components of the composite? Look under the risk reduction column and the answer is no. Doubling of serum creatinine is reduced by 25% and end stage renal disease by 28% but death is actually increased by 2%. Thus, the composite fails on this criterion.

What should you do if the composite endpoint is a bad composite? Just ignore the composite and look at the individual components. Even if a composite is a good one you should always examine the individual components. So in this case losartan reduces the risk of ESRD and doubling of serum creatinine but has no effect on mortality.

Overcoming Probability Inflation

Benjamin Roman, MD, MSPH wrote a wonderful piece in this week’s New England Journal of Medicine. It might not get read much because it is listed way down the table of contents but I think it is more clinically important than any other piece in the journal this week. He tells of his own story of having sudden sensorineural hearing loss and agreeing to an MRI even though the probability of him having a serious cause of the problem was low, the cost of the test (MRI) was high, and the benefit of treatment was minimal (in fact, many don’t need treatment). Furthermore, he is an ENT physician and knows all this but still underwent testing anyway- mainly because his wife wanted him to!

He outlines an important problem in medicine for both physicians and patients: probability inflation.

This problem arises from the way we deal emotionally (added for emphasis) with risk and uncertainty, which are givens in health care, and the way we make decisions in the face of low-probability outcomes.

Emotions are a large part of the problem; the affect heuristic. When we make decisions we often consider it analytically but also from the standpoint of how we feel about it. If we have positive feelings about the situation we magnify the probability of benefit or, conversely, reduce the magnitude of harm. Think about Dr. Roman’s situation. He (or at least his wife was) was worried about something bad happening (ie having an acoustic neuroma) but understood that was pretty unlikely to be the case. But what if he didn’t do the MRI and he actually had a treatable one that would be missed. He had strong feelings (or at least his wife did) that he didn’t want to miss the acoustic neuroma. Or maybe he would be relieved that he didn’t find one (that’s is a strong positive emotion isn’t it) if the MRI was negative (assuming the sensitivity is good enough). Thus, the acoustic neuroma’s probability becomes artificially inflated. He probably didn’t even think about the downstream effects of finding one and the risks associated with having surgery or radiation (which probably outweigh the benefits of finding it if I had to guess).

Many of us fear the uncertainty almost more than the disease itself. We want to know even if we can’t act on the information we are given. We also like doing something. At least we will go down fighting. This affects both physicians and patients. We order things we shouldn’t. Patients request things they shouldn’t. Sometimes its because of poor reasoning skills. The affect heuristic gets us. Sometimes its more practical as Dr. Roman notes:

My doctor’s recommendation was based on a similar reaction. Besides wanting to reassure himself and his patients that there is no acoustic neuroma, he told me, another reason he suggests MRIs in situations like mine is that he fears being sued should he fail to order one and end up missing something. He noted that court malpractice awards for missed acoustic neuromas commonly reach into the millions of dollars and that until we agree to an acceptable miss rate and physicians are no longer liable for missing just a single such case, their practices will not change. I’m not sure how common such verdicts are, but this rationale also reflects risk aversion in the face of a low-probability bad event — it’s simply the doctor’s risk that’s at issue, rather than the patient’s. (emphasis added)”

That last statement is telling. It’s a shame so much of medicine revolves around covering our proverbial asses.

Dr. Roman offers some solutions:

  1. comparative effectiveness and outcomes research (this exists for many things but gets ignored)
  2. educating doctors about how to discuss uncertainty, risk, and probability (First, doctors need to be taught these principles before they can teach anyone else. I see first hand on a daily basis how little of this is understood)
  3. addressing emotions and psychology of patients and physicians (good luck dealing with emotions….. anyone have a teenage child?)
  4. nudging each other to do the right thing
    • consumers share cost of things they want that are marginal (good idea for sure)
    • government (either local or national) regulation (Hell no! More bureaucracy is not needed and will only raise costs even more)

As Dr. Roman points out all of these need to be done but the devil is in the details. HOW? I think the focus of these solutions is from a society or community perspective and physicians mainly feel a duty to only one individual- the individual sitting in front of them. That relationship is powerful and affects decision making.

My dad had advanced dementia and fell in his bathroom suffering a tibial plateau fracture. The surgeon wanted to fix it surgically as this would give my dad the best chance to walk (though he couldn’t actually tell me the probability). The only other option was splinting and rehab.  Thankfully, I know enough about dementia and specifically my dad’s dementia to know he would never be able to participate in rehab and I knew he would never be able to keep the wound clean and stay off his leg until it healed. I decided not to do the surgery and opted for rehab and splinting. My dad never walked again. He couldn’t understand how to do rehab or to use a walker. I made the right decision because I think the ultimate outcome would have been the same either way- not walking. I have no way of knowing. It was a decision under uncertainty. I saved his insurance and Medicare a lot of money. That wasn’t my goal. My goal was to maximize outcomes in the most resource-sensitive way that would harm my dad the least. I felt surgery would be more harmful than not doing the surgery. Should the surgeon have even offered to do surgery? Should he have just said that splinting was the best for someone like my dad with advanced dementia? When he offered surgery did he really thing it would help or was it because he was a surgeon and that’s what they do?

Like all complex problems the solutions are equally if not more complex. I will continue to do my small part of educating who I can on EBM principles and hopefully a few of my learners will make good decisions.