EBM is just not a priority in medical education

When I reflect on what I do each day as a physician it occurs to me that I use EBM skills very commonly. Here is a sampling:

  • I think about and assess pretest probability a lot
  • I think about choosing appropriate tests a lot
  • I apply information from studies a lot. I weigh risks and benefits of therapies. I think about patient context. I try to incorporate patient values and desires as much as possible.
  • I search for information following the Haynes’ 6S approach
  • I critically appraise primary studies and systematic reviews each week (not daily)
  • I make calculations because studies don’t always put information in the format I want
  • I have discussions with patients about the above issues

I am sure I am missing a lot of what I do that falls under “EBM”. I am revamping an introductory course in EBM I teach to 2nd medical students for the upcoming semester. It has been relegated to “just teach them enough to get a good score on Step 1”. Thankfully, I have a fuller online version that they will take during their scholarly time in the 3rd year so all is not lost. To make me feel better, I view the crash course I am teaching them this upcoming semester as scaffolding so that they can better understand my full online course. You can look at and use the materials I will use in the crash course in the tab above labeled “Online Teaching Resources” (I just realized still have to add a few items that the students will use).

We spend so much time in the 1st 2 years of medical school teaching about things that I honestly never ever use but yet what I use daily gets short shrift. Why is that? Are EBM skills not important? Is it assumed they are easy to develop later in one’s career on one’s own (they aren’t)? Is it just kicking the can down the road assuming in residency these skills will be learned? Or during the clerkships?

I for one wish none of this material was on Step 1. I think it’s too early. Furthermore, I am so sick of my course evaluations including statements like “Taught too much stuff that wasn’t on step 1”. I think you need some clinical knowledge to really learn EBM, but more importantly, to understand its importance. EBM type questions should get greater prominence on Step 2 and even more prominence on step 3 exams. One or 2 questions only reinforces the perceived lack of importance of EBM. EBM should have just as many questions as any of the specialties and each test should have more questions to reinforce that these skills are important and will be used. Maybe Santa will grant me that wish one of these years. (I am keeping my fingers crossed I get onto the NBME committee that writes the EBM questions. Maybe I can convince them of my plan)

 

 

Tarnished Gold Chapter 4: Beating The Odds

Finally, a chapter I somewhat agree with.

This chapter discussed the difficulties in understanding probability. The examples they use aren’t good analogies for clinical probabilities but are interesting nonetheless.

Picture of a quote: its all relative

From QuoteAddicts.com

I’ll focus on what I agree with for this post. They discuss the misleading nature of reporting relative risks (and relative risk reductions also) in research reports. This is a real problem as clinicians often don’t understand that while the relative risk/benefit of an intervention is fairly constant across patient subgroups the absolute benefits aren’t. In general, if something is beneficial the sicker you are the more benefit you gain. For example, let’s say a treatment has a relative risk reduction for death in the next year of 75% (RR of 0.25) and we have 2 patients we are seeing. One has a risk (or probability) of death of 50% without the intervention and the other has a risk of death of 10%.  If patient one is given the treatment her risk is reduced from 50% to 12.5% (to see how I did this watch this video). If patient two is given the treatment his risk is reduced from 10% to 2.5%. So the absolute benefit is greater for patient one (37.5%) than for patient two (7.5%) even though the relative benefit is the same (75%). This is often a difficult concept for physicians to understand but once mastered is a useful way to discuss the benefits and harms of a proposed intervention with patients. Furthermore, it’s patient specific.  To get the probability of an outcome for an individual patient you could use a validated clinical prediction rule, the placebo rate from a trial, the results from studies of disease frequency (though these are rare) or, as a last ditch effort, guesstimation.

Tarnished Gold Chapter 3: Prove It

This chapter dealt with the issues of what constitutes evidence. Instead of focusing on their views I will focus on my views of evidence.

the_evidence_tv_series

A common criticism of EBM is that it very strict in what it considers  acceptable evidence and it doesn’t consider clinical experience and pathophysiological rationale as important. Early EBM did focus too much on the RCT and Cochrane systematic reviews but this has changed. The current EBM paradigm focuses on multifactorial “evidence” including the patient’s clinical state and circumstances, clinical experience, and the best available evidence. Sometimes this will be a systematic review but often it will just be patient experience (what worked or didn’t work for them in the past) or pathophysiology. The early EBM paradigm cautioned us that we can be misled by our unsystematic observations and the pathophysiological rationale. For the latter, it’s because our understanding of pathophysiology changes and diseases are complex and multifactorial and interventions we study tend to be unifactorial. Nonetheless, clinical experience is evidence and is very important and no EBMer will say otherwise. Understanding pathophysiology is important and no EBMer will say otherwise. The key is to understand the limitations of any evidence source.

Evidence supports a belief and doesn’t have to be true. In clinical medicine we can never know the truth. We can only try to estimate the truth with a study because we can’t study every person with a given disease. We have to infer a lot. We generalize from a sample in a study to a whole population and back down to an individual patient. The authors of Tarnished Gold have a real problem with this paradigm but it’s what we do in clinical medicine. Bench research works differently. Rats can all be genetically and phenotypically the same. Bacteria can all be clones of each other. Bench scientists can study a whole population of something and declare an effect. We can’t do this in clinical medicine because we are all so heterogeneous and have free will.

EBM no longer worships only the RCT and the Cochrane review. Patient inputs are viewed as very important and slowly becoming equally important. Qualitative studies are gaining importance. Clinical experience will always be prominent in deciding what should be done from what could be done.

Tarnished Gold Chapter 2: Populations are not people

Populations are not people

First off the authors state that decisions sciences do not relate to EBM. They feel decisions are personal and statistical information is not important. They give the example of organ transplantation. Unfortunately, they skip an important step in their argument. Namely, that to know an organ transplant will be of benefit is based upon studies proving that they prolong life and these are based on statistical information.

They argue that EBM is based on a statistical blunder: the ecological fallacy.  There is some merit to this argument. The average finding applies to the average patient. What if your patient isn’t average. There are a couple of options. First, you could calculate your patient’s estimate of benefit (or risk) using the results from the study like I demonstrate in this video.  Almost every study report will include a confidence interval around the point estimate of benefit (or harm). The point estimate is the best guess about the findings of the study but there is uncertainty and the confidence interval helps quantify that uncertainty. You could use the upper and lower bounds of the confidence interval and decide if it includes a clinically important benefit. Finally, you could look for a subgroup analysis (yes I recognize the limitations of this) of a group of patients similar to yours. Despite all this, science is based on inference. We can never measure the effect of an intervention in all people. We often use inductive and deductive reasoning in science.

The authors spent several pages discussing pattern recognition in medicine and that EBM doesn’t help this. This is both true and false. It is true in that we are taught how certain things look and there will never be a study related to that. We have numerous studies though of how good elements of the history and PE are for diagnosing disease. Many of these are pattern recognition. We learn that peripheral edema, orthopnea, PND, and DOE are most likely congestive heart failure. That is pattern recognition but there is also a study that examines how good each of these components is to increasing or decreasing the probability of CHF. Thus, pattern recognition is informed by EBM.

There are more claims to be refuted in this chapter but these are the main ones worth refuting.

 

Tarnished Gold Chapter 1: Evidence-based Medicine

This is going to be a lot harder than I thought. I question why I am even wasting my time reading this tripe but I will plod forward so that there is a counterargument to this work. I also need to understand criticisms of this paradigm so that the paradigm can be improved.

Importantly, the authors focus on an outdated definition of EBM. This definition was the first iteration of the definition and is oft-quoted but it is out of date nonetheless.

Evidence based medicine (EBM) is the conscientious, explicit, judicious use of the current, best evidence in making decisions about the care of individual patients.

The current paradigm of EBM was published in 2002 well before this book was published and should have been included in this book. Hickey and Roberts claim it had its origins in the legal system which is total BS. If you read the early EBM papers there is no mention of the legal system driving this paradigm.  I also consider EBM to just be a set of skills (searching, critical appraisal, application) to use in the care of patients.

 

 

They focus and have problems with 2 words in the above definition: best and evidence. They are concerned that best leads to selection of evidence and that “one bit of evidence is better than another”. Of course some evidence is better than others. Empirical studies (not done by the evil drug companies) have demonstrated that certain design flaws, for example lack of blinding, lead to overestimation of effects. Studies have also demonstrated that observational study designs can overestimate effects and even give opposite effects to randomized trials (see the HRT saga). I’m sure they will argue later in the book that all these types of studies are rigged and randomized trials are rigged (probably because their holy grail, Vitamin C, failed in controlled trials to be useful). There are too many studies showing similar effects to discount the evidence that supports the fact that some studies are better than others.

They claim “EBM’s evidence does not mean scientific information or data, but refers to legal justification“. First off, EBM does not possess evidence so the ‘ is misplaced. Second, this statement doesn’t even possess any face validity. Journals are full of scientific information and data. What are they talking about?

They claim “EBM has little to offer the doctor treating a patient, beyond suggestions about what might be expected with an average patient“. Studies used to inform practice usually are based on a sampling of patients because we can’t study every single person with that problem. Sampling can be done to reflect a broad range of people with a given problem or it can be done to select for certain subpopulations of disease (for example, advanced disease or early disease). On average, most people are average. So their statement isn’t totally without merit. We can’t do studies on every type of patient. But, here is where the current paradigm helps us. We (as doctors) take into account the patient’s state and circumstances when applying the best available evidence to their case. We use our clinical training and experience to decide what we should do from what we could do. There are ways to adapt study data to an individual patient like I demonstrate in this video. N-of-1 trials can also be done on individual patients to see if a therapy is effective (more on this in another post).

Finally, (though there is a lot more I could comment on) they have problems with using statistics to analyze data. As I mention above, in medicine we can only sample a small percentage of those with disease. If we could study everyone we wouldn’t need statistics but since we can’t we use statistics on a sample to try to generalize it to the whole population. I don’t know of any other way to do this because we simply can’t study everyone. (I recognize this is a gross simplification of what statistics do and not totally accurate.)

The next chapter I’ll critique is entitled “Populations are not People”. Stay tuned…

 

 

 

A Cartoon About Blinding: Using New Tools Can Be Fun

I had to make a few slides about blinding and decided a cartoon might be fun to make and a graphic way to display the information. As I am getting a degree in educational technology I have a proclivity to try new tools. I found an article on free cartoon making tools and decided to give one a try. It was intuitive and had reasonable features. I had initially planned to draw my own characters and put masks on them but in the interest of time just used the characters already in the program.

Try using new tools when you can and the situation fits. It can be fun and interesting. Always remember that the tool you use should facilitate learning and not just be used because its cool. I felt a graphic would help learners understand blinding more that just a word description.

What do you think?

A comic strip with 3 panels showing single blind (patient only), double blind (patient and researchers) and triple blind (patients, researchers, and statisticians).

How to decide when a composite endpoint should go into the compost

Composite endpoints are commonly used in studies. A composite endpoint is an endpoint composed of several other endpoints. If a patient experiences any one of them they are considered to have experienced the endpoint of the trial. For example, a composite endpoint in a typical cardiovascular study includes nonfatal MI, nonfatal stroke and cardiovascular death. A patient doesn’t have to have all three just one of them.

Why use composite endpoints? The main reason is to reduce the number of patients needed in the study. The chance of a patient having any one outcome is much less than having any one of three outcomes. They are also used to potentially reduce the length of follow-up needed in a study. A patient is likely to develop one of three outcomes more quickly than any one outcome or one of the components of the composite can occur sooner than another (e.g. doubling of serum creatinine vs. initiation of hemodialysis).

Not all composites are created equal. Some are good and many are poorly developed. Examine the composite outcome below from the RENAAL trial published in the NEJM in 2001. The primary efficacy measure was the time to the first event of the composite end point of a doubling of the serum creatinine concentration, end-stage renal disease, or death. What do you think? Is this a good composite or a poor composite? (Note: I put a red mark next to the components of the composite)

From the RENAAL trial, NEJM 2001

From the RENAAL trial, NEJM 2001

I think this is a poorly designed composite. Why do I say that? A good composite should have the following characteristics:

  1. Each component should be valued equally by patients,
  2. Each component should occur with similar frequency, and
  3. The intervention should have the same relative effect on each component.

With this in mind, reevaluate the RENAAL composite endpoint. Hopefully you agree with me that its not a good composite endpoint. Let’s examine it more closely.

Issue #1: would patients consider each of the components to be of equal value? Patients would not consider death and doubling of serum creatinine as being equal. Clearly they would value death as a much worse outcome. So this composite fails here.

Issue #2: do each of the components of the composite occur with equal frequency? Looking at the percentages of the components in the losartan group they are pretty close to each other (21.6%, 19.6%, and 21%) so I would give the composite a pass on this criterion.

Issue #3: does the intervention (losartan) have an equal effect on each of the components of the composite? Look under the risk reduction column and the answer is no. Doubling of serum creatinine is reduced by 25% and end stage renal disease by 28% but death is actually increased by 2%. Thus, the composite fails on this criterion.

What should you do if the composite endpoint is a bad composite? Just ignore the composite and look at the individual components. Even if a composite is a good one you should always examine the individual components. So in this case losartan reduces the risk of ESRD and doubling of serum creatinine but has no effect on mortality.

What is EBM?

With all the discussion of EBM in crisis and EBM on trial it strikes me that maybe these other folks have a different definition or concept of EBM than I do. I think to have any discussion needs to come from a common ground of just what is EBM.

Evidence based medicine is the conscientious, explicit, and judicious use of current best evidence in making decisions about the care of individual patients.

This is the original definition of EBM published in 1996. It urged us to strive to use best available evidence in making clinical decisions. It also cautioned not to be a slave to evidence as evidence was often not applicable to individual patients. This definition served us well until the patient-centered paradigm of care became popular and the definition of EBM evolved to its current form:

This definition is more explicit about the order of importance of the individual elements of the components of EBM: patient preferences and actions is foremost, followed by the clinical state and circumstances and the research evidence. All this is tempered or tied together by our clinical expertise. The evidence tells us what could be done while the rest tells us what should be done.

The other way to look at EBM is that it is just a set of skills:

  1. asking an answerable clinical question
  2. finding the best available evidence
  3. critically appraising the evidence
  4. applying the evidence to individual patients
  5. appraising how well you did on each step and, I think, appraising the impact on a patient

So from this background I find it difficult to lay blame on EBM for many of the problems with the evidence. I blogged on this previously and will refute their claims at EvidenceLive2015 in April.

Will EBM be found guilty or not guilty?

Carl Heneghan recently wrote a blog for BMJ blogs entitled Evidence based medicine on trial focusing mostly on the problem with the evidence part of EBM. While I mostly concur with his list  of the problems (distortion of the research agenda, very poor quality research, and lack of transparency for published evidence) I wonder who is at fault. “EBM” seems to get the blame as if there is an entity that is EBM and it controls all research. EBM is but a set of skills: question asking, searching, critical appraisal, and application to individual patients. It is nothing more. So why are people being so critical and place so much blame on a set of skills? There will be several sessions at EvidenceLive 2015 (one of which I will be speaking at in defense of EBM) on real vs rubbish EBM.

I want to focus on the distortion of the research agenda. Professor Heneghan rightly points out that the research agenda is driven by industry. Is that good or bad? I think its both but mostly good. The only other major funders of research  are governmental agencies like the NIH. Profit drives innovation. It is very expensive to bring a drug to market. The government could not afford to bring the current drugs we have and need to market. One failed drug alone would deplete the coffers. Failure is the biggest driver of cost. Fewer than 1 in 10 drugs tested makes it to market. Would we tolerate that poor of a success rate at such a big cost by the government? No.

…adjusting that estimate for current failure rates results in an estimate of $4 billion in research dollars spent for every drug that is approved.

I agree that industry seems at times to make a drug then find a “disease” for it. I think the example Professor Heneghan gives is spot on. I don’t believe in adult ADHD but we have drugs for it. Do we need them? No and this video demonstrates why:  Drug free treatment of ADHD. Who is really at fault are the doctors who prescribe the drugs that Professor Heneghan feels aren’t necessary. Not the companies for making them.

On a serious note…what about all the devices we use regularly like stents, defibrillators, etc? Would government have independently brought these to market? Likely not. We had balloon angioplasty (without stenting) that worked just fine albeit short term only. It would have been “good enough for government work” as the saying goes. What about advancements in imaging modalities? Again likely not. The old CT scanners worked just fine. Industry is largely responsible for innovation and improvement in all walks of life. Yes for a profit but profit is not a bad thing. Those who say otherwise please return your iPhones.

Evidence Based Medicine Is Not In Crisis! Part 4

I’ve left the hardest issue to deal with for last- “Overemphasis on following algorithmic rules”.  This has been the most frustrating aspect of my primary care practice.  Patients quit being viewed as patients but a set of goals that I had to achieve to be smiled upon fondly by my boss as being “a good doctor”. It took me some time to finally quit playing the game and just do the best I could do and whatever the numbers were so be it.

Algorithmic medicine couldn’t be any more antithetical to EBM. Everyone is viewed the same. EBM clearly, as I have argued in the last three posts, is about individual patient values and circumstances. It’s about clinical experience temporizing what we could do to what we should do.  Algorithmic medicine allows no individuality.  No temporizing. Thus to claim EBM is in crisis because of algorithmic medicine is wrong. True EBM protects us from the harms of algorithmic medicine.

Interestingly computerized decision support systems (mentioned as a culprit in the first sentence of this section of Greenhalgh’s paper) are at the top of Haynes’ 6S hierarchy of preappraised evidence.

In these computerized decision support systems (CDSSs), detailed individual patient data are entered into a computer program and matched to programs or algorithms in a computerized knowledge base, resulting in the generation of patient-specific assessments or recommendations for clinicians” –  Brian Haynes

At the VA we have a moderately sophisticated CDSS. It warns me if my patient with heart failure is not taking an ACE inhibitor and its smart enough that if I enter an allergy to ACE inhibitors it won’t prompt me to order one. If I tell it that a patient has limited life expectancy it will not prompt me to pursue certain routine health screenings. Thus, I don’t view CDSSs as problematic in and of themselves. The problem arises when physicians don’t consider the whole patient (remember those values and clinical circumstances) in deciding whether or not to follow prompted recommendations.

Greenhalgh has made great points about what happens when good ideas are hijacked and distorted for secondary gain but EBM is not to blame. Victor Montori (@VMontori) said it best in a Tweet to me:

EBM principles are not in crisis, but corruption of healthcare has oft hidden behind the e-b moniker. EBM helps uncover it“.