A response to criticisms of intention-to-treat analysis

I received an email about an in introductory video I made for medical students about intention-to-treat (ITT) analysis. The author of the email also made a comment about the video on YouTube. The points he makes are valid and he expands on them in this manuscript. I will defend, if you will, the concept of ITT analysis.

First, ITT analysis refers to analyzing all patients based on the arm of a study they are randomized to no matter what happens to them…”Once randomized, always analyzed”. Even if they don’t take the intervention or become non-compliant their outcomes count against the group to which they were randomized. It is considered the primary analysis for superiority studies (where we want to prove an intervention is better than something else).

Important background disclaimer: ITT analysis is just one part of the methodology of a randomized controlled trial. (The other parts being concealed allocation, randomization, blinding, equal cointerventions, etc.) There is a tendency of critics of ITT analysis to think of it in isolation and I think that is a mistake. It is just part of an overall plan for conducting a study.

Randomization is considered the gold standard methodology for allocation in therapy studies. It is done for a couple of main reasons: 1) to equalize both known and unknown prognostic or confounding factors and 2) to prevent selection bias or cherry picking of patients to one arm of the study or another based on prognosis. In general, ITT analysis helps preserve the effect of randomization. If you start pulling out people differentially from each arm of a study the remaining people (to be analyzed) are likely not balanced prognostically. Yes, you can adjust for disparities between groups but only those that are measured. You can not adjust for unmeasured confounders. Randomization balances even unmeasured counfounders (assuming sample size is large enough). So its important to preserve randomization.

Dr. Feinman points out in his comment that “In the experiment, people sensibly want to know what is the effect of surgery compared to aspirin. Instead, ITT answers instead what is the effect of TELLING PEOPLE to have surgery vs. taking aspirin. Is that really what you want to know?” This is true but this is what we do in practice. I recommend a strategy to a patient and they will either follow it or they wont. So maybe it’s not the ITT analysis that is the problem but the question of the study. Another piece to this is that a per-protocol analysis can (and probably should) be done after the ITT analysis to see the effect in those who actually got the therapy. The results of both analyses should be reported so the reader can better understand the results. Data on dropout/noncompliant patient demographics/comorbidities/etc needs to also be reported so we can better understand potential reasons for drop outs or noncomplaince.

I mention in the video that ITT analysis is more conservative in that it will  less likely lead you to falsely reject the null hypothesis and falsely conclude that the intervention is effective. This is because those in the intervention group that dont get the intervention are actually like the control group; leading to more similar event rates and more difficulty in finding a significant difference.  So using ITT analysis somewhat depends on the risk to the patient of falsely saying the intervention is effective (when it isnt) or falsely saying it isnt effective (when it is). Conversely, per protocol analysis is less conservative in superiority studies.

I’m not sure what Ive written here will change anyones mind. I think a compromise is to use both ITT analysis and per protocol analysis and report both results. Hopefully, they are similar or at least qualitatively similar.

What do you say about this controversy?

Academic promotion and salary structure is part of the problem with research

There has been an interesting string on a listserv I am a member of about EBM being highjacked and the poor quality of research and of how so much research goes unpublished.  Below are two responses that got me thinking about something:

I am not impressed by the recent declaration in BMJ just because it is about the bad studies and the need of new EBM. What new? The authors of the declaration published calls for new EBM almost annually.  And most the time the calls are to go beyond evidence, and to do good, not bad.

Yes, too many research are poorly designed and executed. Is it new finding? Was ever EBM quiet about the bad evidence?

“today’s world …so financially driven” – was it ever not like that? Can one name the Golden Age?

If not kill your self, one need to live in the real world and try to make it better. EBM is specifically about it: critical appraisal and education (in wide sense). It is about use of the best available evidence for the good of the patients. Best available, sic. To influence research, to improve research, to clean the publication practice is a good thing, but it is beyond EBM in pure sense.

and…

From the link, ” Too many research studies are poorly designed or executed. Too much of the resulting research evidence is withheld or disseminated piecemeal”. In short, medical research is not implemented correctly for a number of reasons. So my question is should we scrap medical research? OR RCT’s which are supposed to be the gold standard for trials should be dropped from research?

For me the most logical answer is better implementation. So the debate should how we can do this in today’s world which is so financially driven. In other words, the debate/discussions should not be about research or RCT’s are flawed and hence should be replaced.

Also, please don’t blame EBM for shoddy research practices. EBM has no control on what researchers like to do or don’t like to do. EBM can only make recommendations on how to do better research and interpret research.

My miniepiphany why reading this string was that academic medicine is partly to blame for some of the problems in the evidence base. In academics you have to garner grants to pay for yourself (or at least part of your salary) and you have to publish to get promoted. So we have a system that rewards quantity over quality. Lots of research is done because we have to do it to get promoted. But how much of it is worthwhile research? Not much. Many academics aren’t trained in proper research technique so they perform lower quality studies.

So one fix to “EBM” could be to restructure how we finance academics. We need to quit focusing on grants (especially considering governmental funding of these goes down every year) for compensation. Teaching (and yes medical schools need to actually pay adequately for the teaching we do) and clinical activities should be the primary funders of academics. Then research could be done on important things and by properly trained researchers.

 

What can we do to reduce the amount of research that goes unpublished?

An excellent piece was published by Paul Glasziou and Iain Chalmers about the large percentage of research that goes unpublished. As they note, the estimate that 50% goes unpublished is likely an underestimate. Unfortunately they didn’t offer any significant solutions other than we need a “better understanding of the causes of, and cures for, non-publication.”

A “simple” solution is for the drug/device approval process to require all studies related to that product and/or conducted/funded by the requesting company be registered and published. This would miss studies done after drug/device approval or done by independent parties but a large number of nonpublished studies are conducted or funded by the companies that market the drug/device. This would also miss all the other studies not directly related to drug/devices (e.g. epidemiological studies).

Another significant challenge is where to publish this information. The web makes the most sense as this is the cheapest route of publication. Maybe the FDA (or some international commission) could have a page(s) on each drug that includes full text access to all studies done on that drug/device. Would these need peer and editorial review? Yes, but a daunting task as we already struggle to find willing and competent peer reviewers. FDA budgets shrink repeatedly and this would be a significant financial burden.

What I really wanted to do in this post was to give my thoughts on a  question raised by Jon Brassey (Director of the TRIP Database):

  • What is better a large RCT or a SR based on a “biased subsample”?

Is a large RCT more desirable than a systematic review (SR) based on a biased subsample of studies? This has been a conundrum for some time. You can argue both sides of this. The reason he says biased subsample is that we know more positive studies get published than negative, larger effects get published more than small effects, etc. Is the answer to this question “it depends”? It depends on your goals: a more precise estimate of the biased effect (favors SR), more generalizability (favors SR), a potentially more methodologically sound result (favors RCT). What is interesting to consider is that the same study repeated over and over will result in a distribution of results (this is why it shouldn’t surprise us that when we do seemingly the same study we don’t get the exact same result). Should we repeat studies? When should we stop repeating the studies (i.e. when have we adequately defined the distribution of results)?

I don’t think we can really answer this question as both of these study types have limitations but if I had to pick one I would rather have a large RCT that is well done than a SR based on a limited subset of the data especially considering we don’t know what is missing and the effect seen in those missing studies.

 

What to do when evidence has validity issues?

I often wonder how different clinicians (and EBM gurus) approach the dilemma of critically appraising  an article only to find that it has a flaw(s). For example, a common flaw is lack of concealed allocation in a randomized controlled trial. Empirical studies show that the effects of experimental interventions are exaggerated by about 21% [ratio of odds ratios (ROR): 0.79, 95% CI: 0.66–0.95] when allocation concealment is unclear or inadequate (JAMA 1995;273:40812). 

bias1

So what should I do if the randomized trial doesn’t adequately conceal the allocation scheme? I could discard the study completely and look for another study. What if there isn’t another study? Should I ignore the data of a perfectly good study otherwise? I could use the study and adjust the findings down by 21% (see above for why) and if the effect of the intervention  still crosses my clinically important threshold then I would implement the therapy. I could use the study as is and assume it wasn’t important because the reviewers and editors didn’t think it was. This is foolish as many of them probably didn’t even recognize the flaw nor would many of them understand the impact.

I don’t have the right answer but wonder what more learned people do. I personally adjust the findings down and determine if I still want to use the information. The problem with this approach is it assumes that in the particular study I am reviewing that the estimate of effect is in fact biased…something I can’t really know.

What do you do?

Podcasting to help keep current

Intro image

I am taking a class on Multimedia as part of my Master of Educational Technology degree program. This week our assignment was to develop a podcast and I decided to make it EBM related (always make your work count twice). I used Audacity, a free audio editor and recorder, to create the podcast. There was a learning curve but I have it mostly figured out. In the past when I created all my YouTube videos I “lectured” off the top of my head. For this assignment I had to write a script first and read from it. This is much better than ad-libbing. I don’t have an verbal tics (like “uhs”) and my cadence is better.  I suggest if you do any recordings, even about things you know a lot about, make a script and read it.

Medicine Review in a Few will be a podcast series in which I review what I consider important studies in Internal Medicine.  Each episode will review one study and will last less than 10 minutes; hence the “in a few” portion of the title. I think its important to keep information that isn’t interactive and is only processed through one channel fairly short. I personally lose interest and focus with long podcasts. According to data from Stitcher.com the average listener abandons a podcast within 22 minutes.

In Episode 1 I review the ADJUST-PE study. I chose to begin my podcast series with this study because I recently used the information in this study to care for a patient. I wasn’t aware of the findings of this study until one of my residents brought it to my attention.  I plan to only review clinically useful studies and will comment on any methodological limitations of the studies that I think the average clinician wouldn’t recognize or know how that limitation impacts the study findings. I think podcasts are a good medium to review studies.

For now, the podcasts will only be posted here but if I keep up with this endeaver I’ll ultimately try to get them on iTunes.

The image I used above is from splitshire.com and requires no attribution. The music used in my podcast is royalty free from Looperman.com.

2014 in review

The WordPress.com stats helper monkeys prepared a 2014 annual report for this blog.

Here’s an excerpt:

The concert hall at the Sydney Opera House holds 2,700 people. This blog was viewed about 8,400 times in 2014. If it were a concert at Sydney Opera House, it would take about 3 sold-out performances for that many people to see it.

Click here to see the complete report.