In noon conference today I reviewed the good, the bad, and the ugly of the recently released ACC/AHA cholesterol treatment guidelines. Below is a YouTube video review of the guidelines. It will be interesting to see how cholesterol management evolves over the next few years. There are groups like the National Lipid Association who feel that removing the LDL goals from the new guideline was a mistake. Likewise, the European Society of Cardiology lipid guidelines recommend titrating statins to LDL targets. Conflicting guidelines are always a problem. I will address conflicting guidelines in my next post and what to think about when you see conflicting recommendations on seemingly the same topic.
Physicians mainly read the abstract of a journal article (JAMA 1999;281:1129). I must admit I am guilty of this also. Furthermore, I would bet that the most often read section of the entire article is the conclusions of the abstract. We are such a soundbite society.
I had always thought the literature showed how bad abstracts were…that they were often misleading compared to the body of the article. But I was wrong. A recent study published in BMJ EBM found that 53.3% are abstracts had a discrepancy compared to information in the body of the article. That sounds bad doesn’t it? But only 1 of them was clinically significant. Thus most of the discrepancies were not important enough to potentially cause patient harm or alter a clinical decision.
This is good news as effectively practicing EBM requires information at the point of care. Doctors don’t have time to read an entire article at the point of care for every question they have but they do have time to read an abstract. It’s good to know that structured abstracts (at least from the major journals that were reviewed in this study) can be relied upon for information. I especially like reading abstracts in evidence based journals like BMJ EBM or ACP Journal Club as even their titles give the clinical information you need.
I recently discovered a cool tool for teaching- TouchCast. A TouchCast is an interactive video; meaning there is a background video and things popup that can be touched and openrf up.
I made a Touchcast on case-control studies. Check it out and see what I mean. Make sure you touch one of the YouTube videos or the web site that I put on the screen to see how it works.
I find this very exciting. I can make a background video that gives a 30,000 foot view of a topic and embed further materials (other videos, websites, etc) for those that want a deeper understanding.
So what are the limitations? For now the interactivity is limited to viewing a TouchCast via their app or their website for the interactive functionality. The videos can be uploaded to YouTube but the interactivity is lost. The length of the video is also limited to 5 or 6 minutes. This isn’t a killer for me because educational videos should be short and in this case I can embed hours of other videos if I wanted to. Finally, the other limitiation (for now, will be changed soon) is that its an iPad tool. A desktop version is coming soon. Hopefully an Android app also.
TouchCast has really broken ground here. This should open up more advancements that will do even more. The future is exciting for us flipped classroom types.
Last week the hotly anticipated cholesterol treatment guidelines were released and are an improvement over the previous ATPIII guidelines. The new guidelines abandon LDL targets, focus on statins and not add-on therapies which don’t help, and emphasize stroke prevention in addition to heart disease prevention.
The problem with the new guidelines is that they developed a new risk prediction tool which frankly stinks. And the developers knew it stunk but promoted it anyway!
Lets take a step back and discuss clinical prediction rules (CPR). CPRs are mathematical models that quantify the individual contributions of elements of the history, PE, and basic laboratory tests into a score that aids diagnosis or prognosis estimation. They can accommodate more factors than the human brain can take into account and they always give the same result whereas human judgment is inconsistent (especially in the less clinically experienced). To develop a CPR you 1) construct a list of potential predictors of the outcome of interest, 2)examine a group of patients for the presence of the candidate predictors and their status on the outcome of interest, 3) determine statistically which predictors are powerfully and significantly associated with the outcome, and 4) validate the rule [ideally involves application of rule prospectively in a new population (with different spectrum of disease) by a variety of clinicians in a variety of institutions].
Back to the new risk tool. They decided to develop a new tool because the Framingham Score (previously used in the ATPIII guidelines) was insufficient (developed on exclusively white population). How was it developed? The tool was developed using “community-based cohorts of adults, with adjudicated endpoints for CHD death, nonfatal myocardial infarction, and fatal or nonfatal stroke. Cohorts that included African-American or White participants with at least 12 years of follow-up were included. Data from other race/ethnic groups were insufficient, precluding their inclusion in the final analyses”. The data they used was from “several large, racially and geographically diverse, modern NHLBI-sponsored cohort studies, including the ARIC study, Cardiovascular Health Study, and the CARDIA study, combined with applicable data from the Framingham Original and Offspring Study cohorts”. I think these were reasonable derivation cohorts to use. How did they validate the tool? Importantly they must use external testing because most models work in the cohort from which it was derived. They used “external cohorts consisting of Whites and African Americans from the Multi-Ethnic Study of Atherosclerosis (MESA) and the REasons for Geographic And Racial Differences in Stroke study (REGARDS). The MESA and REGARDS studies were approached for external validation due to their large size, contemporary nature, and comparability of end points. Both studies have less than 10 years of follow up. Validation using “most contemporary cohort” data also was conducted using ARIC visit 4, Framingham original cohort (cycle 22 or 23), and Framingham offspring cohort (cycles 5 or 6) data”. The results of their validity testing showed C statistics ranging from a low of 0.5564 (African -American men) to a high of 0.8182 (African-American women). The C statistic is a measure of discrimination (differentiating those with the outcome of interest from those without the outcome) and ranges from 0.5 (no discrimination- essentially as good as a coin flip) to 1.0 (perfect discrimination). The authors also found that it overpredicted events. See graph below.
So why don’t I want to use the new prediction tool? 3 main reasons:
1) It clearly over predicts outcomes. This would lead to more people being prescribed statins than likely need to be on statins (if you only use the tool to make this decision). One could argue that’s a good thing as statins are fairly low risk and lots of people die from heart disease so overtreating might be the way to err.
2) No study of statins used any prediction rules to enroll patients. They were enrolled based on LDL levels or comorbid diseases. Thus I don’t even need the rule to decide on whether or not to initiate a statin.
3) Its discrimination is not good….see the C-statistic results. For Black men its no better than a coin flip.
Hilda Bastian writes an important and well written blog on this topic in a recent Scientific American blog .
I don’t think I have much else to add other than read this blog. There are some great links inside her blog to further understand this topic.
I think we are too focused on p <0.05. What if the p value is 0.051? Does that mean we should ignore the finding? Is it really any different than p value of 0.0499?
Confidence intervals give information on both statistical significance and clinical significance but I worry about how they are interpreted also. (Disclaimer: the interpretation and use of the confidence interval that follows is not statistically correct but is how we use them clinically.) Lets say a treatment improves a bad outcome with a relative risk (RR) of 0.94 with 95% CI of 0.66-1.12. So the treatment isn’t “statistically significant” (the CI includes 1.0) but there is potential for a relatively significant clinical benefit [ the lower bound of the CI suggests a potential 34% reduction in the bad outcome (1- RR = relative risk reduction so 1-0.66 = 0.34 or 34%)]. There is also potential for a clinically significant increase in risk of 12%. So which is more important? Somewhat depends on whether you believe in this treatment or not. If you believe in it you focus on the potential 34% reduction in outcomes. If you don’t believe in the treatment you focus on the 12% increased risk. So that’s the problem with confidence intervals but they give much more information than p-values do.
This week I did an EBM “lecture” based around the game show jeopardy. Now I know this isn’t anything new. Lots of teachers have used jeopardy format to teach. The point is that it took the content of “EBM Potpourri” (a group of topics that don’t fit well in other lectures that I give) and made it more interesting than a traditional 1 hour lecture (which is how I have given this lecture in the past).
The challenge when doing this to figure out your main teaching points and only include them since you don’t have a lot of extra space for less important topics (but shouldn’t we be doing this anyway?) The next challenge was to make gradually harder questions within each topic. I made some of the questions limited to certain learner levels only (I teach internal medicine residents that are organized into interns, 2nd years and 3rd years) to make sure every one participated independently at least somewhat. The residents only got about 40% of the questions right….but that wasn’t the point. The point was to convey my teaching points and to engage the learners. They worked in their teams (each team consisted of an intern, 2nd yr and 3rd yr) to solve problems. The competition between teams for “great prizes” (certificate of appreciation for 3rd place team, Rice-a-Roni to the 2nd place team, and lunch with me for the winners) made them take it a little more seriously.
If you would like the original PowerPoint file to use in your teaching I’ll be happy to email it to you. Contact me at UABEBM@gmail.com
What unique ways have you taught EBM topics?
A commenter on my video about intention to treat analysis asked about my thoughts on a twist on intention to treat analysis in which an adjustment is made (via an instrumental variable) for “treatment contamination”. A disclaimer: I am not a statistician or epidemiologist.
First lets start with some definitions:
1) intention to treat analysis: once randomized always analyzed in the group to which the patient was assigned (even if you don’t get the intervention in the intervention arm or you do get it in the control arm)
2) Superiority trial: study designed to “prove” one intervention is better than the other. Null hypothesis is that there is no difference between the groups.
3) Noninferiority trial: study designed to “prove” that one intervention is not worse than another treatment by some prespecified amount. Null hypothesis is the is a difference between the groups.
4) Instrumental variable: variable associated with the factor under study but not directly associated with the outcome variable or any potential confounders.
The authors of this paper An IV for the RCT: using instrumental variables to adjust for treatment contamination in randomised controlled trials state:
Intention to treat analysis estimates the effect of recommending a treatment to study participants, not the effect of the treatment on those study participants who actually received it. In this article, we describe a simple yet rarely used analytical technique, the “contamination adjusted intention to treat analysis,” which complements the intention to treat approach by producing a better estimate of the benefits and harms of receiving a treatment. This method uses the statistical technique of instrumental variable analysis to address contamination
So what do I think about this?
1) A main role of intention to treat (ITT) analysis is to be conservative in a superiority trial. That means we dont want to reject the null hypothesis falsely and claim treatment is better than the control. Another main role of ITT analysis is to preserve randomization (remember, once randomized always analyzed).
2) The authors of the BMJ paper point out that “Intention to treat analysis estimates the effect of recommending a treatment to study participants, not the effect of the treatment on those study participants who actually received it.” This is true but isnt that what real life is like? I recommend a treatment to my patients. Some take it, some don’t. Some who I tell not to use something wind up using it.
3) The authors of the BMJ paper further point out that ITT analysis “underestimates value of receiving the treatment.” That is possible also but its also the point (see #1 above).
4) The instrumental variable in this scheme would be a variable entered into the model indicating whether or not a patient received treatment or not (no matter what group they were assigned to). ITT analysis would still be used but be adjusted for treatment receipt. I worry that this could lead to overfitting the model- a situation where you can add too many variables to a model and start to detect noise beyond real relationships.
5) I think it would be difficult in a trial to judge adherence- what is the cutoff? Is it 100%? What about 60%? 40%? How much use by the control group is important? I think there are issues in judging what is contamination or not.
Time will tell if this technique should be used. We will have to study the treatment estimates from traditional ITT analysis and contamination adjusted ITT analysis. Until then I will stick with what is recommended…traditional ITT analysis.