Guidelines should not include opinions

The authors of this viewpoint have it wrong on a couple of fronts.

“The purpose of practice guidelines must be to develop the best possible recommendations from a body of evidence that may be contradictory or inadequate.”

While I agree that having recommendations come from an expert body is useful when there is inadequate or contradictory evidence I don’t think they should be labeled guidelines. A consensus statement is a more appropriate term. After all, if evidence is lacking or contradictory aren’t these experts just giving their opinion? Isn’t it possible that another group of experts would give a different opinion?

So don’t label it a guideline. That term has garnered reverence that was never intended. Guidelines become law almost. They are bastardized into punishing performance measures and become the cornerstone of legal argument. So, the term guideline should not be used lightly.

“…but those recommendations should always represent the best evidence and the best expert opinion currently available.”

NO! No expert opinion. Data is too open to interpretation. Humans filter information using prior knowledge, experience, and many heuristics (including, very importantly, the affect heuristic). A person’s specialty really influences how they interpret data. It’s one of the reasons it’s so important to have multidisciplinary panels so that conflicts and heuristics can be balanced. Unfortunately, most guideline panels are very homogeneous and conflicted.

I agree that we need unambiguous language in guidelines. They should only contain recommendations on things that have strong evidence that no one refutes. When they venture into the world of vagaries they become nothing more than opinion pieces.

What say you?

Do lipid guidelines need to change just because there is a new, expensive drug on the market? NO!

Shrank and colleagues published a viewpoint online today positing that lipid guidelines should return to LDL based targets. I think they are wrong. The use two studies to support their assertion.

First they use the IMPROVE IT study. In this study patients hospitalized for ACS were randomized to a  combination of simvastatin (40 mg) and ezetimibe (10 mg) or simvastatin (40 mg) and placebo (simvastatin monotherapy). The LDLs were already pretty low in this group: baseline LDL cholesterol levels had to be between 50 to 100 mg per deciliter  if they were receiving lipid-lowering therapy or 50 to 125 mg per deciliter if not on lipid lowering therapy (Average baseline LDL was 93.8 mg/dl). The results show minimal benefits as demonstrated below:

IMPROVE IT resultsCurrent guidelines would recommend high potency statin in this patient population. Adding ezetimibe to moderate dose statin is probably equivalent to a high potency statin (from a LDL lowering  perspective). This study (and all ezetimibe studies) should have tested the difference between simva 40-ezetimbe 10 and simva 80mg or atorvastatin 40 or 80mg. So to me IMPROVE IT doesn’t PROVE anything other than a more potent statin leads to less cardiovascular events…something we already know.

Now on to the 2nd argument. They argue that alirocumab (Praluent), the first in a new class, the proprotein convertase subtilisin/kexin type 9 (PCSK-9) inhibitors should lead to LDL guided therapy again. Why? “Early results suggest these drugs have a powerful effect on levels of low-density lipoprotein cholesterol (LDL-C), likely more potent than statins“. A systematic review of studies of this drug shows a mortality reduction but the comparators in these studies was placebo or ezetimibe 10mg. Why? We have proven therapy for LDL and this drug should have been compared to high potency statins. That study will likely not ever be done (unless the FDA demands it) because the companies making this drug cant risk finding that it works only as good as a high potency statin or possibly worse.  Also does this class of drugs have anti-inflammatory effects like statins? Are they safer? This is an injectable drug that has to be warmed to room temperature prior to use and is very costly compared to generic atorvastatin.

In my opinion, no guideline should be changed without appropriately designed outcomes studies for the drugs being recommended. In this case, the risk-benefit margin needs to be impressive to justify the cost as we have dirt cheap potent statins already.

The authors of this viewpoint make no great rational argument for guidelines change other than that there is a new drug on the market and it might work. Lets see if it does and at what cost (both monetary and physiological).

Underrepresentation of certain patient populations in RCTs informing guidelines: Does it matter?

This week JAMA Internal Medicine published a research letter reporting data on the underrepresentation of women, elderly patients, and racial minorities in RCTs used to inform cardiovascular guidelines. The authors state that RCTs are considered to be the highest level of evidence that should be used to inform guideline development. I would argue systematic reviews would even be better but I understand that questions to be addressed in guidelines often need individual RCTs to answer them.  They then state that “RCTs can have limited external validity”. What do you think?

The authors evaluated all references and then focused on RCTs that were cited in the ACC/AHA guidelines on atrial fibrillation, heart failure, and acute coronary syndromes. They extracted data on age, gender, ethnicity, and continents from which subjects were recruited. What did they find?

Female representation was highest in RCTs in atrial fibrillation (33%) followed by ACS (29%) and heart failure (29%). The next question you should ask is how does this compare to the actual gender representations of people affected by these diseases? In US registries of atrial fibrillation women make up 55% of patients, 42% in ACS registries, and 47% in heart failure registries. Thus women are underrepresented by up to 22%  in these studies but does this affect guideline recommendations? Another way to think about this is would more data change recommendations for women? Hard to know for sure but I suspect not. If enrollment is properly conducted I would think that those enrolled would be a sample of all women with atrial fibrillation, ACS and heart failure. Even though the sampling fraction is smaller as long as they are representative of all women with those problems there should be no bias. The statistical inferences could be affected due to small sample sizes though but the overall qualitative findings (ie benefit or harm) should not be affected.

As expected the majority of patients enrolled in these studies were white. Black patients constituted 19% of heart failure RCT patients and 6% of both afib and ACS patients. In US registries of heart failure, afib and ACS black patients make up 6%, 21%, and 11% respectively. Again I don’t have a problem with this if sampling was done properly.

Elderly (defined as those >75 yrs of age) are very underrepresented constituting only 2% of patients in all the RCTs combined. In this case guideline developers will have to rely on observational data or use expert opinion to inform recommendations.

Finally, the authors point out that 94% of enrolled patients came from North America or Europe. Is this a problem? I don’t think so for the US as ACC/AHA guidelines are developed to guide treatment of American patients. Patients from other underrepresented continents will have less direct evidence informing recommendations on their care. Consequently, those recommendations will be based more on expert opinion.

Evidence Based Medicine Is Not In Crisis! Part 3

In this installation I want to jump ahead in Greenhalgh’s paper to address her last cause of the EBM crisis: “Poor fit for multimorbidity“. Not to worry, I will come back in a future post to cover the remaining “problems” of EBM.

I concur with Greenhalgh that individual studies have limited applicability by themselves in a vacuum to patients with multimorbidity. Guidelines don’t help a they also tend to be single disease focused and developed by single disease -ologists. So is EBM at fault here again? Of course not. EBM skills to the rescue.

The current model of EBM demonstrated below contains 2 important elements: clinical state and circumstances and clinical experience.

Clinical state and circumstances largely refers to the patient’s comorbidities, various other treatments they are receiving, and the clinical setting in which the patient is being seen. Thus, the EBM paradigm is specifically designed to deal with multimorbidity. Clinical expertise is used to discern what impact other comorbidities have on the current clinical question under consideration. and, along with the clinical state/circumstance, helps us decide how to apply a narrowly focused study or guideline in a multimorbid patient. Is this ideal? No. It would be nice if we had studies that included patients with multiple common diseases but we have to treat patients with the best available evidence that we have.



evidence based medicine is not in crisis! part 2

Greenhalgh and colleagues report that the “second aspect of evidence based medicine’s crisis… is the sheer volume of evidence available”. EBM is not the purveyor of what is studied and published. EBM is a set of skills to effectively locate, evaluate, and apply the best available evidence. For much of what we do there is actually a paucity of research data answering clinically relevant questions (despite there being alot of studies- which gets back to her first complaint about distortion of the evidence brand. See part 1 of this series). I teach my students and housestaff to follow the Haynes’ 6S hierarchy when trying to answer clinical questions. As much of the hierarchy is preappraised literature someone else has had to deal with the “sheer volume of evidence”. Many clinical questions can be answered at the top of the pyramid.

I concur with Greenhalgh that guidelines are out of control. I have written on this previously. We don’t need multiple guidelines on the same topic, often with conflicting recommendations. I believe that we would be better off with central control of guideline development under the auspices of an agency like AHRQ or the Institute of Medicine. It would be much easier to produce trustworthy guidelines and guidelines on topics for which we truly need guidance. (Really American Academy of Otolaryngology….do we need a guideline on ear wax removal?) It can be done.  AHCPR previously made great guidelines on important topics. Unfortunately we will probably never go back to the good ole days. Guidelines are big business now, with specialty societies staking out their territory and government and companies bastardizing them into myriad performance measures.



I’m Still Not Crazy About the Pooled Risk Equations in the New Cholesterol Guidelines

2 papers got published this week to further validate the pooled risk equations developed for the ACC/AHA Cholesterol Guidelines.
Muntner and colleagues used the REGARDS participants to assess the calibration and discrimination of the pooled risk equations. This study had potential as it oversampled patients from the stroke belt. This is important because the Pooled Risk Equations were developed  to overcome the limitations of the Framingham tool (mainly its lack of minorities).  I have a real problem with this study because the pooled risk equations estimate 10 yr risk of CHD and stroke and this study only has 5 yrs of follow-up for the REGARDS participants. I don’t think their estimates of calibration and discrimination are valid. Risk of CHD and stroke should increase over time so event rates could change with 5 more years of follow-up. The important thing this paper adds is the reminder that observational studies often lack active surveillance. Most observational studies rely on self report of outcomes and obviously silent events would be missed by the patient as would events for which the patient didn’t seek evaluation. Muntner and colleagues also used Medicare claims data to identify events not detected through routine cohort follow-up and found 24% more events. This is a useful lesson from this study.

In a more useful study Kavousi and colleagues compared 3 risk prediction tools (pooled risk equations, Framingham, and SCORE)  using the Rotterdam Study, a prospective population-based cohort of persons aged 55 yrs and older. This cohort does have 10 yrs of follow-up.


This figure shows that at each level of risk the pooled risk equations overestimated risk, though less so in women.

treatment rec

This figure shows the proportion of patients for whom treatment is recommended (red bars), treatment should be considered (yellow bars), and no treatment is recommended (green bars). As you can see the new risk tool leads to the large majority of men “needing treatment” compared to previous guidelines (ATP III) and the current European guidelines (ESC).

calibration curves

Finally, this figure shows the calibration curves and the calibration was not good. The blue dots should lie right upon the red line for good calibration. Furthermore, the c-statistic is 0.67 (a measure of discrimination which means the tool can differentiate diseased from nondiseased patients. A c-statitic above 0.7 is considered moderate to good. The closer to 1 the better).

Why might the pooled risk equations overestimate risk? Maybe they don’t if you believe the Muntner study. It could just be a problem with the lack of active surveillance in the cohort studies used to validate the tool. Or they really do overestimate risk because they aren’t accurate or maybe more contemporary patients receive better therapies that improve overall health or maybe the baseline risk characteristics of the validation cohorts just differ too much from the development cohorts.

I am still not sold on the new pooled risk equations but they might not be much better than what we have been using based on the Kavousi study (Framinham also overpredicted risk and had poor calibration). I think we need more study and tweaking of the tool or we use the tool as is and focus more on cardiovascular risk reduction (with exercise, diet, tobacco cessation, diabetes and HTN control) and don’t focus so much on starting a statin right away.

The Mayo Clinic has a nice patient decision aid that you can use to help patients decide if a statin is right for them:



JNC 7 or JNC 8: Which Should I Use?

I gave a CME seminar this week on treating hypertension in the elderly and after my presentation a clinical pharmacist asked me an interesting question: “What do you follow? JNC 7 or JNC 8?”.


I thought this was an interesting question and one I hadn’t thought about at all. After all shouldn’t an updated guideline trump the previous one? I like JNC 8 because its methodology is more explicit and consistent with IOM principles than JNC 7. One can argue with some of the decisions made about the evidence review (ie that they only included RCTs and ignored systematic reviews and observational data) and be concerned about the degree of conflicts of interest of the panel members. But what JCN 8 did was make life simpler in that the BP goals are easily remembered: <150/90 for those over 60 yrs of age and < 140/90 for everyone else including those with diabetes or CKD (regardless of age). So for these reasons I prefer JNC 8. Is it perfect? No but I suspect they will address many of the concerns critics have expressed and further questions that need to be addressed in future updates (that they promise will come in a timely fashion).