Evidence Based Medicine Is Not In Crisis! Part 4

I’ve left the hardest issue to deal with for last- “Overemphasis on following algorithmic rules”.  This has been the most frustrating aspect of my primary care practice.  Patients quit being viewed as patients but a set of goals that I had to achieve to be smiled upon fondly by my boss as being “a good doctor”. It took me some time to finally quit playing the game and just do the best I could do and whatever the numbers were so be it.

Algorithmic medicine couldn’t be any more antithetical to EBM. Everyone is viewed the same. EBM clearly, as I have argued in the last three posts, is about individual patient values and circumstances. It’s about clinical experience temporizing what we could do to what we should do.  Algorithmic medicine allows no individuality.  No temporizing. Thus to claim EBM is in crisis because of algorithmic medicine is wrong. True EBM protects us from the harms of algorithmic medicine.

Interestingly computerized decision support systems (mentioned as a culprit in the first sentence of this section of Greenhalgh’s paper) are at the top of Haynes’ 6S hierarchy of preappraised evidence.

In these computerized decision support systems (CDSSs), detailed individual patient data are entered into a computer program and matched to programs or algorithms in a computerized knowledge base, resulting in the generation of patient-specific assessments or recommendations for clinicians” –  Brian Haynes

At the VA we have a moderately sophisticated CDSS. It warns me if my patient with heart failure is not taking an ACE inhibitor and its smart enough that if I enter an allergy to ACE inhibitors it won’t prompt me to order one. If I tell it that a patient has limited life expectancy it will not prompt me to pursue certain routine health screenings. Thus, I don’t view CDSSs as problematic in and of themselves. The problem arises when physicians don’t consider the whole patient (remember those values and clinical circumstances) in deciding whether or not to follow prompted recommendations.

Greenhalgh has made great points about what happens when good ideas are hijacked and distorted for secondary gain but EBM is not to blame. Victor Montori (@VMontori) said it best in a Tweet to me:

EBM principles are not in crisis, but corruption of healthcare has oft hidden behind the e-b moniker. EBM helps uncover it“.


Evidence Based Medicine Is Not In Crisis! Part 3

In this installation I want to jump ahead in Greenhalgh’s paper to address her last cause of the EBM crisis: “Poor fit for multimorbidity“. Not to worry, I will come back in a future post to cover the remaining “problems” of EBM.

I concur with Greenhalgh that individual studies have limited applicability by themselves in a vacuum to patients with multimorbidity. Guidelines don’t help a they also tend to be single disease focused and developed by single disease -ologists. So is EBM at fault here again? Of course not. EBM skills to the rescue.

The current model of EBM demonstrated below contains 2 important elements: clinical state and circumstances and clinical experience.

Clinical state and circumstances largely refers to the patient’s comorbidities, various other treatments they are receiving, and the clinical setting in which the patient is being seen. Thus, the EBM paradigm is specifically designed to deal with multimorbidity. Clinical expertise is used to discern what impact other comorbidities have on the current clinical question under consideration. and, along with the clinical state/circumstance, helps us decide how to apply a narrowly focused study or guideline in a multimorbid patient. Is this ideal? No. It would be nice if we had studies that included patients with multiple common diseases but we have to treat patients with the best available evidence that we have.



evidence based medicine is not in crisis! part 2

Greenhalgh and colleagues report that the “second aspect of evidence based medicine’s crisis… is the sheer volume of evidence available”. EBM is not the purveyor of what is studied and published. EBM is a set of skills to effectively locate, evaluate, and apply the best available evidence. For much of what we do there is actually a paucity of research data answering clinically relevant questions (despite there being alot of studies- which gets back to her first complaint about distortion of the evidence brand. See part 1 of this series). I teach my students and housestaff to follow the Haynes’ 6S hierarchy when trying to answer clinical questions. As much of the hierarchy is preappraised literature someone else has had to deal with the “sheer volume of evidence”. Many clinical questions can be answered at the top of the pyramid.

I concur with Greenhalgh that guidelines are out of control. I have written on this previously. We don’t need multiple guidelines on the same topic, often with conflicting recommendations. I believe that we would be better off with central control of guideline development under the auspices of an agency like AHRQ or the Institute of Medicine. It would be much easier to produce trustworthy guidelines and guidelines on topics for which we truly need guidance. (Really American Academy of Otolaryngology….do we need a guideline on ear wax removal?) It can be done.  AHCPR previously made great guidelines on important topics. Unfortunately we will probably never go back to the good ole days. Guidelines are big business now, with specialty societies staking out their territory and government and companies bastardizing them into myriad performance measures.



Evidence Based Medicine Is Not In Crisis! Part 1

Trisha Greenhalgh and colleagues wrote an opinion piece in BMJ recently lamenting (or perhaps exalting) that the EBM movement is in crisis for a variety of reasons. I don’t agree with some of the paper and I will outline in a series of posts why I disagree.

When most people complain about EBM or discuss its shortcomings they usually are not basing their arguments on the current definition of EBM.  They use the original definition of EBM in which EBM was defined as the conscientious, explicit, and judicious use of current best evidence in making decisions about the care of individual patients. This definition evolved to “the integration of best research evidence with clinical expertise and patient values. Our model acknowledges that patients’ preferences rather than clinicians’ preferences should be considered first whenever it is possible to do so“.

The circles in this diagram are ordered based on importance- with patient preferences and actions being most important and research evidence being the least important when practicing EBM. You can see that clinical expertise is used to tie it all together and decide on what should be done, not what could be done.

Back to the Greenhalgh paper. Her first argument  is that there has been distortion of the evidence brand. I agree. It seems everyone wants to add the “evidence based” moniker to their product. But she argues beyond just a labeling problem. She argues that the drug and medical device industry is determining our knowledge because they fund so many studies. Is this the fault of EBM? Or should funding agencies like the NIH and regulatory agencies like the FDA be to blame? I think the latter. Industry will always be the main funder of studying their product and they should be. They should bear the cost of getting product to market. That is their focus. To suggest they shouldn’t want to make profit is just ridiculous.

The problem arises in what the FDA (and equivalent agencies in other countries) allows pharma to do. Greenhalgh points out the gamesmanship that pharma plays when studying their drug to get the outcomes they desire. I totally agree with what she points out. Ample research proves her points. But it’s not EBM’s fault. The FDA should demand properly conducted trials with hard clinical outcomes be the standard for drug approval. Companies would do this if they had to to get drug to the market. I also blame journal editors who publish these subpar studies. Why do they? To keep advertising dollars? The FDA should also demand that any study done on a drug be registered and be freely available and published somewhere easily accessible (maybe clinical trials.gov). Those with adequate clinical and EBM skills should be able to detect when pharma is manipulating drug dosages, using surrogate endpoints, or overpowering a trial to detect clinically insignificant results. I look at this as a positive for continuing to train medical students and doctors in these skills.

Research has shown that industry funded studies overestimate the benefits of their drugs by maybe 20-30%. A simple way to deal with this is to take any result from an industry funded study and to reduce it by 20-30%. If the findings remain clinically meaningful then use the drug or device.

I agree with Greenhalgh that current methods to assess study biases are outdated. The Users’ Guides served their purpose but need to be redone to detect the subtle gamesmanship going on in studies. Future and current clinicians need to be trained to detect these subtle biases. Alternatively, why can’t journals  have commentaries about every article similar to what BMJ Evidence Based Medicine and ACP Journal Club do. This could then be used to educate journal users on these issues and put the results of studies into perspective.

Do You Have An Unconfortable Relationship With Math? A Study Shows Most Doctors Do

If a test to detect a disease whose prevalence is 1/1000 has a false positive rate of 5%, what is the chance that a person with a positive test result actually has the disease? Assume the test is 100% sensitive.

Everyone taking care of patients, especially in primary care, needs to be able to figure this out. This is a basic understanding of what to do with a positive screening test result. If you can’t figure this out how would you be able to discuss the results with a patient? Or better yet how would you be able to counsel a patient on the implications of a positive test result prior to ordering a screening test?

Unfortunately, a study released online on April 21st found that 77% of respondents answered the question incorrectly. These results are similar to the results of a study in 1978, which used the same scenario. This is unfortunate as interpreting diagnostic test results is a cornerstone of EBM teaching and almost all (if not all) medical schools and residency programs teach EBM principles. So what’s the problem?

Here are some of my thoughts and observations:

  1. These principles are probably not actually being taught because the teachers themselves don’t understand them or if they do they don’t teach them in the proper context. This needs to be taught in the clinic when residents and medical students discuss ordering screening tests or on the wards when considering a stress test or cardiac catheterization, etc.
  2. The most common answer in the study was 95% (wrong answer). This shows that doctors don’t understand the influence of pretest probability (or prevalence) on post test probability (or predictive value). They assume a positive test equals disease. They assume a negative test equals no disease.  Remember where you end up (posttest probability) depends on where you start from (pretest probability).
  3. I commonly see a simple lack of thinking when ordering tests. How many of you stop to think: What is the pretest probability? Based on that do I want to rule in or rule out disease? Based on that do I need a sensitive or specific test? What are the test properties of the test I plan to order? (or do I just order the same test all the time for the same diagnosis?)
  4. I also see tests ordered for presumably defensive purposes. Does everyone need a CT in the ER? Does everyone need a d-dimer for every little twinge of chest pain? When you ask why a test was ordered I usually hear something like this: “Well I needed to make sure something bad wasn’t going on”.  I think this mindset transfers to the housestaff and students who perpetuate it.  I commonly see the results of the ER CT in the HPI for God’s sake!!!
  5. Laziness. There’s an app for that. Even if you can’t remember the formula or how to set up a 2×2 table your smartphone and Google are your friends.  Information management is an important skill.

So what’s the answer to the question above? 1.96%   (Remember PPV = true pos / true pos + false pos  so 1 / 1 + 50 = 1.96) If its easier set up a 2 x 2 table.

This very sensitive (100%) and fairly specific (95%) test (positive LR is 20!) wasn’t very informative when positive. Probability only went from 0.1% to 2%. The patient is still not likely to have disease even with a positive test.  It would have been more useful if the test result was negative. Thus, in a low probability setting your goal is to rule out disease and you should choose the most sensitive test (Remember SnNout).


PEITHO Trial Teaches an Important Lesson

The current issue of the New England Journal of Medicine contains an important trial- the PEITHO trial. Its important because it tells us what not to do.

In the PEITHO trial patients with intermediate risk pulmonary embolism (right ventricular dysfunction and myocardial injury with no hemodynamic compromise) were randomized to a single weight-based bolus of tenecteplase or placebo. All patients were given unfractionated heparin. Patients were followed for 30 days for the primary outcome of death from any cause or hemodynamic decompensation within 7 days after randomization.

This table shows the efficacy outcomes. Looks promising doesn’t it.

PEITHO efficacy outcomes

The primary outcome was significantly reduced by 56%. This composite outcome is not a good one though. Patients would not consider death and hemodynamic decompensation equal. Also the pathophysiology of the 2 outcomes can be quite different. The intervention should also have a similar effect on all components of a good composite and there is a greater effect on hemodynamic decompensation than death. Thus, don’t pay attention to the composite but look at the composite’s individual components. Only hemodynamic decompensation was significantly reduced (ARR 3.4%, NNT 30). Don’t get me wrong this is a good thing to reduce.

But with all good can come some bad. This trial teaches that we must pay attention to adverse effects. The table below shows the safety outcomes of the PEITHO trial. Is the benefit worth the risk?

PEITHO safety outcomes

You can see from the table that major extracranial bleeding was increased 5 fold (ARI 5.1%, NNH 20) as was stroke, with most of them being hemorrhagic (ARI 1.8%, NNH 55).

This trial teaches a few important EBM points (I will ignore the clinical points it makes):

  1. You must always weigh the risks and benefits of every intervention.
  2. Ignore relative measures of outcomes (in this case the odds ratios) and calculate the absolute effects followed by NNT and NNH. These are much easier to compare.
  3. Watch out for bad composite endpoints. Always look at individual components of a composite endpoint to see what was affected.

I’m Still Not Crazy About the Pooled Risk Equations in the New Cholesterol Guidelines

2 papers got published this week to further validate the pooled risk equations developed for the ACC/AHA Cholesterol Guidelines.
Muntner and colleagues used the REGARDS participants to assess the calibration and discrimination of the pooled risk equations. This study had potential as it oversampled patients from the stroke belt. This is important because the Pooled Risk Equations were developed  to overcome the limitations of the Framingham tool (mainly its lack of minorities).  I have a real problem with this study because the pooled risk equations estimate 10 yr risk of CHD and stroke and this study only has 5 yrs of follow-up for the REGARDS participants. I don’t think their estimates of calibration and discrimination are valid. Risk of CHD and stroke should increase over time so event rates could change with 5 more years of follow-up. The important thing this paper adds is the reminder that observational studies often lack active surveillance. Most observational studies rely on self report of outcomes and obviously silent events would be missed by the patient as would events for which the patient didn’t seek evaluation. Muntner and colleagues also used Medicare claims data to identify events not detected through routine cohort follow-up and found 24% more events. This is a useful lesson from this study.

In a more useful study Kavousi and colleagues compared 3 risk prediction tools (pooled risk equations, Framingham, and SCORE)  using the Rotterdam Study, a prospective population-based cohort of persons aged 55 yrs and older. This cohort does have 10 yrs of follow-up.


This figure shows that at each level of risk the pooled risk equations overestimated risk, though less so in women.

treatment rec

This figure shows the proportion of patients for whom treatment is recommended (red bars), treatment should be considered (yellow bars), and no treatment is recommended (green bars). As you can see the new risk tool leads to the large majority of men “needing treatment” compared to previous guidelines (ATP III) and the current European guidelines (ESC).

calibration curves

Finally, this figure shows the calibration curves and the calibration was not good. The blue dots should lie right upon the red line for good calibration. Furthermore, the c-statistic is 0.67 (a measure of discrimination which means the tool can differentiate diseased from nondiseased patients. A c-statitic above 0.7 is considered moderate to good. The closer to 1 the better).

Why might the pooled risk equations overestimate risk? Maybe they don’t if you believe the Muntner study. It could just be a problem with the lack of active surveillance in the cohort studies used to validate the tool. Or they really do overestimate risk because they aren’t accurate or maybe more contemporary patients receive better therapies that improve overall health or maybe the baseline risk characteristics of the validation cohorts just differ too much from the development cohorts.

I am still not sold on the new pooled risk equations but they might not be much better than what we have been using based on the Kavousi study (Framinham also overpredicted risk and had poor calibration). I think we need more study and tweaking of the tool or we use the tool as is and focus more on cardiovascular risk reduction (with exercise, diet, tobacco cessation, diabetes and HTN control) and don’t focus so much on starting a statin right away.

The Mayo Clinic has a nice patient decision aid that you can use to help patients decide if a statin is right for them: http://statindecisionaid.mayoclinic.org/index.php/site/index



Answering Clinical Questions at the Point of Care- Its Time to Stop Making Excuses!

Del Fiol and colleagues published a systematic review of studies examining the questions raised and answered by clinicians in the context of patient care.  The studies they examined used several methodologies including after-visit interviews, clinician self-report, direct observation, analysis of questions submitted to an information service, and analysis of information resource search logs. Each of these methodologies has their pros and cons. I’ll review their findings following the 4 questions that they asked.

How often do clinicians raise clinical questions? On average, clinicians ask 1 question for every 2 patients (range 0.16-1.85).

How often do clinicians pursue questions they raise? On average, they only pursued 47% (range 22-71%).

How often do clinicians succeed at answering the questions they pursue? They were pretty successful when they decided to pursue an answer: 80% of the time they were able to answer the question. Interestingly, clinicians spent less than 2-3 minutes seeking an answer to a specific question.  They were clearly choosing questions that could be answered fairly quickly when they decided to pursue the answer to a  question.

What types of questions were asked? Overall, 34% of questions were related to drug treatment while 24% were related to the potential causes of a symptoms, physical finding, or diagnostic test finding.

I find 3 other findings (from the Box in the manuscript) interesting:

  • Most questions were pursued with the patient still in the practice (not sure if the clinicians searched in front of the patient or left the room- more about this later)
  • Most questions, as you would expect, are highly patient-specific and nongeneralizable.  This is unfortunate for long-term learning.
  • Also unfortunately, clinicians mainly used paper and human resources (more on this in a minute)

From ticklemeentertainment.com

Even though Del Fiol examined barriers to answering questions I refer to another study by Cook and colleagues that more closely examined barriers to answering clinical questions at the point of care. Cook did focus groups with a sample of 50 primary care and subspecialist internal medicine and family medicine physicians to understand barriers and factors that influence point of care learning/question answering. Of course the main barrier is time. This study was done in late 2011 into early 2012 and included a wide range of ages of participants. With the resources available on both the desktop and handheld devices this barrier should be declining, especially when you consider the most common question clinicians ask is about drug treatment.

Physicians frequently noted patient complexity as a barrier. Complex patients require more time and often lead to more complex questions that are harder to answer with many resources. Almost all guidelines and studies are focused on single disease patients. Multimorbidity is rarely covered. Thus many answers will likely rely on clinical expertise and judgment. This is where using human resources is likely to occur. I bet few of us question how up to date our colleagues are that we ask questions of.

Interestingly, Cook’s study participants identified the sheer volume of information as a barrier. As a result, these physicians used textbooks more than electronic resources. I wonder if they understand that a print textbook is at least a year out of date by the time it hits market. How often do they update their textbooks? (likely rarely…just look at a private practice doctor’s bookshelf and you will often see books that are at least 2 or more editions out of date).

Finally, the physicians in Cook’s study felt that searching for information in front of the patient might “damage the patient-physician relationship or make patients uncomfortable.” They couldn’t be more wrong. Patients actually like when we look things up in front of them.  I always do this and I tell them what I am looking up and admit my knowledge limitation. I show them what I found so they can participate in decision making. No one can know everything and patients understand that. I would be wary of a physician who doesn’t look something up.

So, how should a busy clinician go about answering clinical questions?

  1. You must have access to trustworthy resources.  2 main resources should suffice: a drug resource (like Epocrates or Micromedex- both are free and available for smartphones and desk tops) and what Haynes  labels as a “Summary” (Dynamed or UpToDate).  I leave guidelines out here (even though they are classified as a “Summary” resource) because most guidelines are too narrowly focused and many are not explicit enough in their biases.
  2. Answer the most important questions (most important to the patient #1 and then most important to improving your knowledge #2). If the above resources can’t answer your question and you must consult a colleague challenge them to support their opinion with data. You will learn something and likely they will too.
  3. Answer the questions you can in the available time. Many questions should be able to be answered in 5 min or less using the above resources. You are more likely to search for an answer to a question while the patient is in your office than waiting until the end of the day (the above cited studies can attest to that).
  4. Be creative in answering questions. I saw a great video by Paul Glasziou (sorry can’t remember which of his videos it was to link) where he discussed a practice-based journal club. Your partners likely are developing similar questions as your are. This is how he recommends organizing each journal club session: step 1 (10 min) discuss new questions or topics to research, step 2 (40 min) read and appraise (if needed) a research paper for last week’s problem, and step 3 (10 min) develop an action plan to implement what you just learned. This is doable and makes your practice evidence-based and feel somewhat academic. If you follow the Haynes hierarchy and pick the right types of journal articles (synopses and summaries) you can skip the appraisal part and just use the evidence directly.

Ultimately you have to develop a culture of answering questions in your practice. It has to be something you truly value or you won’t do it.  Resources are available to answer questions at the point of care in a timely fashion. At some point we have to stop making excuses for not answering clinical questions.

Should Natriuretic Peptide Guide Chronic CHF Therapy?

Guideline recommended heart failure care is generally followed by most primary care doctors. They prescribe ACE inhibitors, beta-blockers, and occasionally mineralocorticoid receptor antagonists. Despite prescribing these proven medications many patients remain symptomatic, are frequently admitted to the hospital and maybe even die prematurely. One of the main problems is that doctors often don’t titrate medications to the doses used in the landmark studies of these agents. There are several reasons for this but one is just plain lack of knowledge of what the target doses are. I know I am guilty of not teaching my house staff what the target doses are. We often start a medication in the hospital or titrate it slightly but leave the final dose adjustments to the outpatient physicians (we of course assume they will do what’s right). A missed teaching opportunity on my part which I will be correcting this month on my inpatient service. I am going to use a nifty app on my iPad called MyStudies (no affiliation). The app is free but the full complement of journal articles costs $10/year. I plan to use it to review landmark trials with my house staff and I am focusing on making sure they know the target doses of drugs. Most of us adjust med doses based on symptoms or physical findings. Could using a lab test (that seems to always get checked by my ER no matter what the complaint) help us primary care physicians do a better job taking care of CHF patients? This month in the European Heart Journal Troughton and colleagues published an individual patient data meta-analysis on using B-type natriuretic peptide to guide therapy in patients with chronic, largely systolic CHF.  Previous meta-analyses on this topic used study level data and had limitations in what they could adjust for- something that a patient-level meta-analysis is much better suited for. The methodology was fair but not Cochrane level (somewhat limited search strategy, not very explicit about how they did things, no information on publication bias, etc). They wanted to compare BNP guided therapy with clinically guided therapy and they found 9 studies with 2151 total patients that met their inclusion criteria (RCTs reporting all cause mortality). Despite the lack of statistical heterogeneity there is definitely clinical heterogeneity in their included studies ( different target levels of BNP, different study periods, different durations of follow up, different treatment algorithms).  Most studies were fairly small with 69-499  patients enrolled.


What did they find? BNP guided therapy reduced all-cause mortality by 3.4% (19.3% mortality in clinically guided therapy vs 15.9% in BNP guided therapy). This difference was only seen in those under 75 yrs of age as can be seen in this figure. Heart failure hospitalizations were also reduced by 4.7% (27.5% in clinically guided therapy vs 22.8% in BNP guided therapy). Interestingly, there were similar levels of decline in proBNP levels in both groups. I had anticipated BNP guided therapy would result in greater reductions in proBNP levels.

What explains their findings? There are several possible explanations:

  1. The most obvious is that patients in the BNP guided arm received more dose titrations of medications. They found that only ACE or ARBs doses increased in the included studies and only by 8.4%.  No dose increases were seen with beta-blockers or mineralocorticoid receptor antagonists. Loop diuretic doses also stayed the same. The authors found that increasing doses of each of these medications was significantly associated with reduced all-cause mortality.
  2. I wonder if cointerventions (like diet counseling, medication compliance counseling) were intensified in the BNP guided arms more so than in the clinically guided arms. This would not likely be captured in these studies and could explain lower event rates despite minimal increases in ACE inhibitor doses.
  3. Could other agents not measured (like digoxin) been added in the BNP guided arms. Digoxin does lower hospitalization rates. No information is given in the meta-analysis about this and I didn’t go back to the individual studies to see how they handled cointerventions.
  4. Referral to specialty care- this is possible I guess but one of the main effects would be greater titration of meds by cardiologists. ICDs or pacing could have also been done but again this would be related to cointerventions and I would hope this would not have been done differentially.
  5. BNP guided patients could have been seen more frequently by their providers- possible but this should mostly lead to dose titrations or lifestyle modification counseling.
  6. Hawthorne effect

I am not planning to order and follow BNPs on all my CHF patients. Why? Isn’t that what the evidence would say to do?  I think the key is not necessarily the lab test but making sure you titrate doses to those used in the studies. Dose increasing is all that these studies seem to suggest improves mortality.  I think BNP is just a reminder to do that. We need a study of BNP guided therapy vs a clinical pathway that titrates patients to goal doses independent of monitoring BNP. I suspect that study would show no differences in outcomes. You should also remember that BNP is not specific and is affected by renal function. None of the included studies enrolled patients with AKI or CKD (a problem I deal with often).

N-of-1 Trial for Statin-Related Myalgia: Consider Conducting These Studies in Your Practice

The March 4th edition of the Annals of Internal Medicine contains an article by Joy and colleagues in which they conducted an N-of-1 trial in patients who had previously not tolerated statins. This is important because patients often complain that they cannot tolerate statins despite needing them.  I have wondered how much of this was a self-fulfilling prophecy because they hear a lot about this from friends and various media outlets. A N-of-1 trial is a great way to determine if the statin-related symptoms are real or imagined.

First, lets discuss N-of-1 trials. What is a N-of-1 trial? It’s a RCT of active treatment vs. placebo in an individual patient. The patient serves as his or her own control thus perfectly controlling for a variety of biases. When is a N-of-1 trail most useful? This design is not useful for self-limited illnesses, acute or rapidly evolving illnesses, surgical procedures or prevention of irreversible outcomes (like stroke or MI). It’s most useful for conditions that are chronic and for with therapy is prolonged. It’s best if the effect you are looking for occurs fairly quickly and goes away quickly when treatment is stopped. These trials are a good way to determine the optimal dose of a medication  for a patient. They are also good to determine if an adverse effect is truly due to a medication. Finally, they are good way to test a treatment’s effect when the clinician feels it will be useless but the patient insists on taking it. How is a N-of-1 trial conducted? Get informed consent from the patient and make sure they understand that a placebo will be part of the study. Next the patient randomly undergoes pairs of treatment periods in which one period of each pair applies to the active treatment and one to placebo. A pharmacist will need to be involved to compound the placebo and to develop the randomization scheme (so as to keep clinician and patient blinded). Pairs of treatment periods are replicated a minimum of 3 periods. There needs to be a washout period between moving from active to placebo and vice versa. The length of the treatment period needs to be long enough for the outcome of interest to occur. Use the rule of 3s here (if an event occurs on average once every x days, then observe 3x days to be 95% confident of observing at least 1 event). What outcome should be measured? Most commonly these types of trials will be conducted to determine the effect of an intervention on quality of life type measures (eg pain, fatigue, etc).  Ask the patient what is the most troubling symptom or problem they have experienced and measure that as your outcome. Have the patient keep a diary or ask them to rate their symptoms on some meaningful scale at certain follow-up intervals. Do this while on active and placebo treatments. You will have to determine how much of a difference is clinically meaningful.  How do I interpret N-of-1 trial data? This can be a little difficult for non-statistically oriented clinicians. You could do the eyeball test and just see if there are important trends in the data. More rigorously you could calculate the differences in means scores of the placebo and active treatment periods. These would then be compared using a t test (freely available on the internet).

Back to Joy and colleagues N-of-1 trial on statins. They enrolled patients with prior statin-related myalgias. Participants were randomly assigned to get the same statin and dose that they previously didn’t tolerate or placebo. They remained on “treatment” for 3 week periods with 3 week washout periods in between. Patients weekly rated their symptoms on visual analogue scales for myalgias and specific symptoms (0-100, with 0 being no symptoms and 100 being the worst symptoms). It was felt a difference of 13 was clinically significant. What did they find? There were no statistically or clinically significant differences between statins and placebo in the myalgia score (4.37) nor on the symptom specific score (3.89). The neat thing the authors did was to determine if patients resumed taking statins after reviewing the results of their N-of-1 trial and 5 of the 8 patients resumed statins (one didn’t because a statin was no longer indicated).

So are statin related myalgias mostly in our patients’ heads? Maybe. This study is by no means definitive because it only enrolled 8 patients but it at least suggests a methodology you can use to truly test if a patient’s symptoms are statin related or not. This is important to consider because the most recent lipid treatment guidelines focus on using statins only and not substituting other agents like ezetimibe or cholestyramine. So give this methodology a try. You and your patients will likely be amazed at what you find.