How to Use A Treatment Recommendation
Gordon H. Guyatt, Jack Sinclair, Deborah J. Cook, Paul Glasziou, for the Evidence Based Medicine Working Group
Based on the Users' Guides to Evidence-based Medicine and reproduced with permission from JAMA. (1999;281(19):1836-1843) Copyright 1995, American Medical Association.
- Clinical Scenario
- The Process of Developing a Recommendation
- Linking Management Options and Outcomes -- Systematic Reviews
- Decision Analysis
- Practice Guidelines
- Current Sources of Treatment Recommendations
- Making recommendations: a hierarchy of rigour
- Are Treatment Recommendations Desirable at All?
You are a primary care practitioner considering the possibility of anticoagulant therapy with warfarin in a new patient, a 76 year old woman with chronic congestive heart failure and atrial fibrillation. The patient has no hypertension, valvular disease, or other comorbidity. Aspirin is the only antithrombotic agent that the patient has received over the 10 years during which she was in atrial fibrillation. Her other medications include captopril, furosemide, and metoprolol. The duration of the patient's atrial fibrillation, and her dilated left atrium on echocardiogram, dissuade you from prescribing antiarrhythmic therapy. Discussing the issue with the patient, you find she places a high value in avoiding a stroke, a somewhat lower value in avoiding a major bleed, and would accept the inconvenience associated with monitoring anticoagulant therapy.
You have little inclination to review the voluminous original literature relating to the benefits of anticoagulant therapy in reducing stroke or its risk of bleeding, but hope to find an evidence-based recommendation to guide your advice to the patient. In your office file relating to this problem you find a report of a primary study , a decision analysis , and a recent practice guideline that you hope will help.
Each day, clinicians make dozens of patient management decisions. Some are relatively inconsequential, some are important. Each one involves weighing benefits and risks, gains and losses, and recommending or instituting a course of action judged to be in the patient's best interest. These decisions involve an implicit consideration of the relevant evidence, an intuitive integration of the evidence, and a weighing of the likely benefits and harms. In making choices, clinicians may benefit from structured summaries of the options and outcomes, systematic reviews of the evidence regarding the relation between options and outcomes, and recommendations regarding the best choices. This Users' Guide explores the process of developing recommendations, suggests how the process may be conducted systematically, and introduces a taxonomy for differentiating recommendations that are more rigorous (and thus more likely to be trustworthy) from those that are less rigorous (and thus at greater risk of being misleading).
While recommendations may be directed at health policy makers, our focus is advice for practicing clinicians. We will begin by considering the implicit steps that are involved in making a recommendation.
The Process of Developing a Recommendation
Figure 1 presents the steps involved in developing a recommendation, and the formal strategies that are available. The first step in clinical decision-making is to define the decision. This involves specifying the alternative courses of action, and the alternative outcomes. Often, treatments are designed to delay or prevent an adverse outcome such as stroke, death, or myocardial infarction. In our discussion, we will refer to the outcomes that treatment is designed to prevent as "target outcomes". Treatments are associated with their own adverse outcomes -- side effects or toxicity. Ideally, the definition of the decision will be comprehensive -- all reasonable alternatives will be considered, and all possible beneficial and adverse outcomes will be identified. In patients like the woman in the scenario with non-valvular atrial fibrillation options include not treating, giving aspirin, or anticoagulating with warfarin. Outcomes include minor and major embolic stroke, intracranial haemorrhage, gastrointestinal haemorrhage, minor bleeding, and the inconvenience associated with taking and monitoring medication.
Figure 1: A Schematic View of the Process of Developing a Treatment Recommendation
Having identified the options and outcomes, decision-makers must evaluate the links between the two -- what will the alternative management strategies yield in terms of benefit and harm . They must also consider how this impact is likely to vary in different groups of patients . Having made estimates of the consequences of alternative strategies, value judgements about the relative desirability or undesirability of possible outcomes becomes necessary to allow treatment recommendations. We will use the term "preferences" synonymously with "values" or "value judgements" in referring to the process of trading off positive and negative consequences of alternative management strategies.
Recently, investigators have applied scientific principles to the collection, selection, and summarization of evidence, and the valuing of outcomes. We will briefly describe these systematic approaches.
Linking Management Options and Outcomes -- Systematic Reviews
Unsystematic identification and collection of evidence risks biased ascertainment -- treatment effects may be under-, or more commonly, overestimated and side effects may be exaggerated or ignored. Unsystematic summaries of data run similar risks of bias. One result of these unsystematic approaches may be recommendations advocating harmful treatments, and failing to encourage effective therapy. For example, experts advocated routine use of lidocaine for patients with acute myocardial infarction when available data suggested the intervention was ineffective and possibly even harmful, and failed to recommend thrombolytic agents when data showed patient benefit .
Systematic reviews deal with this problem by explicitly stating inclusion and exclusion criteria for evidence to be considered, conducting a comprehensive search for the evidence, and summarizing the results according to explicit rules that include examining how effects may vary in different patient sub-groups  . When a systematic review pools data across studies to provide a quantitative estimate of overall treatment effect we call it a meta-analysis. Systematic reviews provide strong evidence when the quality of the primary studies is high and sample sizes are large, and less strong evidence when designs are weaker and sample sizes small. Because judgement is involved in many steps in a systematic review (including specifying inclusion and exclusion criteria, applying these criteria to potentially eligible studies, evaluating the methodological quality of the primary studies, and selecting an approach to data analysis) systematic reviews are not immune from bias. Nevertheless, in their rigorous approach to collecting and summarizing data, systematic reviews reduce the likelihood of bias in estimating the causal links between management options and patient outcomes.
Rigorous decision analysis provides a formal structure for integrating the evidence about the beneficial and harmful effects of treatment options with the values or preferences associated with those beneficial and harmful effects. When done well, a decision analysis will use systematic reviews of the best evidence to estimate the probabilities of the outcomes and use appropriate sources of preferences (those of society, or of relevant patient groups) to generate treatment recommendations . When a decision analysis includes costs among the outcomes, it becomes an economic analysis, and summarizes tradeoffs between gains (typically valued in quality-adjusted life-years) and resource expenditure (valued in dollars)  . A decision analysis will be open to bias if it fails criteria for a systematic overview in accumulating and summarizing evidence, or uses preferences that are arbitrary or come from small or unrepresentative populations (such as a small group of health-care providers).
Practice guidelines provide an alternative structure for integrating evidence and applying values to reach treatment recommendations. Practice guideline methodology places less emphasis on precise quantitation than does decision analysis. Instead, it relies on the consensus of a group of decision-makers, ideally including experts, front-line clinicians, and patients, who carefully consider the evidence and decide on its implications. Rigorous practice guidelines will also use systematic reviews to summarize evidence, and sensible strategies to attribute values to alternative outcomes as they generate treatment recommendations  . Guidelines developers may focus on local circumstances. For example, clinicians practicing in rural parts of less industrialized countries without resource to monitor its intensity may reject anticoagulation as a management approach for patients with atrial fibrillation. Practice guidelines may fail methodologic standards in the same ways as decision analyses.
We will now contrast these systematic approaches to developing recommendations with historical practice
Current Sources of Treatment Recommendations
Traditionally, authors of original, or primary, research into therapeutic interventions include recommendations about the use of these interventions in clinical practice in the discussion section of their papers. Authors of systematic reviews and meta-analyses also tend to provide their impressions of the management implications of their studies. Typically, however, individual trials or overviews do not consider all possible management options, but focus on a comparison of two or three alternatives. They may also fail to identify subpopulations in which the impact of treatment may vary considerably. Finally, when the authors of overviews provide recommendations, they are not typically grounded in an explicit presentation of societal or patient preferences.
Failure to consider these issues may lead to variability in recommendations given the same data. For example, a number of meta-analyses of selective decontamination of the gut using antibiotic prophylaxis for pneumonia in critically ill patients with very similar results regarding the impact of treatment on target outcomes resulted in recommendations varying from suggesting implementation, to equivocation, to rejecting implementation    . Varying recommendations reflect the fact that both investigators reporting primary studies and meta-analysts often make their recommendations without benefit of an explicit standardized process or set of rules. When benefits or risks are dramatic, and these benefits and risks are essentially homogeneous across an entire population, intuition may provide an adequate guide to making treatment recommendations. Such situations are unusual. In most instances, because of their susceptibility to both bias and random error, intuitive recommendations risk misleading the clinician.
These considerations suggest that when clinicians examine treatment recommendations, they should critically evaluate the methodologic quality of the recommendations. The greater the extent to which recommendations adhere to the methodologic standards we have mentioned, the greater faith clinicians may place in those recommendations [Table 1]. presents a scheme for classifying the methodological quality of treatment recommendations, emphasizing the three key components: consideration of all relevant options and outcomes, a systematic summary of the evidence, and explicit and/or quantitative consideration of societal or patient preferences. In the next section of the text, we will describe the rating system summarized in Table 2.
Table 1: Methodologic Requirements for Systematic, Rigorous Recommendations
Table 2: A hierarchy of rigour in making treatment recommendations
Making recommendations: a hierarchy of rigour
Systematic summary of evidence for all relevant interventions using appropriate values
Quantitative summary of evidence and values
The most rigorous approach to making recommendations (which we will call a systematic synthesis) involves precisely quantifying all benefits and risks; determining the values of either a group of patients or the general population; where uncertainty exists making a systematic and quantitative exploration of the range of possible true values; and using quantitative methods to synthesize the data. One approach to meeting these criteria involves conducting a formal decision analysis. Many decision analyses fail to carry out each step in the process in an optimally rigorous fashion; to do so usually requires a major research project  .
Challenges for decision analysts include conducting the systematic reviews required to generate the best estimates of benefits and risks associated with treatment options, and measuring how the general public or patients value the relevant outcomes. Typically, a decision analysis values each treatment arm in terms of quality-adjusted life years. When costs are considered, the decision analysis becomes an economic analysis, and we think in terms of additional dollars spent to gain an additional quality-adjusted life year. The optimal therapy, or the cost-effectiveness of alternatives, may differ depending on untreated patients' risk of the target outcome.
What a decision analysis or economic analysis usually does not do is to value the benefits, risks and costs and provide an explicit threshold for decision-making. For example, a new treatment might cost $50,000 per quality adjusted life years gained. Is this a bargain, or too great a cost to warrant treatment? Often, decision analysts will refer to the cost/effectiveness or cost/utility ratios of currently-used treatments to help with this decision. For instance, the decision analysis from the scenario in this article concluded that while the cost of warfarin for patients with at least one factor increasing their risk of embolism was $8,000 per quality-adjusted life-year saved, the cost was $375,000 per quality-adjusted life-year saved for a 65-year old with no risk factors . The authors compared these figures to the $50,000 to $100,000 cost per quality-adjusted life-year gained when screening adults for hypertension.
Quantitative summary of evidence and values: Explicit Decision Thresholds
Investigators can use the principles of decision analysis to arrive at explicit decision thresholds and present these thresholds in ways that facilitate clinicians' understanding. One such approach involves the number of patients to whom one must administer an intervention to prevent a single target event, the Number Needed to Treat (NNT) . Typically, the NNT falls as patients' risk of an adverse outcome rises, and may become extremely large when patients are at very low risk. In a previous Users' Guide, we have described the threshold NNT , the dividing line between when treatment is warranted (the NNT is low enough that the benefits outweigh the costs and risks), and when it is not (the NNT is too great to warrant treatment). Deriving the threshold NNT involves specifying the relative value associated with preventing the target outcome versus incurring the side effects and inconvenience associated with treatment .
Investigators using this approach may also consider costs. If so, they face the additional requirement of specifying the number of dollars one would be willing to pay to prevent a single target event. With or without considering costs, investigators can plug the values they adduce into an equation that generates the threshold NNT. They can then look at the risk of the target outcome in untreated subpopulations to whom clinicians might consider administering the intervention. Combining this information with the relative risk reduction associated with the treatment, they can determine on which side of the threshold the treatment falls.
Returning to our example, warfarin decreases the risk of stroke in patients with non-valvular atrial fibrillation. Since anticoagulation increases bleeding risk it is not self-evident that we should be recommending the treatment for our patients and must find a way of trading off decreased stroke and increased bleeds. We can calculate the threshold NNT by specifying the major adverse outcome of treatment, bleeding, and the frequency with which it occurs due to treatment. We then specify the impact of these deleterious effects relative to the target event the treatment prevents, a stroke. A variety of studies of relevant patient populations     suggest that on average, patients consider 1 severe stroke equivalent to 5 episodes of serious gastrointestinal bleeding. We use these figures to calculate our threshold NNT which proves to be approximately 152 [Figure 2]. This implies that if we need to anticoagulate less than 152 patients to prevent a stroke, we will do so; if we must anticoagulate more than 152 patients, then our recommendation will be to not treat.
Figure 2: Calculating the Treshold Number Needed to Treat (T-NNT) for Warfarin Treatment of Patients With Nonvalvular Atrial Fibrillation
The threshold NNT then facilitates recommendations for specific patient groups. Table 3 summarizes the calculation of the NNT, and the associated comparison with the threshold, for two groups of patients. A meta-analysis of randomized trials tells us that anticoagulation reduces the risk of stroke by 68% (95% confidence interval 50% to 79%), and that this risk reduction is consistent across clinical trials . The meta-analysis also provides risk estimates for different groups of patients with strokes. Patients over 75 with any of previous cerebrovascular events, diabetes, hypertension, or heart disease have a stroke risk of approximately 8.1% per year. Anticoagulation reduces this risk to 2.6% with an NNT of 1 / 0.055 or approximately 18 per year. The NNT for this group is appreciably lower than the threshold NNT suggesting that such patients should be treated. Patients less than 65 with no risk factors have a one-year stroke risk of 1% which anticoagulation reduces to 0.32%. The associated NNT of 146 approximates the threshold NNT of 153 and suggests the decision about whether or not to treat is a toss-up.
Table 3: Using the NNT to make treatment recommendations
Clinicians or health-care decision makers interested in considering costs in their decisions can look for help from the model. Costs can be included by specifying the dollar value associated with preventing adverse outcomes (for example, Laupacis and colleagues have suggested the most that society might be willing to pay to gain a quality-adjusted life year is $100,000 ). When we consider costs as calculated in the decision analysis from the patient scenario , we arrive at an threshold NNT of 53, suggesting a more conservative approach to anticoagulant administration.
Investigators can use units other than NNT to develop clinically useful decision thresholds. For example, for 81 patients previously treated with cis-platinum based chemotherapy, the average minimum gain in survival which was felt to make the chemotherapy worthwhile was 4.5 months for mild toxicity and 9 months for severe toxicity . Such a threshold could be integrated with information about the actual gain in life associated with the treatment to help form the basis for a recommendation about use of cis-platinum therapy.
Like other quantitative approaches, considering NNT and the threshold NNT, or alternative thresholds, is intended to supplement clinical judgement, not replace it. Investigators exploring different treatment choices have found the methodology useful . However clinicians use it, the approach highlights the necessity for both valuing the benefits and risks of treatment, and understanding the magnitude of those benefits and risks, in making a treatment decision.
Quantitative summary of evidence, qualitative summary of preferences
Practice guidelines, if they are to minimize bias, should not substitute expert opinion for a systematic review of the literature, an explicit and sensible process for valuing outcomes, an explicit consideration of the impact of uncertainty associated with the evidence and values used in the guidelines and an explicit statement of the strength of evidence supporting the guideline. When a practice guideline meets these methodological standards, and thereby minimizes bias, we refer to the guideline as "evidence-based" .
Once they have the evidence, investigators and clinicians are often uncomfortable with explicitly specifying preferences in moving from evidence to action. Their reluctance is understandable. Specifying a specific tradeoff between, say, a stroke and a gastrointestinal bleed is not an exercise with which we are familiar. People may feel that identifying a specific value -- a stroke is equivalent to 2.5 gastrointestinal bleeds, for instance -- implies more precision than is realistic. Discomfort may rise further when we specify a dollar value associated with preventing an adverse event.
This may be one reason that participants in the development of rigorous practice guidelines, including experts in the content area, methodologists, community practitioners, and patients and their representatives, seldom use numbers to identify the value judgements they are making. Still, a rigorous guideline will establish, reflect, and make explicit the community and patient values on which the recommendation is based.
Most practice guidelines fail to systematically summarize the evidence. Even those that meet criteria for evidence accumulation and summarization do not usually make their underlying values explicit. Guidelines that do not meet either set of criteria produce recommendations of low methodologic rigour.
Practice guidelines that meet the criteria in provide an alternative to quantitative strategies to arrive at a systematic synthesis.
Systematic review of evidence, unsystematic application of values
Traditionally, investigators provide their results and then make an intuitive recommendation about the action that they believe should follow from their evidence. They may do so without considering all treatment options, or all outcomes (). Even when they consider all relevant treatments and outcomes, they may fail to use community or patient values, or even to make the values they are using explicit. For instance, the authors of a meta-analysis of antithrombotic therapy in atrial fibrillation stated "about one patient in seven in the combined study cohort were at such low risk of stroke (1% per year) that chronic anticoagulation is not warranted."  Here, the relative value of stroke and gastrointestinal bleeding is implicit in the recommendation. The nature of the value judgement is not transparent, and we have no guarantee that the implicit values reflect those of our patient or community.
Clinicians faced with such recommendations need to take care that they are aware of all relevant outcomes, both reductions in targets and treatment-related adverse events, and are aware of the relative values implied in the treatment recommendations.
Unsystematic review, unsystematic synthesis
The unsystematic approach represents the traditional strategy of accumulating evidence and summarizing evidence in an unsystematic fashion, and then applying implicit preferences to arrive at a treatment recommendation. The approach is open to bias, and is likely to lead to consistent, valid recommendations only when the gradient between beneficial and adverse consequences of alternative actions is very large.
Both quantitative strategies and practice guidelines, when done rigorously, are very resource-intensive. Investigators may adopt less onerous methods and still provide useful insights. Meta-analysts may wish to take the first steps in making treatment recommendations without a formal decision analysis or practice guideline development exercise. If they are to optimize the rigour of these tentative recommendations they will comprehensively identify all options and outcomes and use their meta-analysis to establish the causal links between the two. They may then choose to label values in only a qualitative way, such as: "we value preventing a stroke considerably more highly than incurring a gastrointestinal bleed. Given this value, we would be willing to treat a moderate to large number of patients to prevent a single target event, and would therefore recommend treating all but those at lowest risk of stroke."
Clinicians may find such recommendations useful, and they have the advantage of highlighting that if one does not share the specified values, one would choose an alternative treatment strategy. They may not, however, reflect community or patient preferences. In addition, they are less specific than the process of placing a number on our values. While quantifying values may make us uncomfortable, we are regularly (if unconsciously) making such judgements in the process of instituting or withholding treatment for our patients.
Are Treatment Recommendations Desirable at All?
The approaches we have described highlight that patient management decisions are always a function of both evidence and preferences. Clinicians may point out that values are likely to differ substantially between settings. Monitoring of anticoagulant therapy might take on a much stronger negative value in a rural setting where travel distances are large, or in a more severely resource-constrained environment where there is a direct inverse relationship between (for example) the resources available for purchase of antibiotics and those allocated to monitoring levels of anticoagulation.
Patient-to-patient differences in values are equally important. The magnitude of the negative value of anticoagulant monitoring, or the relative negative value associated with a stroke versus a gastrointestinal bleed, will vary widely between individual patients, even in the same setting. If decisions are so dependent on preferences, what is the point of recommendations?
This line of argument suggests that investigators should systematically search, accumulate and summarize information for presentation to clinicians. In addition, investigators may highlight the implications of different sets of values for clinical action. The dependence of the decision on the underlying values, and the variability of values, would suggest that such a presentation would be more useful than a recommendation.
We find this argument compelling. Its implementation is, however, dependent on standard methods of summarizing and presenting information that clinicians are comfortable interpreting and using. Furthermore, it implies clinicians having the time, and the methods, to ascertain patient values that they can then integrate with the information from systematic reviews of the impact of management decisions on patient outcomes. These requirements are unlikely to be fully met in the immediate future. Moreover, treatment recommendations are likely to remain useful for providing insight, marking progress, highlighting areas where we need more information, and stimulating productive controversy. In any case, clinical decisions are likely to improve if clinicians are aware of the underlying determinants of their actions, and are able to be more critical about the recommendations offered to them. Our taxonomy may help to achieve both goals.
Resolution of the Scenario
The closest statement to a recommendation relevant to your patient from the original journal article  is the following: "many elderly patients with atrial fibrillation are unable to sustain chronic anticoagulation. Furthermore, the risk of bleeding (particularly intracranial haemorrhage) was increased during anticoagulation of elderly patients in our study." Since this study neither summarized the available evidence, nor explicitly stated its underlying values, so its recommendation that is low in rigour.
The decision analysis uses systematic summaries of the available evidence and specifies the patient values used in developing its conclusion that "Treatment with warfarin is cost-effective in patients with non-valvular atrial fibrillation and one or more additional risk factors for stroke" , placing it in the high rigour category. Moreover, the patient values used in the analysis appear consistent with what your patient's preferences. The only limitation to the decision analysis is that its bottom line recommendation involves considerations of cost which you have reservations about including.
The practice guideline  once again uses a systematic summary of the evidence, and though making frequent reference to patients values, does not specify the relative value of stroke and bleeding implied in its strong recommendation that high risk patients such as ours be offered anticoagulant therapy. Nevertheless, armed with consistent recommendations from a systematic synthesis and a recommendation of intermediate rigour, you feel confident recommending your patient begin taking warfarin.
Stroke Prevention in Atrial Fibrillation Investigators. Warfarin versus aspirin for prevention of thromboembolism in atrial fibrillation: Stroke Prevention in Atrial Fibrillation II Study. Lancet. 1994;343:687-91.
Antman EM, Lau J, Kupelnick B, Mosteller F, Chalmers TC. A comparison of results of meta-analyses of randomized control trials and recommendations of clinical experts. Treatments for myocardial infarction. JAMA. 1992;268:240-8.
Richardson WS, Detsky AS. Users' guides to the medical literature. VII. How to use a clinical decision analysis. B. What are the results and will they help me in caring for my patients? Evidence Based Medicine Working Group. JAMA. 1995;273:1610-3.
Richardson WS, Detsky AS. Users' guides to the medical literature. VII. How to use a clinical decision analysis. A. Are the results of the study valid? Evidence-Based Medicine Working Group. JAMA. 1995;273:1292-5.
Drummond MF, Richardson WS, O'Brien BJ, Levine M, Heyland D. Users' guides to the medical literature. XIII. How to use an article on economic analysis of clinical practice. A. Are the results of the study valid? Evidence-Based Medicine Working Group. JAMA. 1997;277:1552-7.
O'Brien BJ, Heyland D, Richardson WS, Levine M, Drummond MF. Users' guides to the medical literature. XIII. How to use an article on economic analysis of clinical practice. B. What are the results and will they help me in caring for my patients? Evidence-Based Medicine Working Group. JAMA. 1997;277:1802-6.
Hayward RS, Wilson MC, Tunis SR, Bass EB, Guyatt G. Users' guides to the medical literature. VIII. How to use clinical practice guidelines. A. Are the recommendations valid? The Evidence-Based Medicine Working Group. JAMA. 1995;274:570-4.
Wilson MC, Hayward RS, Tunis SR, Bass EB, Guyatt G. Users' guides to the Medical Literature. VIII. How to use clinical practice guidelines. B. what are the recommendations and will they help you in caring for your patients? The Evidence-Based Medicine Working Group. JAMA. 1995;274:1630-2.
Vandenbroucke-Grauls CM, Vandenbroucke JP. Effect of selective decontamination of the digestive tract on respiratory tract infections and mortality in the intensive care unit. Lancet. 1991;338:859-62.
Meta-analysis of randomised controlled trials of selective decontamination of the digestive tract. Selective Decontamination of the Digestive Tract Trialists' Collaborative Group. BMJ. 1993;307:525-32.
Guyatt GH, Sackett DL, Sinclair JC, Hayward R, Cook DJ, Cook RJ. Users' guides to the medical literature. IX. A method for grading health care recommendations. Evidence-Based Medicine Working Group. JAMA. 1995;274:1800-4.
Glasziou PP, Bromwich S, Simes RJ. Quality of life six months after myocardial infarction treated with thrombolytic therapy. AUS-TASK Group. Australian arm of International tPA/SK Mortality Trial. Med J Aust. 1994;161:532-6.
Laupacis A, Feeny D, Detsky AS, Tugwell PX. How attractive does a new technology have to be to warrant adoption and utilization? Tentative guidelines for using clinical and economic evaluations. CMAJ. 1992;146:473-81.
Robbins JM, Tilford JM, Jacobs RF, Wheeler JG, Gillaspy SR, Schutze GE. A number-needed-to-treat analysis of the use of respiratory syncytial virus immune globulin to prevent hospitalization. Arch Pediatr Adolesc Med. 1998;152:358-66.