James O. Sanders, MD
Evidence-Based Committee Chair
|Levels of Evidence Based on Study Design|
|1||Randomized Clinical Trials|
|2||Non-Randomized Clinical Trials or Cohort Studies|
|3||Case Controlled Series|
As you may know, the American Academy of Orthopaedic Surgeons (AAOS) and North American Spine Society (NASS) have been heavily involved in creating evidence-based Clinical Practice Guidelines (CPGs). AAOS is currently beginning development of Appropriate Use Criteria (AUC), which was previously pioneered by the American College of Cardiology (ACC). At the 47th Annual Meeting, a Lunchtime Symposium explained these tools along with the more fundamental tool, systematic reviews.
Many people are familiar with the older version of the guidelines, which were based upon expert opinion. However, expert opinion has fallen on hard times. Many of the experts in earlier guidelines were found to have significant conflicts of interest and strongly biased views. Guidelines developed by expert opinion have not held up under scrutiny, and modern guidelines require the best evidence of all - our literature.
Developing a guideline requires a detailed systematic review of the literature. A systematic review differs from standard reviews by critically analyzing the strength of the evidence. High-quality comparative analysis ranks higher than non-comparative case series, which rank higher than expert opinion. Without a quality systematic review aimed at answering the important questions, then CPG's and AUC's are no better than our opinions.
|Grades of Recommendation Based Upon The Quality of Supporting Evidence|
|Strong||Multiple Level I Evidence Sources|
|Moderate||One Level I, or Multiple Level II/III|
|Weak||One Level II or III, or Multiple Level IV|
|Inconclusive||Insufficient or Conflicting Evidence|
|Consensus||No Evidence, but Expert Work Group Opinion|
Evidence-based CPGs then ask the questions, "what works" and "what does not work" in the diagnosis and treatment of patients. The strength of a guideline reflects the quality of the questions asked and the strength of the underlying literature. Individual recommendations are graded based on the evidence strength, which also reflects whether the recommendation is likely to change with further research.
Evidence-based CPGs use language that is often intimidating to physicians, and the poor levels of evidence in orthopedics have often been very disappointing.
However, without this structured questioning, other evidence-based items such as AUCs and Practice Improvement Modules (PIMs) are meaningless. For example, with high-quality evidence against vertebroplasty, an AUC or PIM for vertebroplasty would be useless. The advantage of well-done CPGs is that they synthesize a large amount of literature, allow flexibility in decision-making, and can help establish research priorities.
But, because medical literature is often not definitive and clinicians must still practice in uncertainty, AUCs were developed as a structured technique by the RAND Corporation to help define the boundaries of acceptable practice. I.E., "in whom is it useful?" AUC are designed to combine the best available evidence with the collective judgment of experts to produce a statement regarding the appropriateness of performing a procedure based on the patient-specific symptoms, medical history and test results. For example, in AIS, is it appropriate to operate on a 15 degree thoracic curve, or on someone with a soon to be fatal disease versus a healthy immature adolescent with a 75 degree thoracic curve?
|Grades of Recommendation and Recommendation Language|
|Weak||"It is an Option"|
|Inconclusive||"We Can Neither Recommend for nor Against"|
|Consensus||"In the Absence of Reliable Evidence, it is the Opinion of the Workgroup That . . ."|
AUCs are developed in three stages and require a quality understanding of the literature. In their initial stage, the important parameters of a disorder are defined. During this initial phase, a matrix is composed of various plausible clinical scenarios. During the second phase, another group assesses the reasonableness of the scenarios. Finally, in the third phase there is a discussion and ultimately a ranking of the level of appropriateness for a procedure on a scale of 1-9 ranking from inappropriate to necessary. This last group must have less than 50 percent of the participants involved in the actual performance of the diagnostic technique or procedure.
Scenarios ranked as inappropriate (1-3) should not be performed. Those ranked necessary (7-9) should be performed, and, if they are not, should be promoted. Those in the middle group (5-8) are potential subjects for research studies.
Having a basic understanding of systematic reviews, CPGs and AUC is important because will become more important under pressures to provide high value care. Each of these are methods of making the underlying scientific evidence well understood and help us bring the best practices to our patients.
Chair: James O. Sanders, MD Committee Members: David W. Polly, MD; Steven D. Glassman, MD; Jacob M. Buchowski, MD; Charles H. Crawford, III, MD; Justin S. Smith, MD; Douglas C. Burton, MD; Serena S. Hu, MD; Baron S. Lonner, MD.