BMJ  2003;327:1136-1138 (15 November), doi:10.1136/bmj.327.7424.1136

Paper

Retrospective analysis of evidence base for tests used in diagnosis and monitoring of disease in respiratory medicine

Z Borrill, clinical fellow1, C Houghton, clinical fellow1, P J Sullivan, consultant1, P Sestini, associate professor of respiratory diseases2

1 Department of Cardiorespiratory Medicine, Hope Hospital, Manchester M6 8HD, 2 Department of Clinical Medicine and Immunological Sciences, Division of Respiratory Diseases, University of Siena, Viale Bracci 3, 53100 Siena, Italy

Correspondence to: P J Sullivan Paul.sullivan{at}srht.nhs.uk

Abstract

Objectives To determine how many common clinical tests used in a respiratory medicine outpatient clinic are based on high quality evidence.

Design Retrospective review of case notes. Record of first three tests for each patient. Diagnostic tests, tests used to assess existing condition, explicit trials of therapy were included. Literature search for supporting evidence and grading of best evidence for each test.

Setting Inner city university teaching hospital in the United Kingdom.

Participants All new outpatients referred to a single respiratory medicine team over a period of three months.

Main outcome measures Proportion of tests supported by level 1a-1c evidence (scale developed by Centre for Evidence Based Medicine).

Results Only half the tests that were used to make or exclude a diagnosis and a fifth of the tests used to assess a known condition were supported by level 1a-1c evidence. There was no evidence to support trials of therapy.

Conclusions A large proportion of clinical tests in respiratory medicine are not supported by level 1a-1c evidence. None of the therapeutic trials that were used were supported by evidence.

Introduction

Clinical practice based on scientific evidence is a major goal of the clinical governance process.1 The randomised controlled trial is regarded as the standard for the assessment of therapeutic interventions.2 Several studies have examined how many treatments in everyday clinical practice are based on good evidence in a range of specialties and in general practice.3-6 However, good treatment relies on accurate diagnosis and doubts have been expressed regarding the quality and breadth of the current evidence base for diagnostic tests. Criteria for appraisal of papers that assess medical tests are available,7 just as they are for studies that look at therapeutic interventions, and in diagnostic testing poor study design has been shown to be associated with significant outcome bias.8

We used established criteria to assess the quality of available evidence for tests used in routine outpatient clinical practice in one respiratory medicine clinic. Previous studies of the proportion of therapeutic interventions that are evidence based have used the patient as denominator, expressing findings as the proportion of patients who received at least one evidence based intervention. Tests behave differently in that the final diagnosis may be based on a combination of test results. If an individual patient undergoes a series of tests that include high quality evidence based tests as well as inaccurate or unassessed tests the final diagnosis may be incorrect. We therefore used tests as the denominator rather than patients.

Methods

The study took place in a UK inner city teaching hospital that provides a referral service for primary care and other specialties. We examined the notes of all consecutive patients referred to the respiratory outpatient clinic in a three month period and recorded the first three eligible tests ordered for each patient. We included tests if they were performed to make a diagnosis or to assess a prediagnosed condition. We excluded tests performed as part of routine preclinical investigation and tests, such as full blood count, if they seemed to have been performed without any specific diagnosis in mind. Routine clinical examination was not included. The tests used were recorded along with the question that they were being used to answer. We used these test-question combinations as the denominator for this study—for example, "serum angiotensin converting enzyme concentration to diagnose sarcoidosis" or "serum angiotensin converting enzyme concentration to assess activity of known sarcoidosis" were considered separately.

We divided tests into three groups: group A comprised tests aimed at making a diagnosis; group B comprised tests performed to assess a previously diagnosed condition; and group C was a trial of therapy, which we included as a special type of test, when a drug was prescribed for a limited period with the explicit intention of predicting future response in an individual. A comprehensive Medline search was performed (1966-2001) for each test-question combination by two researchers experienced in searching medical databases. We used a published strategy with a sensitivity of 92%9 followed by a freely improvised search for each test-question pair. The best evidence that we retrieved for each test-question was graded according to the scale devised by the Centre for Evidence Based Medicine, Oxford, (www.cebm.net/levels_of_evidence.asp) (table 1). Some group A tests were regarded as absolutely specific and therefore graded as level 1c. In group C we searched for evidence that the result of a short term trial could predict the usefulness of a drug for an individual in the longer term.


View this table:
[in this window]
[in a new window]
 
Table 1 Levels of evidence according to criteria from Centre for Evidence Based Medicine, Oxford

 

Results

Referrals were received for 90 patients during the three month period. Patients were seen by a consultant (PJS) or specialist registrar (or equivalent) in the same team. Not all patients had three eligible tests. A total of 165 tests were recorded, 137 in group A, 15 in group B, and 13 in group C. The tests could be represented as 38 different test-question combinations; 26 in group A, 5 in group B, and 7 in group C. Table 2 shows the best evidence found for each test categorised and ranked according to the Centre for Evidence Based Medicine criteria. The finding of visible tumour on bronchoscopy with histological confirmation and the finding of mycobacterium tuberculosis in bronchial washings when tuberculosis was the suspected diagnosis were regarded as absolutely specific and therefore level 1c. Both investigators agreed on the level of evidence assigned to each study. In group A there was level 1a-1c evidence for half of the of test-question combinations and in group B a fifth. In group C we found no studies that examined the predictive role for five of the seven therapeutic trials. In the case of trials of oral or inhaled corticosteroids in chronic obstructive pulmonary disease we found literature that we thought did not show that these trials were predictive.


View this table:
[in this window]
[in a new window]
 
Table 2 Test-context combinations and best evidence found by literature review (or by applying rule that absolutely specific tests are level 1c)

 

Discussion

Few, if any, diagnostic tests give unambiguous results. To deal with this we are advised to combine clinical impressions of pretest probability with test results to derive a post-test probability of disease.10 This requires that the test be assigned a weighting, expressed formally as a likelihood ratio—that is, calculated from the results of scientific studies of the test's performance. Standards for research of diagnostic tests have been published,7 and when these standards are not met studies have been shown to overestimate the value of tests.11 Many of the trials of diagnostic tests that are available fall short of these standards.


What is already known on this topic

Correct interpretation of test results requires information from scientific studies of test performance

If the studies do not meet quality standards the value of the test tends to be overestimated

What this study adds

Many diagnostic tests and tests used to monitor disease are not supported by high quality evidence


In 1996-7 only 30% of studies in one survey met at least six of eight standards11 and a similar survey in 1990-3 gave a figure of only 18%.12 Studies that evaluate diagnostic tests are also relatively rare. In a search of four prominent journals over a period of 16 years only 112 studies gave information on sensitivity, specificity, or likelihood ratios derived from more than 10 participants.13 It is therefore not surprising that a survey of 300 clinicians in a range of different specialties found that only 4% used formal methods to assess the accuracy of tests and 1% utilised likelihood ratios.14 Only half of the common tests we identified were supported by level 1a-1c evidence. We have also shown that there is little evidence to support tests that were used to assess previously diagnosed chronic diseases. The use of therapeutic trials to predict long term efficacy from short term response was similarly unsupported.

Our study reflects the practice in a single unit and the proportion of evidence based tests used elsewhere may be higher. Nevertheless, there is a clear need for further high quality research into medical tests, at least in the specialty that we have studied. There is also a need for an evidence base for the use of trials of therapy.


Contributors: PS had the original idea for the study. PJS and PS designed the study. PJS and ZB surveyed case notes, performed literature searches, and graded evidence. CH surveyed case notes. All authors commented on drafts. PJS is guarantor and can provide further details of the evidence found.

Funding: None.

Competing interests: None declared.

References

  1. McSherry R, Haddock J. Evidence-based health care: its place within clinical governance. Br J Nurs 1999;8: 113-7.[Medline]
  2. Guyatt GH, Sackett DL, Cook DJ. Users'guides to the medical literature. II. How to use an article about therapy or prevention. A. Are the results of the study valid? Evidence-based medicine working group. JAMA 1993;270: 2598-601.[Free Full Text]
  3. Ellis J, Mulligan I, Rowe J, Sackett DL. Inpatient general medicine is evidence based. A-Team, Nuffield Department of Clinical Medicine. Lancet 1995;346: 407-10.[CrossRef][ISI][Medline]
  4. Gill P, Dowell AC, Neal RD, Smith N, Heywood P, Wilson AE. Evidence based general practice: a retrospective study of interventions in one training practice. BMJ 1996;312: 819-21.[Abstract/Free Full Text]
  5. Geddes JR, Game D Jenkins NE Peterson LA Pottinger GR Sackett DL. What proportion of primary psychiatric interventions are based on evidence from randomised controlled trials? Qual Health Care 1996;5: 215-7.[Abstract]
  6. Howes N, Chagla L, Thorpe-M, and McCulloch-P. Surgical practice is evidence based. Br J Surg 1997;84: 1220-3.[CrossRef][ISI][Medline]
  7. Jaeschke R, Guyatt G, Sackett DL. Users' guides to the medical literature. III. How to use an article about a diagnostic test. A. Are the results of the study valid? Evidence-based medicine working group. JAMA 1994;271: 389-91.[CrossRef][ISI][Medline]
  8. Ransohoff D, Feinstein AR. Problems of spectrum and bias in evaluating the efficacy of diagnostic tests. N Engl J Med 1978;299: 926-30.[Abstract]
  9. Haynes RB, Wilczynski NL, McKibbon MA, Walker CJ, Sinclair JC. Developing optimal search strategies for detecting clinically sound studies in Medline. J Am Med Inform Assoc 1994;1: 447-58.[Abstract/Free Full Text]
  10. Jaeschke R, Guyatt GH, Sackett DL. Users' guides to the medical literature. III. How to use an article about a diagnostic test. B. What are the results and will they help me in caring for my patients? Evidence-based medicine working group. JAMA 1994;271: 703-7.[CrossRef][ISI][Medline]
  11. Lijmer JG, Mol BW, Heisterkamp S, Bonsel GJ, Prins MH, van der Meulen JHP, et al. Empirical evidence of design related bias in studies of diagnostic tests. JAMA 1999;282: 1061-6.[Abstract/Free Full Text]
  12. Reid MC, Lachs MS, Feinstein AR. Use of methodological standards in diagnostic test research: getting better but still not good. JAMA 1995;274: 645-51.[Abstract]
  13. Reid C, Lachs MS, Feinstein AR. Use of methodological standards in diagnostic test research. JAMA 1995;274: 645-51.
  14. Reid C, Lane DA, Feinstein AR. Academic calculation versus clinical judgements: practicing physicians' use of quantitative measures of test accuracy. Am J Med 1998;104: 374-80.[CrossRef][ISI][Medline]
(Accepted September 4, 2003)

Related Articles

Many investigative tests are not evidence based
BMJ 2003 327: 0. [Full Text]

Dawn of the diagnostic age
Kamran Abbasi
BMJ 2003 327: 0. [Extract] [Full Text] [PDF]

This article has been cited by other articles:

  • Keeley, P. W., Waterhouse, E. T., Noble, S. I.R. (2007). The evidence base of palliative medicine: is inpatient palliative medicine evidence-based?. Palliat Med 21: 623-627 [Abstract]  
  • Beydon, N., Davis, S. D., Lombardi, E., Allen, J. L., Arets, H. G. M., Aurora, P., Bisgaard, H., Davis, G. M., Ducharme, F. M., Eigen, H., Gappa, M., Gaultier, C., Gustafsson, P. M., Hall, G. L., Hantos, Z., Healy, M. J. R., Jones, M. H., Klug, B., Lodrup Carlsen, K. C., McKenzie, S. A., Marchal, F., Mayer, O. H., Merkus, P. J. F. M., Morris, M. G., Oostveen, E., Pillow, J. J., Seddon, P. C., Silverman, M., Sly, P. D., Stocks, J., Tepper, R. S., Vilozni, D., Wilson, N. M., on behalf of the American Thoracic Society/Europea, (2007). An Official American Thoracic Society/European Respiratory Society Statement: Pulmonary Function Testing in Preschool Children. Am. J. Respir. Crit. Care Med. 175: 1304-1345 [Full text]  
  • Vestbo, J. (2006). Clinical Assessment, Staging, and Epidemiology of Chronic Obstructive Pulmonary Disease Exacerbations. Proc Am Thorac Soc 3: 252-256 [Abstract] [Full text]  
  • Montella, S., Andreucci, M. V., Greco, L., Barbarano, F., De Stefano, S., Brunese, L., Santamaria, F. (2005). Clinical utility of CT in children with persistent focal chest abnormality. Eur Respir J 26: 751-752 [Full text]  
  • Dundas, I, Chan, E Y, Bridge, P D, McKenzie, S A (2005). Diagnostic accuracy of bronchodilator responsiveness in wheezy children. Thorax 60: 13-16 [Abstract] [Full text]  
  • (2004). Many Diagnostic Tests Aren't Supported By Evidence. JWatch Gastroenterology 2004: 6-6 [Full text]  
  • (2004). Many Diagnostic Tests Aren't Supported By Evidence. JWatch General 2004: 4-4 [Full text]  

Rapid Responses:

Read all Rapid Responses

Easy Isn't
Ned Hoke
bmj.com, 17 Nov 2003 [Full text]
RCTs are not everything
Kath H Checkland
bmj.com, 17 Nov 2003 [Full text]
A plea for research enhanced medicine
Rod A Lawson
bmj.com, 18 Nov 2003 [Full text]
Dont confuse validity with levels of evidence
Ben d Ewald, et al.
bmj.com, 1 Dec 2003 [Full text]



Student BMJ

Risk of surgery for inflammatory bowel disease: record linkage studies

What can you learn from this BMJ paper? Read Leanne Tite's Paper+

www.student.bmj.com

Listen to the latest BMJ Interview