BMJ 2003;327:1136-1138 (15 November), doi:10.1136/bmj.327.7424.1136
Paper
Retrospective analysis of evidence base for tests used in diagnosis and monitoring of disease in respiratory medicine
Z Borrill, clinical fellow1,
C Houghton, clinical fellow1,
P J Sullivan, consultant1,
P Sestini, associate professor of respiratory diseases2
1 Department of Cardiorespiratory Medicine, Hope Hospital, Manchester M6 8HD,
2 Department of Clinical Medicine and Immunological Sciences, Division of Respiratory Diseases, University of Siena, Viale Bracci 3, 53100 Siena, Italy
Correspondence to: P J Sullivan Paul.sullivan{at}srht.nhs.uk
Abstract
Objectives To determine how many common clinical tests used
in a respiratory medicine outpatient clinic are based on high
quality evidence.
Design Retrospective review of case notes. Record of first three tests for each patient. Diagnostic tests, tests used to assess existing condition, explicit trials of therapy were included. Literature search for supporting evidence and grading of best evidence for each test.
Setting Inner city university teaching hospital in the United Kingdom.
Participants All new outpatients referred to a single respiratory medicine team over a period of three months.
Main outcome measures Proportion of tests supported by level 1a-1c evidence (scale developed by Centre for Evidence Based Medicine).
Results Only half the tests that were used to make or exclude a diagnosis and a fifth of the tests used to assess a known condition were supported by level 1a-1c evidence. There was no evidence to support trials of therapy.
Conclusions A large proportion of clinical tests in respiratory medicine are not supported by level 1a-1c evidence. None of the therapeutic trials that were used were supported by evidence.
Introduction
Clinical practice based on scientific evidence is a major goal
of the clinical governance process.
1 The randomised controlled
trial is regarded as the standard for the assessment of therapeutic
interventions.
2 Several studies have examined how many treatments
in everyday clinical practice are based on good evidence in
a range of specialties and in general practice.
3-6 However,
good treatment relies on accurate diagnosis and doubts have
been expressed regarding the quality and breadth of the current
evidence base for diagnostic tests. Criteria for appraisal of
papers that assess medical tests are available,
7 just as they
are for studies that look at therapeutic interventions, and
in diagnostic testing poor study design has been shown to be
associated with significant outcome bias.
8
We used established criteria to assess the quality of available evidence for tests used in routine outpatient clinical practice in one respiratory medicine clinic. Previous studies of the proportion of therapeutic interventions that are evidence based have used the patient as denominator, expressing findings as the proportion of patients who received at least one evidence based intervention. Tests behave differently in that the final diagnosis may be based on a combination of test results. If an individual patient undergoes a series of tests that include high quality evidence based tests as well as inaccurate or unassessed tests the final diagnosis may be incorrect. We therefore used tests as the denominator rather than patients.
Methods
The study took place in a UK inner city teaching hospital that
provides a referral service for primary care and other specialties.
We examined the notes of all consecutive patients referred to
the respiratory outpatient clinic in a three month period and
recorded the first three eligible tests ordered for each patient.
We included tests if they were performed to make a diagnosis
or to assess a prediagnosed condition. We excluded tests performed
as part of routine preclinical investigation and tests, such
as full blood count, if they seemed to have been performed without
any specific diagnosis in mind. Routine clinical examination
was not included. The tests used were recorded along with the
question that they were being used to answer. We used these
test-question combinations as the denominator for this studyfor
example, "serum angiotensin converting enzyme concentration
to diagnose sarcoidosis" or "serum angiotensin converting enzyme
concentration to assess activity of known sarcoidosis" were
considered separately.
We divided tests into three groups: group A comprised tests aimed at making a diagnosis; group B comprised tests performed to assess a previously diagnosed condition; and group C was a trial of therapy, which we included as a special type of test, when a drug was prescribed for a limited period with the explicit intention of predicting future response in an individual. A comprehensive Medline search was performed (1966-2001) for each test-question combination by two researchers experienced in searching medical databases. We used a published strategy with a sensitivity of 92%9 followed by a freely improvised search for each test-question pair. The best evidence that we retrieved for each test-question was graded according to the scale devised by the Centre for Evidence Based Medicine, Oxford, (www.cebm.net/levels_of_evidence.asp) (table 1). Some group A tests were regarded as absolutely specific and therefore graded as level 1c. In group C we searched for evidence that the result of a short term trial could predict the usefulness of a drug for an individual in the longer term.
Results
Referrals were received for 90 patients during the three month
period. Patients were seen by a consultant (PJS) or specialist
registrar (or equivalent) in the same team. Not all patients
had three eligible tests. A total of 165 tests were recorded,
137 in group A, 15 in group B, and 13 in group C. The tests
could be represented as 38 different test-question combinations;
26 in group A, 5 in group B, and 7 in group C.
Table 2 shows
the best evidence found for each test categorised and ranked
according to the Centre for Evidence Based Medicine criteria.
The finding of visible tumour on bronchoscopy with histological
confirmation and the finding of mycobacterium tuberculosis in
bronchial washings when tuberculosis was the suspected diagnosis
were regarded as absolutely specific and therefore level 1c.
Both investigators agreed on the level of evidence assigned
to each study. In group A there was level 1a-1c evidence for
half of the of test-question combinations and in group B a fifth.
In group C we found no studies that examined the predictive
role for five of the seven therapeutic trials. In the case of
trials of oral or inhaled corticosteroids in chronic obstructive
pulmonary disease we found literature that we thought did not
show that these trials were predictive.
View this table:
[in this window]
[in a new window]
|
Table 2 Test-context combinations and best evidence found by literature review (or by applying rule that absolutely specific tests are level 1c)
|
|
Discussion
Few, if any, diagnostic tests give unambiguous results. To deal
with this we are advised to combine clinical impressions of
pretest probability with test results to derive a post-test
probability of disease.
10 This requires that the test be assigned
a weighting, expressed formally as a likelihood ratiothat
is, calculated from the results of scientific studies of the
test's performance. Standards for research of diagnostic tests
have been published,
7 and when these standards are not met studies
have been shown to overestimate the value of tests.
11 Many of
the trials of diagnostic tests that are available fall short
of these standards.
| What is already known on this topic
Correct interpretation of test results requires information from scientific studies of test performance
If the studies do not meet quality standards the value of the test tends to be overestimated
What this study adds
Many diagnostic tests and tests used to monitor disease are not supported by high quality evidence
| |
In 1996-7 only 30% of studies in one survey met at least six of eight standards11 and a similar survey in 1990-3 gave a figure of only 18%.12 Studies that evaluate diagnostic tests are also relatively rare. In a search of four prominent journals over a period of 16 years only 112 studies gave information on sensitivity, specificity, or likelihood ratios derived from more than 10 participants.13 It is therefore not surprising that a survey of 300 clinicians in a range of different specialties found that only 4% used formal methods to assess the accuracy of tests and 1% utilised likelihood ratios.14 Only half of the common tests we identified were supported by level 1a-1c evidence. We have also shown that there is little evidence to support tests that were used to assess previously diagnosed chronic diseases. The use of therapeutic trials to predict long term efficacy from short term response was similarly unsupported.
Our study reflects the practice in a single unit and the proportion of evidence based tests used elsewhere may be higher. Nevertheless, there is a clear need for further high quality research into medical tests, at least in the specialty that we have studied. There is also a need for an evidence base for the use of trials of therapy.
Contributors: PS had the original idea for the study. PJS and
PS designed the study. PJS and ZB surveyed case notes, performed
literature searches, and graded evidence. CH surveyed case notes.
All authors commented on drafts. PJS is guarantor and can provide
further details of the evidence found.
Funding: None.
Competing interests: None declared.
References
- McSherry R, Haddock J. Evidence-based health care: its place within clinical governance. Br J Nurs
1999;8: 113-7.[Medline]
- Guyatt GH, Sackett DL, Cook DJ. Users'guides to the medical literature. II. How to use an article about therapy or prevention. A. Are the results of the study valid? Evidence-based medicine working group. JAMA
1993;270: 2598-601.[Free Full Text]
- Ellis J, Mulligan I, Rowe J, Sackett DL. Inpatient general medicine is evidence based. A-Team, Nuffield Department of Clinical Medicine. Lancet
1995;346: 407-10.[CrossRef][ISI][Medline]
- Gill P, Dowell AC, Neal RD, Smith N, Heywood P, Wilson AE. Evidence based general practice: a retrospective study of interventions in one training practice. BMJ
1996;312: 819-21.[Abstract/Free Full Text]
- Geddes JR, Game D Jenkins NE Peterson LA Pottinger GR Sackett DL. What proportion of primary psychiatric interventions are based on evidence from randomised controlled trials? Qual Health Care
1996;5: 215-7.[Abstract]
- Howes N, Chagla L, Thorpe-M, and McCulloch-P. Surgical practice is evidence based. Br J Surg
1997;84: 1220-3.[CrossRef][ISI][Medline]
- Jaeschke R, Guyatt G, Sackett DL. Users' guides to the medical literature. III. How to use an article about a diagnostic test. A. Are the results of the study valid? Evidence-based medicine working group. JAMA
1994;271: 389-91.[CrossRef][ISI][Medline]
- Ransohoff D, Feinstein AR. Problems of spectrum and bias in evaluating the efficacy of diagnostic tests. N Engl J Med
1978;299: 926-30.[Abstract]
- Haynes RB, Wilczynski NL, McKibbon MA, Walker CJ, Sinclair JC. Developing optimal search strategies for detecting clinically sound studies in Medline. J Am Med Inform Assoc
1994;1: 447-58.[Abstract/Free Full Text]
- Jaeschke R, Guyatt GH, Sackett DL. Users' guides to the medical literature. III. How to use an article about a diagnostic test. B. What are the results and will they help me in caring for my patients? Evidence-based medicine working group. JAMA
1994;271: 703-7.[CrossRef][ISI][Medline]
- Lijmer JG, Mol BW, Heisterkamp S, Bonsel GJ, Prins MH, van der Meulen JHP, et al. Empirical evidence of design related bias in studies of diagnostic tests. JAMA
1999;282: 1061-6.[Abstract/Free Full Text]
- Reid MC, Lachs MS, Feinstein AR. Use of methodological standards in diagnostic test research: getting better but still not good. JAMA
1995;274: 645-51.[Abstract]
- Reid C, Lachs MS, Feinstein AR. Use of methodological standards in diagnostic test research. JAMA
1995;274: 645-51.
- Reid C, Lane DA, Feinstein AR. Academic calculation versus clinical judgements: practicing physicians' use of quantitative measures of test accuracy. Am J Med
1998;104: 374-80.[CrossRef][ISI][Medline]
(Accepted September 4, 2003)
Related Articles
-
Many investigative tests are not evidence based
BMJ 2003 327: 0.
[Full Text]
-
Dawn of the diagnostic age
- Kamran Abbasi
BMJ 2003 327: 0.
[Extract]
[Full Text]
[PDF]
This article has been cited by other articles:
-
Keeley, P. W., Waterhouse, E. T., Noble, S. I.R.
(2007). The evidence base of palliative medicine: is inpatient palliative medicine evidence-based?. Palliat Med
21: 623-627
[Abstract]
-
Beydon, N., Davis, S. D., Lombardi, E., Allen, J. L., Arets, H. G. M., Aurora, P., Bisgaard, H., Davis, G. M., Ducharme, F. M., Eigen, H., Gappa, M., Gaultier, C., Gustafsson, P. M., Hall, G. L., Hantos, Z., Healy, M. J. R., Jones, M. H., Klug, B., Lodrup Carlsen, K. C., McKenzie, S. A., Marchal, F., Mayer, O. H., Merkus, P. J. F. M., Morris, M. G., Oostveen, E., Pillow, J. J., Seddon, P. C., Silverman, M., Sly, P. D., Stocks, J., Tepper, R. S., Vilozni, D., Wilson, N. M., on behalf of the American Thoracic Society/Europea,
(2007). An Official American Thoracic Society/European Respiratory Society Statement: Pulmonary Function Testing in Preschool Children. Am. J. Respir. Crit. Care Med.
175: 1304-1345
[Full text]
-
Vestbo, J.
(2006). Clinical Assessment, Staging, and Epidemiology of Chronic Obstructive Pulmonary Disease Exacerbations. Proc Am Thorac Soc
3: 252-256
[Abstract]
[Full text]
-
Montella, S., Andreucci, M. V., Greco, L., Barbarano, F., De Stefano, S., Brunese, L., Santamaria, F.
(2005). Clinical utility of CT in children with persistent focal chest abnormality. Eur Respir J
26: 751-752
[Full text]
-
Dundas, I, Chan, E Y, Bridge, P D, McKenzie, S A
(2005). Diagnostic accuracy of bronchodilator responsiveness in wheezy children. Thorax
60: 13-16
[Abstract]
[Full text]
-
(2004). Many Diagnostic Tests Aren't Supported By Evidence. JWatch Gastroenterology
2004: 6-6
[Full text]
-
(2004). Many Diagnostic Tests Aren't Supported By Evidence. JWatch General
2004: 4-4
[Full text]
Rapid Responses:
Read all Rapid Responses
- Easy Isn't
- Ned Hoke
bmj.com, 17 Nov 2003
[Full text]
- RCTs are not everything
- Kath H Checkland
bmj.com, 17 Nov 2003
[Full text]
- A plea for research enhanced medicine
- Rod A Lawson
bmj.com, 18 Nov 2003
[Full text]
- Dont confuse validity with levels of evidence
- Ben d Ewald, et al.
bmj.com, 1 Dec 2003
[Full text]