Rate of over-diagnosis of breast cancer 15 years after end of Malmö mammographic screening trial: follow-up study
BMJ 2006; 332 doi: https://doi.org/10.1136/bmj.38764.572569.7C (Published 23 March 2006) Cite this as: BMJ 2006;332:689All rapid responses
Rapid responses are electronic comments to the editor. They enable our users to debate issues raised in articles published on bmj.com. A rapid response is first posted online. If you need the URL (web address) of an individual response, simply click on the response headline and copy the URL from the browser window. A proportion of responses will, after editing, be published online and in the print journal as letters, which are indexed in PubMed. Rapid responses are not indexed in PubMed and they are not journal articles. The BMJ reserves the right to remove responses which are being wilfully misrepresented as published articles or when it is brought to our attention that a response spreads misinformation.
From March 2022, the word limit for rapid responses will be 600 words not including references and author details. We will no longer post responses that exceed this limit.
The word limit for letters selected from posted responses remains 300 words.
I have one more thought on the matter and submitted the following for
the
print issue with my colleagues Lisa M Schwartz and Steven Woloshin.
CONTEXT: In this issue of BMJ, Zackrisson et. al. report on follow-up
data
from the Malmo mammographic screening trial and conclude that the rate of
overdiagnosis of breast cancer was 10%. They do not, however, calculate
the
risk we believe is most relevant to women considering mammography: what
is the chance that a screen-detected cancer represents overdiagnosis?
WHAT WAS REPORTED: After 15 years of follow-up, there were 1320
diagnosed in the screened group and 1205 in the control group (Zackrisson
Table 1). The excess detection of 115 cancers associated with screening
led
to their conclusion of an overdiagnosis rate of 10% (=115/1205).
THE PROBLEM: Because the intervention had stopped 15 years earlier
and yet
breast cancer cases continue to accumulate in both groups, their approach
understates the risk of overdiagnosis.
THE SOLUTION: A more relevant denominator is the number of cancers
found
in the screen group at the end of the trial – 741 (Zackrisson Table 2).
This
addresses the question: Were I found to have cancer after being randomized
to screening, how likely is it to represent overdiagnosis? As shown in
the
Figure below, using this denominator the risk of overdiagnosis is 15%
(=115/
741).
However, many of the cancers detected in screened group are not
detected by
screening. They are instead clinically detected (either during the
interval
between screening exams or among non-attenders). The most relevant
denominator is the number of screen-detected cancers found at the end of
the trial. This addresses the question: Were I found to have cancer by a
mammogram, how likely is it to represent overdiagnosis?
Although this denominator is not reported by Zackrisson et. al. , the
original
BMJ article describing Malmo reported that 64% of the cancers detected in
the
screened group were detected by screening mammography (BMJ 1988;297:
943-8). Thus one can deduce that the number of screen-detected cancer at
the end of the trial was about 475. As shown in the Figure below, using
this
denominator the risk of overdiagnosis is 24% (=115/475).
Competing interests:
None declared
Competing interests: No competing interests
Updated results from the Malmö mammography screening trial have
suggested that screening caused an overdiagnosis of breast cancer of 10%
in women aged 55-69 years at randomisation (1). The authors noted that
evidence from randomised trials on the level of overdiagnosis was lacking.
This is not correct. Based on data from the Malmö trial (2), and the two
trials from Canada (3), we have previously estimated a level of
overdiagnosis of 30% (mean follow-up 8.8 and 7 years, respectively) (4)
and have also suggested an overdiagnosis of 33% in the other Swedish
trials, based on number of cancers identified before the control group was
screened (5).
In their paper (1), the authors followed the women for an additional
15 years after the trial ended and noted that they could have
underestimated the level of overdiagnosis as some asymptomatic women in
the control group received mammograms. They did not quantitate this, but
in their original trial report (2) they noted that 24% of a random sample
of 500 women in the control group had undergone mammography during the
trial period at least once. The authors now report (1) that women aged 55-
69 years were never invited to screening after the trial ended, but it
might be expected that many of them - after having belonged to the control
group in a trial for so long - would have undergone mammography at least
once subsequently. If we assume (rather conservatively, compared to the
24% during the trial), that one quarter of the women had undergone
mammography for the first time in their lives during these additional 15
years of follow-up, it means that about half of the women in the control
group received mammograms. This would change the estimated level of
overdiagnosis from 10% to about 20%. If we assume that half of these women
received mammograms after the trial, the estimate becomes 40%. It is
therefore essential that the authors provide data on use of mammography
after the trial ended.
Because of the unavoidable screening in the control groups of the
trials, and the small sample size in the Malmö trial and therefore a wide
confidence interval for the overdiagnosis estimate, it is necessary to
look also at large and long-term observational studies of the increase in
the incidence of breast cancer after screening was introduced. Such data
exist from USA (5), UK (6), Australia (7), and Sweden (8,9) and they
suggest an overdiagnosis of about 40-60%. These estimates could be
somewhat inflated because of a possible concomitant increase in the use of
hormone replacement therapy which causes breast cancer, but this would
only explain a minor part of the increases in the incidence of breast
cancer. We therefore believe that our original estimate of 30%
overdiagnosis with screening (4) is still a very reasonable one.
1. Zackrisson S, Andersson I, Janzon L, Manjer J, Garne JP. Rate of
over-diagnosis of breast cancer 15 years after end of Malmö mammographic
screening trial: follow-up study. BMJ, doi:10.1136/bmj.38764.572569.7C
(published 3 March 2006).
2. Andersson I, Aspegren K, Janzon L et al. Mammographic screening
and mortality from breast cancer: the Malmo mammographic screening trial.
BMJ 1988;297:943–48.
3. Miller AB. The costs and benefits of breast cancer screening. Am J
Prev Med 1993;9:175–80.
4. Olsen O, Gøtzsche PC. Cochrane review on screening for breast
cancer with mammography. Lancet 2001;358:1340–42.
5. Gøtzsche PC. On the benefits and harms of screening for breast
cancer. Int J Epidemiol 2004;33:56-64.
6. Douek M, Baum M. Mass breast screening: Is there a hidden cost? Br
J Surg 2003;90:44-5.
7. Barratt A, Howard K, Irwig L, Salkeld G, Houssami N. Model of
outcomes of screening mammography: information to support informed
choices. BMJ 2005;330:936-8.
8. Zahl PH, Strand BH, Mæhlen J. Incidence of breast cancer in Norway
and Sweden during introduction of nationwide screening: prospective cohort
study. BMJ 2004;328:921-4.
9. Jonsson H, Johansson R, Lenner P. Increased incidence of invasive
breast cancer after the introduction of service screening with mammography
in Sweden. Int J Cancer 2005;117:842-7.
Competing interests:
None declared
Competing interests: No competing interests
To the editor:
While I complement Zackrisson et al on their investigation of
overdiagnosis in
the Malmo trial, I am concerned that they have inadvertently understated
the
magnitude of the problem.
Their “bottom line” figure of a 10% rate of overdiagnosis is based on
a follow-
up period that includes 15 years after the trial ended. Longer follow-up
necessarily dilutes the relative incidence rate among the two groups - in
fact,
it approaches unity as both groups accumulate more cases of cancer. This
is
best understood with a simple example.
Imagine a randomized trial of screening mammography in which all
excess
cases represent overdiagnosis. If the cumulative number breast cancers at
the end of the trial was 300 in the screened group and 200 in the control
group, the RR (screened to control) would be 1.5 – implying a
overdiagnosis
rate of 50%. Now imagine that in the ensuing 15 years (when both groups
are
cared for similarly) an additional, say, 800 cases are diagnosed in each
group.
Even though there are no "catch up" cases diagnosed in the control group
the
overall RR falls to 1.1 (1100 vs. 1000) – implying a overdiagnosis rate of
10%.
To correct for this dilutional effect, the appropriate denominator
for
overdiagnosis is the estimated number of cancers in the control group at
the
end of the trial. This number includes cancers that were detected during
the
trial plus those cancers that can be inferred to exist at the end of the
trial
because they are subsequently diagnosed as a “catch-up” cancer.
At the end of the Malmo trial, 591 cancers were diagnosed in the
control
group while 741 were diagnosed in the screened group. One can infer that
there are an additional 35 “catch up” cancers that ultimately appear in
the
control group (614 cancers that appear in follow-up among controls minus
579 cancers that appear in follow-up among those screened). Thus the
estimated number of cancers in the control group at the end of the trial
is
626 (591+35) and the final totals at the end of screening are 741 vs 626,
producing an RR of 1.18 – implying that mammography in Malmo was
associated with an overdiagnosis rate of 18%.
Competing interests:
None declared
Competing interests: No competing interests
In their follow-up of the Malmö mammography trial Zackrisson et al.
[1] state that the reported levels of overdiagnosis vary from 5% to 50%.
However, to use cumulative incidence rates at the end of follow-up to
quantify the level of overdiagnosis is confusing because the resulting
estimates are highly sensitive to both the length of the follow-up periods
and the length of the screening periods. Suppose for example that during
screening from age 40 to 49 the incidence is increased by 50% and that
none of these extra cancers would have been detected in the patient’s
lifetime in the absence of screening. In this example the level of
overdiagnosis as defined by Etzioni et al [2] would be 50% irrespectively
of when a follow-up is performed. In contrast, the level of overdiagnosis
as defined by Zackrisson et al would be 20% at a follow-up at age 60 but
only 7% at a follow-up at age 80.
Zackarisson et al. reported that the relative incidence rate for
women aged 45-69 years at randomization was 1.24 (1.12 to 1.39) during 10
years of screening. During the 15 year post screening period a slight
reduction in the relative rate (0.95 (0.85 to 1.06)) compensated for only
a fraction of the excess cases diagnosed during screening. We reported [3]
that during screening the relative rates was 1.45 (1.41 to 1.49) for
Swedish women in the last part of the 1990’s and that only a small
reduction in the relative rates occurred later in life. By analysing data
from the screening program in eleven Swedish counties Jonsson et al. [4]
reached a similar conclusion. We believe that the rising trend in
screening-related overdiagnosis since the time of the Malmö trial reflects
the development of more sensitive screening methods.
References
1. Zackrisson S, Andersson I, Janzon L, Manjer J, Garne JP. Rate of
overdiagnosis of breast cancer 15 years after end of Malmö mammographic
screening trial: follow-up study. BMJ 2006; 332:doi
10.1136/bmj.38764.572569.7C
2. Etzioni R, Urban N, Ramsey S, McIntosh M, Schwartz S, Reid B, et al.
The case for early detection. Nat Rev Cancer 2003; 3: 243-52.
3. Zahl PH, Strand BH, Mæhlen J Breast cancer incidence in Norway and
Sweden during introduction of nation-wide screening: a prospective cohort
study. BMJ 2004; 328: 921-4.
4. Jonsson H, Johansson R, Lenner P. Increased incidence of invasive
breast cancer after the introduction of service screening with mammography
in Sweden. Int J Cancer 2005; 117; 842-7.
Competing interests:
None declared
Competing interests: No competing interests
Quantification of overdiagnosis : simple solutions are not necessarily true
The paper by Zackrisson et al.(1), reporting the follow-up of the
Malmo trial, provides important evidence. This is one of the few studies
with a long follow-up and, as one would expect from scientists, the
authors suggest an interpretation of their data, without assuming this to
be necessarily the truth. Zackrisson et al’s study should be considered in
the context of other evidence about over-diagnosis. Results, as usual,
need careful interpretation and different aspects of the study have to be
considered, including its design, its duration, and its statistical power.
There have been several breast cancer screening studies in which over-
diagnosis has been estimated, and estimates have varied considerably. The
authors were well aware of all these controversial aspects, and they made
a further step to solve a difficult problem.
It looks like BMJ (and other journals) are not accustomed to discuss
screening issues in this unavoidable complex way, and they seem to prefer
sensational attention-grabbing headlines. This attitude is exemplified by
the letter by Welch et al. (2) published as a rapid response on the BMJ
website and then on the journal issue on breast cancer screening . Drawing
instinctual conclusions about issues that are, unfortunately, difficult
and sometimes abstruse, is not the right approach for serious
understanding and debate.
The conclusion of the editorial by Fiona Godlee
about over-diagnosis in breast cancer screening and the Norfolk trial is
an example of how scientific data may be discussed with strong
preconceptions. Comparing the difficult issue of harm and benefit balance
within a medical intervention with the Norfolk nightmare is, to say the
least, provocative.
Zackrisson et al. have attributed an excess of incidence at the end
of the study to over-diagnosis. This is a plausible interpretation, and
the estimate of a 10% excess in their data is possibly correct.
Welch and
al. (2) suggest a short cut, attributing the end-of-study excess incidence
first to incident cases by end-of.-trial, and then to trial screen-
detected cases. Their mathematical simplistic exercise seems to ignore
that excess end-of-study incidence might also be due to cancers detected
in the 15 years after trial end, as women did not stop having mammography,
and continued to be exposed to overdiagnosis.
Also Gøtzsche (3)
concentrates its attention on the fact that women in the control arm had
access to mammography (and thus to overdiagnosis) during and after the
trial, and surprisingly does not take into account that the same did
probably occur in women in the screening arm.
If we assume women’s
tendency to perpetuate their previous compliance to screening (after
invitation, for the screening arm, or spontaneous, among controls)
opportunistic exposure to mammography after the end of the trial might
have been higher for women from the screening arm than in women from the
control arm. Were this true, including in the analysis cancers occurring
after end-of-trial would overestimate, rather than underestimate,
overdiagnosis attributable to the trial itself. Analysis by intention-to-
treat is correct, although it is open to interpretation of possible
biases, but excluding incident cancer after end-of -trial, as Welch did,
or ignoring cancers overdiagnosed after end-of-trial in women from the
screening arm, as Gøtzsche did, opens their analysis to biases from other
sources. As for mortality reduction estimates , the intention-to-treat
analysis decreases the magnitude of the effect, but reduces biases.
A second important issue is about statistical uncertainty. Scientific data
are not deterministic. Welch et al. presented data giving the impression
that their point estimate is the truth. However, extrapolating a
scientific result with no consideration for statistical power is a
distortion of evidence. Such an exercise can be done only if the results
are strong enough (for example, the results of meta-analyses or very large
studies) and based on clear evidence. As acknowledged in Gøtzsche’s
letter (3), the Malmo trial had limited power.
Assessment of over-diagnosis in cancer screening is difficult, and should
be done with methodological attention. Gøtzsche’s estimate of a 30% excess
incidence was based only on some of the existing randomised trials.
It is
surprising that the editor of the BMJ has not referred to others sources
of overdiagnosis estimates except for Gøtzsche’s. In the IARC handbook on
breast cancer screening (4) the overdiagnosis issue has been reviewed
concluding for no evidence at incident screening, and uncertainty about
the magnitude at prevalent screening. Sue Moss, in a review of randomised
trials (5), estimated excess incidence to be 11% in trials without
screening of the control group (the Malmo trial was not yet updated in
that review).
Over-diagnosis is an old concept in screening (a major issue, for example,
in prostate cancer screening), and there is no need to be complacent.
Unfortunately, the lack of good empirical data makes quantification of
over-diagnosis in breast cancer screening difficult and controversial.
This is possibly the reason for holding back information about over-
diagnosis from screening leaflets: it is difficult to communicate when you
do not know enough.
Informed decision making in organised screening programmes has improved in
recent years and continuous improvement is needed in the future, including
correct statements about the over-diagnosis issue. However, these
statements should inform women of the conclusions of sound evaluation of
research in this field, and not simply of what Gøtzsche or Welch believe
is the truth.
Our present knowledge, based on several studies, suggests
that the problem of over-diagnosis in breast cancer service screening is
much less significant than the recent BMJ issue has claimed. Quoting
Hyppocrates was absolutely correct. The “first do no harm” rule might also
apply to simplistic and superficial conclusions discrediting a current
health policy which has been demonstrated to have a major impact on breast
cancer mortality.
References
1. Zackrisson S, Andersson I, Janzon L, Manjer J, Garne JP. Rate of
over-diagnosis of breast cancer 15 years after end of Malmo mammographic
screening trial: follow-up study. BMJ 2006; 332(7543): 689-92.
2. Welch HG, Schwartz LM, Woloshin S. Ramifications of screening for
breast cancer: 1 in 4 cancers detected by mammography are pseudocancers.
BMJ. 2006;332(7543):727.
3. Gotzsche PC. Ramifications of screening for breast cancer:
overdiagnosis in the Malmo trial was considerably underestimated. BMJ
2006;332(7543):727.
4. IARC Handbooks of Cancer Prevention. Vol.7: Breast Cancer
Screening.. Lyon, France: IARC; 2002, 248
5. Moss S. Overdiagnosis and overtreatment of breast cancer:
overdiagnosis in randomised controlled trials of breast cancer screening.
Breast Cancer Res. 2005;7:230-4.
Competing interests:
None declared
Competing interests: No competing interests