Jump to: Page Content, Site Navigation, Site Search,
You are seeing this message because your web browser does not support basic web standards. Find out more about why this message is appearing and what you can do to make your experience on this site better.
Ann Oakley Social Science Research Unit,
University of London Institute of Education, London WC1H 0NS
a.oakley{at}ioe.ac.uk
The research design of the randomised controlled trial is
primarily associated today with medicine. It tends either to be ignored
or regarded with suspicion by many in such disciplines as health
promotion, public policy, social welfare, criminal justice, and
education. However, all professional interventions in people's lives
are subject to essentially the same questions about acceptability and
effectiveness. As the social reformers Sidney and Beatrice Webb pointed
out in 1932, there is far more experimentation going on in "the world
sociological laboratory in which we all live" than in any other kind
of laboratory, but most of this social experimentation is "wrapped in
secrecy" and thus yields "nothing to
science."1
The Webbs argued for a more "scientific" social policy, with
social scientists being trained in experimental methods and evaluations of social interventions being carried out by independent investigators. They were apparently unaware that a strong tradition in experimental sociology had already been established, mainly in the United States. This was a precursor to a period between the early 1960s and the late
1980s when randomised controlled trials became the ideal for American
evaluators assessing a wide range of public policy interventions. This
history is conveniently overlooked by those who contend that randomised
controlled trials have no place in evaluating social interventions. It
shows clearly that prospective experimental studies with random
allocation to generate one or more control groups is perfectly
possible in social settings. Notably, too, the history of
experimentation in social science predates that in medicine in certain
key respects.
The original meaning of "control" is "check" In 1901 the American educationalists Thorndike and Woodworth identified
the need for a control group in their experiments on the use of
training to improve mental function.4 A series of
experiments with schoolchildren that addressed questions about the
transferability of memory skills from one subject to another, reported
by Winch in 1908,5 were among the first to use the design
of pretest, intervention, post-test in the experimental group and
pretest, nothing, post-test in the control group. These educational and
psychology researchers invented randomised assignment to experimental
treatments and Latin square designs independently of, and considerably
earlier than, R A Fisher's work at the Rothamsted Agricultural
Research Station.6 The psychologist C S Peirce introduced
both the idea of randomisation and that of "blindness" into
psychology experiments in the 1880s.7
Selection of experimental and control subjects by means of the
principle of chance is described in McCall's How to Experiment in Education, published in 1923: "Representativeness [of
research subjects] can be secured by making a chance selection from
the total group, or a chance selection from a chance portion of the total group .... Just as representativeness
can be secured by the method of chance, so equivalence may be secured
by chance .... One method of equating by
chance is to mix the names of the subjects to be used. Half may be
drawn at random. This half will constitute one group while the other
half will constitute the other group."8 McCall's book
also describes the Latin square design under the name of the
"rotation experiment"; this had been used in educational
experiments as early as 1916.9
The major impetus driving these new approaches to assessing
effectiveness was not the desire to imitate natural science, but, rather, to respond to an uneasiness within the research community of
educational psychology about the inability of existing evaluation methods to rule out plausible rival hypotheses. Similar methodological developments were occurring in other spheres. For example, in 1924-5 an
experiment using a mail campaign to increase electoral turnout was
carried out in Chicago, in which housing precincts were assigned either
to receive individual mail appeals or not.10 This
experiment followed earlier research which had suggested that the
strength of local party organisation was the main factor distinguishing
voters from non-voters, but the research design used in the first study
had made it impossible to have confidence in this finding. Thus, in the
social field as well as later in medicine, the advantages of
prospective experimental studies with randomly chosen controls were
seen to offer an important solution to the problem of linking
intervention with outcomes.
Two other American social scientists, Ernest Greenwood at Columbia
University and F Stuart Chapin at the University of Minnesota, pioneered the application of experimental methods to the study of
social problems in the early decades of the 20th century. Chapin first
wrote on this theme in 1917; his Experimental Designs in Sociological Research, published in 1947, details nine
experimental studies carried out by his research team and a number
undertaken elsewhere covering such topics as rural health education,
the social effects of public housing, recreation programmes for
"delinquent" boys, and the effects of student participation in
extracurricular activities.11 Chapin was particularly
interested in reviewing the use of experimental research designs in
"the normal community situation" because of the objection, voiced
at the time, that experimental studies could only be done in
"laboratory" settings.
Ernest Greenwood's Experimental Sociology, published in
1945, outlined the theoretical rationale for applying experimental methods to social issues.12 He defined an experiment as
"the proof of a causal hypothesis through the study of two controlled contrasting situations," recommended the use of case studies as a
prelude to experimental research, and supported Fisher's strategy of
randomisation as the best way of securing equivalent study groups.
Chapin's and Greenwood's interest in experimental research designs
was stimulated by the social reform concerns of the Depression, and
informed by a desire to establish the most effective methods of
improving people's lives. Their work was part of a general move in the
United States to make social science more experimental; by 1931 at
least 26 universities there were offering courses in experimental
sociology.13
Donald Campbell and Julian Stanley's Experimental and
Quasi-experimental Designs for Research published in
196614 is to social research what Fisher's Design
of Experiments (1935) is to medical research. Campbell's paper
"Reforms as experiments" established an explicit link between
social reform and the use of rigorous experimental
design.15 His complaint that the randomised control group
design had not often been used in the social arena prompted another
American experimentalist, Robert Boruch, to publish a bibliography of
these in 1974.16 This listed 83 "randomised field
experiments" in such areas as criminal justice, legal policy, social
welfare, education, mass communications, and mental health. A revised
version of the bibliography produced four years later updated the total
in these areas to 245.17
This period in the United States has been nicknamed the "golden age
of evaluation."18 It was one in which there was an
enormous burst of activity in applying the randomised controlled trial design to the evaluation of public policy. The table shows nine of the
major evaluations of broadly based social programmes initiated between
the 1960s and early 1980s. Four of the studies were of income
maintenance experiments,19-23 one focused on an
experimental housing allowance scheme,
24 25
two examined
programmes for supporting disadvantaged workers,
19 26
and
two examined interventions for former prison inmates.27
All the studies included one or more prospectively generated control
groups, either by some method of random allocation or by matching.
Supporting all this effort was a government mandate specifying that 1%
of budgets for social programmes had to be spent on evaluation. There
was widespread recognition that social services were in a mess while
expenditure on them was rising exponentially; and, for a time at least,
there was a consensus in policy circles that randomised controlled
experiments provided the best way of assessing
effectiveness.
Summary points
Many social scientists argue that randomised controlled
trials are inappropriate for evaluating social interventions, but they
ignore a considerable history, mainly in the United States, of the use
of randomised controlled trials to assess different approaches to
public policy and health promotion
A tradition of experimental sociology was well established by the
1930s, built on the early use of controlled experiments in psychology
and education
From the early 1960s to early 1980s randomised experiments were
considered the optimal design for evaluating public policy
interventions in the United States, and major evaluations using this
design were carried out
This approach became less popular as policy makers reacted negatively
to evidence of "near zero" effects
Lessons to be learnt about implementing randomised controlled trials in
real life settings include the difficulty of assessing complex
multi-level interventions and the challenge of integrating qualitative
data
![]()
A short history of control groups
the word
comes from "counter-roll," a duplicate register or account made to
verify an official account.2 The term "control"
entered scientific language in the 1870s in the sense of a standard of
comparison used to check inferences deduced from an experiment. The
main use of the term was in experimental psychology.3
![]()
Experimental sociology
![]()
A golden age of evaluation
Other evaluations (not shown in the table) carried out during this period included the Manhattan bail bond experiment with pre-trial release for prisoners,28 the Rand Corporation's well known study of health insurance (several components of which used a randomised controlled trial design),29 and studies of educational performance contracting.30
The reasons why the use of randomised controlled trials in evaluating
policy interventions has declined in attractiveness in the United
States over the past 20 years are as interesting as those explaining
its acceptance in the first place. A primary one was disenchantment
with the apparent ineffectiveness (sometimes seemingly damaging
effects) of the interventions in some of the evaluations. Secondly,
policy makers were often impatient with the length of time it took for
evaluations of their favoured approaches to provide answers: this was
particularly marked in the case of the income experiments. As Senator
Moynihan appositely said, "The bringing of systematic inquiry to bear
on social issues is not an easy thing. There is no guarantee of
pleasant and simple answers, but if you make a commitment to an
experimental mode it seems to me ... something larger
is at stake when you begin to have to deal with the
results."31
| |
Conclusions |
|---|
All claims to successful expertise need to tackle the issue of
causal inference
how do people know that what they do works, and how
can they reasonably demonstrate this to others? As Stanley noted in
1957, "Expert opinions, pooled judgements, brilliant intuitions, and
shrewd hunches are frequently misleading."32 Among the
reasons why randomised controlled trials gained legitimacy in medicine
was the realisation that the decisions of the medical profession need
to be regulated.33 The history of social experimentation indicates clearly that all the same issues have attended attempts to
evaluate the impact of social interventions.
Experts in the social domain, like those in medicine, have resisted the notion that rigorous evaluation of their work is more likely to give reliable answers than their own individual preferences. When randomised controlled trials find that new "treatments" are no better than old ones, a retreat to other methods of evaluation is particularly likely, as though the prime task is not to identify whether anything works but to prove that something does.
The forgotten history of social experimentation also shows that, as in clinical research, implementing randomised controlled trials in real life settings commonly carries a number of hazards: low participation rates or high attrition, problems with "informed consent," unanticipated side effects of the intervention, a problematic relation between research and policy.
There are many lessons to be learnt from this experience about the
challenges of randomised controlled trials, including the difficulty of
establishing the effectiveness of complex multi-level interventions and
the problem of integrating ethnographic or qualitative data. But, as
Chapin wrote in 1931, "Experimental method in sociology does not mean
interference with individual movement or freedom. It does not endanger
life or limb or moral character."34 On the contrary,
what randomised controlled trials offer in the social domain is exactly
what they promise to medicine: protection of the public from
potentially damaging uncontrolled experimentation and a more rational
knowledge about the benefits to be derived from professional
intervention.
| |
References |
|---|
(Accepted 1 October 1998)
Read all Rapid Responses
What can you learn from this BMJ paper? Read Leanne Tite's Paper+