Rapid Responses to:

PAPERS:
Thomas V Perneger
Relation between online "hit counts" and subsequent citations: prospective study of research papers in the BMJ
BMJ 2004; 329: 546-547 [Full text]
*Rapid Responses: Submit a response to this article

Rapid Responses published:

[Read Rapid Response] Those papers read most are cited most
Jon R Brassey   (3 September 2004)
[Read Rapid Response] Citation errors
Phillip J. Colquitt   (3 September 2004)
[Read Rapid Response] Prior evidence that downloads predict citations
Stevan Harnad, Tim Brody   (6 September 2004)
[Read Rapid Response] On 'hit counts'
Maged N.K. Boulos   (6 September 2004)
[Read Rapid Response] Real hits and shadow hits
Paul R. Mazur   (6 September 2004)
[Read Rapid Response] Publication of Hit counts May Lead to Cheating.
Richard D Kennedy, Ruth S Barr   (6 September 2004)
[Read Rapid Response] Scientific value versus scientific interest
Jeffrey Mann   (6 September 2004)
[Read Rapid Response] Annoying that the BMJ published wrong hits data on their website
Gunther Eysenbach   (6 September 2004)
[Read Rapid Response] Re: Annoying that the BMJ published wrong hits data on their website
Tony Delamothe   (6 September 2004)
[Read Rapid Response] In response
Thomas V Perneger   (7 September 2004)
[Read Rapid Response] Journal Readership and Impact Factors
Kai Ming Chow, Cheuk Chung Szeto   (8 September 2004)
[Read Rapid Response] Journal online bias
Syed Abdul Mujeeb   (10 September 2004)

Those papers read most are cited most 3 September 2004
 Next Rapid Response Top
Jon R Brassey,
NPHS, Wales and TRIP Database Ltd
NPHS, Mamhilad House, Pontypool NP4 0YP

Send response to journal:
Re: Those papers read most are cited most

I'm slightly concerned by the conclusions drawn by the author. It was not clear how the 'hits' are counted for each article. Do the hits only refer to people reading the full text?

With regard to the hypothesis that readers judge the "scientific value" of a paper I can see no evidence presented to support this. I would be more supportive of this notion if the hit rate showed people went from reading the abstract to reading the full text. If this is the case it is not presented in the paper.

Either way isn't a simpler conclusion and more obvious conclusion "those papers that are read more are cited more"? If a paper isn't read it isn't going to be cited!

Competing interests: None declared

Citation errors 3 September 2004
Previous Rapid Response Next Rapid Response Top
Phillip J. Colquitt,
Technician and RN
Independent comment

Send response to journal:
Re: Citation errors

Citation error itself, is a field of study that would consume an average scholar’s life. And this suggests at least, that most cited is not necessarily most read, as has been put in some responses so far. It seems authors do cite inaccurately and quote inappropriately [1].

Usually, evaluations of citation accuracy come from within a given field – eg. anaesthetics, surgery, geriatric nursing. The only field that I know of that publishes little or no such evaluation of citation accuracy is neurology, though obviously the feild has it's errors.

[1]McLellan MF, Case LD, Barnett MC. Trust, but verify. The accuracy of references in four anesthesia journals. Anesthesiology. 1992 Jul;77(1):185-8. PMID: 1609991 [PubMed - indexed for MEDLINE]

Competing interests: None declared

Prior evidence that downloads predict citations 6 September 2004
Previous Rapid Response Next Rapid Response Top
Stevan Harnad,
Canada Research Chair in Cognitive Science
Université du Québec à Montréal,
Tim Brody

Send response to journal:
Re: Prior evidence that downloads predict citations

Pernbeger's (2004) finding that download counts (what we call "usage impact") of British Medical Journal articles predict citation counts ("citation impact") for those articles in subsequent years confirm what Tim Brody's online usage/citation correlator http://citebase.eprints.org/analysis/correlation.php has been demonstrating for several years now across a number of areas in physics and mathematics ( Brody & Harnad 2004, in prep.): There is a significant correlation between downloads today and citations two years later.

This correlation has two immediate implications:

(1) Download counts can be used as early performance indicators for papers and authors, even before their impact is reflected in citation counts: http://citebase.eprints.org/ ( Hitchcock et al. 2003).

(2) Enhancing usage impact is yet another reason for authors to provide open access to their articles by self-archiving them.
References

Brody, T. & Harnad, S. (2004, in prep.) Using Web Statistics as a predictor of Citation Impact. http://www.ecs.soton.ac.uk/~harnad/Temp/timcorr.doc

Harnad, S. & Brody, T. (2004) Comparing the Impact of Open Access (OA) vs. Non-OA Articles in the Same Journals, D-Lib Magazine 10 (6) June http://www.dlib.org/dlib/june04/harnad/06harnad.html

Hitchcock, Steve; Woukeu, Arouna; Brody, Tim; Carr, Les; Hall, Wendy and Harnad, Stevan. (2003) Evaluating Citebase, an open access Web-based citation-ranked search and impact discovery service http://opcit.eprints.org/evaluation/Citebase-evaluation/evaluation-report.html

Perneger, T.V. (2004) Relation between online "hit counts" and subsequent citations: prospective study of research papers in the BMJ. BMJ 2004;329:546-547 (4 September), doi:10.1136/bmj.329.7465.546 http://bmj.bmjjournals.com/cgi/content/full/329/7465/546

Competing interests: None declared

On 'hit counts' 6 September 2004
Previous Rapid Response Next Rapid Response Top
Maged N.K. Boulos,
Lecturer in Healthcare Informatics, University of Bath
Bath BA2 7AY, UK

Send response to journal:
Re: On 'hit counts'

The method used to compute the number of times a manuscript has been accessed ('hit count' as Perneger calls it) is not clear to me. Did the author measure hits or pageviews? A simple Web page with four inline images accessed once will count as five raw hits, but only as a single pageview. The well documented limitations of Web server log analysis methods should have been acknowledged by the author, as well as any adjustments or corrections he might have applied in computing paper access statistics. Issues to consider include [1,2]:

o Multiple accesses to the same page by the same person, which could happen for a range of malicious or legitimate reasons, are often wrongly counted as/attributed to more than one reader. Server log analyses remain inaccurate in this respect even after considering user sessions and referring sites, and introducing user accounts and cookies.

o Visits/hits by Web robots like Google to the pages in question should be identified and excluded.

o Browser/proxy caching effects inevitably result in incomplete registration of all server requests.

o Page popularity/pageviews alone were never an accurate measure of user interest or user-perceived quality and usefulness of accessed pages.

References

1. Boulos MNK. A Two-method Evaluation Approach for Web-based Health Information Services: The HealthCyberMap Experience. In: A Geissbühler, C Boyer, JW van der Slikke, TN Arvanitis (eds). Proceedings of the 8th World Congress on the Internet in Medicine, Geneva, Switzerland, December 2003/Technology and Health Care 2003;11(5):333-334. Amsterdam: IOS Press. [http://www.hon.ch/Mednet2003/abstracts/289146271.html]

2. Tec-Ed, Inc, US. Assessing Web Site Usability from Server Log Files. White Paper. December 1999. [http://www.teced.com/PDFs/whitepap.pdf]

Competing interests: None declared

Real hits and shadow hits 6 September 2004
Previous Rapid Response Next Rapid Response Top
Paul R. Mazur,
Physician
The Urban Medical Group; Jamaica Plain, Boston, MA USA 02130

Send response to journal:
Re: Real hits and shadow hits

Couldn't the relation between online "hit counts" and subsequent citations be a function of something other than content? Consider M. Anantanarayanan's 1961 novel, "The Silver Pilgrimage", all but forgotten were it not for John Updike's wonderfully witty poem, "I missed his book but I read his name." More, I think, will have read the latter (and perhaps have found more content in it, for all its brevity) than the former. But since the Updike poem could not exist without the Anantanarayanan novel, he, Anantanarayanan, increments his own "hit count" by 1 each time a new reader discovers "I missed his book but I read his name."

Competing interests: None declared

Publication of Hit counts May Lead to Cheating. 6 September 2004
Previous Rapid Response Next Rapid Response Top
Richard D Kennedy,
Clinical Research Fellow
Queen's University of Belfast, University Floor, Belfast City Hospital. BT9 7AB,
Ruth S Barr

Send response to journal:
Re: Publication of Hit counts May Lead to Cheating.

We read Perneger’s article with interest. We agree that it would be useful to have a rapid method to evaluate research articles. We are not surprised that historical hit counts reflect future article citation. However, we do not believe that this method of evaluation would work prospectively. If a paper’s prestige depended on a hit count score it would be tempting for an author or institute to artificially inflate the score by repeatedly accessing the article. It would be relatively easy to write software to perform this task. Moreover, it would be possible to make each access appear unique by continually altering the internet provider (IP) address.

At present, citation of a research article by another paper can be considered to have the quality guarantee of peer review. Therefore, we believe that citation of a publication still remains the most reliable method of evaluating its impact.

Competing interests: None declared

Scientific value versus scientific interest 6 September 2004
Previous Rapid Response Next Rapid Response Top
Jeffrey Mann,
Retired physician
Salt Lake City, UT 84103

Send response to journal:
Re: Scientific value versus scientific interest

I cannot understand how the author can rationally use the term "scientific value" in relationship to the number of online "hits" that a journal article receives soon after publication. It would make more sense to have used the term "interest" rather than "value". I suspect that many online "hits" are generated by a controversial research result, which is somewhat unanticipated, rather than a scientifically valuable result. The scientific value of a clinical research result is very dependent on the quality of the research study, which cannot be discerned from the abstract.

Likewise, subsequent citations by other researchers may also reflect their need to deal with an unexpected (unanticipated or controversial) research result, and I know of no EBM evidence that demonstrates that the number of citations is closely correlated with the scientific value of the study. Am I wrongly informed?

Jeff Mann.

Competing interests: None declared

Annoying that the BMJ published wrong hits data on their website 6 September 2004
Previous Rapid Response Next Rapid Response Top
Gunther Eysenbach,
Senior Scientist, Centre for Global eHealth Innovation
Toronto M5G2C4

Send response to journal:
Re: Annoying that the BMJ published wrong hits data on their website

The question remains why the BMJ published wrong "hits" data on their website over the course of 5 years. I had the exact same study idea and a graduate student was working on this project, entering and working with all the (wrong) hits data the BMJ published on the website. This is more than a minor nuissance!

Competing interests: None declared

Re: Annoying that the BMJ published wrong hits data on their website 6 September 2004
Previous Rapid Response Next Rapid Response Top
Tony Delamothe,
web editor, bmj.com
BMA House, Tavistock Sq London

Send response to journal:
Re: Re: Annoying that the BMJ published wrong hits data on their website

I am sorry that Gunther Eysenbach and maybe others have been misled by our original hit parades. As we found the (relatively rough and ready) results interesting, we thought others might too, although we never envisaged researchers using them as the basis for research projects until Pernerger submitted his paper.

The script we originally used to count impressions omitted those that arose from users following email alerts back to source stories - which make up an increasing and substantial proportion of the total. The counts also weren't filtered to remove robot activity and activity originating from our electronic suppliers (HighWire) and the BMJ Publishing Group.

Newly available web traffic tools allow us easily to address these problems and to add counts for all versions of an article - not just the HTML of the full text, as before. As a glimpse at the results on http://bmj.bmjjournals.com/hitparade/ shows, the HTML counts for the full text article are not always a reliable predictor of the total traffic.

Obviously, it would have been preferable had we posted the counts with these caveats in 1999 - but we didn't know they existed until relatively recently.

Other points:

Hit counts = successful requests for pages; the numbers *exclude* all images.

As Perneger said in his Methods and results section, these relate to the HTML version of the full text article.

Competing interests: I'm editor of bmj.com

In response 7 September 2004
Previous Rapid Response Next Rapid Response Top
Thomas V Perneger,
professor of health services evaluation
Institute of social and preventive medicine, University of Geneva

Send response to journal:
Re: In response

Having searched the ISI database by hand, I am humbled by Brody's online hits/citations linkage system, described by Harnad. Harnad and Brody observe that the correlation between hits and citations in the field of math/physics increases over time, to reach 0.43 2 years after publication. This is compatible with my observation of r=0.50 after 5 years.

I doubt we could expect a much higher correlation. Both measures are affected by errors, as Colquitt (citations) and Boulos (hits) correctly observe. Misclassification bias will weaken the observed association. Additional error is due to randomness: if the number of citations has a Poisson distribution, 95% of papers that would ideally get 25 citations will in fact lay between 15 and 35 citaions (remember, expected value=variance). The same reasoning holds for hits.

Brassey and Mann disagree with my hypothesis that scientific value is the hidden link between hits and citations. Well, so far it's just a hypothesis. I do not think that Brassey's explanation (papers that are read more are cited more) is incompatible, as one has to ask: why are some papers read and others not? would scientific value have something to do with it? Also, authors (I hope!) do not cite papers they remember reading online a couple of years ago - usually a systematic search is performed.

As for scientific value versus scientific interest (Mann's argument), I am not sure I can see these concepts as clearly distinct. To me, "scientific value" is not an objective attribute that exists per se, but a social construct, reached by consensus in the scientific community. So you need interest to establish value, and you need value to rouse interest. I agree with Mann that the notion that the number of citations reflects scientific value is an a priori belief. But since enough people seem to share this belief, that is how citations are interpreted...

I fully agree with Kennedy that if the hit count gains currency in research assessment, all kinds of manipulation will occur. If self- citation is a problem, it is nothing compared to what "self-hitting" (please help me here!) will be.

I salute my kindred spirit, Dr Eysenbach, with apologies. I've been on the receiving end of this type of situation.

Finally, I will remember Mazur's suggestion that clever titles and alliterative author names will lead to more citations - or was it to get John Updike to write a poem about you? For those interested, here is the link: http://www.cs.rice.edu/~ssiyer/minstrels/poems/788.html

Competing interests: Author of paper commented upon

Journal Readership and Impact Factors 8 September 2004
Previous Rapid Response Next Rapid Response Top
Kai Ming Chow,
Research Fellow
Department of Medicine and Therapeutics, Prince of Wales Hospital, Chinese University of Hong Kong,
Cheuk Chung Szeto

Send response to journal:
Re: Journal Readership and Impact Factors

Editor--In a recent study of scientific articles published in the BMJ, attention was brought to website hit counts which might predict subsequent citation.1 Little doubts exist that medical journals vary in their ability to attract advertisements according to the Science Citation Index Impact Factors or paper citation frequency, a surrogate thought to reflect journal readership. Along the same vein, we question if the Impact Factor has any influence upon journal readership, and hence attractiveness for pharmaceutical advertising investment. Given the fact that primary care physicians received journals with the most advertisements2,3, we evaluated the patterns of pharmaceutical advertisements in four major medical journals targeted for family practitioners (American Family Physician, Canadian Family Physician, Journal of Family Practice, Postgraduate Medicine).

In a retrospective review of pharmaceutical advertisements in four large-circulation journals directed toward family practitioners, pharmaceutical advertisements were defined as those for prescription or over-the-counter medications. Vaccines and non-pharmaceutical advertisements (medical/surgical devices, academic meetings or classified employment advertising) were excluded. To account for monthly variations in advertising budgets, we randomly selected 4 months (January, April, July and October) for each journal between calendar years 1988 to 2002.

Overall, pharmaceutical advertisements comprised 112 plus or minus 64 pages per journal issue. Strange as it might seem, there was significant negative association between the page percentage of pharmaceutical advertisements (per issue) and the corresponding journal yearly Science Citation Index Impact Factor (Spearman's rho = -0.71, P < 0.001).

It would therefore be interesting to speculate that less "prestigious" medical journals (as rated by the Impact Factor) share a far more enormous industrial market of advertisements partly because they are more widely read (albeit less frequently cited). It is therefore important for all medical journals4, particularly those with "less reputable impact factors", to take account of any potentially misleading content in deciding which advertisements to publish.

Funding: CUHK research account 6901031.
Competing interests: None declared.
Ethical approval: Not required.

References

1. Perneger TV. Relation between online "hit counts" and subsequent citations: prospective study of research papers in the BMJ. BMJ 2004;329: 546-7.

2. Glassman PA, Hunter-Hayes J, Nakamura T. Pharmaceutical advertising revenue and physician organizations: how much is too much? West J Med 1999;171: 234-8.

3. Dobson R. Pharmaceutical industry is main influence in GP prescribing. BMJ 2003;326: 301.

4. Oliver JJ, Maxwell SR. Journals should select drug advertisements more carefully. BMJ. 2003;326: 1211.

Competing interests: None declared

Journal online bias 10 September 2004
Previous Rapid Response  Top
Syed Abdul Mujeeb,
Asstt.Prof
AIDS Surveillance Center, JPMC, Karachi

Send response to journal:
Re: Journal online bias

Hit count as a measure for a scientific value of a paper does not take into account the readers of print version of the paper and they may be systematically different from the online readers and thus this measure can introduce a kind of selection bias- “ Journal online bias”. It is quite probable that readers of print version represent readers from the developing world more, and their research interests, duration, preparation and publication time of their research work may be different with the online readers. Therefore, it is quite likely that their citation remains under reported during the specified period of hit count measure time.

Competing interests: None declared