11

Journals ranking and impact factors: how the performance of journals is measured

Iain D. Craig, Liz Ferguson and Adam T. Finch

Abstract:

This chapter investigates measures of journal performance and ranking. It begins by exploring the principal conventional sources of citation data, Web of Science and Scopus, and compares these with alternatives such as Google Scholar. Critical variables in citation analyses include coverage by discipline and different article types, such as review articles compared to articles documenting new research. The chapter concludes with an exploration of alternative metrics.

Key words

Journal Impact factor

journals ranking

alternative bibliometrics

Why rank journals?

There are many ways in which a journal may be ranked, such as surveying the opinions of the individuals who read them, or using empirical measurements based on the numbers of citations or downloads the journal’s articles receive. Before considering in detail any of these ranking mechanisms, however, it is worth examining why we would wish to rank journals in the first place.

Estimates of the number of peer-reviewed journals in current circulation vary considerably but one work (Morris, 2007) places the number at around 23,000 and a more recent unpublished estimate (Morrison, 2012) suggests the number has grown to 26,000, based on information from Ulrich’s periodicals directory. With readers having limited time to search and read the literature, and libraries having limited financial resources to acquire journal content, some form of value judgement is required to provide focus. Assigning a numerical ranking to a journal (e.g., citations per article, cost per ranking unit, or downloads per article) allows for ready comparisons between titles with similar aims and scope. These average measures encapsulate one or more characteristics of the journal, and, depending on which perspective you are viewing from, enable the individual to focus their finite resources better.

An additional reason to rank journals is related to the increasing desire to rank individuals, research groups or even institutions by the quantity and quality of their output in peer-reviewed journals. While this practice is ill-advised due to the problems of extrapolating the average journal quality to the individual articles which are published in those journals, a number of current schemes nevertheless do just that. More sophisticated schemes will investigate the quality of individual articles themselves rather than rely on the bulk average; however, the use of journal quality as a proxy for author quality has the benefit of convenience, being far less labour intensive than collecting individual citation scores for each article. At the very least, journal quality can be taken as a proxy for author esteem.

From a publisher’s perspective, there is interest in pursuing activities that maximize the quality and visibility of journals, ensuring they reach as many interested users as possible and are attractive to potential authors seeking to publish their highest quality research. To this end, scoring highly in various systems of ranking is crucially important to the long-term success of a journal.

Conventional measurement types

Methods of ranking journals can be qualitative, quantitative or a combination of elements of both. With the emergence of online journal content platforms, and the availability of increasing computational power, quantitative methods have largely asserted dominance over the qualitative methods. This is not to say that quantitative methods are intrinsically better than qualitative methods; it is more a function of the ease with which data can be acquired and analysed. Lehmann (2006) said: ‘Institutions have a misguided sense of the fairness of decisions reached by algorithm; unable to measure what they want to maximise (quality), they will maximise what they can measure.’ Certain subject disciplines lend themselves particularly well to quantitative methods such as citation analysis. Other disciplines, particularly the arts and humanities, are less well suited to evaluation by citation analysis (see the section on Subject-specific citation differences’ on p. 263). Qualitative judgements as to the relative merits of different journals are much more prevalent in these subjects than in the sciences, for example.

Citation linkages

The most common form of quantitative measurement is based upon citation linkages, whereby two documents A and B are linked when document B cites document A by means of a reference to document A in its bibliography. This type of citation linkage is largely understood to be a means of substantiating and building upon previous work, although there are numerous alternative reasons why an author may cite previous work (Case and Higgins, 2000), including criticizing previous work or conforming to social hierarchical mechanisms. Despite these possible alternative reasons for the presence or absence of a citation, the interpretation of a citation-based analysis typically relies on the assumption that citations are provided for positive reasons. Nonetheless, citations do not equate directly to quality; they can be spoken of more correctly as a proxy measure for academic impact.

Counting the citation linkages between documents requires a well-structured citation index. A citation index allows a user to navigate from an article in question, both backwards to previous work and forwards to work which has referenced the article in question. While not the inventor of the concept of citation indexing, Eugene Garfield can be credited with commercializing the process and applying it to the scholarly communication process. Garfield described the concept of his citation index in 1955 (Garfield, 1955), and in 1961 the Institute of Scientific Information (ISI), the company he founded, launched the Science Citation Index. For many decades, the standards in citation indexing were the ISI citation indexes (now part of Thomson Reuters and collectively known as Web of Science). In recent years new citation indexes have emerged such as Scopus (http://www.scopus.com), CiteSeerX (http://citeseerx.ist.psu.edu) and Google Scholar (http://scholar.google.com).

The main differences between Scopus and Web of Science are related to the breadth and depth of coverage (how many journals they cover over how many years). Web of Science, for example, indexes fewer journals but has coverage back to the 1980s for many titles and as far back as 1954 for some; conversely, Scopus indexes more journals, but citation counts are only accurate in content published from 1996 onwards. However, the two indexes operate on the same principles: receiving a defined set of journal articles and their references directly from the publisher, initially as print-copy for scanning and optical character recognition (OCR) and latterly as an e-feed. In this way, Scopus and Web of Science capture and index bibliographic metadata for a selected group of high-quality titles, allowing them to match citing references to target articles with a high degree of accuracy. These metadata and the associated citation counts are then available for downloading in record sets, for analysis by publishers and other stakeholders.

Google Scholar and CiteSeerX operate using an autonomous citation indexing principle, whereby indexing robots look for scholarly material by trawling the Internet, including the major journal content hosted on delivery platforms (such as Wiley Online Library, ScienceDirect, and so on), centralized subject repositories such as arXiv, institutional repositories, or even researchers’ personal web pages. Once the robots have identified suitable content, indexing algorithms extract bibliographic metadata and citing references, and these are matched to existing articles in their database. However, there have been some indications that some issues of quality arise from the automated approach to crawling for metadata (Falagas et al., 2008).

While the above citation indexes aim, to a greater or lesser extent, to cover the full gamut of subject areas in the journals universe, there are also subject-specific citation indexes. Examples of these include the Astrophysical Data Service for astronomy and CitEc for economics. Country-specific indexes also exist, including the Indian Citation Index and the Chinese Social Sciences Citation Index. Journal editors are sometimes quite enthusiastic about these indexes, but as they cover only selective parts of the journals universe, and since they often carry less complete metadata than the ‘universal’ citation indexes, wide-scale journal impact analysis is more difficult using them.

In addition to the standard document B cites document A citation, other forms of citation-based linkages are possible, such as bibliographic coupling (Kessler, 1963) – where two documents cite a common third document (document C) – or co-citation studies (Marshakova, 1973; Small, 1973) which measure the number of times documents A and B are cited together in subsequent literature.

Bibliographic coupling and co-citation studies of the journal literature have until recently been small-scale studies and largely confined to the academic realm of scientometrics because of the need to have access to the underlying raw citation data. However, the same principles can be applied to any collection of documents. In general, such relationships are used to demonstrate similarities between the topics of the articles, rather than as a proxy for impact; for example, SciVal Spotlight, a tool based on Scopus data, identifies potential peers for collaboration using this approach. The advent of institutional and subject-based repositories, which make their metadata readily available for harvesting, means that these measurements are becoming more commonplace.

Subject-specific citation differences

Researchers communicate research outputs of different subject areas in different ways. Some predominantly use journals, while others use a much broader range of media such as monographs, book chapters, working papers, reference works, handbooks and conference proceedings. This has dramatic consequences on the application and validity of journal-based citation analysis in different disciplines.

In Table 11.1, the first column, ‘Importance of journals (%)’, describes the proportion of references within ISI-indexed journals which refer to other journals (Moed, 2005). The data set consists of the references of original articles and reviews published in the ISI indexes during 2002, with only references to items from 1980 onwards being considered. In molecular biology and biochemistry, 96 per cent of references are to other journal articles, whereas in the humanities and arts, only 34 per cent of references are to other journal articles. A citation analysis based on journal citations will, therefore, capture much more of the research communication in molecular biology and biochemistry than it will in the humanities and arts.

Table 11.1

Importance and coverage of different subjects in ISI citation indexes (based on references in articles and reviews published during 2002)

Discipline Importance of journals(%) ISI coverage of journal literature (%) Overall ISI coverage (%)
Molecular biology and biochemistry 96 97 92
Biological sciences related to humans 95 95 90
Chemistry 90 93 84
Clinical medicine 93 90 84
Physics and astronomy 89 94 83
TOTAL ISI 84 90 75
Applied physics and chemistry 83 89 73
Biological sciences – animals and plants 81 84 69
Psychology and psychiatry 75 88 66
Geosciences 77 81 62
Other social sciences – medicine and health (1) 75 80 60
Mathematics 71 74 53
Economics 59 80 47
Engineering 60 77 46
Other social sciences (2) 41 72 29
Humanities and arts (3) 34 50 17

image

(1)includes public environment and occupational health, nursing, sport sciences;

(2)includes sociology, education, political sciences and anthropology;

(3)includes law

The ISI citation indexes provide variable coverage of different subject areas, with some areas being better represented than others. The column headed ‘ISI coverage of journal literature (%)’ in Table 11.1 indicates the proportion of references to articles in journals indexed by ISI, and therefore visible in any citation analysis. The journal coverage is highest in molecular biology and biochemistry and lowest in the humanities and arts.

Multiplying the importance of journals by the journal coverage provides a value for the effective ISI coverage of the totality of research communication in that particular subject area. In the humanities and arts, only 34 per cent of communication is via journal articles and only 50 per cent of those journal articles are indexed in ISI. This gives an overall coverage value of 17 per cent.

This combination of the significance of journal communication and the actual coverage of the journal literature indicates that the questions that can be answered through citation analysis vary greatly between subject areas. Perhaps even more importantly, they will vary between different citation indexes: Web of Science, Scopus, Google Scholar, CiteSeerX, and so on.

Finally, these differences will also vary over time. The data reported in Table 11.1 date from an examination of the ISI citation indexes in 2002. Since then, coverage has increased, most notably with the October 2008 integration of the ISI proceedings indexes. In some cases this has had a substantial effect on the coverage of subject areas. Table 11.2 shows the proportion of selected subject areas comprised by the journal as opposed to the proceedings citation indexes. Those subjects – primarily computer science and engineering – where conference proceedings comprise more than 60 per cent of all content from 2000 to 2013 are shown, along with those where such documents comprise less than 0.5 per cent. Clearly, journal content is a less important vehicle for communicating research in some subjects than in others.

Table 11.2

Proportions of selected subject coverage in ISI journal vs. conference proceedings indexes

Web of Science categories Total publications 2000-13 Science, social science or arts and humanities citation indexes (%) Conference Proceedings Citation Index (%)
Imaging science and photographic technology 163,361 15.01 84.99
Robotics 81,153 18.57 81.43
Computer science, cybernetics 57,569 27.64 72.36
Computer science, artificial intelligence 489,403 27.66 72.34
Remote sensing 86,979 30.24 69.76
Computer science, hardware architecture 181,642 31.07 68.93
Automation and control systems 240,794 31.31 68.69
Telecommunications 396,704 31.62 68.38
Computer science, theory and methods 435,686 34.36 65.64
Computer science, information systems 376,441 34.48 65.52
Engineering, electrical and electronic 1,285,388 38.55 61.45
Transplantation 133,412 99.53 0.47
Dermatology 149,674 99.54 0.46
Haematology 297,215 99.54 0.46
Psychiatry 310,461 99.56 0.44
Developmental biology 75,758 99.64 0.36
Substance abuse 51,119 99.65 0.35
Chemistry, organic 257,958 99.69 0.31
Nursing 88,917 99.71 0.29
Literature, romance 59,251 99.72 0.28
Dance 23,654 99.73 0.27
Urology and nephrology 233,068 99.81 0.19
Folklore 9,829 99.84 0.16
Rheumatology 109,054 99.95 0.05
Primary health care 20,942 99.96 0.04
Literary reviews 107,743 99.98 0.02

image

Since 2008, the Conference Proceedings Citation Index has been joined by the Book Citation Index and the Data Citation Index. Thomson Reuters has suggested that 10,000 titles will be added to the Book Citation Index each year and some initial work has been completed (Leydesdorff and Felt, 2012) looking at whether book citation data are robust enough for reliable bibliometric analysis. This may yet become another key focal area of analysis for publishers. However, for now, longitudinal comparisons need to be undertaken with the knowledge that coverage is not fixed, but is constantly in motion.

Citation distributions

The distribution of citations to articles follows a skewed rather than normal distribution, that is a small number of items is exceptionally highly cited, with the majority of items being seldom cited. A study of biomedical journals (Seglen, 1992) reported that: ‘Fifteen per cent of a journal’s articles collect 50 per cent of the citations, and the most cited half of the articles account for nearly 90 per cent of the citations. Awarding the same value to all articles would therefore tend to conceal rather than to bring out differences between the contributing authors.’ Similar findings have been reported in the fields of immunology and surgery (Weale et al., 2004) and for articles published in the journal Nature (Campbell, 2008). Figure 11.1 shows that a similar pattern exists for Australian publications across a range of subjects.

image
Figure 11.1 Proportion of Australian 2010 publications accounting for 50 per cent and 90 per cent of citations (to October 2012) in selected Web of Science subjects

A consequence of this skewed distribution is that average values tend to be highly influenced by the presence or absence of a handful of highly cited items from one year to the next. The implication is that consideration should be given to the type of statistical method or test that is applied when examining sets of citation data. Non-parametric methods will, in many cases, be more appropriate than their parametric counterparts.

Journal Citation Reports

The most common journal aggregated citation-based measurement in use today is the impact factor, which is published annually by ISI as a component of its Journal Citation Reports (JCR). But the impact factor is not the only measurement that is provided in the JCR, and some of the other measurements, such as total citations, cited half-life, and immediacy index provide valuable information at the journal level. Measurements aggregated to the level of the subject category provide useful benchmarking values or rankings.

The impact factor

The impact factor is a measure of the average number of citations to the articles contained within a given journal. This calculation is based on the citations originating from the journal subset indexed in ISI’s Web of Science, comprising approximately 12,000 journals. In practice, only those journals which appear in either the science or social science citation indexes will receive an impact factor; this represented 10,647 titles in the 2011 JCR.

The impact factor imposes some special restrictions on the calculation of this average, limiting it to citations during a defined period of time, to articles published in a second defined period of time. The definition of the 2012 impact factor is provided in the equation below:

2012impactfactor=Citationsin2012toallitemsfrom2010and2011Numberofcitableitemsfrom2010and2011

image

The impact factor was created by Garfield in the early 1960s as a measure to select new journals to add to his growing Science Citation Index. By aggregating author-level citation data to the level of the journal, he could determine which journals were most commonly cited. For a fuller historical explanation of the origins of citation indexing and the impact factor, see Bensman (2007). Journals which Garfield identified as heavily cited but not indexed at that time were then added to the citation index. Garfield noted that a relatively small core of journals was responsible for the majority of citations, and this allowed him to cost-effectively cover a large proportion of cited articles, without necessarily indexing the entire corpus of research. In an essay currently available on the Thomson Reuters website this inner core of journals is highlighted:

Thomson Reuters analyzed the 7621 journals covered in the 2008 Journal Citations Report®. The analysis found that 50% of all citations generated by this collection came from only 300 of the journals. In addition, these 300 top journals produced 30% of all articles published by the total collection. Furthermore, this core is not static. Its basic composition changes constantly, reflecting the evolution of scholarly topics.

(Testa, 2012)

The suitability of the impact factor as a measure of journal quality has been thoroughly debated (e.g., Cameron, 2005; Seglen, 1997) and we do not propose to revisit old arguments here. Suffice to say, despite correctly noted shortcomings, the impact factor is a powerful metric. It is the preeminent metric in the author, library and research funding community. While many of the following JCR metrics may well not be recognized and are seldom used by journal stakeholders, the impact factor almost always will be. Although it is flawed and frequently used in an inappropriate manner, it cannot be dismissed.

Total citations

While the average citation per article calculation that is the impact factor mitigates for the effects of journal size, it has a tendency to favour review journals which typically publish a relatively small number of highly-cited articles (although see the section Review articles’ on p. 273 of this chapter). The total citations measure provides a broader perspective by counting citations from all citing articles from the current JCR year to all previous articles (from any year) in the journal in question. Such a metric will obviously highlight larger and higher-quality journals. In theory, as the journal’s total historical output, and hence citable material, can only ever increase from year to year, the total citations value should also increase from year to year. In practice, however, obsolescence of the literature occurs (see ‘Cited half-life’ below), with material becoming less likely to be cited the older it becomes. This, to a degree, keeps in check the year-on-year rise of the total citation figure.

Cited half-life

An article’s citation distribution over time can be monitored and characterized. Taking an idealized article, the citations per period will typically rise over time to a maximum. This maximum is dependent on the inherent quality of the article, the subject area (medical sciences will reach a peak more rapidly than social sciences) and the type of document (short communications will typically see an earlier peak than original research articles). From this peak it will then drop off, again at a rate dependent on quality, subject area and the type of article.

The cited half-life value characterizes the age distribution of citations to a journal, giving an indication of the rate of obsolescence. The cited half-life as defined by ISI in its JCR is the median age of the papers that were cited in the current year. For example, a cited half-life in JCR 2012 of 6.0 years means that of the citations received by the journal from all papers published in 2012, half were to papers published between 2007 and 2012.

The desired value for a cited half-life varies depending on the journal. A long cited half-life for a journal can be interpreted as meaning longevity in articles and a journal that serves an archival purpose, whereas a short cited half-life can indicate articles which are on the cutting edge but quickly rendered obsolete by the pace of change in the subject area. Both of these qualities are desirable for different reasons, and it is observed (Tsay and Chen, 2005) that for general and internal medicine and surgery there is no correlation between cited half-life and impact factor.

Immediacy index

The immediacy index is an indication of how rapidly citations to a journal take place, and is calculated as the ratio of the number of citations received in the current year to the number of citable items published in the current year. Within-year citations indicate that the research is being built upon rapidly, which is a desirable outcome for any journal. Contrary to the cited half-life data, it was observed that for general and internal medicine and surgery there is a correlation between immediacy index and impact factor (ibid.).

Subject aggregated data

Since JCR 2003 (published summer 2004), ISI have produced data aggregated to the level of the subject category in addition to the journal-specific values mentioned above. These data allow for benchmarking against peer journals, and enhance the usability of the journal-specific data. The data that is provided at a subject level includes: total citations, median impact factor, aggregate (mean) impact factor, aggregate immediacy index, aggregate citing half-life, aggregate cited half-life, and total articles.

By comparing, for example, the change in a journal’s impact factor over time against the change in the median or aggregate impact factor over the same time period, one can determine whether changes at a journal level are to do with the quality of the articles within the journal itself, or merely a reflection of an overall trend at the subject level. This is particularly useful when explaining step-changes in impact factor across a group of journals.

One note of caution should be sounded when performing longitudinal analysis of the journals in the JCR: the ISI journal universe is constantly in flux. The net change in the number of titles is positive, but journals cease, merge and split, while some are simply dropped as they no longer fulfil the selection criteria as research trends evolve over time. Conversely, new journals, and occasionally new subjects, are added to better reflect the current status of research. This change in the composition of the indexes, and hence the JCR, means that care needs to be taken when interpreting data to ensure that the observed result is a true result, and not simply an artefact of the evolving composition of the index.

Using JCR metrics in promoting journals

In the JCR there exist a number of metrics that can be used to rank journals. The choice of one particular metric over another depends on the message to be conveyed and the ranking of the journal in each of the metrics. For instance, a large journal without a particularly high impact factor may be ranked highly by total citations, and so may adopt that as its USP in marketing messages. That said, the impact factor is still the primary metric to be given weight in the community; indeed, even Elsevier, which produces and promulgates its own alternatives to the impact factor, such as SNIP and SJR, still uses this metric on its journal pages.

Ultimately, however, journal editors and publishers are competing in a crowded world for the attention of readers and potential authors. Marketing messages are prepared and distributed which make use of a variety of metrics, with the intention of appealing to as many potential authors as possible, and with the intention of improving those metrics in the future.

Author behaviour and journal strategies

Increasingly, the academic and financial success of a researcher is tied to his or her ability to publish in high-impact journals. In some instances this imperative results in journal submission choices being made not due to the journal’s suitability in terms of the article content and the audience to be served, but simply because it has an impact factor above a certain value. This strategy is clearly misguided, as simply appearing in a journal with a high impact factor does not guarantee that an article will receive any more citations than if it had appeared in a journal with a much lower impact factor (though that assumption cannot be tested). Nor does it follow that an individual article will receive the average number of citations that previous articles have received, due to the skewed distribution of citations in a journal’s articles. The 2005 CIBER Study of more than 5500 authors (Rowlands and Nicholas, 2005) clearly illustrated the importance of the impact factor in publication decisions; the impact factor was the third most important reason (after reputation and readership) for authors to select a journal for submission of their most recent paper. A survey carried out by Wiley into the submission behaviours of early-career authors identified the same trend, though interestingly it was slightly less pronounced than in more experienced researchers (L. Ferguson, personal communication, 2007).

Impact factors and rankings are a regular agenda item for journal editorial board meetings for the reasons outlined above. The impact factor has a substantial effect on author behaviour when choosing where to publish, and most editors believe, with justification, that a good impact factor and the high ranking it brings is one of the strongest drivers of high-quality submissions. Data made available by the editors of Aging Cell, published by Wiley, a young journal in a growing field, supports that notion (Figure 11.2). Data on manuscript submissions have been recorded in six-month periods, and are plotted alongside the impact factor that the journal held at the point the manuscript was submitted.

image
Figure 11.2 Manuscript submissions in six-month periods versus impact factor for the journal Aging Cell

The impact factor of Aging Cell increased from 2.118 to 5.960 between JCR 2005 and 2006. Comparing the level of submissions in the six-month period before the rise to the submissions once the new, higher impact factor was published, the journal received approximately 150 per cent more submissions than in the previous period.

A number of strategies are available to an editor who wishes to improve the likelihood of their journal gaining a good impact factor. Some approach these with more enthusiasm than others.

Review articles

Review articles typically attract more citations on average than primary research articles (see, for example, Moed et al., 1996; Peters and Van Raan, 1994). The effect of this can be seen in ISI impact factor listings, with many subject categories topped by review journals.

There is, however, some evidence of journal-specific effects here. Average citation rates of review and research articles in New Phytologist, for example, show marked differences, while the review and regular research articles in Journal of Urology and Alimentary Pharmacology and Therapeutics exhibit less pronounced differences (see Figure 11.3ac).

image
Figure 11.3a Average number of citations (to end April 2013) to regular and review articles published in New Phytologist, 2009–11
image
Figure 11.3b Average number of citations (to end April 2013) to regular and review articles published in Journal of Urology, 2009–11
image
Figure 11.3c Average number of citations (to end April 2013) to regular and review articles published in Alimentary Pharmacology & Therapeutics, 2009–11

One explanation for these differences is the nature of the review articles themselves. Review is a catch-all term for numerous different document types, ranging from a full comprehensive review, to a mini-review, to a perspective, or to a tutorial. This heterogeneity is likely to lead to local differences in expected citation counts. For example, since 1985 New Phytologist has published its Tansley Reviews, reviews written by specialists but aimed at a readership outside that which could be expected from a specialist review journal. This prestige may elevate them further above research articles than might otherwise be expected.

Because of the powerful effect they can have, many journal editors are keen to publish review articles. This is not simply to increase the impact factor however. The data in Figure 11.4 suggest that as well as having a positive effect on the impact factor, review articles are typically downloaded significantly more often than primary research articles and can broaden readership.

image
Figure 11.4 Average article downloads (to end April 2008) for journal articles published in 2007 in three areas of science

It is interesting to note the gross differences in download ratios for the three journals in the different subject areas. Broadly speaking, the ratio for the anatomical science journal more closely mirrors the citation ratios of the medical journals (see Figure 11.3b and 11.3c), while the ecology and microbiology journals reflect the citation ratios of the plant science journal (see Figure 11.3a).

Writing a high-quality, comprehensive review article takes a significant amount of effort on the part of an author; most are more driven to publish primary papers instead because it is these that earn them tenure or continue to advance their reputations. Journal editorial teams also invest a significant amount of energy into devising strategies for acquiring review articles.

The question of whether the proliferation of review articles is desirable and whether this effort is warranted has been investigated in the pathology literature by Ketcham and Crawford (2007). They identified a sixfold increase in review articles between 1991 and 2006, compared with a twofold increase in primary research articles. Similarly, examining papers using hepatitis as a title or key word, the authors identified a 13-fold increase in review articles over a 20-year period, compared with a sixfold increase in primary research articles. In both cases, the growth of the review literature was largely outside review journals. Most importantly, from the perspective of a journal editor who may be seeking to improve an impact factor, the authors demonstrate that only a small proportion of review articles achieve a high number of citations. Therefore, the energy put into acquiring review articles by journal editorial teams may, in many cases, be misdirected. The degree by which review growth outstrips article growth may, in any case, be slowing; a review of the whole of Web of Science in the decade from 2003 to 2012 shows that article counts grew by 51.1 per cent while review counts grew by 86.5 per cent.

Increasing the focus on areas more likely to attract citations

Within all journals some subject areas will be cited more frequently than others, or will fit the two-year impact factor window better. It is not inconceivable that editors might devote more space in a journal to areas more likely to attract citations, although many editors oppose this practice as it would result in their journals not representing full and balanced coverage of their disciplines. Philip Campbell of Nature, quoted in an article in the Chronicle of Higher Education (Monastersky, 2005), rejected the suggestion that Nature focuses on areas likely to attract more citations by stating that if that were the case the journal would not publish papers in areas such as geology or palaeontology and would focus more on subjects such as molecular biology.

Picking up papers from higher-impact journals

In many cases, an author will submit to a high-impact but broad-based journal in preference to one of lower impact but which is perhaps more suited to the subject matter of the manuscript in question. An unofficial hierarchy of broad to niche and high to low quality emerges. If a generalist journal rejects the paper as of insufficient broad interest, but methodologically sound, some high-level, subject-specific journals now informally encourage authors to submit referees’ comments along with their manuscript – this frequently negates the need for further peer review as a decision can be made on the suitability of the paper using the reviews from the first journal. It has the dual effect of increasing the speed of review and publication while lessening the peer review load on the community.

A more organized attempt at transferring reviewed papers between journals has been established in neuroscience. The Neuroscience Peer Review Consortium (http://nprc.incf.org/) has been operating since 2008, with the intention of reducing the load on reviewers. At the time of writing, 40 journals are participating in the initiative. On receiving a rejection, authors are given the option of having the reviews of their article automatically forwarded to another journal of their choice in the consortium.

Reducing the denominator

ISI does not include all document types in the denominator of their calculations of the impact factor (see the equation on p. 268 above), whereas all citations to any document type are counted on the numerator. This can lead to situations where some citations are not offset by the presence of a publication on the denominator, effectively meaning that these citations are ‘free’ citations. What ISI does include it terms as ‘citable’ items, which typically include source items. Source items typically include research articles, review articles, case reports and articles published in any supplements. Items frequently excluded from the denominator of the calculation include letters (unless they function as articles, for example in Nature or Ecology Letters), commentaries, meeting abstracts, book reviews and editorials. The document type designation of a journal’s papers can be readily determined once they have been indexed in Web of Science and some guidance has been published on this topic (McVeigh and Mann, 2009); however, it is not always clear how that decision has been reached, and this can be a source of some frustration for editors and publishers (The PLOS Medicine Editors, 2006). Editors and publishers are able to contact ISI to request that they treat certain article types as non-citable items, but this channel is informal, and it is entirely at ISI’s discretion as to whether requests are granted.

Self-citation

Self-citation, and specifically a journal editor suggesting to authors during the peer-review process that they may wish to consider citing more work from that journal (either suggesting particular papers or suggesting that the authors identify some themselves), is regarded as the least acceptable way for a journal to improve its impact factor.

Editorials focusing on the journal’s own recent content can also be used to increase the number of citations. In 2004 this approach was observed with an editorial citing 100 of its own articles from previous years (Potter et al., 2004); furthermore, the editorial was published in not one but five different journals. In reality, only one of these editorials was indexed by ISI, and these self-citations were actually only a small proportion of the citations received and did not unduly disturb the journals’ impact factor. While obvious effects such as this can be readily identified (Reedijk and Moed, 2008), more subtle effects prove harder to quantify, although with the widespread availability of citation data it is unlikely that these practices could go undetected indefinitely.

The JCR now identifies the proportion of self-cites for individual journals, and it is possible to re-calculate an impact factor once the effect of self-citing has been removed. In JCR 2011, published in summer 2012, the impact factors of 50 journals were suppressed and not reported, due to excessive self-citation. A notice accompanying JCR 2011 reported:

Suppressed titles were found to have anomalous citation patterns resulting in a significant distortion of the Journal Impact Factor, so that the rank does not accurately reflect the journal’s citation performance in the literature. The Journal Impact Factor provides an important and objective measure of a journal’s contribution to scholarly communication, and its distortion by an excessive concentration of citations is a serious matter.

(Journal Citation Report Notices, 2011)

While the language of this note has been altered from previous years, which made explicit reference to self-citation (Journal Citation Report Notices, 2008), it is worth noting that the number of suppressed titles has increased from eight in 2008 to 50 in 2011; instances of manipulation, the success rate of detection, or both, are increasing.

However, it is important to acknowledge that there is often a good reason for a journal having a high self-citation rate. Journals which are very specialized or which are the dominant publication for a particular subset of a subject are more likely to show high rates of self-citation than those where research is spread among a number of journals with similar aims and scope.

A new, and subtler, form of self-citation appears to have emerged in recent years, whereby members of a cartel of journals cite each other in much the same way as the editorial described above. As it is not the journal benefiting that is increasing citation to itself, but rather one of a multitude of other titles that cite it, such gaming can be difficult to detect. One example of this was identified recently (Davis, 2012) and led to the journals involved being stripped of their 2011 impact factors.

Alternative sources

It is worth noting that the Journal Citation Report is not the only source of citation impact metrics for journals. The Essential Science Indicators, also produced by Thomson Reuters, list a selection of journals by total and average citations, focusing only on articles and reviews. Alternative metrics based on Scopus data (including the SJR [SCImago Journal Rank] and SNIP [Source Normalized Impact Paper] – see below) are available on the SJR and JournalMetrics websites; and while the approach is yet in its early stages, initiatives such as Altmetrics hope to construct and host metrics based on social media and other data elements.

Alternative metrics

In the years since the impact factor was created, numerous alternative journal ranking metrics have been proposed. Many of these have been minor modifications of the impact factor itself, with the aim of addressing some of the most commonly voiced concerns while keeping the fundamental simplicity of the measure. There have also been some complete departures, including a recent focus on eigenvector-based measurements. Whether any of these measurements provides a more accurate ranking picture of the journal hierarchy is debatable (Ewing, 2006):

For decades, scholars have complained about the misuses of the impact factor, and there is an extensive literature of such complaints and admonitions. But in a world gone mad with an obsession to evaluate everything ‘objectively’, it is not surprising that desperate and sometimes incompetent evaluators use a poorly understood, but easily calculated, number to comfort them.

What is clear is that the tools with which to build alternative ranking systems are readily available. The raw data that is the foundation for any new ranking system is becoming increasingly available from a variety of different sources, and the field of scientometrics has moved into the mainstream scientific consciousness. This has been accelerated by the influence of citation-based performance measurements, such as those being piloted for the UK Research Excellence Framework which will affect increasingly large numbers of individuals.

h-index and descendents

Before describing some of the new methods for ranking journals, it is worth mentioning in passing the emerging methods for ranking individuals based not on the aggregate performance of the journals in which they have published but the actual citation performance of their individual articles. Beginning with Hirsh’s h-index (Hirsch, 2005) – and quickly followed by a number of related indexes including the H1 index (Kosmulski, 2006), the g-index (Egghe, 2006) and the R- and AR-Indexes (Jin et al., 2007) – there has been a growing realization that the journal as the unit of measurement is not the most appropriate measure.

Hirsch’s original h-index as applied to the individual is calculated as the natural number ‘h’ such that the individual has published ‘h’ articles which have each been cited ‘h’ or more times. An h-index of 3, therefore, means three papers have been cited at least three times each, while an h-index of 10 means ten papers have been cited at least ten times each, and so on.

Criticisms have been levelled against the h-index since its inception (Costas and Bordons, 2007; Schubert and Glänzel, 2007) and the subsequent indexes attempt to address these criticisms while retaining the simplicity of the original index. Some of these prove better suited to the task than others (Bornmann et al., 2008).

Impact factor modifications

Numerous alternative impact factor measurements have been proposed over the years. Examples of these include: the ‘per document type impact factor’ (Moed and Van Leeuwen, 1995), where the differences in inherent citability of different document types published within the same journal – for example, original research articles, review articles, letters – are mitigated; the ‘rank-normalized impact factor’ (Pudovkin and Garfield, 2004), where a percentile ranking based on the impact factor of all the journals in a particular subject category is calculated; and the ‘cited halflife impact factor’ (Sombatsompop et al., 2004), where the age of the cited references is factored into the calculation.

The fact that none of these has gained widespread acceptance may be interpreted in a number of ways. It may be simply a consequence of a ‘better the devil you know’ attitude in the community. While the impact factor is imperfect, who is to say that any new measurement will be more equitable? Alternatively, it may be a deeper-seated dissatisfaction with the whole process of ranking by citations. Another explanation is related to the fact that all these modifications rely upon the underlying citation data as provided by ISI – which may have little incentive to change a formula which works perfectly well within the defined limitations that it has set out.

Alternative journal indicators

In traditional journal ranking measurements such as the impact factor, a citation from a high-impact factor journal is treated in exactly the same way as a citation from a low-impact factor journal, i.e. no account is taken of the citing source, only that a citation linkage exists. In an eigenvector-based journal measurement, such as the Eigenfactor (http://eigenfactor.org) or the SJR indicator (http://www.scimagojr.com), the computation takes into account a quality ‘characteristic’ of the citing journal.

It should be noted that the eigenvector style analysis applied to the ranking of scholarly journals is not a new phenomenon; indeed the process was described and applied to a selection of physics journals in the mid-1970s (Pinski and Narin, 1976). The re-emergence of this type of measurement has been driven by the success of the Google PageRank algorithm, which is itself based on eigenvector analysis. Google defines PageRank as follows:

PageRank reflects our view of the importance of web pages by considering more than 500 million variables and 2 billion terms. Pages that we believe are important pages receive a higher PageRank and are more likely to appear at the top of the search results…

PageRank also considers the importance of each page that casts a vote, as votes from some pages are considered to have greater value, thus giving the linked page greater value. We have always taken a pragmatic approach to help improve search quality and create useful products, and our technology uses the collective intelligence of the web to determine a page’s importance.

(Technology Overview, n.d.)

The most significant of the eigenvector style measurements is the SJR Indicator, which was released in November 2007. Its significance is not related to the mathematics of the calculation but to the underlying source of the citation data, which in this case was not ISI data but Scopus data.

Also based on Scopus data but taking a different approach is the SNIP (www.journalmetrics.com). It is the first metric to be calculated for the whole journal list that seeks to correct for the differences in average citation impact among different subject areas, to allow the comparison of journals across subjects. An average citations-per-paper is calculated for a journal and then divided by the ‘relative database citation potential’ (RDCP), which measures how likely it is that the journal should be cited, given how many citations are made overall by articles in the journals that cite it.

Comparing different ranking systems

The differences between the journal rankings as produced by three different ranking systems – JCR, Eigenfactor and SJR – can be examined, and relative performance in each scheme determined. The analysis in Tables 11.3 to 11.7 examines the rankings of journals in dentistry and associated subject areas.2 Note that it is currently not very simple to perform a like-for-like comparison, nor even to establish what the most appropriate comparison would be.

Table 11.3

Constituent dentistry journals in three ranking systems, 2011

JCR Eigenfactor SJR SNIP
Number of journals 81 64 127 125

image

Table 11.4

Top ten journals ranked by impact factor

Top ten journals ranked by impact factor JCR IF Eigenfactor SJR SNIP
Periodontology 2000 1 25 5 14
Clinical Implant Dentistry and Related Research 2 38 23 31
Journal of Dental Research 3 1 8 12
Dental Materials 4 5 3 3
Journal of Clinical Periodontology 5 9 2 4
Journal of Dentistry 6 12 9 10
Journal of Endodontics 7 6 4 19
Oral Oncology 8 8 10 15
Oral Microbiology and Immunology 9 29
Molecular Oral Microbiology 10 17 39

image

Table 11.5

Top ten journals ranked by Eigenfactor

Top ten journals ranked by Eigenfactor Eigenfactor JCR IF SJR SNIP
Journal of Dental Research 1 3 8 12
Oral Surgery, Oral Medicine, Oral Pathology, Oral Radiology, and Endodontics 2 33 30 32
Journal of Oral and Maxillofacial Surgery 3 27 16 16
Journal of Periodontology 4 11 12 58
Dental Materials 5 4 3 3
Journal of Endodontics 6 7 4 19
Clinical Oral Implants Research 7 13 6 11
Oral Oncology 8 8 10 15
Journal of Clinical Periodontology 9 5 2 4
American Journal of Orthodontics and Dentofacial Orthopedics 10 35 7 13

image

Table 11.6

Top ten journals ranked by SJR

Top ten journals ranked by SJR SJR JCR IF Eigenfactor SNIP
International Endodontic Journal 1 17 14 24
Journal of Clinical Periodontology 2 5 9 4
Dental Materials 3 4 5 3
Journal of Endodontics 4 7 6 19
Periodontology 2000 5 1 25 14
Clinical Oral Implants Research 6 13 7 11
American Journal of Orthodontics and Dentofacial Orthopedics 7 35 10 13
Journal of Dental Research 8 3 1 12
Journal of Dentistry 9 6 12 10
Oral Oncology 10 8 8 15

image

Table 11.7

Top ten journals ranked by SNIP

Top ten journals ranked by SNIP SNIP JCR IF Eigenfactor SJR
Monographs in Oral Science 1 43
International Journal of Oral and Maxillofacial Implants 2 21 15 31
Dental Materials 3 4 5 3
Journal of Clinical Periodontology 4 5 9 2
Journal of Cranio-Maxillo-Facial Surgery 5 26 36 13
Community Dentistry and Oral Epidemiology 6 19 26 15
Caries Research 7 16 32 11
Journal of Adhesive Dentistry 8 48 41 38
International Journal of Prosthodontics 9 36 28 40
Journal of Dentistry 10 6 12 9

image

Of the 145 unique titles covered in these systems, 59 are common to all three systems. Tables 11.4 to 11.7 describe the top ten ranked titles in terms of their impact factor, Eigenfactor, SJR and SNIP respectively. A superficial comparison of this small sample of data suggests that there are differences between the three systems. Many journals highly ranked by impact factor have relatively low rankings by Eigenfactor and vice versa. This is perhaps unsurprising: the impact factor takes account of the number of articles in a journal, while the Eigenfactor is closer to a measure of total citation in that it does not.

Similarly, when looking at the SJR and SNIP rankings (Tables 11.6 and 11.7) there are few journals which do well in all four systems. Only the journals Dental Materials and Journal of Clinical Peridontology are ranked in the top ten by all four metrics. Even the Journal of Dental Research, which is generally acknowledged to be a leading journal in the field, does not crack the top ten for the SNIP and only just does so for the SJR. Whatever it is that each metric is measuring, it appears not to be the same thing, and even using a number of metrics in parallel will struggle to separate all but the top tier of journals.

Download statistics

Although librarians have been monitoring use of print copies using a number of elaborate processes for some considerable time (Butkovich, 1996), comprehensive statistics based on actual user activities have not been available until recently. With the transition from print to online delivery of journal content, a new method for evaluating journals was born based on the actual activities undertaken by the user while at the journal website; namely, measuring the number of full-text articles downloaded. This is usually referred to by publishers as usage.

However, a terminology question arises when describing usage statistics. What does a full-text download actually represent? One cannot be certain that the user read the item they downloaded, or even that they truly intended to download the item in the first place. Further, while specific filters have been developed to eliminate double-counting of links clicked twice within ten seconds, users may choose to re-download an item every time they wish to view it, or they may download the PDF and store it locally for subsequent use.

It is clear that download figures need to be approached cautiously. While the activities that can lead to under- or over-counting of actual use can be assumed to take place in comparable ratios between different journals, there is no simple way of retrospectively examining any anomalous data, other than sifting through the server logs while questioning the user as to their motivation for each and every activity. Citation-based measurements are by no means without their flaws, but they do at least provide a permanence of record, and an ability to adjust data to account for factors such as self-citation. Usage data do not currently afford that ability.

As online journals proliferated from the late 1990s, and data on page impressions (views), number of sessions and, crucially, the number of articles downloaded could be easily collected, it became clear that a common standard was required in reporting usage information, in order to make valid comparisons between journals supplied by different publishers. In 2002, COUNTER (Counting Online Usage of Networked Electronic Resources) was launched. The aim of COUNTER was to provide a code of practice for the reporting of online usage data, in order to facilitate inter-journal and inter-publisher comparisons. The current code of practice is Release 4, which was published in April 2012. Any COUNTER 3 compliant publisher wishing to remain classed as a COUNTER-compliant vendor must have implemented all the standardized reporting criteria as described in this code by 31 December 2013.

While it is technically possible to measure every mouse-click on a website – a technique known as deep-log analysis (Nicholas et al., 2006) – a discussion of such analysis is outside the scope of this chapter. Suffice to say, the headline usage figure that is most commonly reported is the number of full-text downloads per journal per time period, which is referred to as Journal Report 1.

COUNTER Journal Report 1 and cost per access

From Journal Report 1, library administrators can compare the level of full-text downloads of their journal collection over monthly reporting periods. A common measure that is then derived is the cost per access (cost per use), which enables a comparison of the cost-effectiveness of different parts of a collection. Although superficially a simple process (to divide the cost of the journal by the number of full-text downloads in a specified period), the range of mechanisms in which journals are sold to institutions has a large bearing on the relevance and validity of this figure.

In the print era, journals were typically sold as single entities. Subscription agents made the process of subscribing to different journals from the same publisher a much simpler proposition, but, in essence, the cover price of a journal was what was paid, and a publisher’s revenues could be estimated by multiplying the cover price (minus an agent’s commission) by the number of subscriptions. In today’s online era, journals are sold as a mixture of single sales and larger bundles, and to a combination of individuals, institutes and vast library consortia occasionally spanning an entire nation, where the terms and conditions of the deal are collectively brokered. A consequence of this is that access to the journal literature has never been greater, with many institutions subscribing to a publisher’s entire collection of titles.

A by-product of this bundling process is that the actual cost to the library of an individual title is typically significantly less than the cover price. Precisely how much less will vary depending on the particular subscription model the publisher operates. Now factor in the changes in the operational cost structure of the library as a result of print to online migration (Schonfeld et al., 2004) and a simple cost per access calculation suddenly becomes a far more intricate undertaking (Price, 2007).

Usage factor

As with counting citations, the number of downloads is determined by the number of articles online and accessible. All things being equal, a larger journal will experience more downloads than a smaller journal. This size effect can be mitigated by calculating an average number of downloads per article, in the same way as the impact factor is an average of citations per article. This so-called usage factor, as proposed by COUNTER and the UK Serials Group (UKSG) (Journal Usage Factor, n.d.), has the potential to make meaningful comparisons between journals based on their usage, although there are a number of problems which will need to be overcome before such a measure can be universally accepted.

The main challenge is in creating a measurement which is resilient to deliberate and systematic abuse. This is of particular relevance when considering a usage-based pricing model, or where reward of individuals is based (even partly) upon the usage of their journal articles by others. The project’s Stage 2 analysis (Journal Usage Factor, 2011) concluded that the chances of such manipulation being conducted successfully were small, but there exist a small number of scenarios in which the metric could be successfully gamed.

COUNTER’s planned response to this seems to be twofold: to ensure that definitions of downloads and the exclusion of suspicious activity are sufficiently precise to minimize the influence of deliberate manipulation; and to apply statistical audits to publisher usage data, to check for signs that gaming has taken place. Whether these efforts will be successful remains to be seen.

Peer-review panel judgements

As discussed in the section Subject-specific citation differences’ (on p. 263), certain subject areas do not lend themselves to citation analysis. It can be no coincidence that while ISI produces three citation indexes – for science, social science, and arts and humanities – it only produces a Science and Social Science Journal Citation Report (the product which contains metrics such as the impact factor). Evidently, the validity of an impact factor based on journals which only appear in the arts and humanities citation index is too low to be meaningful.

In the absence of a simple quantitative metric for the large number of journals without impact factors, particularly, but not exclusively, in the arts and humanities, the most common form of ranking is that which originates from peer opinion.

European Reference Index for the Humanities

An example of peer opinion being used to rank journals is the European Reference Index for the Humanities (ERIH), a project run by the European Science Foundation (ESF) (http://www.esf.org). The ERIH project provides categorized lists of journals in 14 areas of the humanities. Expert panels split the journals into three categories, A, B and C, based on peer review. However, ERIH reports that this was misinterpreted in the community as being qualitatively hierarchical (ERIH Foreword, n.d.) and that the differences between the categories were of kind, not quality. It should be noted, however, that in a document no longer available (ERIH Summary Guidelines, n.d.) both qualitative and categorical criteria appeared to be applied to A, B and C classifications.

This decision may have been influenced by debates on the margins between the different categories, and of the overall wisdom of applying any such ranking to the humanities. In a joint editorial entitled ‘Journals under threat: a joint response from history of science, technology and medicine editors’, and archived online in numerous discussion lists and forums, editors from over 40 journals raised concerns about the process:

This Journal has concluded that we want no part of this dangerous and misguided exercise. This joint Editorial is being published in journals across the fields of history of science and science studies as an expression of our collective dissent and our refusal to allow our field to be managed and appraised in this fashion. We have asked the compilers of the ERIH to remove our journals’ titles from their lists.

(Journals under Threat, 2008)

Instead of an A, B and C classification, journals have been divided into INT(ernational)1, INT(ernational)2, NATional and W categories. The difference between INT1 and INT2 seems to be partially qualitative. As at the time of writing, this has been completed for 12 of the 14 subjects, with the results hosted by the ESF.2 The stated motivation for creating the ERIH was to address the problem of low visibility for European humanities research, in that:

It was agreed that this was largely caused by the inadequacy of existing bibliographic/bibliometric indices, which were all USA-based with a stress on the experimental and exact sciences and their methodologies and with a marked bias towards English-language publication. A new Reference Index was needed which would represent the full range of high-quality research published in Europe in the humanities and thus also serve as a tool of access to this research.

(ERIH Foreword, n.d.)

It remains to be seen what the long-term future of the ERIH will be.

Other efforts to rank journals using qualitative methods have encountered complications. An attempt by the 2010 Excellence in Research for Australia (ERA) exercise to rate all journals from A*–C using expert review caused immense controversy (Creagh, 2011). The rankings were retired for the most recent ERA in 2012.

Combination peer review and quantitative evaluation

In an effort to provide a balance between peer review and purely quantitative evaluation, a ranking combining elements of both systems can be created. Such evaluations are gaining popularity in the assessment not only of journals, but also of research groups, departments, institutes and universities. However, the success of such a mixed model will depend on the distribution of the weighting factors. With a multivariate approach, it is possible to come up with any number of different overall rankings simply by varying these factors.

The UK Research Excellence Framework

Following the retirement of the UK Research Assessment Exercise (RAE) in 2008, the Higher Education Funding Council for England (HEFCE) began planning its successor, the Research Excellence Framework (REF). A November 2007 HEFCE consultation document (Research Excellence Framework, 2007) noted that quantitative indicators, and particularly bibliometrics, will be a key element of the judgement of the quality of research. One of the major criticisms of the RAE – that it compelled authors and institutions to chase high impact factor journals – is explicitly dealt with in the consultation paper; HEFCE stresses that the bibliometrics applied will not involve the use or recording of journal impact factors.

The current plans for the incorporation of bibliometrics into the 2014 REF are based on a pilot study conducted in 2008–9 (Bibliometrics Pilot Exercise, n.d.). A fuller explanation of the metrics to be applied (Bibliometrics and the Research Excellence Framework, n.d.) indicates that a normalized citation measure will be applied; this is a fairly robust metric in which the actual number of citations received is divided by the average received by other publications in the same subject in the same year, to give a measure relative to a world average of 1.00. If the result is 1.25, the article is cited 25 per cent more than the average; if 0.75, it is cited 25 per cent less. Each researcher will be invited to submit up to four publications for consideration; therefore the REF will not look at the entire corpus of an institution’s research output (Assessment Framework and Guidelines for Submission, 2012). This measure of relative citation impact will then be used to construct a citation profile, where the proportion of work far below, below, around, above and far above the world average will be charted. In this respect, it is very similar to the ERA exercise that preceded it in Australia (see below).

Currently, it is not clear how granular the subject scheme will be, but it is likely to be based on the Scopus database, for which world average citations received in a subject area (often called ‘citation baselines’) can be calculated but are not publicly available. Scopus was selected as the data provider for the REF, as it was for the ERA, possibly on the basis that it covers more publications.

An interesting question that arises both from the RAE and the REF is to what extent researchers modify their behaviour in response to the evaluation process itself. In an assessment of UK science spanning the period 1985–2003, Moed (2008) concluded that the observed behaviour of UK scientists varied depending on the assessment criteria in the prevailing RAE. For instance, in RAE 1992 when total publication counts were requested (rather than the current situation of submitting a subset of ‘best’ work), UK researchers dramatically increased their article output. Furthermore, in RAE 1996 when the emphasis shifted from quantity to quality of output, the proportion of papers from UK researchers in high-impact journals increased. When a system of evaluation is created, generally those who are being evaluated will rapidly work out the practices to adopt in order to allow them to exploit the evaluation criteria. This ‘gaming the system’ is an inevitable consequence, and is an important factor to consider when developing any evaluation framework.

Excellence in Research for Australia (ERA)

Another example of a ranking for journals can be found within the ERA initiative (Excellence in Research for Australia, 2008), announced in February 2008. ERA aims to assess research quality of the Australian higher education sector biennially, based on peer-review assessment of a number of performance measures, including bibliometric indicators. The first ERA was conducted in 2010 and reported in 2011; the following ERA was conducted in 2012 and reported in early 2013.

While more traditional peer-review elements such as income generated and various esteem measures (such as specific awards and memberships of respected bodies) will be considered by the peer-review panels, so too will be publication outputs and their citation impact, with the results rating a university with a unit from 1–5 (ERA, 2012). The outputs will only be counted if they appear on the ERA journal list (which, as noted before, no longer carries the A*–C ranking). The method for calculating the citation impact of an institution’s publications is similar to that to be employed by the REF – actual citations received divided by a baseline for publications in the same year and subject – except that both a global and a national baseline are used.

Three types of analysis are employed: the distribution of papers based on world and Australian citation centile thresholds; average relative citation impact; and, as with the REF, the distribution of papers against relative citation impact classes. The centiles are calculated such that the most cited papers in a subject are in the 1st centile and those receiving no citations are in the 100th centile.

Only those publications listing an author based at the institution at the date of the census will be considered by the panel. This permits a degree of gaming, with institutions ‘poaching’ well-cited authors prior to the assessment date. Gaming is also possible in the subject scheme applied: this uses a four-digit Field of Research code, dividing up science, social science, arts and humanities into 164 subjects (although arts and humanities subjects will not have citation indicators applied), and a single paper can be allocated to more than one subject. It is up to the submitting institution to decide the ‘share’ of each subject for a given paper, meaning that shares of less cited papers could be allocated to a less crucial subject, and shares of more cited papers allocated to subjects of preference.

Despite this, and despite ongoing debates about the influence of self-citation, ERA was the first national research assessment scheme to successfully and consistently apply robust bibliometric indicators. There is no indication that this will change in the immediate future.

Combining usage and citation statistics

It is often argued that local rankings are more appropriate than global rankings for a librarian managing his or her collection, and many advocate the counting of local citations (i.e., citations from and to articles authored by local authors) rather than all citations as a more valid measure of how useful that particular journal is to the local authors.

Currently, the reporting of full-text downloads takes place at the level of the subscribing institution, and allows administrators to observe usage of their content at the local level. In terms of ranking journals, however, this will produce a localized ranking based on the specific characteristics of the subscribing institution in question. These characteristics – such as whether the institution has a large undergraduate programme or is purely research focused, or whether the institution has a broad or niche subject base – play an enormous role in determining the download figures, and hence the ranking.

In early 2006, ISI announced a product which would allow the combination of COUNTER-compliant usage data and institution-specific publication and citation data. The Journal Use Report (Newman, 2005) promised to ‘provide users with a 360° view of how journals are being used at their institution’. With emerging standards such as SUSHI (Standardized Usage Statistics Harvesting Initiative, n.d.) enabling the aggregation of usage data from different publishers, the activity of evaluating journals from a local standpoint becomes a far simpler proposition than previously. The Web of Knowledge Usage Reporting System (WURS) is now available (http://wokinfo.com/usage/), but this only reports usage from the subscribing institution, unlike the citations reported, which are counted from any citing source. While this allows an institution to establish the usage of subscribed materials (which many do anyway) it is unable to give a global view of usage across all subscribers.

Conclusion

In the not-too-distant future, new metrics will emerge to complement or even replace existing ones. For some of us, this future is almost within our grasp (Harnad, 2007). What will ultimately determine which of this new battery of measurements succeed and which fail, either individually or as composite measures, is likely to be how strongly they resonate with the communities they serve. The best ideas do not always make the best products; instead, simplicity and transparency can be the difference between success and obscurity.

Acknowledgements

The authors would like to thank Siew Huay Chong and Robert Campbell, both at Wiley, for providing the analysis in the Review articles’ section on p. 273, and for a critical reading of the draft.

References

Assessment Framework and Guidance on Submissions, In Research Excellence Framework, 2012. Available from: http://www.ref.ac.uk/media/ref/content/pub/assessmentframeworkandguidanceonsubmissions/GOS%20including%20addendum.pdf [(accessed 25 April 2013).].

Bensman, S. Garfield and the impact factor. Annual Review of Information Science and Technology. 2007; 41(1):93–155.

Bibliometrics Pilot Exercise (n.d.). In Higher Education Funding Council for England. Available from: http://www.ref.ac.uk/background/bibliometrics (accessed 25 April 2013).

Bibliometrics and the Research Excellence Framework (n.d.). In Higher Education Funding Council for England. Available from: http://www3imperial.ac.uk/pls/portallive/docs/1/46819696.PDF. (accessed 25 April 2013).

Bollen, J., Van de Sompel, H., Rodriguez, M. A., Towards usage-based impact metrics: first results from the MESUR Project. Proceedings of the 8th ACM/IEEE-CS Joint Conference on Digital Libraries. 2008:231–240 Doi: 10.1 145/1378889.1378928. Available from: http://dx.doi.org/10.1145/1378889.1378928.arXiv:0804.3791v1

Bornmann, L., Mutz, R., Daniel, H.-D. Are there better indices for evaluation purposes than the h index? A comparison of nine different variants of the h index using data from biomedicine. Journal of the American Society for Information Science and Technology. 2008; 59(5):830–837.

Butkovich, N. J. Use studies: a selective review. Library Resources & Technical Services. 1996; 40:359–368.

Cameron, B. D. Trends in the usage of ISI bibliometric data: uses, abuses, and implications. Libraries and the Academy. 2005; 5:105–125.

Campbell, P. Escape from the impact factor. Ethics in Science and Environmental Politics. 2008; 8:5–7.

Case, D., Higgins, G. How can we investigate citation behavior? A study of reasons for citing literature in communication. Journal of the American Society for Information Science. 2000; 51(7):635–645.

Costas, R., Bordons, M. The h-index: advantages, limitations and its relation with other bibliometric indicators at the micro level. Journal of Informetrics. 2007; 1(3):193–203.

Creagh, S., Journal rankings ditched: the experts respond. The Conversation 2011; (June). Available from: http://theconversation.com/journal-rankings-ditched-the-experts-respond-1598 [(accessed 19 April 2013).].

Davis, P., The emergence of a citation cartel. The Scholarly Kitchen 2012; (10 April). Available from: http://scholarlykitchen.sspnet.org/2012/04/10/emergence-of-a-citation-cartel/ [(accessed 22 April 2013).].

Egghe, L. An improvement of the h-index: the g-index. ISSI Newsletter. 2006; 2(1):8–9.

ERA, Evaluation Handbook, 2012. In Australian Research Council. Available from: http://www.arc.gov.au/pdf/era12/ERA%202012%20Evaluation%20Handbook_final%20for%20web_protected.pdf [(accessed 25 April 2013).].

ERIH (n.d.) Context and Background of ERIH. In European Science Foundation. Available from: http://www.esf.org/research-areas/humanities/research-infrastructures-including-erih/context-and-background-of-erih.html (accessed 29 October 2008).

ERIH Foreword (n.d.) In European Science Foundation. Available from: http://www.esf.org/hosting-experts/scientific-review-groups/humanities/erih-european-reference-index-for-the-humanities/erih-foreword.html (accessed 25 April 2013).

ERIH Summary Guidelines (n.d.) In European Science Foundation. Accessed 29 October 2008. No longer available.

Ewing, J. Measuring journals. Notices of the American Mathematical Society. 2006; 53:1049–1053.

Excellence in Research for Australia, In Australian Research Council, 2008, 2 October. Available from: http://www.arc.gov.au/era [(accessed 29 October 2008).].

Falagas, M. E., Pitsouni, E. I., Malietzis, G. A., Pappas, G. Comparison of PubMed, Scopus, Web of Science, and Google Scholar: strengths and weaknesses. The FASEB Journal. 2008; 22(2):338–342.

Garfield, E. Citation indexes to science: a new dimension in documentation through association of ideas. Science. 1955; 122(3159):108–111.

Harnad, S., Open access scientometrics and the UK Research Assessment Exercise. Torres-Salinas, D. Moed, H. Proceedings of ISSI 2007: 11th International Conference of the International Society for Scientometrics and Informetrics. 2007:27–33. [I, Madrid, Spain, 25–7 June].

Hirsch, J. E. An index to quantify an individual’s scientific research output. Proceedings of the National Academy of Sciences USA. 2005; 102(46):16569–16572.

Jin, B. H., Liang, L., Rousseau, R., Egghe, L. The R- and AR-indices: complementing the h-index. Chinese Science Bulletin. 2007; 52(6):855–863.

Journal Citation Report Notices, 2008. Available from: http://admin-apps.isiknowledge.com/JCR/static_html/notices/notices.htm [(accessed 29 October 2008).].

Journal Citation Report Notices, 2011. Available from: http://admin-apps.webofknowledge.com/JCR/static_html/notices/notices.htm [(accessed 25 April 2013).].

Journals under Threat, A Joint Response from History of Science, Technology and Medicine Editors, 2008, 19 October. Available from: http://listserv.liv.ac.uk/cgi-bin/wa?A2=ind0810&L=classicists&T=0&P=5072 [(accessed 29 October 2008).].

Journal Usage Factor (n.d.) Results, recommendations and next steps. Available from: http://www.projectcounter.org/documents/Journal_Usage_Factor_extended_report_July.pdf (accessed 25 April 2013).

Journal Usage Factor, Exploratory data analysis, 2011. Available from: http://www.projectcounter.org/documents/CIBER_final_report_July.pdf [(accessed 25 April 2013).].

Kessler, M. M. Bibliographic coupling between scientific papers. American Documentation. 1963; 14:10–25.

Ketcham, C. M., Crawford, J. M. The impact of review articles. Laboratory Investigation. 2007; 87(12):1174–1185.

Kosmulski, M. A new Hirsch-type index saves time and works equally well as the original h-index. ISSI Newsletter. 2006; 2(3):4–6.

Lehmann, S. J. Measures for Measures. Nature. 2006; 44:1003–1004.

Leydesdorff, L., Felt, U. Edited volumes, monographs and book chapters in the Book Citation Index (BKCI) and Science Citation Index (SCI, SoSCI, A&HCI). Journal of Scientometric Research. 2012; 1(1):28–34.

Marshakova, I. System of documentation connections based on references (SCI). Nauchno-Tekhnicheskaya Informatsiya Seriya. 1973; 2(6):3–8.

McVeigh, M.E. (n.d.) Journal self-citation in the Journal Citation Reports – Science Edition (2002). In Thomson Reuters. Available from: http://wokinfo.com/essays/journal-self-citation-jcr/ (accessed 29 October 2008).

McVeigh, M. E., Mann, S. J. The Journal Impact Factor denominator: defining citable (counted) items. Journal of the American Medical Association. 2009; 302(10):1107–1109.

Moed, H. F. Citation Analysis in Research Evaluation. Dordrecht: Springer, 2005; 126.

Moed, H. F. UK Research Assessment Exercises: informed judgements on research quality or quantity? Scientometrics. 2008; 74(1):153–161.

Moed, H. F., Van Leeuwen, Th.N. Improving the accuracy of Institute for Scientific Information’s journal impact factors. Journal of the American Society for Information Science. 1995; 46(6):461–467.

Moed, H. F., Van Leeuwen, Th.N., Reedijk, J. A critical analysis of the journal impact factors of Angewandte Chemie and the Journal of the American Chemical Society inaccuracies in published impact factors based on overall citations only. Scientometrics. 1996; 37(1):105–116.

Monastersky, R. The number that’s devouring science. The Chronicle of Higher Education. 2005; 52(8):A12.

Morris, S. Mapping the journal publishing landscape: how much do we know? Learned Publishing. 2007; 20(4):299–310.

Morrison, H. G., Freedom for scholarship in the Internet age. Unpublished thesis, Simon Fraser University, 2012. Available from: http://summit.sfu.ca/system/files/iritems1/12537/etd7530_HMorrison.pdf [(accessed 19 April 2013).].

Newman, D., Journal use reports: easier collection development, 2005, February. In Thomson Reuters. Available from: http://scientific.thomsonreuters.com/news/2006-01/8310175 [(accessed 29 October 2008).].

Nicholas, D., Huntington, P., Jamali, H. R., Tenopir, C. What deep log analysis tells us about the impact of big deals: case study OhioLINK. Journal of Documentation. 2006; 62(4):482–508.

Peters, H. P.F., Van Raan, A. F.J. On determinants of citation scores: a case study in chemical engineering. Journal of the American Society for Information Science. 1994; 45(1):39–49.

Pinski, G., Narin, F. Citation influence for journal aggregates of scientific publications: theory with application to literature of physics. Information Processing & Management. 1976; 12(5):297–312.

Potter, C. V., Dean, J. L., Kybett, A. P., Kidd, R., James, M., et al. Comment: 2004’s fastest organic and biomolecular chemistry!. Organic and Biomolecular Chemistry. 2004; 2(24):3535–3540.

Price, J., Are they any use? Hazards of price-per-use comparisons in e-journal management. In 30th UKSG Annual Conference: Plenary Sessions. 2007, 17 April Available from: http://www.uksg.org/sites/uksg.org/files/jprice_plenary_presentation_2007.pps [(accessed 4 January 2008).].

Pudovkin, A. I., Garfield, E. Rank-normalized impact factor: a way to compare journal performance across subject categories. Proceedings of the American Society for Information Science and Technology. 2004; 41(1):507–551.

Reedijk, J., Moed, H. F. Is the impact of impact factor decreasing? Journal of Documentation. 2008; 64(2):183–192.

Research Excellence Framework, Consultation on the assessment and funding of higher education research post-2008, 2007, November. In Higher Education Funding Council for England. Available from: http://www.hefce.ac.uk/pubs/hefce/2007/07_34/ [(accessed 29 October 2008).].

Rowlands, I., Nicholas, D., New journal publishing models: the 2005 CIBER survey of journal author behaviour and attitudes, 2005. In UCL Centre for Publishing. Available from: http://www.publishing.ucl.ac.uk/papers/2005aRowlands_Nicholas.pdf [(accessed 29 October 2008).].

Schonfeld, R., King, D., Okerson, A., Fenton, E.The Nonsubscription Side of Periodicals: Changes in Library Operations and Cost between Print and Electronic Formats. Washington, DC: Council of Library and Information Resources, 2004.

Schubert, A., Glänzel, W. A systematic analysis of Hirsch-type indices for journals. Journal of Informetrics. 2007; 1(3):179–184.

Seglen, P. O. The skewness of science. Journal of the American Society for Information Science and Technology. 1992; 43(9):628–638.

Seglen, P. O. Why the impact factor of journals should not be used for evaluating research. BMJ. 1997; 314(7079):498–502.

Small, H. Co-citation in the scientific literature: a new measurement of the relationship between two documents. Journal of the American Society of Information Science. 1973; 24(4):265–269.

Sombatsompop, N., Markpin, T., Premkamolnetr, N. A modified method for calculating the impact factors of journals in ISI Journal Citation Reports: polymer science category in 1997–2001. Scientometrics. 2004; 60(2):217–235.

Standardized Usage Statistics Harvesting Initiative (SUSHI) (n.d.). In National Information Standards Organization. Available from: http://www.niso.org/workrooms/sushi(accessed 29 October 2008).

Technology Overview (n.d.). In Google – Corporate Information. Available from: http://www.google.com/corporate/tech.html (accessed 29 October 2008).

Testa, J., The Thomson Scientific journal selection process, 2012. In Thomson Reuters. Available from: http://thomsonreuters.com/products_services/science/free/essays/journal_selection_process/ [(accessed 22 April 2013).].

The PLoS Medicine Editors. The impact factor game. PLoS Medicine. 2006; 3(6):e291.

Tsay, M., Chen, Y.-L. Journals of general and internal medicine and surgery: an analysis and comparison of citation. Scientometrics. 2005; 64(1):17–30.

Weale, A. R., Bailey, M., Lear, P. A., The level of non-citation of articles within a journal as a measure of quality: a comparison to the impact factor. BMC Medical Research Methodology. 2004; 4(14), doi: 10.1186/1471-2288-4-14.


1.JCR data from http://isiknowledge.com/jcr was taken from the subject category Dentistry, Oral Surgery and Medicine, for the JCR Year 2011; Eigenfactor data from http://eigenfactor.org was obtained using the subject category Dentistry, Oral Surgery and Medicine, for the year 2011; SJR and SNIP data from http://www.journalmetrics.com was taken from the subject area Dentistry, for the year 2011.

2.Available from: https://www2.esf.org/asp/ERIH/Foreword/search.asp.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset