Is the mean normalized citation score (MNCS), one of the most commonly used field-normalized scientometric indicators, fundamentally flawed as an indicator of scientific performance? This is the topic of a sharp debate in the scientometric community. Journal of Informetrics, of which I am the Editor-in-Chief, today published a special section in which a number of prominent scientometricians provide their viewpoint on this question. The debate opens with a provocative paper by Giovanni Abramo and Ciriaco Andrea D’Angelo, two well-known Italian scientometricians. They argue that the MNCS and other similar indicators are invalid indicators of scientific performance. Their critical perspective can be highly relevant, because the MNCS and related indicators are widely used in evaluations of research groups and scientific institutes.
To illustrate the essence of the debate, let us take a look at an example. Suppose we have two research groups in the field of physics. These groups have exactly the same resources, for instance the same number of researchers and the same amount of funding. Suppose further that during a certain time period group X has produced 60 publications, of which there are 40 that, after normalization for field and publication year, have a citation score of 2.0 and 20 that have a citation score of 0.5. In the same time period, group Y has produced 100 publications, half of them with a normalized citation score of 2.0 and half of them with a normalized citation score of 0.5. Based only on this information, which of the two research groups seems to be the one that is performing better?
Using the standard scientometric indicators that we use at CWTS, the answer to this question would be that research group X is performing better. At CWTS, we normally use the MNCS indicator – and other so-called size-independent indicators, such as the PP(top 10%) indicator – to answer questions like the above one. The MNCS indicator equals the average normalized citation score of the publications of a research group. The indicator has a value of (40 x 2.0 + 20 x 0.5) / 60 = 1.50 for group X and a value of (50 x 2.0 + 50 x 0.5) / 100 = 1.25 for group Y. Hence, on average the publications of group X have a higher normalized citation score than the publications of group Y, and therefore according to the MNCS indicator group X is performing better than group Y. The same answer would be obtained using indicators analogous to the MNCS that are used by other scientometric centers and that are provided in commercial scientometric analysis tools, such as InCites and SciVal.
Does the reasoning followed by the MNCS indicator make sense, and should research group X indeed be considered the one that is performing better? A counterargument could be as follows. Groups X and Y have exactly the same resources. Using these resources, group Y outperforms group X in terms of both its number of highly cited publications (50 vs. 40 publications with a normalized citation score of 2.0) and its number of lowly cited publications (50 vs. 20 publications with a normalized citation score of 0.5). In other words, using the same resources, group Y produces 10 more highly cited publications than group X and 30 more lowly cited publications. Since group Y is more productive than group X both in terms of highly cited publications and in terms of lowly cited publications, the conclusion should be that group Y is the one that is performing better.
This reasoning is followed by Abramo and D’Angelo in their paper A farewell to the MNCS and like size-independent indicators. This paper is the starting point for the debate that is taking place in the above-mentioned special section of Journal of Informetrics. Abramo and D’Angelo argue that research group Y is performing better in our example, and they conclude that indicators such as the MNCS, according to which group X is the better one, are fundamentally flawed as indicators of scientific performance. Since the MNCS and other similar indicators are widely used, the conclusion drawn by Abramo and D’Angelo could have far-reaching consequences.
Are Abramo and D’Angelo right in claiming that in our example research group Y is performing better than research group X? I agree that group Y should be considered the better one, and therefore I believe that Abramo and D’Angelo are indeed right. The next question of course is why the MNCS indicator fails to identify group Y as the better one. The reason is that the MNCS indicator does not use all information available in our example. In our example, groups X and Y have exactly the same resources, but this information is not used by the MNCS indicator. The MNCS indicator takes into account only the outputs produced by the groups, so the publications and their citations, and it does not take into account the resources available to the groups. For this reason, the MNCS indicator is unable to find out that group Y is performing better than group X. This is a fundamental problem of the MNCS indicator and of any scientometric indicator that does not take into account the resources research groups have available.
One may wonder why indicators such as the MNCS are being used so frequently even though they suffer from the above-mentioned fundamental problem. As I explain in a response (free preprint available here) that I have written to the paper by Abramo and D’Angelo, the answer is that in most cases there does not seem to be a better alternative. In my response, co-authored with CWTS colleagues Nees Jan van Eck, Martijn Visser, and Paul Wouters, it is pointed out that information about the resources of a research group usually is not available, and if this information is available, the accuracy of the information is often questionable. For this reason, scientometric assessments of the performance of research groups usually need to be made based only on information about the outputs of the groups. Since such assessments are based on incomplete information, they provide only a partial indication of the performance of research groups. This is the case for scientometric assessments in which the MNCS indicator is used, but in fact it always applies when indicators are used that take into account only the outputs of a research group.
A more extensive discussion of the limitations of indicators such as the MNCS, and of the proper use of these indicators, can be found in the above-mentioned special section of Journal of Informetrics. The special section includes eight responses to the paper by Abramo and D’Angelo. These responses, written by scientometricians with diverse backgrounds and ideas, provide a rich set of perspectives on the fundamental criticism of Abramo and D’Angelo on commonly used scientometric indicators. Gunnar Sivertsen for instance offers an interesting plea for methodological pragmatism, allowing for a diversity of indicators each providing different insights and serving different purposes. Mike Thelwall argues that collecting the data required for calculating the indicators advocated by Abramo and D’Angelo is too expensive. Lutz Bornmann and Robin Haunschild even claim that employing the indicators proposed by Abramo and D’Angelo may be undesirable because of the risk of misuse by politicians and decision makers.
Many different viewpoints can be taken on the criticism of Abramo and D’Angelo, but probably the main lesson to be learned from the debate is that we need to continuously remind ourselves of the limitations of scientometric indicators. As scientometricians we need to be humble about the possibilities offered by our indicators.