More on the PBRFs new clothes

A few of weeks ago I outed the multi-million-dollar exercise that is the Quality Evaluation component of the performance based research fund (PBRF) as a futile exercise because there was no net gain in research dollars for the NZ academic community.  Having revealed the Emperor’s new clothes, I awaited the call from the Minister in charge to tell me they’d cancelled the round out of futility.  When that didn’t come, I pinned my hope on a revolt by the University Vice-Chancellors. Alas, the VCs aren’t revolting.  This week, my goal is for there to be mass resignations from the 30 or so committees charged with assessing the evidence portfolios of individual academics and for individual academics to make last minute changes to their portfolios so as to maintain academic integrity.

I love academic metrics – these ways and means of assessing the relative worth of an individual’s contribution to academia or of the individual impact of a piece of scholarly work are fun.  Some are simple, merely the counting of citations to a particular journal article or book chapter, others are more complex such as the various forms of the h-index. It is fun to watch the number of a citations of an article gradually creep up and to think “someone thinks what I wrote worth taking notice of”.  However, these metrics are largely nonsense and should never be used to compare academics.  Yet, for PBRF and promotions we are encouraged to talk of citations and other such metrics.  Maybe, and only maybe, that’s OK if we are comparing how well we are performing this year against a previous year, but it is not OK if we are comparing one academic against another.  I’ve recently published in both emergency medicine journals and cardiology journals.  The emergency medicine field is a small fraction the size of cardiology, and, consequently, there are fewer journals and fewer citations.  It would be nonsense to compare citation rates for an emergency medicine academic with that of a cardiology academic.

If the metrics around individual scholars are nonsense, those purporting to assess the relative importance (“rank”) of an academic journal are total $%^!!!!.  The most common is the Impact Factor, but there are others like the 5-year H-index for a journal.  To promote them, or use them, is to chip away at academic integrity.  Much has been written elsewhere about impact factors.  They are simply an average of a skewed distribution.  I do not allow students to report data in this way.  Several Nobel prize winners have spoken against them.  Yet, we are encouraged to let the assessing committees know how journals rank.

Even if the citation metrics and impact factors were not dodgy, then there is still a huge problem that faces the assessing committee, and that is they are called on to compare apples with oranges.  Not all metrics are created equal.  Research Gate, Google Scholar, Scopus and Web of Science all count citations and report h-indices.  No two are the same.  A cursory glance at some of my own papers sees a more than 20% variation in counts between them.  I’ve even paper with citation counts of 37, 42, 0 and 0.  Some journals are included, some are not depending on how each company has set up their algorithms. Book chapters are not included by some, but are by others. There are also multiple sites for ranking journals using differing metrics.  Expecting assessing committees to work with multiple metrics which all mean something different is like expecting engineers to build a rocket but not to allow them to use a standard metre rule.

To sum up, PBRF Evidence Bases portfolio assessment is a waste of resources, and encourages use of integrity busting metrics that should not be used to rank individual academic impact.


2 thoughts on “More on the PBRFs new clothes

  1. Michael MacAskill

    The first step to changing behaviour is measuring that behaviour. Only then can incentives be applied accordingly. To my mind, PBRF has been a success inasmuch as the research output of NZ universities has apparently increased markedly since it was introduced.

    Having said that, yes, the measures employed are imperfect and the exercise is immensely inefficient. When looking at the distribution of funding awarded to the eligible institutions over time, there has been very little real change between them, so it is not clear that the exercise actually changes the funding allocation much (I don’t have those figures to hand, but my recollection is that the only substantial change was for AUT, increasing from a very low baseline).

    As researchers, we tend to focus on the quality assessment, as it is so labour intensive for us at an individual level. But 45% of the allocation formula is based on simple administrative data gathered at an institutional level (currently 25% from research degree completions and 20% from external funding gained). One can debate about the Matthew effect in operation here (the organisations already getting external income or producing research graduates get extra recognition for that, rather than funding being directed to develop those that are struggling to do so). But at least these metrics are very simple to assess and are directly comparable across organisations. That leaves the 55% that arises from the immensely labour-intensive quality assessment. Given that the formula seldom leads to substantial changes in the actual funding allocation, I believe that the flawed measures adopted for the quality assessment could simply be gathered at a global organisational level, saving the immense efforts of individual academics to produce portfolios, all of which need to be assessed multiple times by other academics. i.e. presumably simply counting all of the publications and citations for an institution could be a rough-enough shorthand for the individual assessments, given that that work doesn’t seem to directly alter much in terms of the final outcome in the funding formula.

    Individual researchers are still monitored though performance reviews and promotion applications, so it is not like we escape accountability, and those are done more frequently than the 6-yearly PBRF cycle.

    1. Michael MacAskill

      Ahh, it must have been your previous post where I saw those institutional figures collated over the previous PBRF rounds:

      As you so nicely showed, there is very little change from round-to-round in the proportional allocation (especially since the second round, by which time everyone had learned lessons from the various games that needed to be played). This really does emphasise how the total effort placed into the PBRF process should be reduced as much as possible, since the effort massive effort employed into the “quality” component is not justified by the useful variability in the output (the proportional funding allocation) from round to round.


Got something to say? Don't be shy

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s