Defending the Unified Peer-to-Peer Review Platform Proposal Part 3
This is part 3 of my defense against David Crotty’s (additional) critique. The topic of our exchange is our working paper. Part 1 of our exchange is here (a different style of blog title there, though. I apologize for the confusion). Part 2 is here.
If you want to skip this and just read about our proposed model from the beginning, you can either go to the first part of my blog series that summarizes our proposed model (and wait for updates) or read our paper in its entirety right here.
Citation: you can imagine other factors being added in, but from my recall, citation was the factor that was mentioned over and over again as being used to score a reviewer’s performance.
Not quite. The paper citation count is actually “just” another element of the Reviewer Impact. The grades for the peer reviews by the other peer reviewers are far more important. We didn’t design nor intended for the paper citation count to be relatively more influential for the Reviewer Impact of scholars.
I agree that citation is an incredibly important metric, but it’s a flawed one as well. It’s impossible to separate out a citation in a subsequent paper that lauds an earlier discovery versus one that proves it to be untrue. Fraudulent and incorrect papers get cited lots.
No disagreements here. As mentioned in part 2 (and in the paper, of course): the paper citation count does have a substantial role in determining the rankings of the eprints by default. But that’s not really any different from how it is utilized now. I wonder how feasible attaching a +/- value to citations will be for the model?
Citation is a very slow metric as well, as you note in your proposal. If Reviewer Impact is indeed important to a career, it may not fit into the necessary timeline for someone up for tenure or funding.
Reviewer Impact can help one’s career a bit, indirectly, by improving the visibility of one’s papers. The actual effect of citations is rather modest in this context. A high Reviewer Impact will do very little, if anything at all, for authors who have papers that aren’t interesting to scholars to begin with. A high Reviewer Impact will likely do something for authors with papers that are interesting to scholars to begin with.
And citation is certainly an area where one could game the system, by deliberately citing all of the papers one reviewed. If the Reviewer Impact score is somehow decided to be important, you could choose papers relevant to your own work, give them a good review then cite them, thus pumping up your own score for judging science.
The Reviewer Impact is not designed nor was it ever intended to be significantly influenced by the paper citation count. Which is why I consider your scenario to be rather farfetched. However, you are correct that it is still a factor and it is still exploitable so I should take it seriously. So thank you for bringing up this point. I’ve spent a good part of an afternoon thinking of ways to tackle this issue.
Here’s one pretty surefire way to tackle the issue: if the Reviewer Impact becomes so important to scholars that they are willing to “unjustly” increase the citation count of the papers that they have reviewed and scored favorably (on “significance/impact”) just to gain that little bit of advantage (from one element of their peer reviews), I think it’s safe to assume that by then the peer-to-peer review model is very established. Likely established enough that we no longer need the “personal” incentives to encourage scholars to participate. Thus we remove the incentives in favor of improving the effectivity and efficiency of all our scholarly communication functions, while at the same time eliminating many of the personal motives for scholars to exploit the system. I’d really rather not do this unless there’s absolutely no other way of preventing people from exploiting the paper citation count and possibly other exploits, though.
A more technologically challenging approach is to develop a function that can automatically track which authors have cited which papers. And then track whether they have peer reviewed those papers (positively), but without providing the names of the actual papers to ensure the identities of the peer reviewers are not exposed. To allow a higher degree of accountability, this information can then be publicly displayed, e.g.: “This author has a total of x citations divided over x papers that he/she has also peer reviewed (and rated positively)”. After deciding what are unreasonable levels of such incidents taking place, we can either nullify the impact this measure has on their Reviewer Impact. Of course, they could avoid this measure by having “friends” cite those papers for them so they’re not directly connected.
The easiest solution is to simply remove the paper citation count as an element that can influence a scholar’s Reviewer Impact. They must still assess the significance of a paper, but only to the benefit of the other scholars, not themselves. If we were to use this solution, we must think of another way to reflect a peer reviewer’s ability to accurately assess the significance of a paper in their Reviewer Impact. After all, it is a valuable skill and reflects the understanding the scholars have of their disciplines. Perhaps we should have the model and these specific elements depend (more) on the expert reviews of registered scholars. For example, based on just the abstract, introduction, discussion and conclusion of a paper, provide scholars with the means to rate, comment and provide additional evidence for the significance of the reviewed paper. They have to be qualified to determine this for the research topic of the paper. Since the focus for them should only be on the papers they find significant and can say something positive about, their real identity will also be attached to it.
Another example of a place for gaming is in reviewing the other reviewer on your paper. As I understand it, there are a certain number of “reviewer credits” given for each peer review session. If those are divided among the paper’s reviewers based on their performance, isn’t there an advantage in always ranking the other reviewer poorly so you garner more of the credits?
Authors and peer reviewers must unanimously consent to “finalize” a peer review session. They will have access to the peer reviews of the other peer reviewers before they receive the option to end that particular peer review session. If this doesn’t lead to a satisfying result the assessments can be made publicly visible. This will consequently provide a larger group of peers the opportunity to share their thoughts and finalize it for them. The peer reviewers will receive the opportunity to adjust their assessments before that can happen. Either way, no Reviewer Credits will be credited until peer review sessions are finalized.
Delays: one month is a lot longer than my current employer gives reviewers (2 weeks).
The 1 month is just an example and can be changed. I think this will also vary per discipline.
Furthermore, as you note, the time limit can be changed by the reviewers.
Oops, that’s supposed to be a consensus of authors and peer reviewers. We got it right with the “new deadlines”, but not with the default time limits. Good catch! And while we’re at it, it’s probably a good idea to set the maximum extension limit to 2 weeks every time. This way authors (and peer reviewers) receive two weeks every time to consider if the peer review sessions are going somewhere or if they should just cancel them.
If a paper gets a bad reviewer who unfairly trashes it, should that paper be permanently tarnished by having that review read by every subsequent reviewer? Wouldn’t it be better if they gave the paper a fair chance, a blank slate? Clearly I’m not alone in thinking this, as the uptake levels for systems like the Neuroscience Peer Review Consortium are microscopic (1-2% of authors).
Better for the authors, yes. And what if the reviews are fair and the authors unfair in #1. their treatment of the reviews and #2. their assessment of their manuscripts? On what basis do the authors/papers deserve another “fair” chance then? Aren’t you the one overly relying on authors to be completely truthful and objective about their own manuscripts now?
This seems to be an interesting contradiction: if editors are really as valuable as we think they are when it comes to selecting the best peer reviewers, then shouldn’t we also expect that, in general, peer reviewers will objectively, proficiently and constructively review manuscripts? What is the problem then with sharing the reviews with other valuable journal editors who will choose the best peer reviewers? Other than the competitive reasons? If the “consensus” is that these peer reviews shouldn’t be shared with other journal editors and their selected peer reviewers, doesn’t that imply that something is wrong with the ability of journal editors and their selected peer reviewers to carry out their tasks in an objective, constructive and proficient manner? And that we should encourage a higher degree of accountability, both for the authors and for the peer reviewers?
One good reason I can think of why peer reviews shouldn’t be shared, with respect to improving scholarly communication in general, is because the authors have revised the manuscripts and the peer reviews have become irrelevant. Which means that sharing them would be a waste of time for the new peer reviewers. This can somewhat be addressed with a “Track Changes” function that can neatly display the changes of the manuscripts in relation to the specific points of feedback by the corresponding peer reviewers.
Additional work: there’s a huge difference between reading through the other reviewer comments on a paper and in writing up and doing a formal review of the quality of their work. If one is to take such a task seriously, then it’s a timesink.
I think it’ll depend on the design and use of the quality assessment instruments. It doesn’t necessarily have to be a big write up of the other peer review report. It could be based on providing ratings for each of the important characteristics of a good peer review and a “highlight” tool to support each rating. Each highlighted part of the other peer review represents something the other peer review lacks or has extra compared to their own peer review report. The entire process does not have to be complicated or time consuming to still be constructive.
There seems to be a whole raft of negotiations involved and extra duties, extra rounds of review. The proposal itself is highly complicated, filled with all sorts of if/than sorts of contingencies. You’ve certainly put a lot of thought into it, but it’s way too complicated, too hard to explain to the participants.
I expect that the model will be easier to explain once it’s out of the design/brainstorming phase. The current phase is about presenting as many solutions as possible to address every single significant issue that we can think of. If the model wasn’t so “complex”, I’d likely have a lot harder time replying to your criticism, for example. If we can actually build the tools, it’ll be easier for them to just use the tools, rather than read about them.
The ideal improvement to the system would be a streamlining, not an adding in more tasks, more negotiation, more hoops through which one must jump. Time is the most valuable commodity that most scientists are having to ration. Saving time and effort should be a major focus of any improved system.
First of all, the proposed model is designed to reduce the time and effort of scientists, among other things. Secondly, a lot of critics disregard new models for a lack of personal incentives. Pretty much like what you’ve been doing, and I’d like to think this is one of the few ideas that has actually focused on providing said personal incentives to participants. And it’s a bit difficult to provide personal incentives, whatever they are, without a way to assess the exact contribution of participants so one can reward them more appropriately. And one thing led to another and this proposed model is the result of just wanting to make scholarly communication better and still provide personal incentives.
I do think you have some interesting ideas here, and I look forward to seeing future iterations.
Thanks. Some of your feedback has helped me come up with some new ideas to improve the model a bit.