A Proposal To Improve Peer Review: A Unified Peer-to-Peer Review Platform (part 1.5)
Late again. So as I was working on part 2 of this blog series where I present our proposal to improve scholarly communication through the peer review element….I came across this rather scathing review of our proposal by David Crotty.
Since I don’t see the point of working on part 2 while someone has criticized some elements of our proposal, I’m going to take a short break and respond to the criticism first.
First things first, my co-author of our working paper no longer works at Erasmus University Rotterdam. He hasn’t updated his information yet. As for me, I currently don’t have any affiliations (relevant to this working paper anyway). So that’s that. I wouldn’t exactly classify myself as mysterious, as I do have a LinkedIn page where I’ve listed my educational background. But let’s focus on the actual comments.
Their system is designed to begin in open access preprint repositories and then potentially spread into use in traditional journals.
The design should, by default, allow journal publishers/editors to take advantage of the system. But that’s pretty much it. This part doesn’t change at all whether the peer-to-peer review model grows or not.
The proposal is full of gaping holes, including a need for a magical automated mechanism that will somehow select qualified reviewers for papers while eliminating conflicts of interest,
Okay. First of all, I don’t consider the idea of a recommendation system that can match manuscripts with suitable peer reviewers as magical. Now, in the Discussion & Conclusion section of our paper we go over the potential strengths and weaknesses of our peer-to-peer review model. In the “Potential Weaknesses” section of it, we’ve stated the following:
A key requirement of the peer-to-peer review model is that the automated manuscript assignment system has to be effective. Since it is essentially a type of recommendation algorithm, it should be technically and functionally feasible to find suitable manuscripts for scholars available for peer review. We identify two issues that remain for now. The first is how to verify whether there is a conflict of interest without making the real identities public. The ability to verify this would improve the answerability of this model significantly. Technically and functionally, filtering certain matches should be feasible, but it would significantly rely on the information that scholars provide. Perhaps allowing authors to indicate manually which authors (edit: scholars is probably a better term to use here) they do not want for peer review might help address this issue. The manual element can be done anonymously, making it only accessible to the automated manuscript selection algorithms. Ideally, we would be able to rely on the automated selection algorithms for this issue as much as possible. Creating a system that can compare paper abstracts, keywords, scholarly affiliations and future research projects to determine whether there is reason to believe there is a conflict of interest is a critical success factor.
To imply that we completely (and magically!) depend on the manuscript selection element, including the ability to find and reject matches with a conflict of interests, to be fully automated and working perfectly is highly inaccurate. In fact, on page 15, in the “On Peer-to-Peer Answerability” section of our paper, we’ve spent 5 paragraphs on addressing this exact issue, with the second paragraph starting with the following:
Manual approaches should additionally be implemented in the event the recommendation algorithms are unable to detect conflicts of interests. For example, scholars can manually prevent certain scholars from peer reviewing their manuscripts. The number of scholars they can prevent from peer reviewing can be based on the total number of suitable scholars. Furthermore, scholars should be given the opportunity and encouragement to mark both manuscripts and papers for which they are “Proficient” to peer review. If this statement is checked, additional statements are presented, such as the “No conflict of interest” statement and whether the scholars are “Interested”, “Very Interested” or “Not Interested” in peer reviewing the respective manuscripts and papers.
(SNIP: To the next paragraph)
After a manuscript has a certain amount of such “compatibility” statements checked by a number of scholars, a short overview with the titles and abstracts of the respective manuscripts can be added to the real profile pages of these scholars.
The rest you can read for yourself. I won’t do it justice unless I quote the entire thing, and I still got other things to handle in this post. One can certainly question how efficient this model can relatively be with these manual measures (an issue that we’ve also acknowledged and discussed), but to suggest a magical reliance on automating manuscript selections is highly inaccurate.
an over-reliance on citation as the only metric for measuring impact,
Not entirely sure what he’s referring to here. It’s true that we consider the paper citation count an important factor in determining the impact of a paper. And? I can imagine the number of views, downloads, ratings, comments, blog posts and such to be significant as well in determining the impact of a paper. Actually, we have factored in comments and ratings as something that can influence the impact of a manuscript. I’m sure we can consider the others as well later.
and a wide set of means that one could readily use to game the system.
Well, we’ve spent a lot of the paper addressing such issues. Did we identify all exploits? I doubt it. Did we create perfect measures to close the potential exploits? I doubt that. I’d like to think that at the design phase, which is where we are, we can (openly) discuss such issues. I, for one, am very interested in hearing about these ‘means that one could readily use to game the system’.
The proposal doesn’t seem to solve any of the noted problems with traditional peer-review
Solved is a big word. I think our message has been to try and “improve on the current situation”. Few to no incentives, for one. Accountability the other. Insight on the peer review quality (relatively). A higher utility of a single peer review by making it accessible to the relevant parties, such as other journals and peer reviewers of the same manuscript etc.
as it seems just as open to as much bias and subjectivity as what we have now.
Well, we do provide tools that allow scholars to at least track and (publicly) call out such offenses, in very extreme cases. In other cases they will simply not have their work “count” towards their “Reviewer Impact”, which is publicly visible. How is that as open as what we have now?
It’s filled with potential waste and delays as reviewers can apparently endlessly stall the process
What? No. Page 8 and 9:
Each peer review assignment is constrained by predetermined time limits. The default time limit for an entire process is one month after two peer reviewers have accepted the peer review assignment. Peer reviewers can agree to change the default time limit during the acceptance phase. Any reviewer who has not “signed off” by then will have Reviewer Credits extracted until the reviewers of the reports sign off or when the application for a peer review is terminated. This measure is to prevent a process going on for a far longer time than agreed to beforehand, which is not desirable for any party. An example of how the termination can work: a termination can happen when no new deadline, agreed by the authors and peer reviewers in question, has been set two weeks after it has passed the original deadline. In the case of termination one or more peer reviewers will have to be assigned to the peer review session to achieve the minimum of two peer reviews per manuscript.
Not exactly what I’d call the ability to stall endlessly.
and authors can repeatedly demand new reviews if they’re unhappy with the ones they’ve received.
Like how they can do now? Actually, we have something a little different in mind. See page 9:
When authors are not content after having gone through a peer review process, they can leave manuscripts “open” for others peer reviewers to start a new peer review session. The newer peer reviewers will have access to peer review reports of previous sessions, creating an additional layer of accountability. Concerning the consequences of multiple peer review sessions for the same manuscripts; in the traditional system the latest peer reviews before a manuscript is accepted for publication are the ones that count. In our peer-to-peer review model, the manuscript score is based on what the peer reviewers of the newest session have assigned to them. This is regardless of whether the scores are higher or lower than the previous manuscript scores. A possible alternative to this is to let the authors decide which results to attach to the manuscript rating. A disadvantage of authors selecting which set of grades to use is that it could likely weaken the importance of the earlier peer review sessions. To improve accountability and efficiency, previous reviews are not hidden from any future peer reviewers. The reviews will still count and the peer reviewers who have submitted them maintain the Reviewer Credits awarded to them. Regardless of how and which sets of grades are utilized, those specific grades are to be reflected in the rankings and returned search results.
So, yes, authors can demand new reviews if they’re unhappy with the ones they’ve received. And scholars can see how many times they’ve done this already based on the grades (and sometimes more, depending on the grades) of the existing peer reviews of those manuscripts and decide for themselves whether it’s worth their time to peer review them again. Again, you can question the effectiveness of this added level of accountability, but you cannot say authors can “abuse” the concept of requesting peer reviews as many times as they want. They can’t, and certainly not compared to what they already can and generally do with the current publishing system. Also, the section Crediting Reviewer Impact (which starts at page 11) covers additional “penalties” of authors repeatedly accepting new peer reviews.
Reviewers are asked to do a tremendous amount of additional work beyond their current responsibilities, including reviewing the reviews of other reviewers, and taking on jobs normally done by editors. If one of the problems of the current system is the difficulty in finding reviewers with time to do a thorough job, then massively increasing that workload is not a solution.
A legitimate concern. But here’s the thing, we’re not sure this is going to be true. Sure, we ask peer reviewers to additionally evaluate and score the peer reviews of the others. We’ll classify that as a chore. Not entirely substantiated, because we’ve more than once heard the sentiment shared that scholars actually enjoy having access to the other peer reviews of the manuscripts that they themselves have peer reviewed just out of curiosity, or to learn something from it. And are they not evaluating the other peer reviews by doing that? We’re just proposing to provide scholars who want to do that with the tools to do so effectively. But fine, we’ll consider that a chore.
But what if we achieve our intended objectives? What if by doing this the average quality of a peer review(er) goes up? What if the average number of peer reviews for manuscripts go down (because of the instruments that can hold peer reviewers and authors alike more accountable for untimely/low quality work)? And if you have to peer review a manuscript that has been peer reviewed before (but hasn’t been revised), what if you can save time by having access to previous peer reviews? And what if your own manuscripts receive greater odds of being noticed, read, reviewed and cited more often by peer reviewing well (more on this later, or you can just read it in the working paper)? A more efficient allocation (with a global platform) of the available peer reviewers, peer reviews, authors and manuscripts? Have a more objective understanding of (the impact of) your peer review proficiency (relatively to other scholars)? Open Access to scrutinized research literature? Would it still just be a tremendous waste of your time? Or may the benefits actually be worth it? Focusing on just the “chores” without pondering over the potential benefits, both perspectives which we’ve written extensively about, is not a very accurate way of evaluating proposals IMO.
There’s a reason that editors are paid to do their jobs — it’s because scientists don’t want to spend their time doing those things. Scientists are more interested in doing actual research.
And they can do that better when they don’t have to keep on peer reviewing (unrevised) manuscripts that have already been peer reviewed. And when they can have more access to scrutinized research literature.
Like the PubCred proposal, it fails to address the uneven availability of expertise, and assumes all reviewers are equally qualified.
Actually, the whole point of creating a metric for peer review proficiency is to more objectively measure the differences in peer review proficiency among scholars. And by providing them with the instruments to do so systematically, I’d like to think that we can get that kind of information. As for the former issue, I’m not entirely sure what he means. Scholars aren’t being punished for not peer reviewing. They can still submit their papers, and if they’re interesting enough then surely some scholars will want to peer review them, which they can with no penalties.
Also like PubCred, the authors’ suggestions for paying for the system seem unrealistic. In this case, they’re suggesting a subscription model, which seems to argue against the very open access nature of the repositories themselves, limiting functionality and access to tools for those unwilling to pay.
The nature of Open Access preprint repositories is to provide access to preprints. That doesn’t change at all. Everybody can still submit and access preprints in OA preprint repositories. What they might want to be paying for is more advanced search instruments for peer-to-peer reviewed manuscripts (“postprints”). What we propose here is something that can no longer be classified as an “open access repository”. It’s a peer-to-peer review model with an own database for peer reviews and possibly an own database for revised papers, if the repositories “providing” manuscripts can’t accommodate for that, providing open access to scrutinized preprints (“postprints”). Paying for the scrutiny of manuscripts doesn’t go against the nature of scholarly communication, surely.
The authors spend several pages going into fetishistic detail about every aspect of the measurement, but just as in the proposed Scientific Reputation Ranking program suggested in The Scientist, they fail to answer key questions:
Who cares? To whom will these metrics matter? What is being measured and why should that have an impact on the things that really matter to a career in science? Why would a funding agency or a hiring committee accept these metrics as meaningful?
If you’re hoping to provide a powerful incentive toward participation, you must offer some real world benefit, something meaningful toward career advancement.
And with this, my failure to come up with a good title of this blog post is exposed: it’s not just about peer review. As the title of our working paper already suggests: it’s also about scholarly communication. Who cares? Scholars who want to read scrutinized research literature might care. Authors who want to see their papers scrutinized might care. People who care about Open Access might care. Scholars who want to peer review properly might care. Scholars who peer review properly and want to be rewarded with higher odds of having their own works noticed, read, reviewed and cited might care. Scholars who care about a more efficient allocation of peer reviewers and their peer reviews might care. You can argue against the validity of these “incentives”, but you can’t just disregard them completely without considering them and telling people “there’s nothing for them to gain”. I find that to be a very incomplete approach of evaluating proposals.
Look, there are plenty of legitimate concerns with our proposed model. We spent quite a bit of time addressing how those concerns can be tackled. We could use advice on how to improve our proposed solutions or even criticism of why they don’t work. What we don’t need are people completely ignoring our proposed solutions when they review our proposal. It doesn’t help us, and I don’t see how it can help you. And that’s all I have to say for now. Back to working on part 2.