Defending the Unified Peer-to-Peer Review Platform Proposal Part 2
I really hate writing long comments in (most) comment sections, because the readability is usually terrible. So I’m going to spend another blog post to reply to David Crotty’s additional critique. The topic of our exchange is our working paper. Part 1 of our exchange is here (a different style of blog title there, though. I apologize for the confusion). This is part 2 and there will be a part 3, because there are too many things in the critique that require more detailed responses to fit in one blog post, unfortunately.
On a related note: I can’t shake the feeling that I’m totally destroying the proper flow of presenting our model with these exchanges, though. We’re jumping all over the place with this. If you want to skip this and just read from the beginning, you can either go to the first part of my blog series that summarizes our proposed model (and wait for updates) or read our paper in its entirety right here.
The point there was to talk about the futility of social reputation scoring systems as motivators for participation in a professional market. That’s where the question “who cares” is an important one.
Yes, I know what your point is. And I’m saying that just because there haven’t been many success stories, if any, it doesn’t mean it’s impossible. It just means that it is, at best, very difficult and the odds are very high that our model isn’t feasible. Consider me fully aware of this reality. On that note, I thank you for investing time and effort into reading our working paper and providing feedback for it. I truly appreciate that (well, not necessarily for your first entry, but certainly for this one).
No one questions the desire to improve the peer review process, improve science publishing, improve communication and speed progress. If your system accomplishes those sorts of things, then isn’t that motivation enough?
It sure is, but what if we don’t see a way of doing that unless we can somehow measure the contribution of scholars in the role of peer reviewer and provide them with “personal” incentives proportionally? To you, a scoring system may just be a useless shiny number generator, but for us it’s an integral part of getting this model to function properly and realize the benefits that come with it. Actually, if we can motivate enough scholars to peer review with our approach (or similar methods) and verify that such approaches truly improve scholarly communication in general, I’m not ruling out the idea of removing these “personal” incentives. But to me, they are more than just personal incentives; I genuinely believe these incentives can additional provide important value to scholars, such as an indirect “rating” of manuscripts (more on this later).
Adding a scoring system does not seem to be a big motivator, particularly, as noted in the blog, because it’s irrelevant outside of your system.
Our “system” is actually the entire environment of Open Access preprint repositories including additional databases/archives with (peer-to-)peer reviews of those preprints. It’s relevant to all the scholars who are interested in platforms with scrutinized research literature. That’s not a small area of relevancy.
If I’m up for tenure and I haven’t published any papers or secured any grants, will having a good Reviewer Impact score make any difference to my institution? If I’m a grant officer for the American Heart Association, I’m looking to fund researchers who can come up with results that will help cure disease, not researchers who are good at interpreting the work of other researchers or who are popular in the community. Why would I care about a grant applicant’s Reviewer Impact score?
I noticed I’ve missed this part in your original blog post as well. I’m going to skip commenting on this right now because I don’t think you fully understand our intentions with Reviewer Impact as an incentive for scholars to peer review. Instead, I’ll talk a little bit more about that in reply to your following statements:
For any system to be adopted, it has to have clear utility and superiority on its own. An artificial ranking system does not add any motivation for participation. The one benefit offered by your Reviewer Impact score is more visibility for one’s own papers.
Actually, that’s not the only benefit. The Reviewer Impact’s most significant role is to help with enforcing accountability, actually. And we’re hoping that a more accountable system leads to peer reviews of a higher quality, which leads to more incentives for scholars. Anyway…
That seems to be the opposite of what you’d want out of any paper filtering system. You want to highlight the best papers, the most meaningful results, not the papers from the best reviewers. If a scientist does spectacular work but is a bad reviewer, that work will be buried by your system in favor of mediocre work by a good reviewer.
A legitimate concern. In fact, it’s so legitimate that we’ve thought of it as well. Page 18:
When scholars are searching and browsing for manuscripts, the order of the peer review offers of the manuscripts are prioritized based on relevance. However, if the degree of suitability and relevance of the manuscripts is largely the same, the manuscripts of the scholars with a higher Reviewer Impact will be listed higher on search results, browse results and their lists of peer review offers.
So while I think I did say papers in the last reply to you, I meant “manuscripts” or “preprints”. I agree absolutely that the good papers should be at the top of the heap, but the rankings of preprints are (more) fair game. Once someone can establish the quality of the preprints, this personal incentive is going to be far less effective.
Page 18 again:
The paper citation count alone is widely considered as one of the best, if not the best, quality indicator of a paper that scholars have. It is therefore more reasonable to attach far less weight to the Reviewer Impact in determining the priority, if at all, and more weight to the other quality indicators when it comes to postprints.
And page 19:
In order to avoid such conflicts, we could apply the following conditions: the priority level for preprints in search and browse results are for 50% determined by the Reviewer Impact of the authors. 30% is based on more established quality indicators such as paper citation counts. The remaining 20% is based on the “informal” manuscript screenings and other quality indicators. For postprints, 60% is based on quality indicators such as paper citation counts and the peer review grades. 20% is based on their Reviewer Impact, and the remaining 20% is based on the informal manuscript screenings and other quality indicators. These conditions can be revaluated once they have been put in effect and more insight is available on their effectiveness. For example, if there is evidence that the Reviewer Impact of scholars is more accurate in reflecting the quality of the manuscripts of the respective authors than previously assumed.
That said, I’m happy to expand my comments on your proposal. As a working paper, it deserves scrutiny and hopefully constructive criticism to improve the proposal.
I call the automated selection program “magical” because it does not exist, and I don’t think it’s technologically capable of existing, at least if it’s expected to perform as well as the current editor-driven system.
A legitimate concern. As I mentioned before, the effectiveness of our manuscript selection function and the efficiency (with regards to time and effort) of peer reviewing the peer reviews are the biggest hurdles for our proposal. But even if our manuscript selection function cannot be optimized to eliminate the same number of conflicts of interests as journal editors do now, that doesn’t actually mean that the total amount of peer reviews through our method will be relatively of lesser quality compared to the total amount of peer reviews through the traditional journal peer review. In exchange for a lack of journal editors, the system does provide a far higher level of (public) accountability. Even journal editors cannot track the degree of professionalism scholars exert (for peer reviewing). That is possible in a unified (peer-to-)peer review system.
Your conflict of interest prevention system relies entirely on reviewers being completely fair and honest.
No, this is actually not accurate. One suggestion we have to improve this conflict of interest prevention system is to have scholars publicly “declare” that there is no conflict of interests. Some excerpts from page 15:
Furthermore, scholars should be given the opportunity and encouragement to mark both manuscripts and papers for which they are “Proficient” to peer review. If this statement is checked, additional statements are presented, such as the “No conflict of interest” statement and whether the scholars are “Interested”, “Very Interested” or “Not Interested” in peer reviewing the respective manuscripts and papers.
After a manuscript has a certain amount of such “compatibility” statements checked by a number of scholars, a short overview with the titles and abstracts of the respective manuscripts can be added to the real profile pages of these scholars. This allows for a way to validate the accuracy of their claims of proficiency and objectiveness.
So we see this as a method to connect manuscripts with scholars without giving away the identities of the peer reviewers. Which means that…
One of the common complaints about the current system is that reviewers with conflicts deliberately delay, or spike a qualified publication. If those reviewers are so unethical that they’re willing to accept a review request from an editor, despite knowing their conflicts, why do you think they’d recuse themselves in your system?
…our proposed system allows practically anybody to verify (and call to attention) potential conflicts of interests. This allows for a higher degree of accountability and provides the means to stop “repeat offenders”. With the traditional publishing system, this is extremely difficult to achieve, if not impossible. Of course, efficiency is a legitimate concern even if we can enforce standards to minimize the “additional” workload. Again, we certainly don’t deny this is going to take some serious effort to get it as good as with real journal editors, if at all possible.
Isn’t landing a big grant or being the first to publish a big result going to be more important to them then scoring higher on an artificial metric?
Absolutely. But I think what you’re forgetting is that we’re dealing with OA preprints here. Which is both a strength and a weakness. The “strength” is that the part about “stalling a publication” is actually less meaningful in “our” OA preprint environment than in the current scholarly publishing environment. Granted, “ruining” a review/manuscript and delaying a proper “grading” of the preprint is still going to be detrimental to the authors of those preprints. And a bigger “weakness” is that by limiting ourselves to just being able to scrutinize OA preprints, we can never truly compensate for an important service that journal publishers provide: preventing manuscripts from being publicly accessible until it’s been accepted (after revisions, optionally) for publication i.e. “closed” peer review, if you will.
Again, with the ability to confirm and track such offenses globally, we can discourage such incidents from happening (again) better than the current system can, in theory. And I don’t think we should just disregard the potential of the other measures that we have in mind (and written about) to improve this system. But I definitely can see how optimizing this function is going to take the most time. I’m actually secretly hoping that experts of recommendation systems and encryption can give us a piece of their minds on how to best optimize this process through automation and/or allowing people to manually verify which papers scholars have tagged “no conflict of interests” without revealing their identities.
But there’s much more to selecting good reviewers than just avoiding conflicts of interest. Your system relies on reviewers accurately portraying their own level of expertise and accurately selecting only papers that they are qualified to review. One of the other big complaints about the current system is when reviewers don’t have the correct understanding of a field or a technique to do a fair review. A skilled editor finds the right reviewers for a paper, not just random people who are in the same field.
On average, journal editors know enough about the manuscripts and the scholars to estimate whether scholars are capable of properly reviewing it better than the scholars themselves? I find that quite difficult to believe. And scholars certainly don’t have the incentive to risk that in our system. Because all their peer reviews get evaluated systematically. And depending on how they’ve done their job (with very high or very low scores for peer reviews), their reviews could be partly made public.
When an editor fails to do their job properly, you get unqualified reviewers. In your system, this would be massively multiplied as there’s a seemingly random selection of who would be invited to review once you get into a particular field.
This depends on how effective the manuscript selection function will work. And I don’t expect the incompetency of scholars to determine whether they can properly review manuscripts to be that big of an issue, to be totally honest. The stories that I’ve heard are usually the other way around: scholars complaining about journal editors repeatedly sending them manuscripts that are way outside of their expertise, simply because the journal editors don’t have anybody else or part of their stressful routine to get as many peer reviews as quickly as they can. The scholars who cave and do the peer reviews do so because #1. it’s not their responsibility if the peer reviews are of low quality since it’s the journal editors who asked them to do that and, more importantly: #2. they feel they can still contribute to improving the manuscripts, even if it’s not their (main) expertise. And they realize their (weaker) contribution could very well be the only thing these manuscripts will have (before they either get rejected or accepted for publication). And something is better than nothing.
With our system, we’re basically saying: “We’re giving you the option to choose now, ladies and gentlemen. So make sure you get it right, because there’s nobody but yourself to blame if you do a poor job of it. And that doesn’t help you and it doesn’t help the authors that you’re trying to help”. Optionally, we could provide them with the opportunity to simply “screen” (“light” peer review) instead of a “real” review if they feel they have something to contribute to the manuscript, but not fully confident that they can do a good job of it.
Your system seems to have a mechanism built in where a reviewer can only reject a limited amount of peer review offers. After that, he must peer review manuscripts to remain part of the system. That puts pressure on reviewers to accept papers where they may not be qualified.
A fair point. To reduce the impact of this issue, scholars can earn the right to refuse to do peer reviews with activities that require far less time and effort. From “screening” papers to validating the no conflict statements of authors and other useful but less time consuming activities. Page 17:
The third measure is a limit system: a mechanism that enforces a limited amount of times a scholar can perform an activity. An example of such an activity is rejecting a batch of peer review offers. While rejecting manuscripts they do not wish to peer review is acceptable, they cannot reject unlimitedly. They have to peer review or screen manuscripts to regain the option again to reject again.
Still, I imagine the solutions should probably be better. It definitely is food for thought.
Expertise is not democratically distributed. You want papers reviewed by the most qualified reviewers possible, not just someone who saw the title and abstract and thought it might be interesting or because they ran out of rejection opportunities allowed by the system.
All fair points, but as I’ve already commented on them earlier in this reply, I’m going to skip this. In fact, I think this is a good place to end part 2 of my defense. Part 3 coming soon…