Reviewer has chosen not to be AnonymousOverall Impression:
AcceptTechnical Quality of the paper:
Limited noveltyData availability:
All used and produced data (if any) are FAIR and openly available in established data repositoriesLength of the manuscript:
This manuscript is too long for what it presents and should therefore be considerably shortened (below the general length limit)
Summary of paper in a few sentences (summary of changes and improvements for
second round reviews):
The paper reviews existing persistent identifiers, their use and usefulness in the context of the FAIR principles. Based on those findings, it presents a new format that seeks to offer the features of the most successful systems.
Reasons to accept:
Overall, as my answers to the specific questions show, I'd say this is a weak accept. There's some good material in the paper. For example, the text includes comparisons of the longevity of different PIDs and the accuracy with which they lead to the identified thing. That's useful information although the paper would benefit from presenting those details as tables, charts etc.
Reasons to reject:
The paper would benefit from better layout, diagrams etc. and none of the references appear in the text due to poor HTML.
More substantive, I'd suggest, is that there's a big dollop of open access/paywall discussion that, for this paper, is irrelevant. In my opinion, this should be removed. The paper is about the discoverability, resolvability and persistence of IDs - stick to that topic and don't go into a discussion of paywalls/access to research etc.
I'd also suggest a switch around in the order of presentation. The suggested new PID format comes at the end rather than at the beginning: "this paper proposes a new PID structure that addresses a series of issues identified with existing structures" or some such.
Due to my current work at GS1, I found the comment on ISBNs very interesting. They are indeed persistent and pre-date the Web by some decades. Most importantly - they are used throughout the industry that runs the system (the publishing, supply chain and retail industries). The paper refers several times to the - correct - notion that persistence comes from usage rather than design. The check digit is included so that if the point of sale scan fails, the number can be entered into the till manually and there's a good degree of checking that it was entered correctly.
However, sadly, it is not true that ISBNs provide a 1-1 mapping. ISBNs are just a special case for the even more widely used UPC/EAN numbering system used on all manner of goods. At the end of the day, it's just a number - and they are cloned/re-used around the world. It's being addressed, sure, and it's not a massive problem, but it's not as perfect as the paper suggests.
One thing - and this I must admit is a personal hobby horse - persistence is a matter of policy, not technical design. Link rot is a real problem because people allow it to happen, not because of an innate property of the Web. I'd have loved to have seen that point included in the paper.