PIDs, please play FAIR and identify yourselves!

Tracking #: 558-1538

Authors:

NameORCID
Joakim PhilipsonORCID logo https://orcid.org/0000-0001-5699-994X


Responsible editor: 

Alejandra Gonzalez-Beltran

Submission Type: 

Position Paper

Abstract: 

This is an extended, revised version of [37]. Findability and interoperability of some PIDs and their compliance with the FAIR data principles are explored, where ARKs were added in this version. It is suggested that the wide distribution and findability (e.g. by simple 'googling') on the internet may be as important for the usefulness of PIDs, as the resolvability of PID URIs. This version also includes new reasoning about the failure to use PIDs such as DOIs for citation. The prevalence of phenomena such as link rot implies that the persistence of URIs cannot be trusted. By contrast, the well distributed, but seldom directly resolvable ISBN identifier has proved remarkably resilient, with far-reaching persistence, inherent structural meaning and good validatability, by means of fixed string-length, pattern-recognition, restricted character set and check digit. Examples of regular expressions used for validation of PIDs are supplied or referenced. The suggestion to add context and meaning to PIDs, making them "identify themselves", through namespace prefixes and object types is more elaborate in this version. Meaning can also be conferred by means of structural elements, such as well defined, restricted string patterns, that at the same time make PIDs more "validatable". Concluding this version is a generic, refined model for a PID with these properties, in which namespaces are instrumental as custodians, meaning-givers and validation schema providers. A draft example of a Schematron schema for validation of "new" PIDs in accordance with the proposed model is provided.

Manuscript: 

Supplementary Files (optional): 

Previous Version: 

Tags: 

  • Under Review

Data repository URLs: 

None

Date of Submission: 

Thursday, February 28, 2019