Reviewer has chosen not to be AnonymousOverall Impression:
AcceptTechnical Quality of the paper:
Clear noveltyData availability:
All used and produced data (if any) are FAIR and openly available in established data repositoriesLength of the manuscript:
The length of this manuscript is about right
Summary of paper in a few sentences:
This article presents Hobbit, a platform that is used for benchmarking systems that solve various tasks related to Linked Data in both local and distributed systems. The platform architecture was defined based on a set of requirements that were defined with the help of numerous experts, as well as the requirements defined in FAIR, such that the platform supports FAIR data processing (in some respects requiring adherence of the benchmarked systems to FAIR principles). The paper thus clearly explains the reasoning between certain design choices (such as decoupling into containers for platform stability). The platform's functionality is evaluated on two LD tasks, but moreover, it has demonstrated real-world usefulness by use in different challenges.
Reasons to accept:
A1 Sufficient novelty to previous publications relating to the platform.
A2 The platform and its source code are openly available to test and view.
A3 The authors define an ontology that both allows for clear functionality on the platform and ensures FAIR principles.
A4 The description of the motivation behind the components of the platform as well as of the workflow of using it as a resource for further research are clearly and understandably described in this article.
A5 The authors evaluate their platform on two common LD tasks with different configurations, showing that their platform works as expected.
A6 The quality and functionality of the platform has been established through use in numerous real-world challenges and it has a large number of registered users already.
Reasons to reject:
R1 The comparison with related work, i.e. existing frameworks on pp2-3, is somewhat weak and Table 1 could be explained in more detail.
R2 It is unclear where the limitations lie, i.e., what cannot be done on the platform.
R3 As stated in section 6, the platform controller being a single node creates a bottleneck under realistic loads.
- In Table 1, why are only 4 frameworks compared when the text mentions others? I understand that some such as Peel are more difficult to compare given their specific limitations, yet I think it would be nice to mention anyway. Also, the inclusion of "Manual Revision" at this stage seems odd given it is not supported by any of the compared frameworks.
- U5 is not addressed explicitly in section 4 but I believe it to be implicitly fulfilled.
- Section 4.3.2, are Data generators responsible for synthetic data sets? How are real data sets loaded into the benchmarking system? I suppose this is trivial but it would be nice to point out.
- Direction of messages in Figure 4 is nearly impossible to recognize, please enlarge the arrowheads slightly.
- With respect to the conclusion, what further extensions are intended in the future?
Typos and minor errors spotted:
- p2 l7: missing closing parenthesis
- p2 l8: "can be installed deployed locally"
- p3 Table 1: capitalize FAIR
- p4 l5: I suppose Ux should be U?
- p4 l34: was build -> built
- p7 l7: "implemented a system" unclear
- p7 l9 and l20: benchmark's, l12 and p10 l8: system's, l23: experiment's, p8 l38: browser's
- p9 l22: two triple stores
- p9 l28: R1.1
- p9 l40: cannot
- p11 l20: "computation the KPIs"
- p13 l42: "the average length... has a length of..."
- p14 l22: than on the single
- p15 l27: per document
Meta-Review by Editor
Submitted by Tobias Kuhn on
The reviewers agree that this is a solid resource contribution and that the paper itself shows how evaluations can also be made fair.
Paul Groth (https://orcid.org/0000-0003-0183-6910)