Title: The Knowledge Graph as the Default Data Model for Machine Learning on Heterogeneous Knowledge
Authors: Xander Wilcke (*), Peter Bloem, and Victor de Boer
Journal: Data Science
Year: 2017
Type: Position paper
Version: Final v3
Concerns: cover letter concerning the points raised by three reviewers (2nd review round).
================================================
We would like to thank the three reviewers for their careful and useful reviews. Based on the points raised by each of
the reviewers we have updated our paper. For each point, we first list the main issue (>>>ISSUE: ) followed by our
response and adaptations (
================================================
>>>ISSUE:
Editor: "Remove the word 'default' from the title"
Reviewer 1: "The title might have been too optimistic. [...] to claim that it should be the default data model is perhaps too strong a statement."
Reviewer 2: "The title is too broad and does not accurately reflect the position in the paper."
We agree with the concerns with the original title. However, a title without the word 'default' would, we feel, reduce the title to a vacuous truth: it will be no surprise to anyone that one can perform machine learning on knowledge graphs. We specifically argue for a _broader_ use of knowledge graphs in machine learning, and the title should reflect that. Our solution is as follows:
- We have included "heterogeneous knowledge" in the title to make it more specific to our message
- We have included a paragraph in the introduction that defines what we mean by _default_: not a solution for all use cases, but a good first choice.
We realize that we are explicitly ignoring the editor's request in this draft. If this title is not acceptable we suggest the following instead:
"On the Knowledge Graph as the Data Model for Learning on Heterogeneous Knowledge"
------------------------------------------------
>>>ISSUE
Editor: "Extend the discussion to further address contextual limitations of knowledge graphs."
Reviewer 3: "[...] more discussion on when to use a knowledge graph and when not to [...]"
Our position is that for the majority of use-cases, knowledge graphs can be used, with two important caveats:
- The granularity with which knowledge is encoded into a graph must be carefully chosen
- The benefits of knowledge graphs are only obvious when dealing with heterogeneous knowledge. For homogeneous knowledge there may no performance benefits, but no disadvantages either.
We have adapted the introduction and section 3.4 to highlight these caveats.
------------------------------------------------
>>>ISSUE
Reviewer 3: "[...] more thorough discussion of the limitations w.r.t. the challenges. [...] distinguish what current approaches [...] are already capable of doing, what they might be extended to, and what might be the hard challenges for which no straightforward solution exists.[...]"
We have added a subsection (5.4) with a brief description of how current methods address the challenges described in section 4. This section also provides some indication of which challenges are simple to overcome and which are more fundamental.
------------------------------------------------
>>>ISSUE
Reviewer 3: "[...] I have my doubts that simply using a deep neural net will solve the issues. For example, for data with different modeling paradigms [...]. Without a significant overlap of pairs of instances that use *both* properties simultaneously (also indirectly by interlinking instances in both datasets), it will be difficult to learn that they refer to the same property.[...]"
We agree that some challenges, specifically the issue of differently modeled knowledge, are deep problems, and may in some cases be insurmountable. We have made this more explicit in section 4.4 and referenced active learning as a potential middle ground between integrating data by hand, and learning the integration as necessary with end-to-end models. Crucially, such solutions are still helped by a simple low-level integration of data from different sources, which the knowledge graph model provides.
================================================