Application of Concepts of Neighbours to Knowledge Graph Completion

Tracking #: 633-1613

Authors:

NameORCID
Sébastien FerréORCID logo https://orcid.org/0000-0002-6302-2333


Responsible editor: 

Tobias Kuhn

Submission Type: 

Research Paper

Abstract: 

The open nature of Knowledge Graphs (KG) often implies that they are incomplete. Knowledge graph completion (aka. link prediction) consists in inferring new relationships between the entities of a KG based on existing relationships. Most existing approaches rely on the learning of latent feature vectors for the encoding of entities and relations. In general however, latent features cannot be easily interpreted. Rule-based approaches offer interpretability but a distinct ruleset must be learned for each relation. In both latent- and rule-based approaches, the training phase has to be run again when the KG is updated. We propose a new approach that does not need a training phase, and that can provide interpretable explanations for each inference. It relies on the computation of Concepts of Nearest Neighbours (C-NN) to identify clusters of similar entities based on common graph patterns. Different rules are then derived from those graph patterns, and combined to predict new relationships. We evaluate our approach on standard benchmarks for link prediction, where it gets competitive performance compared to existing approaches.

Manuscript: 

Supplementary Files (optional): 

Previous Version: 

Tags: 

  • Reviewed

Data repository URLs: 

Date of Submission: 

Monday, May 4, 2020

Date of Decision: 

Wednesday, June 3, 2020


Nanopublication URLs:

Decision: 

Accept

Solicited Reviews:


1 Comment

Meta-Review by Editor

The reviewers agree that the paper has much improved in terms of its presentation and structure. Some considerable issues remain, however, with respect to how scalability and explainability are discussed, as pointed out by reviewer 2.

This acceptance is therefore conditional on the following issues being addressed:

- The issue of scalability needs to be clarified and quantified, as pointed out by reviewer 2. This doesn't need to be a full-blown study, but at least some performance numbers on larger KGs should be shown in order for the reader to get an impression of at which orders of magnitude, if any, the approach stops working efficiently in practice.

- The explainability of the approach needs to be better discussed. I agree with reviewer 2 that a user study would be preferable, but I don't consider this a mandatory addition for acceptance. However, this issue should at least be better underscored with convincing examples and it should be made clear that the "explainability in principle" that this approach provides does not necessarily translate into "practical explainability" that reaches end users. The sub-section that was supposed to address this issue (6.3) seems to be empty (see also below).

- The issues with rules 3 and 4 from the comparison with AnyBURL, as reported by reviewer 2, need to be addressed.

Moreover and in addition to the more minor points mentioned by the reviewers, some layout and language problems remain:

- There are a number of empty elements that seem to point to formatting problems: Definition 3; 4. Overview of the Approach; 6.3. Inference Algorithm and Explanations

- The language and style of the paper should be checked and improved further. For example, what follows after "The main question we answer here is:" is grammatically not a question and therefore hard to read. As another example, a (null) hypothesis is normally said to be "rejected" and not "contradicted".

Tobias Kuhn (http://orcid.org/0000-0002-1267-0234)