Application of Concepts of Neighbours to Knowledge Graph Completion

Tracking #: 633-1613

Authors:

	Name	ORCID
	Sébastien Ferré	https://orcid.org/0000-0002-6302-2333

Responsible editor:

Tobias Kuhn

Submission Type:

Research Paper

Abstract:

The open nature of Knowledge Graphs (KG) often implies that they are incomplete. Knowledge graph completion (aka. link prediction) consists in inferring new relationships between the entities of a KG based on existing relationships. Most existing approaches rely on the learning of latent feature vectors for the encoding of entities and relations. In general however, latent features cannot be easily interpreted. Rule-based approaches offer interpretability but a distinct ruleset must be learned for each relation. In both latent- and rule-based approaches, the training phase has to be run again when the KG is updated. We propose a new approach that does not need a training phase, and that can provide interpretable explanations for each inference. It relies on the computation of Concepts of Nearest Neighbours (C-NN) to identify clusters of similar entities based on common graph patterns. Different rules are then derived from those graph patterns, and combined to predict new relationships. We evaluate our approach on standard benchmarks for link prediction, where it gets competitive performance compared to existing approaches.

Manuscript:

ds-paper-633.pdf

Supplementary Files (optional):

ds-supplementary-633-984.pdf

ds-supplementary-633-985.txt

Previous Version:

Application of Concepts of Neighbours to Knowledge Graph Completion

Data repository URLs:

http://www.irisa.fr/LIS/ferre/pub/link_prediction2020/

Date of Submission:

Monday, May 4, 2020

Date of Decision:

Wednesday, June 3, 2020

Nanopublication URLs:

Decision:

Solicited Reviews:

Review #1 submitted on 12/May/2020

Review Details

Reviewer has chosen to be Anonymous

Overall Impression: Good
Suggested Decision: Accept
Technical Quality of the paper: Good
Presentation: Good
Reviewer`s confidence: High
Significance: Moderate significance
Background: Reasonable
Novelty: Limited novelty
Data availability: All used and produced data (if any) are FAIR and openly available in established data repositories
Length of the manuscript: The length of this manuscript is about right

Summary of paper in a few sentences (summary of changes and improvements for second round reviews):

The authors propose a new approach for knowledge graph completion (mainly, link prediction) that does not need a training phase, and that can provide interpretable explanations for each inference. It relies on the computation of Concepts of Nearest Neighbours (C-NN) to identify clusters of similar entities based on common graph patterns. A key advantage of the method seems to be its interpretability.

Reasons to accept:

The paper is fairly well written, and the authors have addressed the concerns the reviewers raised in the previous round. Previous reasons for accepting still hold.

Reasons to reject:

Considering that many knowledge graph completion methods have already been presented in the literature, the article would be published in a relatively saturated field. The interpretability of the approach would be an advantage, but I surmise it will still have limited impact. However, this is an important area of research and I believe that the paper deserves to be published.

Nanopublication comments:

Further comments:

Review #2 submitted on 29/May/2020

By Heiko Paulheim ORCID logo

https://orcid.org/0000-0003-4386-8195

Review Details

Reviewer has chosen not to be Anonymous

Overall Impression: Average
Suggested Decision: Reject
Technical Quality of the paper: Weak
Presentation: Average
Reviewer`s confidence: Medium
Significance: Moderate significance
Background: Reasonable
Novelty: Limited novelty
Data availability: All used and produced data (if any) are FAIR and openly available in established data repositories
Length of the manuscript: The length of this manuscript is about right

Summary of paper in a few sentences (summary of changes and improvements for second round reviews):

The author presents an approach for knowledge graph completion, called "concepts of nearest neighbors". It is an anytime learning approach which learns patterns of neighbors and uses those for knowledge graph completion.

The paper is an extended version of an ESWC 2019 paper, as acknowledged by the author. Compared to that paper, there are no novel ideas, but a more extensive evaluation on more datasets and baselines, as well as a more detailed investigation of the impact of various parameters.

Reasons to accept:

In their revision, the writing style has been improved significantly. The paper now is much easier to follow.

Reasons to reject:

I am not very happy with the approach by which some of the criticism was addressed.

Regarding scalability, I would have expected a more thorough study. There are several ways of doing this, e.g., by systematically creating smaller samples of the graph, by using artificial benchmark graphs, or by carrying out a demonstration on a really large KG, e.g., DBpedia or Wikidata. None of this has been done by the authors. I suggest that for a publication in a journal, a proper scalability study is undertaken.

For the explanations, I would have liked to see some examples, rather than a statement like "the top rules are used". A small user study would be a good way to justify the claim that explainable inference is actually produced.

Looking at the four rules used in the comparison to AnyBURL, I was also wondering how valid they are. While the first and second seem valid, the third looks odd (actually, with a support of 1, this looks like overfitting - the corresponding rule would be something like "all story producers of Superman II live in Cleveland"?), while the fourth is just capturing some general pattern of movies (most movies have a make-up artist and a special effects supervisor, which, however, has little to do with the fact that they win an award for original screenplay - in fact, make-up and special effects seem to be rather orthogonal to screenplay). This is in fact another reason why a user study would be very important in this case, to show that the rules are also meaningful explanations for humans, not just capturing some patterns in the graph which can be used for link prediction, but are not useful otherwise.

Nanopublication comments:

Further comments:

Review #3 submitted on 29/May/2020

Review Details

Reviewer has chosen to be Anonymous

Overall Impression: Average
Suggested Decision: Undecided
Technical Quality of the paper: Average
Presentation: Average
Reviewer`s confidence: Medium
Significance: Moderate significance
Background: Reasonable
Novelty: Limited novelty
Data availability: All used and produced data (if any) are FAIR and openly available in established data repositories
Length of the manuscript: The length of this manuscript is about right

Summary of paper in a few sentences (summary of changes and improvements for second round reviews):

This paper presents a novel symbolic approach for Knowledge Graph Completion, which is both reasonable in theory and competitive in experiments. The advantages of this approach can be summarized as follows: 1. Easy to interpret, which is common for all symbolic approaches. 2. No training is needed for evaluation, which can be applied to dynamic knowledge graph or newly added triples.

The author proposes to use two rules to predict the tail entities for the head entities and relations. The first one can be understood as “a tail entity tends to have a relation with a head entity if it has the relation with other similar head entities” The second one can be interpreted as “a tail entity tends to have a relation with a head entity, if another tail entity has the same relation with another head entity, and the tail/head entities appear in the same position in the graph patterns.” These assumptions are generally reasonable. Although the computation of similar entities adopts an existing approach (C-NN), the proposed approach is novel.

The method is properly evaluated on benchmark datasets. The author conducts detailed experiments of the proposed approach, especially in the analysis of the positive and negative predictions.

A concern is that since the rules are not learned from the training set but human-designed, it may introduce bias or false deductions to the testing. And the concrete advantages in terms of improving the prediction performance over other rule-learning approaches are not explicitly pointed out. However, it’s also interesting to see the two human-designed rules compare favorably to other learned rules with the help of Concepts of Nearest Neighbors.

In general, the paper is interesting in theory and competitive in experiments.

Reasons to accept:

1. This paper provides a novel and reasonable approach for Knowledge Graph Completion.

2. The paper is presented in a rigorous and clear way, which makes it appeal to a broader audience.

3. The experiments are conducted using benchmark datasets, which allows fair comparisons.

Reasons to reject:

1. Though reasonable in most cases, the rules used are not guaranteed to be true, which is shown by the negative predictions in the experiments.

2. The advantages in terms of improving the performance of predictions over other symbolic approaches are not detailed.

Nanopublication comments:

Further comments:

1. Section 4 (Overview of the Approach) seems missing.

2. The writing can be more formal to make it clearer. For example, the last three lines of page 7 can be improved.

3. It’s better to give a visualization (graph) of the running example.

RESPONSE TO REVIEWERS

I kindly thank the reviewers for their careful reading, and their
valuable suggestions for improvement.

The parts of the paper that have been added or significantly changed
are shown in bold in the supplementary file of the new manuscript.

I hereafter answer to each point raised by the reviewers by describing
the changes that have been made.

# Approach name

> review 1
> I am not sure it is wise to name the approach CNN, since the acronym
> has become synonymous with Convolutional Neural Networks. It may be
> difficult for people to find this paper, especially if all they
> remember is that it is called CNN.

True. Given that this name has already been used in former
publications, it is difficult to make a radical change. We replaced
CNN by C-NN, which has a nice analogy with k-NN for the classical
nearest neighbour approach.

# The big picture

> review 2
> Overall, the paper is written in a very confusing style. Although a
> running example is used, it is often hard to grasp the idea and the
> example per se. What would help is a big picture explaining how the
> different pieces (queries, answers, extensions, intensions, neighbors,
> inference, ...) fit together and how they are combined in order to
> produce a prediction. For the actual prediction part, I miss a
> pseudocode explaining how the approach actually computes an inference.

Thank you for the suggestion of "a big picture", we agree that this
was missing. We added a short section "Overview of the Approach",
before the technical sections, supported by a new figure showing the
whole workflow of the approach in a schematic way. The new section
summarizes the whole approach along with a small running example, and
with references to the relevant sections. We hope that it will
facilitate the reading of the paper.

A pseudocode algorithm (Algorithm 2) was added to explain the
inference and ranking process.

# Formal terminology

> review 1
> I think the terminology and use of definitions and symbols can be
> simplified, and I encourage you to do so to make your theory more
> accessible.

We strived to simplify and clarify the technical sections. In
particular, in Section "Concepts of Nearest Neighbours", we added
before each formal definition a short paragraph to provide context and
intuition about what is defined and why.

We also significantly simplified the very definition of C-NNs
(Definition 3). Table 1 has been changed accordingly.

# Scalability

> review 1
> I think the paper makes a valid and different contribution to the
> field; hence, it should be accepted, even though it's not without some
> limitations, mainly that I am not confident that it can be applied to
> large-scale knowledge graphs due to its reliance
> anytime-computation. In contrast, embeddings are only linear in the
> total numbers of relations and entities.

> review 2
> It would be good to address the scalability limitations in the
> beginning of the paper. It seems to me that storing the KG in memory
> is important for the practical merits of this approach. Either way,
> this needs to be mentioned somewhere in the introduction.

We added the following paragraph in the introduction:

«The combinatorial aspect of graph patterns over knowledge graphs is
tackled by delaying their computation to inference time, which enables
to bound the number of C-NNs by the number of entities, and in
practice it is generally much smaller. The intensive knowledge-based
computation at inference time is managed with in-memory KG indices (as
is common in RDF stores), and an any-time algorithm for both C-NN
computation and inference.»

And the following paragraph at the end of the section on C-NNs computation:

«A last point to discuss is the scalability of C-NNs
computation w.r.t. the KG size, i.e. what is the impact of doubling
the number of entities. If we make the reasonable assumption that
the new knowledge follows the same schema as the old knowledge
(e.g., adding film descriptions to a film KG), then the impacts are
that: a) the query answers are two times larger, and b) the number
of C-NNs is somewhere between the same and the double, depending on
the novelty of the new knowledge. As a consequence, if our objective
is to compute a constant number of C-NNs for further inference, then
it suffices to increase timeout linearly with KG size.»

# Explainability

> review 2
> One major claim of the paper is that the inference is
> explainable. However, I find that claim problematic, in particular
> since the paper does not show examples of how such an explanation
> would be presented to a user. Just because the patterns learned are
> explicit, it does not mean that the inference itself is
> explainable. In fact, as the paper claims, many rules are combined
> into a final inference. Which one is then chosen as an explanation? Is
> there a difference whether the inference is created by one strong or
> by 100 weak rules?

This is discussed along with the description of Algorithm 2, in the
new subsection 6.3.

1 Comment

Meta-Review by Editor

Submitted by Tobias Kuhn on Wed, 06/03/2020 - 08:20

The reviewers agree that the paper has much improved in terms of its presentation and structure. Some considerable issues remain, however, with respect to how scalability and explainability are discussed, as pointed out by reviewer 2.

This acceptance is therefore conditional on the following issues being addressed:

- The issue of scalability needs to be clarified and quantified, as pointed out by reviewer 2. This doesn't need to be a full-blown study, but at least some performance numbers on larger KGs should be shown in order for the reader to get an impression of at which orders of magnitude, if any, the approach stops working efficiently in practice.

- The explainability of the approach needs to be better discussed. I agree with reviewer 2 that a user study would be preferable, but I don't consider this a mandatory addition for acceptance. However, this issue should at least be better underscored with convincing examples and it should be made clear that the "explainability in principle" that this approach provides does not necessarily translate into "practical explainability" that reaches end users. The sub-section that was supposed to address this issue (6.3) seems to be empty (see also below).

- The issues with rules 3 and 4 from the comparison with AnyBURL, as reported by reviewer 2, need to be addressed.

Moreover and in addition to the more minor points mentioned by the reviewers, some layout and language problems remain:

- There are a number of empty elements that seem to point to formatting problems: Definition 3; 4. Overview of the Approach; 6.3. Inference Algorithm and Explanations

- The language and style of the paper should be checked and improved further. For example, what follows after "The main question we answer here is:" is grammatically not a question and therefore hard to read. As another example, a (null) hypothesis is normally said to be "rejected" and not "contradicted".

Tobias Kuhn (http://orcid.org/0000-0002-1267-0234)

Data Science

Application of Concepts of Neighbours to Knowledge Graph Completion

Tracking #: 633-1613

Authors:

Responsible editor:

Submission Type:

Abstract:

Manuscript:

Supplementary Files (optional):

Previous Version:

Tags:

Data repository URLs:

Date of Submission:

Date of Decision:

Decision:

1 Comment

Meta-Review by Editor