TriVec: Knowledge Graph Embeddings for Accurate and Efficient Link Prediction in Real World Application Scenarios

Tracking #: 620-1600


Responsible editor: 

Frank van Harmelen

Submission Type: 

Research Paper

Abstract: 

Knowledge graph embeddings models are widely used to provide scalable and efficient link prediction for knowledge graphs. They use different techniques to model embeddings interactions, where their tensor factorisation based versions are known to provide state-of-the-art results. In recent works, developments on factorisation based knowledge graph embedding models were mostly limited to enhancing the ComplEx and the DistMult models, as they can efficiently provide predictions within linear time and space complexity. The evaluation of these models was also limited to general knowledge benchmarks and it did not include any other applications in specialised domains. In this work, we aim to extend the works of the ComplEx and the DistMult models by proposing a new factorisation model, TriVec, which uses three part embeddings to model a combination of symmetric and asym- metric interactions between embeddings. We perform an empirical evaluation for the TriVec model compared to other tensor factorisation models on different training configurations (loss functions and regularisation terms), and we show that the TriVec model provides the state-of-the-art results in all configurations. In our experiments, we use standard benchmarking datasets (WN18, WN18RR, FB15k, FB15k-237, YAGO10) along with a new NELL based benchmarking dataset (NELL239) that we have developed. To complement the evaluation of our method on standard, but rather artificial datasets, we also present a more realistic benchmark based on the real-world problem of predicting effects of chemical-protein interactions. More specifically, we build a knowledge graph benchmark of chemicals, proteins and the effects of their interactions, and we desing an evaluation pipeline that uses knowledge graph embedding to predict new chemical-protein interactions and their effects. We then show by experimental evaluation that our model provides the best results in terms of the area under the ROC and precision recall curves in the prediction of the effects of chemical-protein interactions compared to other knowledge graph embedding models. Keywords. Knowledge Graph Embedding, Link Prediction, Bioinformatics

Manuscript: 

Tags: 

  • Reviewed

Data repository URLs: 

Date of Submission: 

Monday, January 27, 2020

Date of Decision: 

Thursday, April 9, 2020


Nanopublication URLs:

Decision: 

Reject

Solicited Reviews:


2 Comments

Note from the editor-in-chief

First of all, we apologize for the delay with this. The two reviewers raise a number of important points that need to be resolved before this manuscript can be accepted. The authors also should look at the section about "Extended Versions" of the Guidelines for Authors (https://datasciencehub.net/content/guidelines-authors) and make sure these conditions are fulfilled.

Tobias Kuhn (http://orcid.org/0000-0002-1267-0234)