## Learning interpretable entity embeddings

Entity embeddings are vector space representations of objects (e.g. people, movies, places) and concept names. In our work, we learn such embeddings based on a combination of textual descriptions (extracted from Wikipedia) and structured knowledge (obtained from WikiData). Our main focus is on learning embeddings that can be viewed as approximations of conceptual space representations. To this end, we have focused on learning embeddings in which (i) semantic types are represented as low-dimensional subspaces of the full embedding and (ii) salient features correspond to directions in the space. The resulting representations allow us to describe or compare entities in terms of interpretable features, to find entities that satisfy given criteria, and to implement cognitive models of categorisation, among others. Intended applications include entity retrieval, knowledge base completion, and zero-shot learning.

Find out more## Learning word region embeddings

Word embedding models such as SkipGram and GloVe represent each word as a vector. This is not always satisfactory, for two reasons. First, for rare words, the available information typically does not allow us to choose a specific vector in a principled way. Second, standard vector representations do not allow us to model how broadly the meaning of a word should be interpreted, making such representations less suitable for modelling properties of objects or for identifying hypernym relations, among others. Both issues can be addressed by representing words as regions or densities. In the first case, we then intuitively obtain a probability distribution over standard vector-based embeddings, whereas in the second case the regions/densities are modelling the word meanings themselves. Surprisingly, even for standard word embedding benchmark tasks, the resulting models can outperform state-of-the-art word embedding methods such as SkipGram and GloVe.

Find out more## Induction of description logic concepts

Each semantic type from WikiData corresponds to a subspace in our entity embeddings, and these subspaces can be viewed as approximations of conceptual spaces in the sense of GĂ¤rdenfors. Based on this view of entity embeddings as conceptual spaces, we have proposed a model that learns concept representations in a cognitively plausible way. In particular, for each concept, we learn a (scaled) Gaussian distribution over the relevant subspace of the embedding, where we can view the mean of that distribution as a prototype for the concept, and the covariance matrix intuitively reflects the relative importance of each of the underlying features. The resulting approach can be seen as a generalization of a number of commonsense reasoning strategies. In particular, if only one example of the concept is available, it behaves like similarity based reasoning; if two examples are available, it behaves like interpolative reasoning. However, in contrast to these commonsense reasoning methods, it is better able to exploit the distribution of examples, in cases where more than two instances are given.

Find out more## Relational marginals

We consider the problem of learning weighted logical theories from a given set of relational facts (e.g. a knowledge graph). One important question in this context is what the weights in these learned theories should represent. While several frameworks (e.g. Markov logic networks) have been proposed to link these weights to probabilities, it is usually left implicit to what statistics of the training data these estimated probabilities actually relate. To address this issue, we propose the notion of a relational marginal. Among others, we show how this view allows us to provide statistical bounds on the estimated parameters of learned models, and how it enables us to transfer learned models to domains of different sizes. As a practical application, we have developed a method that learns interpretable logical representations of these relational marginals, using a relational variant of possibilistic logic. Compared to Markov logic networks, these learned possibilistic logic theories were found to be more accurate in answering MAP queries with small amounts of evidence, in addition to being more interpretable.

Find out more