前言

因为研究生要开始做知识图谱相关的项目了，因此记录一下阅读论文的过程。此文也借鉴了一下知乎上的一些阅读笔记知乎阅读笔记

Perface

This is the notes when I was reading Knowledge Graphs cookbook paper.

Introduction notes

Since 2012, the modern incarnation of the phases “Knowledge Graph” begins at 2012 announcement of the Google Knowledge Graph. Graphs could provide the abstraction for a variety of domains.

Some Definitions

$\text{Node}$: The nodes of the graph represent related entities.
$\text{Edge}$: Capture the (potentially cyclical) relations between the entities.

Graphs could allow a more flexible manner than typically possible in a relational setting, particularly for capturing
incomplete knowledge.

Knowledge may be accumulated from external sources, or extracted from the knowledge graph itself.

$\text{Simple statements}$: e.g.“Santiago is
the capital of Chile”. Simple statements can be accumulated as edges in the data graph.
$\text{Quantified statements}$: e.g.“All capitals are cities.” To accumulate quantified statements, ontologies or rules is required.
$\text{Deductive methods}$: It used to imply and accumulate further knowledge (e.g.,“Santiago is a city”)
$\text{Inductive methods}$: Additional knowledge based on simple or quantified statements can also be extracted from and accumulated by the knowledge graph.

Knowledge graphs are often assembled from numerous sources, and as a result, can be highly diverse in terms of structure and granularity.

$\text{Schema}$: Defines a high-level structure for the
knowledge graph.
$\text{Identity}$: Denotes which nodes in the graph (or in external sources) refer to the same real-world entity.
$\text{Context}$: Indicate a specific setting in which some unit of knowledge is held true.

$\textbf{Two Types Knowledge Graphs}$

$\text{Open knowledge graphs}$: Open knowledge graphs are
published online, making their content accessible for the public good. Open knowledge graphs have also been published within specific domains.
$\text{Enterprise knowledge graphs}$: Enterprise knowledge graphs are typically internal to a company and applied for commercial use-cases.

Running Example

Example in the context of a hypothetical knowledge graph relate to tourism in Chil.

Data Graphs Notes

Relational Database Model

The example will be followed as running example. Assuming that the tourism board has not yet decided how to model related data, they firstly used a tabular structure (relational database) to represent the required data.

$\textbf{Event table}$

$\text{Event(\underline{name}, venue, type, \underline{start}, end)}$

where $\text{\underline{name}}$ and $\text{\underline{start}}$ together form the primary key of the table in order to uniquely identify recurring events.

In the process of organizing data based on this table, the tourism board found some serious problems. As the diversity of data increases, the tourism board can only continue to iterate the relational schema.

$\textbf{Disadvantages}$: For data with multiple sources and large changes, the use of relational database modeling may need to bear the high cost of multiple iterations

$\textbf{What graph models can do}$: In fact, we could model a set of binary relations between entities, which indeed can be viewed as modelling a graph to replace the ever-expanding relationship table. The board could abandon the need for an upfront schema and could define any (binary) relation between any pair of entities at any time.

Graph Data Models

Directed Edge-labelled Graphs

A directed edge-labelled graph (also known as a multi-relational graph) is defined as a set of nodes and a
set of directed labelled edges between those nodes. In the case of
knowledge graphs, nodes are used to represent entities and edges are used to represent (binary)
relations between those entities.

Heterogeneous graphs

A heterogeneous graph (or heterogeneous information
network) is a graph where each node and edge is assigned one type. The difference between Directed Edge-labelled Graphs and Heterogeneous graphs is in the Figure 1.

永缘空的博客

Knowledge Graphs cookbook (1)

前言