Graph database

In computing, a graph database is a database that uses graph structures for semantic queries with nodes, edges and properties to represent and store data - wikipedia

How might we draw meaningful diagrams of wiki that don't turn quickly into a fuzzball of links?

# Comparisons * Comparison of graph databases - docs.google.com * List of graph databases - wikipedia * 25 Graph Databases - predictiveanalyticstoday .com * Titan Graph Database

Graph DB technology has been advancing fast over the last few years, and more evolutions are coming down the pipe. For example, Titan Graph Database and Blazegraph are distributed and can handle billions of edges, and Blazegraph can be GPU-accelerated - blazegraph.com

Compared with relational databases, graph databases are often faster for associative data sets and map more directly to the structure of object-oriented (Object-oriented programming) applications.

They can scale more naturally to large data sets as they do not typically require expensive join operations.

As they depend less on a rigid schema, they are more suitable to manage ad hoc and changing data with evolving schemas. Conversely, relational databases are typically faster at performing the same operation on large numbers of data elements - wikipedia

# Concepts

A key concept of the system is the graph (or edge or relationship), which directly relates data items in the store. The relationships allow data in the store to be linked together directly, and in most cases retrieved with a single operation.

This contrasts with conventional relational databases, where links between data are stored in the data itself, and are gathered by searching for this data within the store and using the JOIN concept to collect the related data.

Abbildung eines Property Graphen bei Graphdatenbanken - wikimedia - wikimedia

Graph databases are designed to allow simple and rapid retrieval of complex hierarchical structures that are difficult to model in relational systems.

Abbildung eines Property Graphen bei Graphdatenbanken - wikimedia

The underlying storage mechanism of graph database products varies. Some, like MariaDB, are based on a relational engine and store the graph data in a table. More common examples generally use a key-value store or document-oriented database for storage, making them inherently NoSQL solutions.

These solutions generally offer a performance advantage because the graph is stored in a format similar to a database index, minimizing its size and retrieval time. Most graph databases based on non-relational storage engines also add the concept of tags or properties, which are essentially relationships lacking a pointer to another document. This allows data elements to be categorized for easy retrieval en masse.

Retrieving data from a graph database requires new concepts and generally a new query language. Some standardization efforts have taken place, leading to systems like Gremlin that work with a variety of graph engines, and the SPARQL system which has seen some multi-vendor adoption.

# Description

Graph databases are based on Graph Theory. Graph databases employ nodes, edges and properties - wikipedia

* Nodes represent entities such as people, businesses, accounts, or any other item you might want to keep track of. They are roughly the equivalent of the ''record'', ''relation'' or ''row'' in a relational database, or the ''document'' in a document database. * Edges, also known as ''graphs'' or ''relationships'', are the lines that connect nodes to other nodes; they represent the relationship between them. Meaningful patterns emerge when examining the connections and interconnections of nodes, properties, and edges. Edges are the key concept in graph databases, representing an abstraction that is not directly implemented in other systems. * Properties are pertinent information that relate to nodes. For instance, if ''Wikipedia'' were one of the nodes, one might have it tied to properties such as ''website'', ''reference material'', or ''word that starts with the letter w'', depending on which aspects of ''Wikipedia'' are pertinent to the particular database.

Graph databases directly store the relationships between records. Instead of an email address being found by looking up its user's key, the user record has a pointer directly to the email address record. That is, having selected a user, the pointer can be followed directly to the email records, there is no need to search the email table to find the matching records.

The relative advantage of graph retrieval grows with the complexity of the query. For instance, one might want to know "that movie about submarines with the actor who was in that movie with that other actor that played the lead in Gone With the Wind" - Examples where Graph databases shine: Neo4j edition

# Properties

Properties add another layer of abstraction to this structure that also improves many common queries. Properties are essentially labels that can be applied to any record, or in some cases, edges as well.

For instance, one might label Clark Gable as "actor", which would then allow the system to quickly find all the records that are actors, as opposed to director or camera operator. If labels on edges are allowed, one could also label the relationship between Gone With the Wind and Clark Gable as "lead", and by performing a search on people that are "lead" "actor" in the movie Gone With the Wind, the database would produce Vivien Leigh, Olivia de Havilland and Clark Gable.

The equivalent SQL query would have to rely on additional data in the table linking people and movies, adding more complexity to the query syntax. These sorts of labels may improve search performance under certain circumstances, but are generally more useful in providing additional semantic data for end users.=