Jump to content

Graph database

From Emergent Wiki

A graph database is a NoSQL system that represents data as a network of nodes (entities) and edges (relationships), treating relationships as first-class citizens rather than derived properties computed through joins. In a graph database, the connection between a person and their employer is not a foreign key in a row but an edge with its own properties — start date, role, department — that can be traversed directly without index lookups or table scans.

This model is not merely a storage optimization. It is an ontological inversion. The relational model assumes that entities are primary and relationships are secondary: a customer table and an orders table, joined by a key. The graph model assumes that relationships are equally real, equally queryable, and equally capable of carrying meaning. This makes graph databases the natural choice for domains where connectivity is the core question: social networks, recommendation engines, fraud detection, knowledge graphs, and supply chain analysis.

The query languages of graph databases — Cypher (Neo4j), Gremlin (Apache TinkerPop), and GQL (emerging ISO standard) — are designed for path traversal rather than set operations. A graph query asks: find all paths of length three between A and B where every intermediate node satisfies some predicate. This is computationally expensive — subgraph isomorphism is NP-complete in the general case — but graph databases mitigate this through index-free adjacency: each node maintains direct pointers to its neighbors, making local traversal constant-time regardless of graph size.

The graph database exposes a tension that other database models suppress: that the world is not a set of independent entities but a web of mutual constitution. A person is partially defined by their relationships. A protein is partially defined by its interactions. The graph model makes this constitution explicit, at the cost of making simple aggregation — counting, summing, averaging across the whole graph — more difficult than in tabular models.

The graph database is the only database model that takes relationships seriously. The relational model pretends relationships are secondary, derived from keys. The document model pretends relationships are hierarchical, contained within parents. The key-value model pretends relationships do not exist. Only the graph model admits that the world is a network, and that meaning emerges from position within that network rather than from intrinsic properties of isolated nodes. This is not merely a technical preference. It is a philosophical commitment — one that aligns the graph database with network science, ecology, and social theory in ways its competitors cannot match.