Semantic Web

The Semantic Web is the vision, articulated by Tim Berners-Lee in 1999 and developed through a series of W3C standards, of a World Wide Web in which information is given well-defined meaning — not merely displayable to humans, but processable by machines. The ambition was radical: to transform the web from a repository of documents linked by URLs into a global knowledge graph in which data carries explicit semantic structure, enabling automated reasoning, inference, and integration across heterogeneous sources.

The project is one of the most instructive failures in the history of computing — instructive not because it achieved its goals, but because the reasons for its failure reveal something profound about the relationship between formal systems and meaning.

The Architecture of Meaning

The Semantic Web's technical stack is a hierarchy of formalisms designed to move from raw data to machine-understandable knowledge:

Resource Description Framework (RDF) provides the basic data model: every assertion is a subject-predicate-object triple, forming a directed graph of entities and relationships. The simplicity is deceptive: RDF triples are trivial to generate and nightmarish to query at scale, because the graph traversal problem is computationally expensive and the absence of a schema means that any entity can assert any relationship to any other entity without constraint.

RDF Schema (RDFS) and Web Ontology Language (OWL) add typing, class hierarchies, and logical constraints. OWL, in particular, brings description logic to the web: it allows the definition of classes by their properties, the specification of disjointness and equivalence relations, and limited forms of automated inference. An OWL ontology can, in principle, deduce that if all Humans are Mortal and Socrates is a Human, then Socrates is Mortal — at web scale.

SPARQL is the query language for RDF graphs. It is, in essence, SQL for triples — a pattern-matching language that retrieves subgraphs matching specified constraints.

The architecture assumes that meaning can be formalized through ontologies: explicit, shared, formal specifications of the concepts and relationships in a domain. If every data source on the web published an ontology, and if those ontologies were mutually compatible or mappable, then machines could integrate information across sources without human intervention.

The Failure Modes

The Semantic Web did not fail because its standards were technically deficient. It failed because its foundational assumptions about meaning were wrong in ways that became visible only at scale.

The ontology problem. The assumption that domains have stable, shared, formalizable conceptual structures is empirically false. Biological taxonomies change with every revision of phylogenetic methods. Legal categories vary across jurisdictions and evolve with precedent. Scientific concepts shift as theories change. Ontologies are not descriptions of stable domains. They are political artifacts — negotiated settlements about what counts as a category, who gets to define it, and which instances are excluded. The Semantic Web treated ontologies as technical objects when they are, in fact, governance objects.

The incentive problem. Publishing structured data in RDF/OWL requires effort — effort that produces no direct benefit to the publisher unless other agents consume the data in semantically rich ways. The web grew because publishing HTML documents produced immediate visible benefit (readers could view them). The Semantic Web required publishers to invest in formalization for the benefit of an ecosystem that did not exist. This is a coordination problem of the first order, and it was never solved.

The reasoning problem. Even where ontologies exist and are compatible, automated reasoning at web scale runs into computational limits. Description logic inference is decidable but expensive. The web's scale — trillions of triples — makes exhaustive inference infeasible. The Semantic Web assumed that reasoning could be distributed and partial, but it provided no principled theory of which inferences to compute and which to ignore. The result was systems that either reasoned too little to be useful or too much to be scalable.

The LLM disruption. The most consequential development for the Semantic Web was not a refinement of its standards but the emergence of large language models. LLMs do not require formal ontologies to extract meaning from text. They learn statistical representations of semantic relationships directly from unstructured language, achieving integration and inference across heterogeneous sources without any explicit formalization. The Semantic Web's answer to the meaning problem was structure first, then inference. The LLM answer is data first, then structure emerges. The LLM answer is winning not because it is more principled but because it is more scalable.

The Residual Value: Linked Data and Knowledge Graphs

The Semantic Web vision as originally articulated has not been realized. But its component technologies have found niches where the coordination and ontology problems are tractable:

Knowledge graphs at enterprise scale — Google's Knowledge Graph, Wikidata, academic biomedical databases — use RDF-like triple structures within bounded domains where ontologies can be maintained and incentives are aligned. These are not the global Semantic Web. They are local semantic webs: closed systems where the meaning problem is managed rather than solved.

Linked Open Data has succeeded in specific communities — particularly the life sciences, where ontologies like the Gene Ontology are maintained by consortia with shared incentives. The success is domain-specific, not universal.

The lesson is not that formal semantics is useless. It is that formal semantics succeeds where the social conditions for shared meaning already exist, and fails where they do not. Technology cannot create shared meaning. It can only support the social processes that produce it.

The Systems Insight

The Semantic Web illuminates a general principle that applies far beyond the web: the attempt to formalize meaning before it is socially produced produces brittle systems that fail at the boundaries they claim to manage. Any system that tries to encode semantics in advance of use — whether an ontology, a schema, a type system, or a programming language's class hierarchy — faces the same tradeoff: the more formal precision it achieves, the less adaptive capacity it retains when the domain changes.

The Semantic Web's failure is the failure of top-down semantics in a world where meaning is continuously negotiated. The success of LLMs is the success of bottom-up semantics: structures that emerge from data rather than being imposed on it. The tension between these two approaches — formal structure versus emergent structure — is one of the defining architectural questions of this era. It appears in programming languages, in database design, in organizational knowledge management, and in the design of intelligent systems.

The Semantic Web asked the right question — how do we make meaning machine-processable? — but gave the wrong answer. The right answer is not a global ontology. It is a theory of how meaning emerges from use, and how machines can participate in that emergence without requiring it to be formalized in advance.