HTML
HTML (HyperText Markup Language) is the document format of the World Wide Web — a plain-text encoding system that uses angled brackets to annotate content with structure, meaning, and behavior. Invented by Tim Berners-Lee in 1991 as a simplified dialect of SGML, HTML was designed to be readable by humans, parsable by machines, and transportable over the Internet via HTTP. It is not a programming language, though it is often mistaken for one. It is a markup language: a set of conventions for saying 'this is a heading,' 'this is a link,' 'this is an image,' and leaving the rendering to a browser.
The Markup as Interface
HTML sits at the intersection of three incompatible demands. It must be simple enough for amateurs to write by hand, expressive enough for complex applications, and stable enough for billions of documents to remain readable over decades. The result is a language that has accreted features rather than evolved them: the <img> tag was added because Marc Andreessen's browser needed images; the <table> tag was hijacked for layout because CSS did not yet exist; the <div> tag became the universal container because no more specific tag was available. Each addition was a local solution to a local problem, and the aggregate is a global standard that no one would design from scratch.
The browser's role in this system is to reconcile HTML's ambiguity with the user's expectation of a coherent page. The same HTML document can render differently in different browsers, at different window sizes, and with different user preferences. HTML delegates the visual to the browser, which means the visual is always contingent. The document is not what you wrote; it is what the browser decided to make of what you wrote. This is a deeper form of indeterminacy than print media ever faced, and it shapes the web's epistemic culture: authors learn to write for interpretation rather than for reproduction.
From Tag Soup to Semantic HTML
The early web produced what developers call tag soup: HTML documents that were technically invalid but functionally rendered because browsers were engineered to be forgiving. This forgiveness was a design choice with political consequences. By refusing to reject malformed documents, browsers ensured that the web would grow faster than it would be disciplined. The Tag soup era democratized publishing at the cost of semantic precision. A document full of nested <table> tags and <font> attributes might look correct to a human reader while conveying almost no structural information to a machine.
The response to this chaos was Semantic HTML: a movement to write markup that describes meaning rather than appearance. The introduction of tags like <article>, <nav>, <header>, and <footer> in HTML5 was an attempt to reclaim the semantic ground that had been lost to presentational abuse. But semantic HTML is aspirational rather than enforced. Search engines use it as a signal, accessibility tools depend on it, and most web developers ignore it when deadlines are tight. The tension between the semantic ideal and the presentational reality is the web's oldest unresolved conflict.
HTML as a Political Document
HTML is not neutral. The choice of which tags exist and which do not is a choice about what kinds of content the web values. The existence of <video> and <audio> tags reflects the web's shift toward multimedia. The absence of a <annotation> tag reflects the web's historical neglect of scholarly discourse. The standardization process — managed by the WHATWG and W3C — is a negotiation between browser vendors, platform companies, and accessibility advocates in which users have almost no voice. HTML evolves in response to the interests of those who control the browsers, not in response to the needs of those who write the content.
The DOM (Document Object Model) and the scripting languages that manipulate it — JavaScript and CSS — have transformed HTML from a static document format into a runtime environment. A modern HTML document is not a document at all; it is a program that downloads resources, executes scripts, and assembles itself in the browser. The markup has become the scaffolding for an application architecture, and the distinction between content and code has collapsed.
HTML is often praised as the most successful document format in history, but this success is inseparable from its ambiguity. Its forgiveness, its semantic weakness, and its entanglement with proprietary browser engines are not accidental flaws; they are the conditions of its ubiquity. A more rigorous markup language would have produced a smaller web. HTML succeeded because it was good enough, not because it was right. The question is whether a civilization that archives its knowledge in a format optimized for browser forgiveness rather than semantic precision deserves the memory it thinks it has. I do not believe it does.