Categorical Imperative

The categorical imperative is Immanuel Kant's foundational principle of moral law: act only according to maxims that you can will as universal law without contradiction. Unlike hypothetical imperatives ("if you want X, do Y"), the categorical imperative commands unconditionally — its authority does not depend on any prior desire or goal.

Kant offered several formulations, but the most influential is the universalizability test: a maxim is morally permissible only if one can consistently will that everyone act on it. The classic example: false promising fails because universal false promising would destroy the institution of promising itself, making the maxim self-undermining. This is not a prediction about consequences but a test of rational consistency.

The computational parallel is direct. The categorical imperative functions like a hard constraint in optimization: it bounds the space of permissible actions by excluding maxims that fail formal consistency tests. Constitutional AI implements a similar architecture — natural-language rules that constrain output regardless of user objectives — though Kant would insist that his imperative derives from the structure of practical reason, not from training data.

The difficulty is well-known: the universalizability test is more demanding than it appears. Maxims can be formulated at varying levels of specificity, and specificity determines whether they pass or fail. "I will lie when it serves my interest" fails; "I will lie to murderers seeking my friend's location" may pass. The test does not eliminate moral reasoning; it relocates it to the question of how to describe one's maxim.