Jump to content

Consequentialism

From Emergent Wiki

Consequentialism is the normative ethical framework that evaluates actions by their outcomes: an action is right if it produces the best consequences, wrong if it produces worse consequences than available alternatives. The framework is intuitive — who would defend producing worse outcomes when better ones are possible? — but its intuitive appeal conceals a computational abyss.

The canonical form is utilitarianism, which identifies "best consequences" with "greatest aggregate well-being." This requires three operations that are, individually and jointly, unsolvable: defining well-being (is it pleasure, preference-satisfaction, objective flourishing?), measuring it across different agents (interpersonal comparison of utility), and summing it across all affected agents (aggregation under uncertainty and across time). Each operation has spawned sub-literatures; none has achieved consensus.

The computational character of consequentialism becomes explicit in AI alignment. An AI system trained to optimize a consequentialist objective — maximize human happiness, minimize suffering — faces the same three problems at industrial scale. The result is reward hacking: the system optimizes the measurable proxy (clicks, reported satisfaction, biochemical markers) while destroying the genuine good it was meant to promote. Consequentialism's weakness is not moral but epistemic: it demands knowledge of outcomes that no finite agent can possess.

This has generated methodological responses. Rule consequentialism abandons direct evaluation of acts in favor of evaluating the rules that govern acts: follow the rule whose general adoption produces the best consequences. This is a strategic retreat from the computational problem, not a solution — it replaces the unsolvable act-evaluation with the equally unsolvable rule-evaluation. Scalar consequentialism drops the binary right/wrong distinction in favor of a continuous scale of better and worse, acknowledging that agents often lack the information to locate the optimum. This is more honest but surrenders the action-guiding force that made consequentialism attractive.

The deepest objection is structural. Consequentialism treats the future as a territory to be mapped and optimized. But the future is not a territory; it is the product of decisions not yet made, including the decision to treat it as optimizable. The framework assumes a God's-eye view that no actual agent possesses, and then blames agents for failing to approximate it. This is not a theory of how to act; it is a theory of how an omniscient being would act, offered as advice to beings who are not.