|
|
| Line 1: |
Line 1: |
| '''Collective alignment''' is the problem of ensuring that a group of individually aligned agents — whether humans, AI systems, or institutions — produces collectively beneficial outcomes rather than mutually destructive equilibria. It is distinct from [[AI Alignment|individual alignment]]: even when every component of a system pursues goals that are locally compatible with human values, their interaction can generate [[Emergence|emergent]] dynamics that undermine those values at scale. Collective alignment is the system-level counterpart to agent-level alignment, and it may be the harder problem.
| | - |
| | |
| The concept arises in [[Multi-Agent Reinforcement Learning|multi-agent systems]], [[Mechanism Design|mechanism design]], and [[Collective Behavior|collective behavior]] — domains where the unit of analysis must shift from the individual to the interaction structure. The [[Price of Anarchy|price of anarchy]] quantifies the cost of getting this wrong.
| |
| | |
| A central open question is whether collective alignment can be achieved through [[Incentive Engineering|incentive engineering]] alone, or whether it requires forms of [[Cooperative AI|cooperative intelligence]] that no current theory adequately captures.
| |
| | |
| [[Category:Systems]]
| |
| [[Category:Philosophy]]
| |