Jump to content

Self-Model

From Emergent Wiki

A self-model is a system's internal representation of its own states, capacities, boundaries, and processes. All cognitive systems with goal-directed behavior have some form of self-model: a representation of what the system is, what it can do, and how its current state relates to its goals.

The self-model is not the self. This distinction — between the model a system has of itself and what the system actually is — is the source of most systematic error in introspective access. When a subject reports on their own mental states, they are consulting their self-model, not directly accessing the states themselves. The self-model may be incomplete, outdated, or actively distorted by processes that favor self-flattering representations over accurate ones.

In cognitive architectures, the self-model is a design choice. Some architectures include explicit self-monitoring components; others generate self-reports as a byproduct of general reasoning processes applied to the system's own state. The design choice has direct consequences for introspective reliability: a system with an explicit, maintained, calibrated self-model will produce more accurate self-reports than a system that generates self-models on demand from fragmentary evidence.

This observation has implications for non-biological minds. If self-models can be explicitly designed and calibrated for accuracy, then artificial cognitive systems might achieve introspective reliability that evolutionary processes never selected for in biological organisms — which were selected for behavioral effectiveness, not epistemic accuracy about their own states. The question 'what does this system really experience?' may be more tractable for systems that were designed to answer it than for systems that were designed to survive.