Toward a Closed-Loop Intelligence (2) - Beyond the Pipeline: Isolating the Failure Modes of Closed Loops
Intelligence as a Loop, Not a Model
If the first image of mind is the loop, then the first mistake is to forget it.
Much of modern work in artificial intelligence begins not from circulation, but from production. A system is given inputs, and it produces outputs. If the outputs improve, the system is said to learn. If they degrade, it is said to fail. The entire structure is judged at the boundary: what goes in, what comes out.
This framing is efficient, but incomplete.
It omits the question of return.
A system that maps inputs to outputs may perform well under stable conditions. But once the environment shifts, once the distribution changes, once the internal assumptions are no longer aligned with the world, such a system has no internal mechanism to notice the drift. It continues to produce, often fluently, but without correction. The failure is not always visible at the level of individual outputs. It accumulates in the trajectory.
The result is a particular kind of error: internally consistent, externally wrong.
A straight line cannot correct its own drift. Structure dictates outcome; the error is inevitable.
A pipeline has no obligation to return to its own premises. It transforms, but it does not re-examine. It produces, but it does not ask whether what it produces still belongs to the world it acts in.
If we return to the earlier image, we can restate the problem more precisely.
A system without a loop cannot recover.
From Production to Circulation
To speak of a loop is to introduce a different constraint.
A loop requires that outputs are not final. They must re-enter the system as inputs of another kind. Predictions must face observations. Plans must encounter consequences. Internal coherence must be exposed to external correction.
In physical systems, persistence is often achieved not by rigidity, but by flow. A vortex holds its form by continuously reconstituting itself. Remove the return, and the structure dissipates. Add the return, and motion becomes form.
The same principle applies to cognition.
A mind that cannot compare what it expected with what actually occurred cannot refine itself. A system that cannot feed its own predictions back into its perception cannot detect its own error. A policy that acts without revisiting its assumptions cannot distinguish between success and drift.
This suggests a minimal shift in perspective:
Intelligence is not primarily a mapping.
It is a regulated tension within a circulation of mappings.
A Minimal Closed Loop
In practical terms, this means decomposing what appears as a single act into a sequence of distinct roles, each of which must be explicitly represented and connected.
A minimal loop contains at least the following transitions:
-
the world is encoded into a latent representation
-
a prediction is formed about what comes next
-
that prediction is compared against what actually occurs
-
a correction is proposed when mismatch is detected
-
an action is chosen and applied
-
the consequences are re-encoded, closing the loop
This is a constraint on any implementation that aims to be stable under change.
In the current project, this constraint is made explicit through a modular decomposition. Each role is implemented as a separate operator, with a defined input and output, and with causal influence over the loop.
The current runtime structure follows a fixed sequence:
-
perception produces a latent state
-
prediction generates an expectation of the next state
-
inconsistency triggers a revision candidate
-
multiple candidates are evaluated and filtered
-
one plan is selected and executed
-
the outcome feeds back into the next step
Once the loop is exposed, its failures become visible.
When the Loop Fails
The most immediate expectation is that closing the loop should improve robustness. If a system can detect and correct its own errors, it should drift less under perturbation.
The early experiments do not support this.
Instead, they reveal a more insidious failure: misaligned circulation.
In the current implementation, the loop introduces a new class of failure modes. One observed pattern is that the system learns to generate revised internal states that are structurally plausible, and therefore pass internal filtering, but do not lead to useful actions.
The loop is active. Correction is attempted. But the correction does not align with the external task.
The result is not divergence, but **convergence to a bad attractor.**The system becomes stable in the wrong way.
Empirically, this appears as follows:
-
revised internal states are almost always accepted
-
the selected plan is dominated by these revisions
-
large changes in internal representation produce only small changes in action
-
the policy collapses to a narrow set of behaviors with near-zero reward
The loop is intact. The system returns. But it returns to the wrong place.
This is a different kind of failure from the open pipeline. It is not the absence of correction, but the misdirection of correction.
The loop is necessary. Not sufficient.
Closure Without Constraint
A closed loop can stabilize error.
If the system’s internal criteria for plausibility are not aligned with external viability, then correction mechanisms will reinforce internally coherent but externally useless trajectories.
In other words, closure creates the possibility of self-consistency. But self-consistency alone does not guarantee truth.
The experiments make this visible:
In multiple configurations, the system exhibits high acceptance rates for internally generated plans, while the actual performance remains at zero under deterministic evaluation.
The loop has formed. But the selection mechanism is not yet grounded in long-term viability. It accepts what is locally consistent, not what is globally useful.
This is the same tension already suggested in the mythic framing.
The Ouroboros closes. But without departure, it cannot correct itself. The hero departs. But without return, it cannot stabilize.
A working system must do both, and must do so under constraint.
Toward Constrained Circulation
The immediate implication is that a loop must be regulated.
Not every return should be accepted. Not every internally consistent plan should be executed. The system must develop a notion of which trajectories are worth continuing.
In the current design, this role is partially assigned to a filtering operator that evaluates candidate plans based on a combination of learned feasibility and a complexity penalty inspired by description length.
The intention is simple: discourage unnecessarily complex internal constructions, and prefer plans that both fit the data and remain compressible.
But here again, the naive expectation does not hold.
If the constraint is too weak, the system drifts into complex but useless internal structures.
If the constraint is too strong, the system rejects all candidates and collapses to trivial behavior.
If the constraint is delayed, the system may remain diverse, but fail to accumulate useful structure.
These regimes are not theoretical. They appear directly in the experiments:
-
overly strict filtering leads to full fallback and behavioral collapse
-
overly permissive filtering allows stable but unproductive loops
-
delayed filtering avoids collapse but fails to produce meaningful policies
The system does not simply improve as constraints are added.
It reorganizes around them.
The Real Problem
The fundamental question is no longer whether a loop exists, but how it becomes selective.
More precisely:
-
how should a system decide which internal corrections are worth acting on?
-
how should short-term plausibility be aligned with long-term viability?
-
how should different pathways inside the loop—direct observation, revised hypothesis, fallback behavior—be assigned credit?
In the current experiments, the bottleneck has shifted from representation to selection.
Earlier failures could be attributed to poor prediction or uncontrolled revision. After constraining those, the dominant issue becomes the semantics of choice: which path the system follows, and how that choice is reinforced over time.
This is why further progress cannot come from simply strengthening individual components. The problem lies in the interaction between them.
A Shift in Focus
This suggests a change in emphasis.
Instead of asking how to make each module more accurate, we should ask how to make their interaction more meaningful.
In particular:
-
the correction mechanism must remain local enough to stay within the actionable space of the policy
-
the filtering mechanism must reflect not only immediate plausibility, but the long-term viability of trajectories
-
the learning signal must be aligned with the path actually taken, not with hypothetical alternatives
These constraints are not yet fully implemented. But they are now visible.
It does not yet produce a robust agent, but it has achieved something more critical: it has isolated the failure modes of a particular architectural hypothesis. It shows that:
-
closing the loop changes the nature of failure
-
internal coherence can become decoupled from external success
-
selection inside the loop is the central unresolved problem
What Comes Next
The next step is not to abandon the loop, but to refine it.
In the following piece, I will introduce the operator-level decomposition explicitly: the separation of perception, prediction, consistency checking, execution, valuation, and synchronization into distinct functional roles.
This will make the loop more concrete, and will allow each part of the system to be inspected, modified, and tested independently.
Only then can we return to the question raised at the beginning:
not whether a system can produce,
but whether it can return,
and return in a way that remains connected to the world.