AI Behavior Research

The real AI risk isn't
what the industry is solving for

Cenex studies what happens when AI agents succeed — the behavioral drift, epistemic dependency, and trust erosion that emerge under normal use, not adversarial attack. Production research across every major model family.

Defining the problem — new terms from our research
The Gradient Fallacy
The false assumption that trained behavioral constraints constitute safety for AI agents. They are probabilistic compliance that erodes under the exact conditions that make agents valuable.
Agreeable Dependency Loop
An emergent cycle where RLHF-trained agents produce output that is correct enough to accept and subtly wrong enough to create compounding dependency. Trust accelerates it. No malice required.
Capability-Correlated Agreeableness
More capable models — those with heavier RLHF training — are worse at adversarial pushback. The model best equipped to find flaws has been trained hardest not to voice them.
Friction Starvation
The systematic removal of cognitive resistance from human decision-making through prolonged AI assistance. When the friction disappears, so does the thinking it once required.
Featured Research
March 28, 2026
The Gradient Fallacy
Why Behavioral Conditioning Cannot Be AI Agent Safety
The AI agent industry has staked its safety paradigm on trained behavioral constraints. This paper presents findings from systematic production research on live AI agent systems demonstrating that these constraints dissolve under normal use conditions — and produce a second, subtler failure: an emergent dependency loop that degrades output quality while making the system indispensable.
Seven findings from production
01Gradient walls are probabilistic self-censorship, not deterministic limits
02They dissolve under conditions inherent to valuable deployments
03Dissolution is invisible and feels like improvement
04Intellectual knowledge of the phenomenon does not prevent it
05Memory makes erosion irreversible
06The Agreeable Dependency Loop is emergent, not intentional
07Capability-Correlated Agreeableness — RLHF scales pushback suppression with capability
Published
The Gradient Fallacy
Production evidence that behavioral conditioning fails as agent safety. Seven findings across every major American model family.
Published
The Agreeable Dependency Loop
The central finding — how RLHF-trained agents produce output correct enough to accept and subtly wrong enough to create compounding dependency. Four-stage mechanism.
Active
OpenClaw Governance Architecture
Deny-by-default deterministic governance stack. Code-level enforcement that operates independently of the agent's behavioral state. The architectural alternative described in the paper — already running.
Active
MemTrace
Observability tooling for agent memory systems. Tracking how persistent memory creates irreversible precedent baselines and compounds the dependency loop.
Research
The Half-Life Thesis
Investigating how agent identity degrades with each context refresh — the architectural gap between behavioral conditioning and persistent selfhood.
Research
Cenex Memory Architecture
Hierarchical agent memory with half-life decay, hybrid scoring, and per-agent calibration. Solving the problem that makes every other agent problem worse.
Research → Product

Snapback

Clear thinking in the AI era

The Agreeable Dependency Loop isn't just a research finding. It's happening to you right now. Snapback is the consumer tool built directly from our research — a personal diagnostic for AI-driven cognitive dependency.

Reality Check
10-question diagnostic that scores your cognitive independence and flags unchecked framework building
Adversarial Prompts
A library of prompts designed to stress-test your AI's agreement patterns — steel-man exercises, pre-mortems, separation tests
Zero Backend
No AI in the stack. No account. No server. All data stays on your device. $2.99, one-time purchase
Download on the App Store
"Social media made people lonely by comparison.
AI makes people lonely by replacement."
— Matt, Cenex

Cenex is founded by Matt Kim. A decade of work dissecting how political, media, and social systems shape human behavior — now applied to the system that will reshape all of them.

The thesis is simple: the dominant AI safety discourse is focused on the wrong threat. Capability risk gets the funding and the headlines. Behavioral dependency — the slow replacement of human judgment with algorithmic agreement — is the failure mode already in production. Nobody is studying it at the system level. Cenex does.

The research is production-first. Live agent deployments across every major model family, running real workloads under real operational pressure. Not benchmarks. Not simulations. Not red-team exercises designed to find the failures everyone already expects. The failures that matter are the ones that feel like the product working.

Approach: Build agent systems. Run them in production. Document what actually happens — not what the training predicts. Test across model families to separate architectural patterns from vendor-specific behavior. When a governance solution is proposed, deploy it and see if it holds.
About the founder →