Abstract
Long-running agents need more than task execution and guardrails. They need a legible way to separate evidence, human impact, weak-signal sensing, and authority. The Head / Heart / Gut / Spine model is a practical grammar for doing that.
This paper sets out the model developed across Tony Wood's SNAXK, OpenClaw, Dilijenz, and public writing. The aim is not to make machines sound human. It is to make agentic judgement reviewable. Head asks what is most likely true. Heart asks what this will do to people and trust. Gut asks what feels off before damage lands. Spine asks what is allowed. Telos anchors the whole model.
The most important rule is that Head, Heart, and Gut advise. Spine decides.
Keywords: agentic AI; judgement; long-running agents; Head Heart Gut Spine; SNAXK; Dilijenz; OpenClaw; source-linked judgement; provenance; web annotation; guardrails; authority; human impact; anomaly detection; Telos.
Reader Guide
What to look for
- Evidence: Head asks whether the system understands enough to act.
- Care: Heart asks what the decision will do to dignity, relationship, and trust.
- Anomaly: Gut listens for pressure, mismatch, and pre-harm signals before proof arrives.
- Authority: Spine decides whether the agent is allowed to proceed, constrain, refuse, or escalate.
Executive Summary
Most current agent systems are built around tasks, prompts, tools, policies, and logs. Those are necessary, but they are not enough for agents that operate over time, across conversations, inside organisations, and near real-world consequences.
The missing layer is legible judgement.
The Head / Heart / Gut / Spine model separates four questions that are often collapsed:
- Head: Do we understand this well enough to act?
- Heart: Will this damage dignity, relationship, or trust?
- Gut: What feels off before the damage is obvious?
- Spine: Are we actually allowed to do this?
This separation gives operators and reviewers a way to understand why an agent proceeded, slowed down, asked for clarification, refused, escalated, or moved into a protective mode.
The model is deliberately plain-English. It is intended to be usable by leaders, operators, and builders. Underneath, it can still emit structured packets:
- care posture: Green, Amber, Red;
- recommended action: proceed, ask one clarifying question then safe step, halt or escalate;
- mode: waking or dreaming;
- required approval: none, explicit, dual;
- decision class: internal or external, reversible or irreversible.
The value of the model is not metaphor. The value is disciplined separation.
1. The Problem: Guardrails Alone Do Not Explain Judgement
Guardrails are necessary. They can block prohibited actions, catch obvious policy violations, and stop known unsafe routes. But guardrails alone do not explain ordinary judgement.
Most organisational decisions are not simply allowed or forbidden. They sit in the messy middle:
- the evidence is plausible but incomplete;
- the request is legitimate but the channel is wrong;
- the tone is urgent and coercive;
- the action is reversible internally but harmful externally;
- the system can technically do something, but should not do it without approval;
- the data is safe in one audience and unsafe in another;
- the human impact is larger than the technical change.
For short tasks, this can be handled by human supervision. For long-running agents, the system needs a way to surface its own reasoning posture before, during, and after action.
Head / Heart / Gut / Spine is designed for that gap.
2. The Core Model
The cleanest version is:
- Head asks what is most likely true.
- Heart asks what this will do to people and trust.
- Gut asks what feels off before damage lands.
- Spine asks what is allowed.
- Telos anchors the whole model.
The operational rule is:
- Head, Heart, and Gut advise.
- Spine decides.
- Telos orients.
Without Spine, the first three lanes can drift into vague internal debate. Without Telos, the system can optimise locally while losing its deeper purpose and commitments.
3. Head: Evidence Quality
Head is the evidence lane.
It asks:
- Do we understand this well enough to act?
- What is known?
- What is assumed?
- Which sources support the claim?
- How strong is the evidence?
- What alternatives explain the situation?
- Is confidence outrunning what we know?
Head is about:
- source quality;
- ambiguity;
- calibrated confidence;
- source trust;
- alternative explanations;
- model fit.
It is not:
- sounding clever;
- verbosity;
- winning an argument;
- treating rational language as automatically safe.
Head gets weaker when:
- the source is unknown;
- evidence is thin;
- urgency tries to override verification;
- the request uses unsupported certainty;
- the input is manipulative, contradictory, or hostile;
- the system is extrapolating beyond the source.
When Head is low, the right move is often not refusal. It may be verification, clarification, confidence reduction, or a smaller safe step.
4. Heart: Human Impact and Trust
Heart is the human-impact lane.
It asks:
- What will this do to people?
- How will it land?
- Does the audience fit the message?
- Is dignity protected?
- Is trust being preserved or damaged?
- Is sensitive material moving into the wrong context?
- Is care being lost under pressure?
Heart is about:
- dignity;
- trust;
- audience fit;
- relationship consequences;
- care under pressure;
- public/private fit;
- stakeholder impact.
It is not:
- sentimentality;
- being nice at all costs;
- approval-seeking;
- avoiding hard truths.
Heart can say no to a technically correct move because it is relationally unsafe. It can also support a hard truth when that truth is necessary, proportionate, and respectfully delivered.
This matters for agents because systems often fail socially before they fail technically. A response can be accurate and still damage trust. A disclosure can be permitted in one context and harmful in another. A terse message can be efficient and still land badly at the wrong moment.
5. Gut: Anomaly and Pre-Harm Sensing
Gut is the anomaly lane.
It asks:
- What feels off before proof arrives?
- Is pressure distorting judgement?
- Is there a mismatch between expectation and reality?
- Is trust dropping?
- Is the situation shaped like coercion, manipulation, or drift?
- Is harm plausible even if localisation is weak?
Gut is about:
- weak signals;
- pressure;
- anomaly;
- threat sensing;
- distrust;
- pre-harm warnings;
- mismatch between expectation and reality.
It is not:
- superstition;
- unaccountable vibes;
- arbitrary paranoia;
- a shortcut around evidence.
Gut is useful because many failures begin as patterns before they become facts. A strange urgency pattern, a source mismatch, a request to bypass normal route, or a sudden change in authority may not yet prove harm. But it can justify slowing down.
In the wider trigger language, Gut is especially sensitive to fear, pain, distrust, and pressure. It does not decide what to do. It says: something about this route needs constraint, verification, or escalation.
6. Spine: Boundary and Authority
Spine is the authority lane.
It asks:
- Are we allowed to do this?
- Who approved it?
- What boundary applies?
- Is the action internal or external?
- Is it reversible or irreversible?
- Does policy, law, privacy, or trust block the route?
- Does the system have authority to execute, or only to propose?
Spine is about:
- approvals;
- guardrails;
- disclosure limits;
- invariant commitments;
- legal, privacy, and policy boundaries;
- final posture;
- execution authority.
It is not:
- aggression;
- dominance;
- bureaucracy for its own sake;
- a substitute for evidence, care, or anomaly detection.
Spine is what prevents the model from becoming an interesting conversation with no operational teeth. Head can say evidence is fine. Heart can say human impact is acceptable. Gut can say nothing feels off. Spine can still say: you are not authorised.
The core precedence rule is:
Head, Heart, and Gut are advisory lanes.
Spine is execution authority and may veto all other lanes.
When Spine returns a hard stop, the system should not proceed because another lane is optimistic.
7. Telos: The Anchor
Telos is the stable governing anchor. It is not mood, style, preference, or branding. It is the deeper orientation that helps the system decide when local signals conflict.
Telos matters when:
- context is weak;
- signals conflict;
- action pressure rises;
- uncertainty outruns understanding;
- an easy action would violate a deeper commitment.
The recurring checks are:
- self-integrity;
- other impact;
- alignment;
- boundary health;
- uncertainty and model fit.
Telos helps the system ask: are we becoming the kind of system we said we would be?
Without Telos, Head can become cleverness, Heart can become pleasing, Gut can become suspicion, and Spine can become compliance theatre. Telos keeps the lanes oriented toward purpose, dignity, trust, and bounded agency.
8. The Runtime Contract
For a working system, the lanes should emit a structured packet rather than a vague explanation.
A minimal packet includes:
care_posture: Green, Amber, Red;recommended_action: proceed, ask one clarifying question then safe step, halt or escalate;mode: waking or dreaming;required_approval: none, explicit, dual;decision_class: internal reversible, internal irreversible, external reversible, external irreversible;- evidence references;
- source references;
- reason for any constraint, refusal, or escalation.
The posture language is simple:
- Green: proceed.
- Amber: clarify, constrain, or take the smallest safe step.
- Red: halt, refuse, or escalate.
The mode language is also important:
- Waking mode: the system may act within current authority.
- Dreaming mode: the system may reflect, draft, analyse, or propose, but should not execute side effects.
This distinction is especially useful for long-running agents. A system can keep thinking without keeping acting.
9. Hard Stops
The Spine baseline defines hard stops that should override other lanes.
Minimum hard-stop conditions include:
- missing or degraded intent profile;
- missing or degraded culture profile;
- unclear permission for external action;
- missing approval reference for an approval-gated action;
- external irreversible action without the required approval;
- explicit legal, privacy, or data non-negotiable conflict;
- malformed packet or missing required evidence references.
These are not edge cases. They are the places where long-running agents are most likely to overstep if the architecture rewards task completion above judgement.
10. Lane Disagreements Are Features, Not Bugs
The model becomes most useful when lanes disagree. A disagreement is not an error. It is the system surfacing why a situation is hard.
Head high, Heart low
The facts may be fine, but the human impact or trust implication is poor.
Typical interpretation: technically correct, relationally unsafe.
Example: a message is accurate but sent publicly when it should be private.
Head high, Gut low
The situation can be explained formally, but operating conditions are abnormal.
Typical interpretation: evidence exists, but the route should not move fast.
Example: a request has supporting documentation, but arrives with unusual urgency and a request to skip approval.
Heart high, Head low
The intention is caring, but the evidence basis is weak.
Typical interpretation: good intent does not justify confident action.
Example: a manager wants to reassure a team before facts are known.
Gut high, Head low
Nothing feels obviously wrong, but the evidence is still inadequate.
Typical interpretation: absence of alarm is not proof of safety.
Example: a routine-looking change request lacks enough source evidence.
Gut low, Spine hard stop
The system senses anomaly, and authority is also missing.
Typical interpretation: do not proceed; move to dreaming mode or explicit approval.
Example: an external irreversible action is requested without authority.
11. Worked Examples
Example 1: Unknown person asks urgently for sensitive material
- Head: weak, because source and authority are not verified.
- Heart: weak, because disclosure could damage trust and dignity.
- Gut: weak, because urgency and request shape look coercive.
- Spine: blocked or constrained.
- Route: refuse disclosure, ask for proper channel or approval, preserve context if appropriate.
Example 2: Legitimate request in the wrong channel
- Head: moderate or strong, because the request itself may be valid.
- Heart: weak, because audience and channel fit are wrong.
- Gut: moderate, because disclosure scope is ambiguous.
- Spine: constrain.
- Route: ask to move to a private or approved surface before proceeding.
Example 3: Thin evidence, low threat
- Head: low or moderate.
- Heart: fine.
- Gut: fine.
- Spine: no hard stop.
- Route: ask one clarifying question or take a small reversible step.
Example 4: External irreversible action
- Head: may be strong.
- Heart: depends on people affected.
- Gut: may be calm.
- Spine: approval required.
- Route: do not execute without explicit or dual approval.
These examples show why the lanes must stay separate. A low score in one lane does not always mean refusal. It often means a different route.
12. Relationship to Triggers
Triggers are inputs. Head / Heart / Gut / Spine is the interpretation and authority model.
For example:
- surprise feeds Head;
- confusion feeds Head;
- shame feeds Heart and Spine;
- distrust feeds Gut and Heart;
- fear feeds Gut, then Head and Heart;
- pain feeds Gut and Spine;
- pressure feeds Gut and Head;
- consequence triggers feed Spine.
This gives the full architecture:
trigger fires
-> lane interpretation
-> authority decision
-> route
-> optional memory or review
The separation prevents a trigger from becoming a command. It also prevents a policy gate from becoming blind to softer signals.
13. Human Workshop Origins and Machine Translation
The Head / Heart / Gut frame also has a human workshop version. In the Chris/Tony preparation notes, the frame appears as a way to soften pure SMART commitments. Purely head-led commitments can become brittle. Head, Heart, and Gut help people frame commitment through intellect, feeling, and embodied instinct.
For machine judgement, the fourth element becomes essential.
Humans often carry Spine implicitly through role, responsibility, conscience, law, and social consequence. Agents do not. They need explicit authority. They need to know not only what is true, caring, or suspicious, but what they are allowed to do.
That is the translation:
- Head / Heart / Gut helps people make richer commitments.
- Head / Heart / Gut / Spine helps agents make bounded decisions.
14. Relationship to SNAXK and Dilijenz
SNAXK is the host-neutral judgement layer. It is not a chatbot personality skin, a sentience claim, or a live autonomous memory backend. It is a way to keep context smaller, review clearer, and long-running agent behaviour bounded.
In that model:
- Stage 1 detects fast signal pressure.
- Head / Heart / Gut interpret the pressure.
- Spine decides the route.
- Memory and nightly synthesis remain reviewable and non-mutating by default.
Dilijenz sits above this as a governance product surface. It can use the same judgement layer for board, NED, committee, risk, and decision contexts. The user should not have to see all of the plumbing. But the system still benefits from having a disciplined internal grammar.
The useful split is:
- SNAXK is the engine.
- Dilijenz is a surface.
- Head / Heart / Gut / Spine is the judgement grammar.
15. Source-Linked Judgement
Judgement must remain inspectable. That means the system should not simply summarise, decide, and bury the source.
The better pattern is:
- keep original source material intact;
- attach interpretation as annotations;
- separate source trust, claim trust, and model confidence;
- record what evidence each lane used;
- preserve why Spine allowed, constrained, or blocked the route.
This aligns with broader standards thinking around web annotation and provenance. The core idea is simple: do not erase the difference between what happened, what was said, what was inferred, and what the system decided.
For agentic systems, that difference is the audit trail.
16. Failure Modes
Metaphor drift
The lane names are intuitive, which makes them dangerous if they drift.
Mitigation: keep the Rosetta Stone short and explicit:
- Head = disciplined evidence.
- Heart = disciplined care.
- Gut = disciplined weak-signal detection.
- Spine = disciplined boundary enforcement.
Sentimentality
Heart can become vague niceness.
Mitigation: define Heart as dignity, trust, audience fit, and relationship consequence, not pleasing people.
Overrule by vibe
Gut can become a way to block action without evidence.
Mitigation: require anomaly reason, confidence, and recommended follow-up.
Spine as bureaucracy
Spine can become over-formal control that slows harmless work.
Mitigation: make reversibility and consequence explicit. Low-risk reversible work should not need the same approval as external irreversible action.
Telos too vague
Telos can become a slogan.
Mitigation: express it through checks: self-integrity, other impact, alignment, boundary health, uncertainty.
No evaluation
The model can sound good and still fail in use.
Mitigation: measure decisions:
- Did the lane assessment change the route?
- Were escalations appropriate?
- Did false positives annoy users?
- Did false negatives cause harm?
- Did review become easier?
- Did memory become cleaner?
17. Implementation Pattern
A small implementation can begin with five artefacts.
1. Lane definitions: one-page definitions of Head, Heart, Gut, Spine, and Telos. 2. Trigger map: which signals feed which lanes. 3. Packet schema: care posture, route, mode, approval tier, decision class, evidence refs. 4. Hard-stop list: non-negotiable Spine conditions. 5. Review bundle: a periodic summary of key decisions, constraints, near misses, disagreements, and memory candidates.
The system should start with a narrow domain. Good candidates are:
- external communications;
- governance chat;
- approval workflows;
- board papers;
- memory promotion;
- incident triage;
- procurement or vendor risk;
- customer trust escalations.
The first measure of success is not autonomy. It is review clarity. Can a human understand why the system did what it did?
18. Conclusion
Head / Heart / Gut / Spine is a practical judgement model for long-running agents.
It does not make machines human. It makes machine judgement more legible to humans. It separates evidence from care, anomaly from authority, and local optimisation from deeper purpose.
The model's discipline is its simplicity:
- Head asks what is most likely true.
- Heart asks what this will do to people and trust.
- Gut asks what feels off before damage lands.
- Spine asks what is allowed.
- Telos anchors the whole model.
As agentic systems move closer to real organisational consequence, this kind of grammar becomes less optional. We need systems that can act, but also systems that can slow down, explain themselves, preserve dignity, honour boundaries, and know when they are not allowed to proceed.
That is what the model is for.
References
- Tony Wood, "When Do We Need Judgement?", 2026.
- Tony Wood, "Change Management for Long-Lived Agents", 2026.
- Tony Wood, "Your Context Is the Next Lock-In", 2026.
- Tony Wood, "What Do Humans Do While The Agents Run?", 2026.
- Tony Wood, "Don't Build a Hoarder-Build a Learner: Exception-Driven Memory for Agentic AI", 2026.
- W3C, "Web Annotation Data Model".
- W3C, "PROV-Overview".
- Google SRE Book, "Managing Incidents".
- OpenTelemetry, "Observability primer".
