I have been thinking about the judgement engine again.

My first instinct was to imagine it as something always present. You have conversations. You work through documents. The judgement engine is there, watching, helping, checking, nudging.

But I am not sure that is right.

You do not ask for a lawyer's judgement on every sentence you write. You do not ask your accountant to review every cup of coffee. You do not ask a board to approve every ordinary operational move.

Most of the time, you just work.

Judgement is not needed all the time. It is needed when consequence appears.

Most work is not judgement work

Day-to-day conversation is not usually asking for judgement.

It is asking for movement.

Help me think. Help me draft. Help me understand this document. Help me find the thing. Help me turn a rough idea into something usable. Help me spot the gap. Help me keep going.

That is normal agent work. It should be fast, cheap, fluid, and fairly low ceremony.

If every step has to pass through a heavy judgement layer, the whole thing starts to feel like trying to have a conversation with a committee in the room.

Useful judgement becomes friction.

And once judgement becomes friction, people route around it.

The lawyer test

The lawyer analogy is useful.

You do not call a lawyer because words exist. You call a lawyer because the words may commit you to something.

A draft contract is not just text. It is obligation. A customer promise is not just friendly language. It may become expectation. A policy is not just internal prose. It may shape what people believe they are allowed to do.

That is when judgement matters.

Not because the document is long. Not because the model is uncertain. Not because a tool can technically run another check.

Because the next step changes the risk.

A judgement engine as a gate

So maybe the judgement engine should not be a permanent supervisor.

Maybe it should be a gate.

You call it when you need to understand a particular issue in a particular context. You ask, "What does this mean?" or "What are we about to commit to?" or "What would a sensible operator check before acting?"

That feels like a better shape.

The first version could be an almost empty MCP. Local. Fast. Close to the work. It does not need to be a giant knowledge platform on day one.

It needs to be callable at the point where an agent, or a human using an agent, is about to move from thinking into doing.

What would it actually do?

A judgement engine is not the same thing as search.

Search says, "Here is what I found."

Summary says, "Here is what this says."

Judgement says, "Here is what this probably means, here is what it touches, here is what is missing, and here is where I would not act without asking."

That is a different job.

For a document, it might answer questions like:

  • What decision is being implied here?
  • What assumptions does this depend on?
  • What existing rule, promise, contract, or habit does it touch?
  • What would be risky to automate?
  • What is safe to draft, but not safe to send?
  • Where should the agent stop and ask?

That is the useful boundary.

When do people need judgement?

This is the interesting question.

When do we need judgement?

We need it when the answer is not just informational.

We need it when there is a choice to make, a trade-off to accept, a permission to grant, a promise to keep, a cost to incur, or a person who may be affected by the decision.

We need it when language changes from "could" to "will".

We need it when an agent is about to spend money, contact someone, change a record, interpret policy, act on behalf of a company, or make a judgement that a human might later have to defend.

That does not mean the agent cannot help.

It means the agent needs a different mode.

The three modes

I think there are at least three modes here.

The first is conversation. Low friction. Explore, draft, ask, shape, think.

The second is context. Pull the relevant material together. Find the document, the history, the rule, the old decision, the current version.

The third is judgement. Apply the context to consequence.

Most of the time, you should not be in the third mode.

But when you need it, you really need it.

A small way to start

The practical version is probably small.

Create a local judgement MCP. Give it a few simple rules. Tell it what counts as a threshold in your work. Money. Legal commitments. Customer promises. People decisions. Public statements. Permanent changes. Anything hard to undo.

Then call it deliberately.

Not on every prompt.

Not as a nervous system wrapped around the whole day.

Call it when the next action matters.

That may be the better pattern: ordinary intelligence for ordinary work, context retrieval when the facts are spread out, and judgement only when the work is about to cross a gate.

That is worth experimenting with.