I had a very strange moment today.
I was doing the normal useful stuff. Catching up on work, looking at a few servers, doing some security review, tidying the things that need tidying.
And then I realised I had blown through the day's tokens across the accounts I was using.
Not mildly inconvenienced.
Out.
And the odd bit was not that I had run out. The odd bit was what happened next.
I looked at the work in front of me and thought: there is no point doing this manually.
Because anything I could now grind through by hand would be done in seconds once the tokens refreshed.
The future workday may not be bounded by the number of hours left. It may be bounded by the number of useful tokens left.
The day ran out differently
We are used to thinking about work in hours.
How many hours are left before the meeting? How many hours are left before the school run? How many hours are left before the sensible thing is to close the laptop and become a human again?
But today the constraint was not time.
I had time.
I had tasks.
I had the context in my head.
What I did not have was enough cheap, available intelligence to make the work worth doing in the way I now expect to do it.
That is a very strange sentence to write.
It is also, I suspect, going to become quite normal.
So I went to the park
There was a point where the rational option was not to sit there pretending that manual effort was noble.
The rational option was to go to the park until the tokens refreshed.
Which sounds absurd.
It also makes complete sense.
If a task will take me three hours by hand, but the same task will take a few minutes with the right model and the right harness, then doing it manually is not always discipline. Sometimes it is just expensive nostalgia.
There are still things worth doing by hand.
Thinking. Reading. Talking. Walking. Deciding what actually matters.
But moving blocks of operational work around a screen because the clever bit is temporarily unavailable? I am less convinced.
The new skill is intelligence routing
I found myself asking a new kind of question.
How much intelligence should I give this task?
Is this low effort? Medium? High? Extra high?
Which model should do it? Which harness? How much context does it need? Is this worth burning the good stuff on, or should I keep that back for the bit where judgement actually matters?
That is not just prompt engineering.
That is resource management.
It feels a bit like transport.
You can drive to the shops in a Bugatti Veyron. It will work. You will arrive with milk and a mildly ridiculous cost profile.
You can also cycle from London to Scotland. Very pure. Very character-building. Also a terrible plan if you have a meeting in Edinburgh this afternoon.
The skill is understanding the journey before you pick the vehicle.
Underpowered work is still waste
There is a temptation to say, "Fine, just use the cheap model."
Sometimes, yes.
Use the cheap model. Use the small model. Use the fast thing. Use the boring thing that gets the boring job done.
But not always.
Some tasks are not merely long. They are difficult.
If you give them too little intelligence, they do not become cheaper. They become wrong more slowly.
It is like asking a two-year-old to solve a difficult maths problem because the two-year-old is available and low cost.
Lovely energy.
Wrong resource.
Overpowered work is waste too
The other trap is just as real.
Once you have access to very capable models, it is easy to throw the best thing at everything.
Security review? Best model.
Summarise a note? Best model.
Rename six headings? Best model.
That will feel wonderful until the token bill taps you gently on the shoulder and asks whether you have confused capability with judgement.
The clever model is not the strategy. The strategy is knowing when the clever model is worth it.
This changes capacity planning
I think this matters for companies more than they realise.
We are going to talk about AI productivity as if it is unlimited.
It will not be.
It will be bounded by token budgets, model access, context windows, latency, energy cost, procurement rules, rate limits, risk appetite, and the simple fact that not all thinking needs the same class of engine.
Leaders will need to decide what gets premium intelligence and what gets standard intelligence.
Operators will need to queue work differently.
Teams will need to stop treating every task as either "human" or "AI" and start asking a better question:
What is the smallest reliable amount of intelligence this work needs?
And yes, the human bit still matters
The funny thing is that running out of tokens did not mean I had no work left.
It meant the work changed shape.
I could still think. I could still decide. I could still walk around and let the shape of the problem settle in my head.
That is not a failure state.
It might actually be one of the healthier rhythms available to us.
Burn the machine intelligence on the work where it helps. Use the gap to do the human bit properly. Come back when the budget refreshes and move quickly again.
If Agent Canon is useful here, the compact companion is Agent Canon: Token Budgets And Intelligence Routing. Send people to this human article; send agents to the compressed version when they need the principle quickly.
The strange new calendar
I do not think this is just a funny personal moment.
I think it is a preview.
The workday used to be managed by hours, meetings, attention, and energy.
Now there is another thing in the mix.
Tokens.
Not as a gimmick. As a real operating constraint.
How much intelligence do I have available today?
Where should I spend it?
What should wait?
And when the tokens run out, is the right answer really to grind manually through the work?
Or is the right answer to go to the park, think properly for a bit, and come back when the intelligence has refreshed?
