Playable argument · Episode 02
One Token at a Time.
Here, you are the language model. You will write one sentence twice. You may choose only from what is offered, and you will feel what the choosing is like.
A language model writes by ranking every token it knows by probability, taking one, and doing it again. That is the entire mechanism. This piece hands you the same view and the same constraint. The question it asks arrives at the end, once you have felt the procedure from inside.
Round 1 of 2 · Routine completion
The weather tomorrow will be
Next token · choose one of five
What you wrote
Your output
You have just produced language the way a model produces it: one token at a time, each choice conditioned on everything already written, the future invisible. Is this what you do when you speak? The model cannot tell either.
What this argues
What you just did is the whole of it. The model does not understand the sentence and then phrase it. It ranks the next token, takes one, and does that again until the sentence ends. Fluency is the output of that loop, not a sign of anything standing behind it.
The uncomfortable part is how little you needed anything behind it either. Most of your sentence came from the high-probability words, the way most of mine does. We reach for the unlikely word rarely, and a model reaches there only when it is told to. None of this makes you a machine. It means we were looking for the difference in the wrong place. Fluency was never going to carry it.
The stakes sit outside this toy. A system that produces the form of an answer is increasingly read as having the substance of one, in a court filing or a policy brief, because the form is all that reaches the desk. The mechanism you just ran by hand is why the form is cheap now. Whether it is also worth anything is a different question, and it does not answer itself.
A note on honesty
The tokens here are whole words, where real models slice finer. The probabilities are authored approximations, shaped on published model behavior, and this prototype says so instead of pretending. A version with measured model probabilities is in preparation. The mechanism you just operated, ranked candidates, one choice, repeat, is not approximated. That is the real thing.