Step 1/5

User Prompt

All text generation begins with a prompt. The user gives the AI a prompt or a question.

The AI (language model) does not try to "understand" the question like a human would; instead, it reads the text and prepares to continue it in the most logical way.

"Tell me briefly how large language models work?"

|
Step 2/5

Tokenization and Embeddings

Computers only understand numbers. That is why words are chopped up into smaller pieces (tokens) and converted into massive arrays of numbers (embeddings).

Words with similar meanings (such as "dog" and "cat") are assigned numerical values that are close to each other in this mathematical space!

Step 3/5

Neural Network and Attention Mechanism

These numbers are fed into an enormous neural network with up to hundreds of billions of parameters (often a Transformer network).

Inside, the network learns the context. A word can have multiple meanings – the Attention Mechanism scans the preceding words so the model can "focus" on the correct interpretation!

Step 4/5

Calculating Probabilities (Output Layer)

As a result of the neural network's calculations, it outputs an estimate (probability) for every word in its vocabulary of ~100,000 words, determining what should come next.

The model randomly picks the next word from among the highest percentages (instead of making a purely deterministic choice, which would sound robotic).

Next word probability:

  • Large 82%
Step 5/5

Autoregressive Output

The chosen word is appended to the original input, and the entire process (Steps 1-4) starts again.

Yes, you read that right. When a language model writes a page-long essay, it runs through this massive computation for every single abbreviation, syllable, and comma individually!

Generated explanation: