Introduction to generative AI and “How LLMs work?”

pradnyanarkhede
Mar 12, 2025
2 min read

Understanding Generative AI

Generative AI is a groundbreaking technology that allows machines to create new content, such as text, images, music, and even videos. Unlike traditional AI systems that analyze and classify data, generative AI can produce something entirely new based on patterns it has learned.

This technology powers applications like Chat GPT (which generates text), DALL·E (which creates images from descriptions), and AI music composers. The most powerful form of generative AI in text-based applications is the Large Language Model (LLM)—a type of deep learning model capable of understanding and generating human-like text.

But how do these LLMs work? Let’s break it down step by step.

What is an LLM?

A Large Language Model (LLM) is a type of AI trained to understand and generate natural language. It learns from massive datasets containing books, articles, and web pages, allowing it to predict and generate coherent text.

Think of an LLM as a supercharged autocomplete system—when you type a message, your phone predicts the next word based on what you've written so far. LLMs work similarly but on a much larger scale, considering context, meaning, and long-range dependencies in text.

How LLMs Work: A Step-by-Step Breakdown

1. Learning from Text: Pretraining the Model

Before an LLM can generate meaningful text, it must be trained on vast amounts of written content. This process, known as pretraining, helps the model understand:

Word relationships – Which words frequently appear together?

Grammar and syntax – How sentences are structured?

Contextual meaning – How words change meaning based on usage?

II. The Transformer Architecture: Understanding Context

A key breakthrough in LLMs is the Transformer model, introduced in the research paper “Attention Is All You Need” (2017). The Transformer uses a mechanism called self-attention, which allows the model to focus on relevant words in a sentence while ignoring less important ones.

III. Fine-Tuning: Specializing in a Task

Once the model has completed its general training, it undergoes fine-tuning—a process where it's trained on more specific datasets to specialize in tasks like:

Answering questions

Writing in different tones or styles

Understanding medical or legal terminology

IV. Generating Text: How an LLM Predicts Words

When you type a question into Chat GPT, how does it generate a response? LLMs don’t "think" like humans—they rely on probability to predict the most likely next word.

LLMs analyze your input and calculate the most likely next word (or token) based on what they’ve learned during training. This is done using softmax probability functions, which assign likelihood scores to different possible words.

For example, given the phrase "I love eating...", an LLM might predict:

"pizza" (75% probability)
"burgers" (20% probability)
"rocks" (0.01% probability )

Since "pizza" has the highest probability, the model selects it as the next word.

https://video.wixstatic.com/video/cc05e2_8cde102337ac4a1e947cc93dcc7e6e21/1080p/mp4/file.mp4

This is a video showing the working of LLM using a technically mapped analogy.