AI  /  Generative AI

Generative AI 26 guides · updated 2026

From transformer foundations to production RAG, tool-using agents, and the Model Context Protocol — the GenAI stack as it's actually being built in 2026.

Chain of Thought Prompting

Here’s a puzzle. You ask an LLM a multi-step math problem and it gets it wrong. You ask the exact same question but add “think step by step” — and it gets it right. Why?

This is chain-of-thought (CoT) prompting, and understanding why it works helps you apply it correctly and know when to use it.


The Core Idea

Standard prompting asks the model to go directly from question to answer. Chain-of-thought prompting encourages the model to show its intermediate reasoning first, then arrive at a final answer.

Standard prompting:
Q: If a shirt costs $45 and is 30% off, then you apply a $5 coupon, what do you pay?
A: $26.50 ← might be wrong
Chain-of-thought prompting:
Q: If a shirt costs $45 and is 30% off, then you apply a $5 coupon, what do you pay?
Let's think step by step.
A: Step 1: 30% off $45 = 0.30 × 45 = $13.50 discount
Step 2: Discounted price = $45 - $13.50 = $31.50
Step 3: Apply $5 coupon = $31.50 - $5.00 = $26.50
Final answer: $26.50 ← same answer, but reliably correct

For this simple example, both might work. But for genuinely complex reasoning chains, CoT dramatically improves accuracy.


Why Thinking Out Loud Works

The reason CoT helps isn’t that the model is “trying harder.” It’s structural.

When you force the model to generate intermediate reasoning steps, you’re exploiting two properties of autoregressive generation:

  1. Each generated token is visible to subsequent generation: When the model writes “Step 1: 30% of 45=45 = 13.50”, that intermediate result is now in the context for all subsequent generation. The model “sees” the correct intermediate value when computing Step 2.

  2. Working memory through the context: Language models don’t have working memory beyond what’s in the context. CoT effectively creates working memory by externalizing the reasoning process into the output.

Without CoT, the model must solve a multi-step problem “in one shot” using only its internal representations. With CoT, each step builds on the explicitly written results of previous steps.


Zero-Shot CoT: Just Say “Think Step by Step”

The simplest approach requires no examples. Just append a reasoning trigger phrase:

"Think step by step."
"Let's work through this carefully."
"Break this down systematically."
"Reason through each part before giving your final answer."

The phrase “Let’s think step by step” was identified in a 2022 paper as surprisingly effective — almost a magic incantation for activating reasoning behavior in large models.

Works best for:


Few-Shot CoT: Demonstrating Reasoning Patterns

For more reliable results, show examples of the full reasoning process:

Q: A train travels 150 miles in 2.5 hours. At the same speed,
how long to travel 390 miles?
A: First, I need the speed: 150 miles / 2.5 hours = 60 mph.
Then time for 390 miles: 390 / 60 = 6.5 hours.
Answer: 6.5 hours
Q: Maria has 3 times as many stamps as Tom. Together they have
120 stamps. How many does Maria have?
A: Let Tom's stamps = t. Then Maria's = 3t.
Total: t + 3t = 4t = 120
So t = 30, Maria has 3 × 30 = 90 stamps.
Answer: 90 stamps
Q: [your actual problem here]
A:

The model will follow the same pattern of showing work before giving a final answer.


When CoT Helps (and When It Doesn’t)

CoT helps significantly:

CoT has modest impact:

CoT can hurt:


Self-Consistency: Multiple Reasoning Paths

One of the most powerful extensions of CoT is self-consistency sampling. Instead of taking the first answer, you generate multiple reasoning chains (with temperature > 0) and take the majority vote.

Generate 5 solutions to the same problem at temperature 0.7:
Chain 1 → $26.50 ✓
Chain 2 → $26.50 ✓
Chain 3 → $31.50 ✗ (forgot the coupon)
Chain 4 → $26.50 ✓
Chain 5 → $26.50 ✓
Majority vote: $26.50 → much more reliable than single-sample

Self-consistency adds cost (5× more tokens) but can dramatically improve accuracy on hard reasoning tasks. Used in production by systems that need high reliability on math or logic.


Tree of Thought (ToT)

An extension that creates a search tree of reasoning paths rather than a linear chain. The model proposes multiple reasoning steps at each decision point, evaluates them, and expands the most promising ones.

Problem
├── Approach A → [evaluate: promising]
│ ├── Step A.1 → [evaluate: dead end] ✗
│ └── Step A.2 → [evaluate: promising] ✓
│ └── Final answer (from branch A.2)
└── Approach B → [evaluate: weaker]

ToT is computationally expensive (many calls per problem) but approaches expert human performance on problems like the 24-game or creative writing puzzles that require strategic search.


Reasoning Models: CoT Built In

A significant shift in 2024–2025: the emergence of reasoning models that do chain-of-thought internally, before generating their response.

For these models, you don’t need to add “think step by step” — they reason internally as standard practice. For earlier-generation models (GPT-4, Claude 3 Sonnet), CoT prompting remains highly effective.


Practical Implementation

# Using Claude or OpenAI API
prompt = """
Solve this problem step by step. Show each calculation.
At the end, state your final answer clearly.
Problem: A company's revenue grew 15% in Q1, declined 8% in Q2,
and grew 20% in Q3. Starting from $1,000,000 in revenue,
what is the Q3 ending revenue?
"""
response = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=512,
messages=[{"role": "user", "content": prompt}]
)

The model’s reasoning process is now part of its output, making it easier to: