How to Guide LLMs Through Step-By-Step Reasoning

When you're working with language models, you'll notice that showing them what you want through clear examples often works far better than giving long, complicated instructions. Examples help the model understand your intention and follow patterns based on how the LLM architecture processes information. Building on this idea, chain-of-thought prompting takes things a step further by encouraging the model to think out loud. It uses a reasoning process to guide the LLM to produce the desired result.

Have you ever received an answer from an AI that seemed correct at first glance but contained logical errors when you looked closer? Or perhaps you've asked a complex question and received a simple answer with no explanation of how the AI arrived at that conclusion? Chain-of-thought prompting addresses these problems by encouraging language models to "show their work". What exactly does this mean? Let's find out.

What is Chain-of-Thought Prompting?

Chain-of-thought (CoT) prompting is a technique where you guide the language model to break down complex problems into a series of intermediate steps before arriving at the final answer. Instead of jumping straight to the solution, the model walks through the reasoning process step by step.

Think about how you solve challenging problems. You rarely leap directly to the answer. Instead, you work through the problem methodically, considering different aspects and drawing connections between them. CoT works in a very similar fashion.

Standard Prompting

User Prompt:

If John has 5 apples and gives 2 to Mary, then buys 3 more and eats 1, how many apples does John have now?

Model Response:

John has 5 apples.

Limitations: The model provides only the final answer without showing the reasoning process. If the answer is wrong, it's unclear where the error occurred.

Chain-of-Thought-Prompting

User Prompt:

If John has 5 apples and gives 2 to Mary, then buys 3 more and eats 1, how many apples does John have now? Let's think step by step.

Model Response:

John starts with 5 apples.
John gives 2 apples to Mary, so he has 5 -2 = 3 apples.
John buys 3 more apples, so he has 3 + 3 = 6 apples.
John eats 1 apple, so he has 6-1 = 5 apples.
Therefore, John has 5 apples now.

Benefits: The model breaks down the problem into logical steps, making the reasoning transparent. This helps catch errors and makes the solution process clear to users.

Compare the differences between standard prompting and CoT prompting in the examples above. Here's what stands out:

Chain-of-thought shows you the entire reasoning process, not just the final answer
Intermediate steps make it easier to spot and correct logical errors
The approach mimics human problem-solving more closely

Why Chain-of-Thought Works

CoT prompting works particularly well with complex reasoning tasks for several key reasons:

Step-by-step decomposition breaks hard problems into manageable pieces, reducing errors that occur when trying to solve everything at once.
Error detection becomes much easier when you can see each step, as you can spot exactly where the reasoning went wrong.
Knowledge organisation helps structure the approach to a problem rather than jumping around all over.

As expected, the technique is particularly good for math problems, logical reasoning, and multi-step decision processes where the path to the solution matters as much as the solution itself.

How to Create Effective Chain-of-Thought Prompts

There are two main approaches to CoT prompting: few-shot CoT and zero-shot CoT. Let's explore both:

Few-Shot Chain-of-Thought

Few-shot CoT combines the example-based approach we covered in our previous article with step-by-step reasoning. You provide examples that not only show the input and output but also demonstrate the reasoning process.

Example 1:
Q: A store has 25 shirts. If they sell 15 shirts and then get a shipment of 20 more shirts, how many shirts does the store have now?

A: I need to find out how many shirts the store has now. The store starts with 25 shirts. They sell 15 shirts, so they have 25 - 15 = 10 shirts left. Then they get a shipment of 20 more shirts, so they have 10+ 20 = 30 shirts. Therefore, the store has 30 shirts now.

Example 1:
Q: Sarah had $120. She spent 1/3 of her money on a concert ticket and 1/4 of the remaining money on dinner. How much money does she have left?

A: I need to find how much money Sarah has left. Sarah starts with $120. She spends 1/3 of her money on a concert ticket: $120 x (1/3) = $40. After buying the ticket, she has $120-$40 = $80 left. She spends 1/4 of the remaining money on dinner: $80 x (1/4) = $20. After dinner, she has $80 - $20 = $60 left. Therefore, Sarah has $60 left.

New Problem:
Q: A bakery made 40 loaves of bread. They sold 3/4 of their bread in the morning and 1/2 of the remaining bread in the afternoon. How many loaves of bread did they have left?

Model Response:

A: I need to find how many loaves of bread the bakery has left. The bakery starts with 40 loaves of bread. They sold 3/4 of their bread in the morning: 40 x (3/4) = 30 loaves. After the morning, they have 40-30 = 10 loaves left. They sold 1/2 of the remaining bread in the afternoon: 10 x (1/2) = 5 loaves. After the afternoon, they have 10-5 = 5 loaves left. Therefore, the bakery has 5 loaves of bread left.

Refer above for a complete few-shot CoT prompt with examples. Notice how each example breaks down the problem-solving process into clear steps, which the model then mimics when tackling the new problem.

Here are some pointers to help you build a better few-shot CoT:

Pick examples that show good, clear thinking.
Make each step simple and easy to follow.
Be consistent and always show the reasoning in the same style.
Include both the question and the detailed solution for each example.
End each example with the final answer clearly marked.

Zero-Shot Chain-of-Thought

User Prompt:
The browns, the reds, and the pinks are different types of flowers. The browns are to the right of the reds. The pinks are to the left of the browns. Which flowers are in the middle? Let's think step by step.

Model Response:

1. We have three types of flowers: browns, reds, and pinks.

2. The browns are to the right of the reds.

3. The pinks are to the left of the browns.

4. So from left to right, we have some arrangement.

5. If the browns are to the right of the reds, then the reds are to the left of the browns.

6. If the pinks are to the left of the browns, then the browns are to the right of the pinks.

7. So we have: pinks are to the left of browns, and reds are to the left of browns.

8. We also need to determine the order of pinks and reds.

9. If the pinks are to the left of the browns and the reds are to the left of the browns, we need more information to determine which is leftmost.

10. But we can use the constraint that all three types must be in the arrangement.

11. If pinks were to the left of reds, the order would be: pinks, reds, browns.

12. If reds were to the left of pinks, the order would be: reds, pinks, browns.

13. Let's check both scenarios against our constraints.

14. In scenario 1 (pinks, reds, browns): pinks are to the left of browns (✓), reds are to the left of browns (✓).

15. In scenario 2 (reds, pinks, browns): pinks are to the left of browns (✓), reds are to the left of browns (✓).

16. Both scenarios satisfy our constraints, but we need to determine which is correct.

17. The additional constraint is that the pinks are to the left of the browns, which is satisfied in both scenarios.

18. Wait, let me re-read the problem. It says "the pinks are to the left of the browns." This doesn't tell us about the relationship between pinks and reds.

19. Since both pinks and reds are to the left of browns, and we don't have information about the relationship between pinks and reds, we need to reconsider.

20. Actually, if both pinks and reds are to the left of browns, and all three types must be in different positions, then browns must be on the right.

21. If browns are on the right, and we have three positions (left, middle, right), then browns are in the right position.

22. This means pinks and reds must occupy the left and middle positions.

23. If pinks are to the left of browns, they could be either in the left or middle position.

24. If reds are to the left of browns, they could also be either in the left or middle position.

25. But we're told the browns are to the right of the reds, which means the reds are directly to the left of browns.

26. So reds must be in the middle position, and by elimination, pinks must be in the left position.

27. Therefore, the flowers in the middle are the reds.

Refer to the above example to see how a simple phrase like "Let's think step by step" can trigger detailed reasoning without requiring examples.

Here are some phrases that you can use:

"Let's think step by step." The most common and effective zero-shot CoT prompt. Simple but powerful.
"Let's solve this carefully, one step at a time.” Emphasises careful reasoning and methodical approach.
"Let's break this down into steps.” Directly instructs the model to create a step-by-step breakdown.
"First, let's understand what we know. Then we'll solve step by step." Encourages the model to organise information before solving.

When Should You Use Chain-of-Thought Prompting

Chain-of-thought prompting isn't always necessary. For simple, straightforward tasks, standard prompting will get the work done. That said, you can use iti in the following scenarios:

Multi-Step Math Problems

CoT prompting works well for calculations that require multiple steps, such as compound interest across different periods, word problems with several variables, or complex probability scenarios. This approach divides calculations into manageable parts and helps you avoid arithmetic errors that commonly occur during mental calculations.

Some interesting problems you can tackle are:

Calculating the total amount after 10 years of compound interest with varying interest rates
Solving a word problem where you need to find the speed of two trains meeting at a specific point
Determining the probability of drawing specific card combinations from multiple decks

Logical Reasoning

For puzzles with multiple constraints, analysing sequences, or evaluating logical arguments, CoT prompting helps you address all conditions properly and maintain logical connections throughout your analysis. This becomes particularly valuable when you need to track multiple rules simultaneously.

You can try experimenting with problems like this:

Solving a scheduling problem where five people must be assigned to different tasks with specific constraints
Analysing a number sequence like 2, 5, 11, 23, 47 to identify the pattern and predict the next value
Evaluating whether a complex argument with multiple premises leads to a valid conclusion

Complex Decision Making

When comparing options with various advantages and disadvantages, analyzing business cases with multiple factors, or working through ethical questions, CoT prompting allows you to evaluate all relevant considerations before making a conclusion. This reduces the chance of missing important aspects of the problem.

This is where you can push LLM to its limits:

Deciding between three job offers by systematically comparing salary, benefits, location, growth potential, and company culture
Analysing whether a company should invest in expanding to a new market based on market research, competition, and financial projections
Considering the ethical implications of implementing an AI system that could improve efficiency but might displace workers

Troubleshooting

For finding errors in code with multiple potential issues, diagnosing system problems, or analysing the causes of errors, extended thinking lets you systematically eliminate possibilities and test different hypotheses. This structured method often produces more thorough and accurate solutions.

If you're working on fixing bugs or issues at work, you can experiment with CoT in scenarios like:

Debugging a Python function that produces unexpected results by tracing through each step of execution
Diagnosing why a computer network has intermittent connectivity issues by checking each potential point of failure
Analysing why a manufacturing process suddenly produces defective items by examining each variable in the production line

When Should You Stick To Standard Prompting

Sometimes, CoT can make it unnecessarily complex and delay your response. It can fall short as well. Here are some cases where it's best to stick to your usual standard prompting techniques:

Simple Information Retrieval

When you only need factual information, it's better to stick to standard prompting as it's faster and equally effective. Some day-to-day questions like:

"What year was the light bulb invented?"
"What are the ingredients in a classic cheesecake?"
"Who was the 16th President of the United States?

Time-Critical Responses

When you need quick answers, using detailed reasoning is going to delay your responses. For instance, let's say you want to:

Getting a quick explanation of a current event
Obtaining a brief definition of a technical term
Finding a simple recipe for dinner

How To Get The Best Out Of CoT Prompting

Understanding how to improve LLM reasoning with advanced techniques starts with mastering these fundamental principles:

Structure Your Prompts Clearly

A well-structured CoT prompt includes:

A clear problem statement
An explicit request for step-by-step reasoning
Specific guidance on what constitutes good reasoning

Here's how you can structure it: I need to calculate how much I'll save if I deposit $5,000 in an account with 4.25% annual interest, compounded monthly, over 7 years. Please work through this step by step, showing all calculations for each year, and explain your reasoning at each stage. Make sure to use the correct compound interest formula and round to the nearest cent in your final answer.

Include Relevant Context

Make sure to provide all the information needed to solve the problem. If the reasoning requires certain facts or assumptions, state them explicitly in your prompt.

Here's an example: I'm trying to determine if a clinical trial result is statistically significant. The study had 120 participants split evenly between treatment and control groups. The treatment group had 42 positive outcomes while the control group had 30. Using a significance level of 0.05, please determine if this result is statistically significant. Show your calculation of the p-value and explain each step in your statistical reasoning.

Encourage Complete Reasoning

Ask the model to show all steps, including intermediate calculations and logical connections. This helps avoid leaps in reasoning that might hide errors.

You can try this: Solve this probability problem: In a bag of 10 marbles, 4 are blue, 3 are red, and 3 are green. If I draw 3 marbles without replacement, what's the probability they're all different colours? Please show every step of your calculation, including how you determine the total number of possible outcomes and the number of favourable outcomes. Don't skip any mathematical steps.

Check for Logical Consistency

Review the reasoning to ensure each step follows logically from previous steps. Look for contradictions or unsupported assumptions.

Here's an example: Analyse whether this argument is valid: "All birds have wings. Penguins are birds. Therefore, penguins can fly." Walk through your logical analysis step by step, identifying any premises, assumptions, and conclusions. For each step, explain whether it follows from previous steps and identify any logical fallacies. At the end, state whether the argument is deductively valid and sound, explaining your reasoning.

You can take inspiration from the above prompts that lead to clear, thorough reasoning and accurate results. Try experimenting and have fun!

Conclusion

Chain-of-thought prompting takes complex problems and solves them step-by-step, just like a good teacher working through a tough problem on the board. Instead of jumping straight to the answer, this approach will break things down into logical steps.

The beauty of this prompting is its flexibility. Sometimes, simply asking "Let's work through this step-by-step" is enough to get detailed reasoning. At other times, you may want to provide an example of the kind of thinking you're looking for, like we saw in the previous section. Either way, you end up with clearer, more trustworthy responses that show you not just what the answer is, but how it was found.

How to Guide LLMs Through Step-By-Step Reasoning

What is Chain-of-Thought Prompting?

Standard Prompting

User Prompt:

Model Response:

Chain-of-Thought-Prompting

User Prompt:

Model Response:

Why Chain-of-Thought Works

How to Create Effective Chain-of-Thought Prompts

Few-Shot Chain-of-Thought

Example 1:
Q: A store has 25 shirts. If they sell 15 shirts and then get a shipment of 20 more shirts, how many shirts does the store have now?

Example 1:
Q: Sarah had $120. She spent 1/3 of her money on a concert ticket and 1/4 of the remaining money on dinner. How much money does she have left?

New Problem:
Q: A bakery made 40 loaves of bread. They sold 3/4 of their bread in the morning and 1/2 of the remaining bread in the afternoon. How many loaves of bread did they have left?

Zero-Shot Chain-of-Thought

User Prompt:
The browns, the reds, and the pinks are different types of flowers. The browns are to the right of the reds. The pinks are to the left of the browns. Which flowers are in the middle? Let's think step by step.

When Should You Use Chain-of-Thought Prompting

Multi-Step Math Problems

Logical Reasoning

Complex Decision Making

Troubleshooting

When Should You Stick To Standard Prompting

Simple Information Retrieval

Time-Critical Responses

How To Get The Best Out Of CoT Prompting

Structure Your Prompts Clearly

Include Relevant Context

Encourage Complete Reasoning

Check for Logical Consistency

Conclusion

Drop a Query

About the Author

Found the Post Useful? Share It Now!

Post Comments

Popular Courses

PMP® Certification Course Training

Guidewire Training

Lean Six Sigma Certification Course Training

Lean Six Sigma Black Belt Certification

How to Guide LLMs Through Step-By-Step Reasoning

What is Chain-of-Thought Prompting?

Standard Prompting

User Prompt:

Model Response:

Chain-of-Thought-Prompting

User Prompt:

Model Response:

Why Chain-of-Thought Works

How to Create Effective Chain-of-Thought Prompts

Few-Shot Chain-of-Thought

Example 1: Q: A store has 25 shirts. If they sell 15 shirts and then get a shipment of 20 more shirts, how many shirts does the store have now?

Example 1: Q: Sarah had $120. She spent 1/3 of her money on a concert ticket and 1/4 of the remaining money on dinner. How much money does she have left?

New Problem: Q: A bakery made 40 loaves of bread. They sold 3/4 of their bread in the morning and 1/2 of the remaining bread in the afternoon. How many loaves of bread did they have left?

Zero-Shot Chain-of-Thought

User Prompt: The browns, the reds, and the pinks are different types of flowers. The browns are to the right of the reds. The pinks are to the left of the browns. Which flowers are in the middle? Let's think step by step.

When Should You Use Chain-of-Thought Prompting

Multi-Step Math Problems

Logical Reasoning

Complex Decision Making

Troubleshooting

When Should You Stick To Standard Prompting

Simple Information Retrieval

Time-Critical Responses

How To Get The Best Out Of CoT Prompting

Structure Your Prompts Clearly

Include Relevant Context

Encourage Complete Reasoning

Check for Logical Consistency

Conclusion

Drop a Query

About the Author

Found the Post Useful? Share It Now!

Post Comments

Popular Courses

PMP® Certification Course Training

Guidewire Training

Lean Six Sigma Certification Course Training

Lean Six Sigma Black Belt Certification

Example 1:
Q: A store has 25 shirts. If they sell 15 shirts and then get a shipment of 20 more shirts, how many shirts does the store have now?

Example 1:
Q: Sarah had $120. She spent 1/3 of her money on a concert ticket and 1/4 of the remaining money on dinner. How much money does she have left?

New Problem:
Q: A bakery made 40 loaves of bread. They sold 3/4 of their bread in the morning and 1/2 of the remaining bread in the afternoon. How many loaves of bread did they have left?

User Prompt:
The browns, the reds, and the pinks are different types of flowers. The browns are to the right of the reds. The pinks are to the left of the browns. Which flowers are in the middle? Let's think step by step.