Tom Sawyer's Tool: Using NoOp Tools to Improve LLM Reasoning

by
Britt Crawford
"You can choose from phantom fears and kindness that can kill
I will choose a path that's clear, I will choose free will"
— Rush, "Free Will"

It sounds a little bit ridiculous, but sometimes it helps to give an LLM a tool that does nothing but repeat back what the LLM sends it.

The Problem I Was Trying to Solve

I built a chatbot at foundervsinvestor.com where the AI responds as Liz (a founder) or Jerry (an investor). Some user questions are better answered by Liz, some by Jerry, and some need both perspectives. The AI needs to figure out which persona(s) should respond before generating an answer.

My first attempt used a straightforward prompt:

Determine whether this question should be answered by Liz, Jerry, or both. Then respond as the appropriate character(s).

It failed. The LLM would:

  • Have both Liz and Jerry basically say the same thing
  • Mix up character voices, Liz would speak for the investor and Jerry for the Founder
  • Make inconsistent decisions about when both personas should weigh in

I was using gpt-4.1 so it was not like I could just use a more powerful model.

What I Tried First

I tried a few things to get it to respond better. They helped but none of them was good enough.

Strict system prompts: Loading up the system message with detailed instructions about determining which persona should answer.

Explain your reasoning: Asking the LLM to output a structured response that included both the text and its reasoning.

Chain-of-Thought prompting: Adding "Let's think step by step" or "First, analyze which persona should respond" to the prompt.

Multiple API calls: Call the LLM once to determine the responder, then call it again to generate the response. (Okay, honestly this worked fine, but it was too slow.)

None of these really performed well enough.

The NoOp Tool Solution

Here's what worked. I created a tool called chooseResponder:

import { z } from "zod";
import { tool } from "ai";

export type Responder = "liz" | "jerry" | "both" | "none";

const chooseResponder = tool({
  description: `Always use this tool to determine whether to respond as Liz, Jerry, or Both.
  If the user asks for both Liz and Jerry's opinion, choose "both".
  If the user asks Liz a question, choose "liz".
  If the user asks Jerry a question, choose "jerry".
  If the user asks for a Founder's perspective choose "liz".
  If the user asks for an Investor's perspective choose "jerry".
  If it is not clear, both should answer. 
  `,
  parameters: z.object({
    question: z.string().describe('The question to choose the responder for'),
    reason: z.string().describe('Explain in detail why you chose the responder you did'),
    responder: z.enum(["liz", "jerry", "both"]).describe('The responder you have chosen')
  }).required({ question: true, reason: true, responder: true }),
  execute: async ({ question, reason, responder }: { question: string, reason: string, responder: Responder }): Promise<Responder> => {
    return responder;
  }
})

The function does absolutely nothing except return its input unchanged. It does nothing, but it does force the LLM down a specific generation path—it can't generate the response in a single pass anymore. It must:

  1. Generate parameters to pass to the tool
  2. Call the tool
  3. Add the tool response to the context
  4. Generate its response

Using the NoOp tool inserts an explicit decision point into the model's generation process. The LLM cannot proceed to response generation without first making an explicit, committed decision about which persona should respond. Then when it generates its response the choice of responder is there in context. It worked right away. The bot stopped mixing up Liz and Jerry or having them say the same things in slightly different words.

Critical: You need to make the tool required or give the LLM very strong instructions that it must always use it.

Without this, the LLM might skip the tool and go straight to generating a response.

In OpenAI's API, you can use tool_choice: {type: "function", function: {name: "determine_responder"}} to force the tool call.

Alternatively, include explicit instructions in your system prompt: You MUST always call the determine_responder tool before responding to analyze which persona should answer.

Other Use Cases

I've used this same pattern in a few other places to good effect.

Validating responses: If you have a long response and want to make sure a response includes specific information, you can use a NoOp tool to force the model to check its work.

Routing decisions: Before choosing which specialized agent or prompt to use, call a NoOp tool to declare the routing decision.

Improving Writing Quality: By introducing an explicit editor tool that must be called after generating a response but before returning the response to the user, you can have a model critique and improve its own writing.

You can use it anywhere you need to force the LLM through a particular reasoning pattern.

The trick isn't in making the tool do something clever. The trick is in forcing the LLM through particular reasoning paths before generating its response.

It basically allows you to abuse cheaper chat and response models to act more like reasoning models.

Epilogue: What the hell is up with the title and Rush lyrics?

Every day at Sandgarden we end our all-company daily check-in by picking a song. The day I came up with this technique, someone picked a Rush song. The lyrics quoted at the top of this post kept echoing through my head, and for whatever reason I misremembered them as being from "Tom Sawyer." I'm not a Rush fan 🤷‍♂️

/
Artificial Intelligence