Prevent LLM Hallucinations with Deterministic MCP Tools

LLMs are powerful, but they have a critical flaw: they hallucinate. When asked to count, calculate, or validate, AI models often produce confident but incorrect answers. This guide explains why hallucinations happen and how deterministic MCP tools eliminate them.

What Are LLM Hallucinations?

LLM hallucinations are outputs that are incorrect, fabricated, or inconsistent with reality. While the term originally referred to AI models making up facts that don't exist, hallucinations in the context of utilities are more subtle: the AI produces a reasonable-looking answer that happens to be wrong.

Consider this interaction:

Example Hallucination
User: How many R's are in the word "strawberry"?

AI: The word "strawberry" contains 2 R's.
    s-t-R-a-w-b-e-R-R-y
           ^       ^ ^
    Wait, let me recount... there are 3 R's.

This is a classic hallucination example. The AI initially gives a confident wrong answer, and even when it tries to verify, its counting is inconsistent. The correct answer (3 R's) is only sometimes reached, and the process reveals the AI isn't actually counting - it's guessing.

Why LLMs Fail at Deterministic Tasks

Understanding why hallucinations occur helps explain why tools are the solution.

LLMs Are Pattern Matchers, Not Computers

LLMs generate text by predicting the most likely next token based on the input and their training data. When you ask "What is 17 x 23?", the model isn't performing multiplication - it's predicting what number typically appears after similar questions in its training data.

Key Insight: An LLM answering "17 x 23 = 391" is doing pattern matching, not arithmetic. It's essentially saying "in my training data, when people wrote 17 x 23 =, the next tokens were usually 391." This works for common calculations but fails for unusual ones.

Tokenization Breaks Character Counting

LLMs process text as tokens, not characters. The word "strawberry" might be tokenized as ["straw", "berry"] or ["str", "aw", "berry"]. The model never "sees" individual letters, making character counting inherently unreliable.

Confidence Doesn't Indicate Correctness

LLMs express confidence based on how common the pattern is, not how correct the answer is. A model might be very confident about a math answer because similar calculations appeared frequently in training data, even if this specific calculation is wrong.

Real Examples of Hallucinations

Here are real categories of hallucinations that affect AI agents in production:

Counting Errors

Character Counting
Prompt: "How many vowels are in 'entrepreneurship'?"

Common AI Answers:
- "5 vowels" (wrong)
- "6 vowels" (correct: e-e-u-i-a)
- "4 vowels" (wrong)

The AI often miscounts because it can't reliably
process individual characters.

Mathematical Errors

Decimal Arithmetic
Prompt: "What is 0.1 + 0.2?"

Correct Answer: 0.3

Common AI Issues:
- Returns "0.30000000000000004" (floating point)
- Returns "0.3" (correct but lucky)
- Explains floating point when not asked

The AI conflates the mathematical answer with
computer science trivia about floating point.

Conversion Errors

Temperature Conversion
Prompt: "Convert 98.6 degrees Fahrenheit to Celsius"

Correct Answer: 37.0 C (exactly)

Common AI Answers:
- "37 degrees Celsius" (rounded, loses precision info)
- "36.9 degrees Celsius" (calculation error)
- "37.1 degrees Celsius" (rounding error)
- "About 37 degrees" (vague)

The AI applies the formula incorrectly or
rounds at the wrong step.

Validation Errors

Email Validation
Prompt: "Is 'user+tag@example.com' a valid email?"

Correct Answer: Yes (per RFC 5321)

Common AI Errors:
- "No, the + makes it invalid" (wrong)
- "It might work with some providers" (vague)
- "Technically valid but..." (unnecessary hedging)

The AI guesses based on common patterns
instead of applying RFC specifications.

Hash Hallucinations

SHA256 Hashing
Prompt: "What is the SHA256 hash of 'test'?"

Correct Answer: 9f86d081884c7d659a2feaa0c55ad015a3bf4f1b2b0b822cd15d6c15b0f00a08

Common AI Behavior:
- Returns a plausible-looking but wrong hash
- Admits it cannot compute hashes
- Returns a hash from training data that
  doesn't match the input

The AI cannot compute cryptographic functions.

The Deterministic Tool Solution

Deterministic tools solve hallucinations by performing actual computation instead of prediction. When an AI agent uses a tool, it delegates the calculation to code that produces the same correct result every time.

How Tool Calling Works

Without Tools (Hallucination Prone)
User: "What is 17 x 23?"

AI Processing:
1. Parse the question
2. Look up similar patterns in training
3. Predict the most likely answer
4. Output: "391" (correct by luck, not computation)
With Tools (Deterministic)
User: "What is 17 x 23?"

AI Processing:
1. Parse the question
2. Recognize this is a calculation
3. Call tool: math/multiply(17, 23)
4. Tool returns: 391
5. Output: "391" (always correct, computed)

Why Deterministic Tools Work

  • Actual Computation: Tools run real code that computes results mathematically
  • Consistent Results: Same inputs always produce same outputs
  • No Training Bias: Results aren't influenced by what appeared in training data
  • Edge Case Handling: Tools are tested for edge cases that LLMs might miss
  • Specification Compliance: Validation tools implement actual RFC/ISO specifications

Hallucination Prevention in Action

Here's how TinyFn MCP tools prevent each type of hallucination:

Preventing Counting Errors

Character Counting with TinyFn
User: "How many R's are in 'strawberry'?"

AI Action: Call string/count-char tool
Tool Input: { "text": "strawberry", "char": "r" }
Tool Output: { "count": 3, "positions": [2, 8, 9] }

AI Response: "There are 3 R's in 'strawberry',
at positions 2, 8, and 9 (0-indexed)."

Result: Always correct, no guessing

Preventing Math Errors

Multiplication with TinyFn
User: "Calculate 17.5 x 23.8"

AI Action: Call math/multiply tool
Tool Input: { "a": 17.5, "b": 23.8 }
Tool Output: { "result": 416.5 }

AI Response: "17.5 x 23.8 = 416.5"

Result: Precise decimal arithmetic

Preventing Conversion Errors

Temperature Conversion with TinyFn
User: "Convert 98.6F to Celsius"

AI Action: Call convert/fahrenheit-to-celsius tool
Tool Input: { "fahrenheit": 98.6 }
Tool Output: { "celsius": 37.0 }

AI Response: "98.6°F is exactly 37.0°C"

Result: Correct formula, correct precision

Preventing Validation Errors

Email Validation with TinyFn
User: "Is 'user+tag@example.com' valid?"

AI Action: Call validate/email tool
Tool Input: { "email": "user+tag@example.com" }
Tool Output: {
  "is_valid": true,
  "local_part": "user+tag",
  "domain": "example.com",
  "has_plus_addressing": true
}

AI Response: "Yes, 'user+tag@example.com' is a valid
email address. The + is allowed per RFC 5321."

Result: RFC-compliant validation

Preventing Hash Hallucinations

SHA256 Hashing with TinyFn
User: "SHA256 hash of 'hello world'"

AI Action: Call hash/sha256 tool
Tool Input: { "text": "hello world" }
Tool Output: {
  "hash": "b94d27b9934d3e08a52e52d7da7dabfac484efe37a5380ee9088f7ace2efcde9"
}

AI Response: "The SHA256 hash of 'hello world' is:
b94d27b9934d3e08a52e52d7da7dabfac484efe37a5380ee9088f7ace2efcde9"

Result: Cryptographically correct hash

Implementing Hallucination Prevention

Setting up TinyFn to prevent hallucinations is straightforward:

Step 1: Get Your API Key

Sign up at tinyfn.io for a free API key (100 requests/month).

Step 2: Configure MCP

mcp.json
{
  "mcpServers": {
    "tinyfn": {
      "url": "https://api.tinyfn.io/mcp/all/",
      "headers": {
        "X-API-Key": "your-api-key"
      }
    }
  }
}

Step 3: Test

Ask your AI assistant questions that typically cause hallucinations:

  • "How many E's are in 'refrigerator'?"
  • "What is 19.99 x 7?"
  • "Convert 451 Fahrenheit to Celsius"
  • "Is 'name@sub.domain.co.uk' a valid email?"
  • "What's the MD5 hash of my password 'secret123'?"

When to Use Tools vs. Native LLM

Not every task needs a tool. Here's a guide:

Use Tools For:

  • Any calculation: Math, statistics, percentages
  • Counting: Characters, words, occurrences
  • Conversions: Units, temperatures, currencies
  • Validation: Emails, URLs, formats
  • Encoding/Hashing: Base64, SHA256, etc.
  • Date/Time: Timezone conversion, date math

Native LLM Is Fine For:

  • Creative writing and ideation
  • Summarization and explanation
  • Code generation (logic, not calculations)
  • Conversation and discussion
  • Analysis and recommendations
Rule of Thumb: If there's exactly one correct answer, use a tool. If the task is creative or analytical, let the LLM handle it.

Stop the Hallucinations Today

Get your free TinyFn API key and give your AI agents tools that don't hallucinate.

Get Free API Key

Ready to try TinyFn?

Get your free API key and start building in minutes.

Get Free API Key