When to Use Toon Format: A Practical Guide to Real-World Use Cases

So you’ve learned about Toon Format and its impressive token savings. But here’s the real question: when should you actually use it?

Not every project needs TOON. Sometimes JSON works just fine. But if you’re dealing with structured data and LLMs, there are specific scenarios where switching to TOON can save you serious time, money, and headaches.

TLDR: When TOON Shines
RAG Systems: Feeding large datasets to LLMs for retrieval
API Cost Optimization: High-volume LLM API calls with structured data
Agent Systems: Multi-step workflows with data passing between agents
Prompt Engineering: Few-shot examples with structured data
Batch Processing: Converting datasets for LLM analysis

Use Case #1: RAG Systems with Large Datasets

Problem: You’re building a Retrieval-Augmented Generation system. Your vector search returns 20 product records, and you need to stuff them into the LLM’s context for the user’s question.

With JSON, those 20 products might consume 3,000+ tokens before you even add the user’s question or system prompt.

Solution: Convert your retrieved data to TOON before sending it to the LLM.

Before (JSON):

[
  {
    "id": "prod-001",
    "name": "Wireless Mouse",
    "price": 29.99,
    "stock": 145,
    "category": "Electronics"
  },
  {
    "id": "prod-002",
    "name": "USB-C Cable",
    "price": 12.99,
    "stock": 89,
    "category": "Electronics"
  }
  // ... 18 more products
]

After (TOON):

products[20]{id,name,price,stock,category}:
  prod-001,Wireless Mouse,29.99,145,Electronics
  prod-002,USB-C Cable,12.99,89,Electronics
  ...

Impact: With 20 products, you save approximately 1,200 tokens. That’s enough space to include more context, examples, or retrieved chunks.

Implementation Guide:

from toon import to_toon
import anthropic

# Your RAG retrieval
search_results = vector_db.search(query, limit=20)

# Convert to TOON
products_toon = to_toon(search_results)

# Build prompt
prompt = f"""Here are the available products:

{products_toon}

User question: {user_query}

Please recommend the best option and explain why."""

# Send to LLM
client = anthropic.Anthropic()
response = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    messages=[{"role": "user", "content": prompt}]
)

Use Case #2: Reducing API Costs in Production

Problem: Your app makes 1 million LLM API calls per month. Each call includes a structured dataset (user profile, purchase history, preferences). Your monthly token bill is $4,500.

Solution: Switch the structured data portions to TOON format.

Let’s say each request looks like this:

Before (JSON - 450 tokens):

{
  "user": {
    "id": "u-12345",
    "tier": "premium",
    "joinDate": "2024-03-15"
  },
  "purchases": [
    {"date": "2024-12-01", "item": "Widget A", "amount": 49.99},
    {"date": "2024-12-15", "item": "Gadget B", "amount": 89.99},
    {"date": "2025-01-10", "item": "Tool C", "amount": 129.99}
  ],
  "preferences": {
    "notifications": true,
    "newsletter": false
  }
}

After (TOON - ~270 tokens):

user:
  id: u-12345
  tier: premium
  joinDate: 2024-03-15
purchases[3]{date,item,amount}:
  2024-12-01,Widget A,49.99
  2024-12-15,Gadget B,89.99
  2025-01-10,Tool C,129.99
preferences:
  notifications: true
  newsletter: false

Cost Impact:

Token reduction per request: 180 tokens (40%)
Monthly token savings: 180M tokens
Cost savings: ~$1,800/month (at typical API pricing)

That’s a 40% reduction in your data transfer costs, without changing any application logic.

Use Case #3: Multi-Agent Systems

Problem: You’re building an agent system where Agent A retrieves data, Agent B processes it, and Agent C generates a report. Each handoff involves passing structured data through the LLM.

With JSON, each agent consumes extra tokens just for syntax overhead.

Solution: Use TOON as the “data wire format” between agents.

Workflow Example:

# Agent A: Data Retrieval
def agent_a_retrieve(query):
    results = database.query(query)
    return to_toon(results)  # Convert to TOON

# Agent B: Data Processing
def agent_b_process(data_toon):
    prompt = f"""Analyze this data and extract key insights:

{data_toon}

Provide insights in TOON format:
insights[N]{category,description,importance}:
  ..."""

    response = llm.generate(prompt)
    return response  # Already in TOON

# Agent C: Report Generation
def agent_c_report(insights_toon):
    prompt = f"""Generate a summary report from these insights:

{insights_toon}"""

    return llm.generate(prompt)

Benefits:

Less token waste between agent handoffs
Cleaner, more readable intermediate data
Explicit schemas help agents understand data structure

Use Case #4: Prompt Engineering with Examples

Problem: You’re doing few-shot prompting and need to include 3-5 examples in your prompt. Each example has structured input/output. The examples alone consume 2,000+ tokens.

Solution: Format your examples in TOON to save tokens for the actual instruction and response.

Before:

Example 1:
Input: {"userId": "u-001", "items": [{"sku": "A1", "qty": 2}, {"sku": "B2", "qty": 1}]}
Output: {"total": 145.97, "shipping": 12.00, ...}

Example 2:
Input: {"userId": "u-002", "items": [{"sku": "C3", "qty": 5}]}
Output: {"total": 89.95, "shipping": 0, ...}

Token count: ~600 tokens

After:

Example 1:
Input:
  userId: u-001
  items[2]{sku,qty}:
    A1,2
    B2,1
Output:
  total: 145.97
  shipping: 12.00

Example 2:
Input:
  userId: u-002
  items[1]{sku,qty}:
    C3,5
Output:
  total: 89.95
  shipping: 0

Token count: ~360 tokens (40% savings)

This gives you room to include MORE examples, which often improves model performance more than verbose formatting.

Use Case #5: Data Export & Batch Processing

Problem: You have 10,000 database records to analyze with an LLM. You can’t fit them all in one prompt, so you batch them into groups of 100.

Solution: Export batches to TOON format, maximizing the records per batch.

Scenario: Customer feedback analysis

Script:

import pandas as pd
from toon import to_toon
import anthropic

# Load customer feedback
df = pd.read_csv('feedback.csv')

# Process in batches
batch_size = 100
results = []

for i in range(0, len(df), batch_size):
    batch = df[i:i+batch_size].to_dict('records')
    batch_toon = to_toon({'feedback': batch})

    prompt = f"""Analyze this customer feedback and categorize by sentiment:

{batch_toon}

Return categories in TOON format:
categories[N]{id,sentiment,theme}:
  ..."""

    response = llm.generate(prompt)
    results.append(response)

Impact: By using TOON, you can fit 100 records instead of ~70 with JSON. That’s 30% fewer API calls needed.

When NOT to Use TOON

Let’s be honest—TOON isn’t always the answer. Here are scenarios where you should stick with JSON:

Browser-based applications: If you’re sending data directly to a web frontend, JSON is native. The conversion overhead isn’t worth it.
Small data payloads: If you’re only sending 3-4 fields, the token savings are minimal (maybe 10-20 tokens). Not worth the complexity.
Non-LLM APIs: If your data is going to a REST API or traditional service, they expect JSON. Don’t convert unnecessarily.
Complex nested structures: TOON excels with tabular data (arrays of objects). If your data is heavily nested with irregular shapes, JSON might be clearer.

Getting Started Checklist

Ready to try TOON in your project? Here’s a practical checklist:

✓ Identify high-volume structured data flows

Where are you sending arrays of objects to LLMs?
Which prompts have repeated data patterns?

✓ Measure current token usage

Use your LLM provider’s API to check token counts
Calculate the cost per request

✓ Run a pilot conversion

Pick one endpoint or workflow
Convert the data portion to TOON
Measure token reduction and accuracy

✓ Add conversion functions

// Add to your codebase
import { toToon, fromToon } from '@toon-format/toon';

function prepareForLLM(data) {
  return toToon(data);
}

function parseFromLLM(toonString) {
  return fromToon(toonString);
}

✓ Update prompts

Add clear instructions that data is in TOON format
Include example TOON structures for output generation

✓ Monitor and iterate

Track token savings
Monitor parsing errors (should decrease)
Measure response quality

Conclusion

Toon Format isn’t a replacement for JSON in your application layer. It’s a specialized tool for the LLM interface layer—the boundary where structured data meets natural language processing.

The sweet spot for TOON is anywhere you have:

High volume (many API calls or large datasets)
Structured data (arrays of objects with consistent schemas)
Token constraints (cost concerns or context limits)

If you hit even two of those three, TOON is worth testing. Start with one high-impact use case, measure the results, and expand from there.

The 40% token savings are real, but the bigger win is often the improved reliability in structured output generation. Less syntax means fewer ways for the LLM to mess up.

Ready to optimize your LLM workflows? Check out the official Toon Format documentation or try our JSON to TOON converter to see your own data transformed.

TLDR: When TOON Shines