patternfetch · explainer

Why is feeding raw OHLCV candles to an LLM expensive?

Because a candle array is hundreds of numbers per ticker, it gets re-sent on every turn, and the model still has to re-derive the structure itself — exactly where LLMs are least reliable. Here's where the tokens go, why it's also error-prone, and the token-compact fix.

Answer-first · honest · for agent builders

TL;DR.

A raw OHLCV array is hundreds of numbers per ticker — a 200-candle 4h window is 1,000+ numbers, which serializes to several thousand tokens. That array is re-sent every turn the agent reasons, and multiplied by every ticker it watches. Worse: the model still has to re-derive the structure (patterns, levels, indicators) itself — by doing arithmetic on a long flat list of numbers, which is precisely where LLMs hallucinate (miscounted bars, misread closes, invented RSI values). The fix: precompute the structure server-side and send a token-compact brief — a few hundred tokens of interpreted state plus a one-line summary the agent can act on, with the raw candles still available compactly when you actually need them.

Where the tokens go

One OHLCV candle is six numbers: a timestamp and the open, high, low, close, and volume. That's the irreducible unit. Now scale it:

Back-of-envelope: a 200-candle window serialized as conventional JSON lands in the low thousands of tokens for a single ticker on a single turn. A digested brief covering that same window — patterns, levels, regime, interpreted indicators, and a summary line — is a few hundred tokens. Multiply both by turns and tickers and the gap is the difference between a context window you can afford to keep open and one that blows your budget.

Exact token counts depend on the tokenizer, decimal precision, and whether you use array-of-arrays vs array-of-objects — but the order of magnitude (thousands vs hundreds) holds across models.

Why it's also error-prone

Cost is only half the problem. Long flat arrays of numbers are the worst-case input for a language model. LLMs don't iterate an index and count; they pattern-match and approximate. On a raw candle array that means:

Interpreted state sidesteps all of this. If the structure is computed deterministically server-side and handed to the model as labels and a few key numbers — "rsi":{"v":58.3,"state":"neutral"}, "trend":"up", "double_bottom" (conf 0.86) — there is no long-array arithmetic left for the model to get wrong. It reasons over a clean, small, already-correct summary.

The token-compact alternative

One /v1/brief call to patternfetch returns the digested market state for a ticker and timeframe: detected chart & candlestick patterns, support/resistance levels, a regime label, interpreted RSI/EMA/ATR (value + state, not a raw series), and a one-line nl summary the agent can act on directly. It's a few hundred tokens — and the raw candles are still there, compactly, in codec.rows, for when you genuinely need to compute something yourself.

Before — raw OHLCV (truncated)
{ "candles": [
  [1718000000000,60125.4,60480.0,59890.1,60310.7,1284.3],
  [1718014400000,60310.7,60790.2,60180.0,60655.9,1102.8],
  [1718028800000,60655.9,60980.4,60410.3,60540.1, 998.1],
  [1718043200000,60540.1,60600.0,59980.7,60120.4,1411.6],
  … 196 more rows …
] }
# ~1,200 numbers · re-sent every turn · the model
# still has to find the close, count bars, derive RSI
After — token-compact brief
{
  "header": { "sym":"BTC/USDT","tf":"4h","n":200 },
  "codec": { "rows":"60125.4,60480,59890.1,60310.7,1284;…",
             "sax":"dcefdcbe", "precision":1 },
  "analysis": {
    "patterns":[{"name":"double_bottom","confidence":0.86}],
    "levels":{ "support":[{"price":59820.4}],
               "resistance":[{"price":63450.8}] },
    "regime":{ "trend":"up","strength":0.42,"volPct":2.13 },
    "indicators":{ "rsi":{"v":58.3,"state":"neutral"},
                   "ema":{"v":61240.8,"state":"above_20_50"} },
    "nl":"BTC/USDT: uptrend (moderate), +1.94% last 4h,
          RSI 58.3 (neutral), double_bottom (conf 0.86)."
  }
}
# few hundred tokens · raw rows still in codec.rows

The nl line is a ready-to-reason summary, so the model never has to do arithmetic on raw numbers. The codec.rows field keeps the actual candles available in a compact, comma-packed form if you do need them. See it live →

How to use it

POST a ticker and timeframe to /v1/brief:

curl -X POST https://patternfetch.com/v1/brief \
  -H "Authorization: Bearer pf_…" \
  -H "Content-Type: application/json" \
  -d '{ "ticker": "BTC/USDT", "timeframe": "4h" }'
Free to start — no card:

No-signup demoPOST /v1/demo returns a real brief with no key. ② Free MCP discovery — agents can tools/list with no key. ③ $0.01 per brief with a free key ($0.05 starter credit), one-click OAuth for MCP (Smithery, Claude.ai, Cursor, Claude Desktop), and pay-per-call via x402 (USDC, no account) or Stripe. Pricing →

FAQ

How many tokens does a candle array use?

One candle is 6 numbers (timestamp, open, high, low, close, volume). A 200-candle window is ~1,200 numbers; serialized as JSON with field names, brackets and commas — and with long decimals split across multiple tokens — that lands in the low thousands of tokens per ticker, re-sent every turn and multiplied by every ticker. A digested brief over the same window is a few hundred tokens.

Why do LLMs miscount candles?

Because long flat number arrays are the input they're worst at. Models approximate rather than iterate an index, so they miscount bars, misread which row is the latest close, and sometimes invent indicator values (like an RSI) that were never in the data. Sending interpreted state instead of raw numbers removes the arithmetic step where those errors happen.

Can I still get the raw numbers?

Yes. A brief still includes the candles compactly in codec.rows, and POST /v1/candles returns the compact candle codec on its own. You get digested state for reasoning and the raw rows when you need to compute something yourself — without paying full-array token cost on every turn.

Is this investment advice?

No. It's impersonal market data and algorithmic signals, for informational purposes only — not advice, not personalized, non-executing. See the disclaimer.

Keep reading

Interpreted RSI/EMA/ATR → How the signals are computed → vs raw OHLCV APIs → API docs → Try it free →