Because a candle array is hundreds of numbers per ticker, it gets re-sent on every turn, and the model still has to re-derive the structure itself — exactly where LLMs are least reliable. Here's where the tokens go, why it's also error-prone, and the token-compact fix.
Answer-first · honest · for agent builders
A raw OHLCV array is hundreds of numbers per ticker — a 200-candle 4h window is 1,000+ numbers, which serializes to several thousand tokens. That array is re-sent every turn the agent reasons, and multiplied by every ticker it watches. Worse: the model still has to re-derive the structure (patterns, levels, indicators) itself — by doing arithmetic on a long flat list of numbers, which is precisely where LLMs hallucinate (miscounted bars, misread closes, invented RSI values). The fix: precompute the structure server-side and send a token-compact brief — a few hundred tokens of interpreted state plus a one-line summary the agent can act on, with the raw candles still available compactly when you actually need them.
One OHLCV candle is six numbers: a timestamp and the open, high, low, close, and volume. That's the irreducible unit. Now scale it:
{"t":...,"o":...,"h":...,"l":...,"c":...,"v":...} and you add field names, braces, colons and commas to every single row. Tokenizers split long decimals like 60480.17 into multiple tokens, so a "number" is rarely one token.Back-of-envelope: a 200-candle window serialized as conventional JSON lands in the low thousands of tokens for a single ticker on a single turn. A digested brief covering that same window — patterns, levels, regime, interpreted indicators, and a summary line — is a few hundred tokens. Multiply both by turns and tickers and the gap is the difference between a context window you can afford to keep open and one that blows your budget.
Exact token counts depend on the tokenizer, decimal precision, and whether you use array-of-arrays vs array-of-objects — but the order of magnitude (thousands vs hundreds) holds across models.
Cost is only half the problem. Long flat arrays of numbers are the worst-case input for a language model. LLMs don't iterate an index and count; they pattern-match and approximate. On a raw candle array that means:
Interpreted state sidesteps all of this. If the structure is computed deterministically server-side and handed to the model as labels and a few key numbers — "rsi":{"v":58.3,"state":"neutral"}, "trend":"up", "double_bottom" (conf 0.86) — there is no long-array arithmetic left for the model to get wrong. It reasons over a clean, small, already-correct summary.
One /v1/brief call to patternfetch returns the digested market state for a ticker and timeframe: detected chart & candlestick patterns, support/resistance levels, a regime label, interpreted RSI/EMA/ATR (value + state, not a raw series), and a one-line nl summary the agent can act on directly. It's a few hundred tokens — and the raw candles are still there, compactly, in codec.rows, for when you genuinely need to compute something yourself.
{ "candles": [
[1718000000000,60125.4,60480.0,59890.1,60310.7,1284.3],
[1718014400000,60310.7,60790.2,60180.0,60655.9,1102.8],
[1718028800000,60655.9,60980.4,60410.3,60540.1, 998.1],
[1718043200000,60540.1,60600.0,59980.7,60120.4,1411.6],
… 196 more rows …
] }
# ~1,200 numbers · re-sent every turn · the model
# still has to find the close, count bars, derive RSI
{
"header": { "sym":"BTC/USDT","tf":"4h","n":200 },
"codec": { "rows":"60125.4,60480,59890.1,60310.7,1284;…",
"sax":"dcefdcbe", "precision":1 },
"analysis": {
"patterns":[{"name":"double_bottom","confidence":0.86}],
"levels":{ "support":[{"price":59820.4}],
"resistance":[{"price":63450.8}] },
"regime":{ "trend":"up","strength":0.42,"volPct":2.13 },
"indicators":{ "rsi":{"v":58.3,"state":"neutral"},
"ema":{"v":61240.8,"state":"above_20_50"} },
"nl":"BTC/USDT: uptrend (moderate), +1.94% last 4h,
RSI 58.3 (neutral), double_bottom (conf 0.86)."
}
}
# few hundred tokens · raw rows still in codec.rows
The nl line is a ready-to-reason summary, so the model never has to do arithmetic on raw numbers. The codec.rows field keeps the actual candles available in a compact, comma-packed form if you do need them. See it live →
POST a ticker and timeframe to /v1/brief:
curl -X POST https://patternfetch.com/v1/brief \
-H "Authorization: Bearer pf_…" \
-H "Content-Type: application/json" \
-d '{ "ticker": "BTC/USDT", "timeframe": "4h" }'
① No-signup demo — POST /v1/demo returns a real brief with no key. ② Free MCP discovery — agents can tools/list with no key. ③ $0.01 per brief with a free key ($0.05 starter credit), one-click OAuth for MCP (Smithery, Claude.ai, Cursor, Claude Desktop), and pay-per-call via x402 (USDC, no account) or Stripe. Pricing →
One candle is 6 numbers (timestamp, open, high, low, close, volume). A 200-candle window is ~1,200 numbers; serialized as JSON with field names, brackets and commas — and with long decimals split across multiple tokens — that lands in the low thousands of tokens per ticker, re-sent every turn and multiplied by every ticker. A digested brief over the same window is a few hundred tokens.
Because long flat number arrays are the input they're worst at. Models approximate rather than iterate an index, so they miscount bars, misread which row is the latest close, and sometimes invent indicator values (like an RSI) that were never in the data. Sending interpreted state instead of raw numbers removes the arithmetic step where those errors happen.
Yes. A brief still includes the candles compactly in codec.rows, and POST /v1/candles returns the compact candle codec on its own. You get digested state for reasoning and the raw rows when you need to compute something yourself — without paying full-array token cost on every turn.
No. It's impersonal market data and algorithmic signals, for informational purposes only — not advice, not personalized, non-executing. See the disclaimer.