What Claude looks like running 32 specialised trading agents.
A single Claude API call is a chat. 32 specialised Claude calls organised as a hedge fund's departments is a different category. iQntX is the production version of that idea — Strategist, Risk Gate, FactChecker, DoubleChecker, Macro Officer, all running on Claude as the reasoning substrate, routed subscription-first to keep cost retail-class.
What a real Claude API trading bot looks like
A weekend Claude trading bot is one API call in a cron loop. It's a tutorial, not a system. The production version of "Claude API trading bot" is a multi-agent architecture where Claude is the reasoning substrate for 20+ specialised agents — each with its own skill prompt, its own veto authority, its own journal entries.
iQntX is the production version. The same 32-agent fund engine the rest of this site describes is, at the reasoning layer, a coordinated set of Claude calls.
The architecture, from the Claude angle
┌────────────────────────┐
│ LLM Router │
│ (4 routes) │
└────────────────────────┘
│
┌───────────────────┼───────────────────┐
▼ ▼ ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Claude CLI │ │ Codex CLI │ │ Anthropic API │
│ (subscription) │ │ (subscription) │ │ (fallback) │
│ Priority 1 │ │ Priority 2 │ │ Priority 3 │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│
▼
┌─────────────────┐
│ OpenAI API │
│ (fallback) │
│ Priority 4 │
└─────────────────┘
Per call: circuit-broken, cost-logged, latency-tracked.
20+ of the 32 iQntX agents call Claude (either via CLI or API depending on route availability). The remaining agents either run pure logic (no LLM) or call Codex for code-shaped tasks.
Read the full LLM-routing architecture →
Why subscription-first matters
The single most important cost-control mechanism. A 32-agent system at retail scale:
The Claude Max x20 subscription (around $200/month for the unlimited-fair-use tier) carries the overwhelming majority of calls in a healthy deployment. The Anthropic API kicks in as fallback — typical fallback bill is $40-80/month even on active accounts. Compared to pure-API economics, the savings make multi-agent retail trading practical for the first time.
What each Claude agent does
A representative subset of the 20+ Claude-driven agents:
| Agent | Model tier | Cycle frequency | Role |
|---|---|---|---|
| Regime Classifier | Sonnet | Every 1-5 min | Pick active stance based on charts |
| Strategist | Opus | On regime change or setup signal | Propose specific trades |
| StrategySwitcher | Sonnet | Hourly | Rotate active strategies for new regime |
| Risk Gate | Sonnet | Per proposed trade | Veto based on numerical state |
| FactChecker | Sonnet | Per proposed trade | Re-verify inputs at signing |
| DoubleChecker | Sonnet | Per proposed trade | Blind second opinion |
| Macro Officer | Opus | 4x daily + on Tier-1 events | Independent macro stance |
| CEO Agent | Opus | On stance trigger | Set fund-level posture |
| Journal Writer | Sonnet | Per decision (signed or vetoed) | Write postmortem-ready entry |
| Self-Optimizer | Opus | Nightly | Read day's journal, queue experiments |
Sonnet for routine reasoning, Opus for ambiguous/high-stakes decisions. The router picks the right tier per task.
Circuit breakers: the production detail most get wrong
A circuit breaker tracks consecutive failures per route. After N failures the route is open — requests are short-circuited until a cooldown elapses. After cooldown, a single trial request tests recovery (half-open). If it succeeds, the breaker closes and normal operation resumes. If it fails, the cooldown extends exponentially.
Three details most implementations get wrong:
- Half-open allows exactly one trial. Some implementations let several requests through during half-open; this defeats the purpose. The correct design is a single canary call.
- Counter resets on success. Don't decrement on success — reset to zero. Intermittent failures should not leave the breaker permanently dangerous.
- Recovery is exponential. Successive open-state durations grow (1 minute → 2 → 4 → 8 → up to 1 hour) until a real human intervention happens.
iQntX's router gets all three right because the architecture inherits from 20 years of distributed-systems patterns (Hystrix, AWS, etc.). Without the right breaker behavior, an outage on one Claude route cascades into 32 agents hammering the broken endpoint simultaneously — turning a 10-minute degradation into a 6-hour outage.
The cost ledger
Every LLM call writes one row to llm_cost_ledger:
| Column | Purpose |
|---|---|
| id | UUID primary key |
| agent | Which agent made the call |
| route | claude_cli / codex_cli / anthropic_api / openai_api |
| model | Specific model used |
| input_tokens | Prompt tokens |
| output_tokens | Completion tokens |
| cost_usd | Computed cost (0 for subscription routes) |
| latency_ms | Round-trip time |
| success | Boolean |
| created_at | Timestamp |
SELECT against the ledger to see exactly where the month's spend went. For a healthy subscription-first deployment:
claude_cli: 4,000+ calls/day, $0.00, avg 2-3 seconds.codex_cli: 1,200+ calls/day, $0.00, avg 2-3 seconds.anthropic_api: 80-200 calls/day, $1-3, avg 1-2 seconds.openai_api: 5-20 calls/day, $0.04-0.20, avg 1-2 seconds.
The cost ledger is also the basis for the monthly cap. Operators can configure a hard ceiling on API spend; the router refuses to use API routes once the ceiling is hit, falling back to extended CLI usage even at higher latency.
How to evaluate any "Claude trading bot" product
Three questions to ask any vendor selling a Claude-based trading product:
1. Multi-agent or single-call?
If the answer is "a single Claude call decides everything," you're looking at a weekend project sold as a product. If the answer is "20+ specialised Claude calls with independent veto authority," you're looking at production architecture.
2. Subscription-first or API-only?
If the answer is "we use the Anthropic API," ask about the monthly bill. If the bill is reasonable, the call volume is probably low — meaning the agent count is low. If the call volume is high enough to support a multi-agent system, the bill on pure-API is dramatically higher than subscription-first economics would produce.
3. Circuit-broken or not?
If the answer is "we just retry on failures," you're looking at an architecture that has not yet experienced its first major Anthropic outage. The product will be down for hours when it does. The right answer involves a router, circuit breakers, and multiple fallback routes.
A vendor who can answer all three cleanly is operating at production-grade architecture. A vendor who dodges any of the three is at the weekend-project level.
Who this is for
- Engineering buyers evaluating "Claude trading bot" products and want to know the production architecture.
- Prop firm operators with the time to evaluate the engineering rigor of a product before deploying.
- Family offices running discretionary accounts who want institutional-grade reasoning at retail cost.
- Quant teams looking for a multi-agent template they can study (not necessarily replicate — the prompts and skills directory are not public).
See the production system
iQntX is the production Claude API trading bot described above — 32 specialised agents, 4-route LLM router, circuit-broken, cost-ledgered, with the subscription-first economics that make it retail-affordable.
Join the waitlist for early-access pricing. Cohorts open in waves so the system never gets ahead of itself.
You may also want
Frequently asked questions
Don't see your question? Email hello@iqntx.com — we'll add it.
Does iQntX run on the Claude API directly?
It can — the Anthropic API is one of four routes in the LLM Router. The preferred route for most Claude calls is the Claude CLI (running against a Max x20 subscription) because the per-call cost is effectively zero. The Anthropic API serves as the fallback when the CLI is unavailable or fair-use is exceeded. Either way, every Claude call is logged in a cost ledger.
Why subscription-first instead of pure API?
Pure-API economics make multi-agent retail trading impractical. A 32-agent system running 7,000 Claude calls per day at Sonnet API rates costs roughly $1,500-2,400/month. The same workload routed through the Claude Max x20 subscription is essentially free within fair-use. Subscription-first routing is what makes institutional-grade architecture retail-affordable.
Which Claude models does iQntX use?
Different agents call different model tiers. Routine reasoning (regime classification, fact-checking, journal writing) runs on Sonnet. Heavier reasoning (trade decisions in ambiguous regimes, macro stance setting, postmortems) escalates to Opus. The router picks the right model per task type, with the cost ledger recording every choice.
What about Codex CLI and OpenAI?
Both are part of the same 4-route stack. Codex CLI handles code-shaped tasks (parsing JSON, validating regex, generating MQL5 fragments) — it's faster and cheaper than Claude for these. OpenAI API serves as the fallback for Codex. The result: Claude does the reasoning, Codex does the engineering, and per-token API only fires when both subscription routes are down.
What's the latency of a Claude trading decision?
A complete decision cycle (regime → setup → risk gate → fact-check → double-check → execute) typically runs 10-30 seconds end-to-end. Each Claude call is 1-5 seconds; the multi-agent coordination layer adds incremental time. This is not HFT — it's swing/day-trading appropriate. For setups that close in hours or days, the latency is in the noise.
Are circuit breakers really necessary?
Yes. Without them, an outage on one route (API rate limit, subprocess hang, network blip) cascades into 32 agents simultaneously hammering the broken endpoint. The circuit breaker stops calls after N consecutive failures and moves traffic to the next route. After a cooldown, a single trial request tests recovery. This is the difference between a 10-minute degraded experience and a 6-hour outage.
Can I see the prompts iQntX uses?
Not publicly. The exact prompts (and the skills directory that contains them) are core IP. What we will say is that the prompt structure follows Anthropic's Skills convention — persona, inputs, output schema — and that each skill is a markdown file under version control. The structure is public; the contents are not.
Get on the waitlist.
Be first in line when private access opens. We onboard in waves so the system never gets in front of itself.