
OSS inference · week-one
Open weights.
Served fast.
One OpenAI-compatible endpoint for every OSS model worth shipping — Llama, Qwen, DeepSeek, Mixtral, Gemma — integrated within a week of release. No closed clouds. No vendor lock-in.
Open weights, zero lock-in
Every OSS model worth shipping.
Llama, Qwen, DeepSeek, Mixtral, Gemma — served on fast inference we operate. Swap models in one line. No closed-cloud tax.
Week-one on release
New weights drop, they're live.
When a new OSS model is released, it lands in your catalog within seven days — benchmarked, priced, routable. No waiting on the hyperscalers.
Pricing you can defend
Flat $199, or per-token.
Jungle Club for heavy daily use. Metered per-million tokens for everything else. Margins shown on the page, not buried in a sales call.
Integration
One line, not a migration.
If your app already speaks OpenAI, it already speaks Junglething. Change the base URL, pick an OSS model, ship.
import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.JUNGLETHING_API_KEY,
baseURL: "https://api.junglething.com/v1",
});
await client.chat.completions.create({
model: "llama-4-scout",
messages: [{ role: "user", content: "Ship it." }],
});import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.JUNGLETHING_API_KEY,
baseURL: "https://api.junglething.com/v1",
});
await client.chat.completions.create({
model: "llama-4-scout",
messages: [{ role: "user", content: "Ship it." }],
});Who it's for
Built for the teams shipping OSS this quarter.
Indie AI builders
Side projects that need to feel fast.
Shipping an app, agent, or Chrome extension on the side? Plug into one endpoint, pay only for what you use, and stop rewriting your code every time a new model drops.
Pre-seed & seed founders
Runway-friendly inference from day one.
OSS economics cut your cost-per-request against closed clouds. Start on per-usage, switch to Jungle Club the day your usage justifies it — no sales call, no commitment.
Agent & dev-tool teams
Route pinned, fast, steady, or auto.
Coding agents, RAG stacks, and dev tools pick the model per task. Pin production, experiment in dev, and let auto-route hold the floor on latency when a model hiccups.
OSS maintainers
Your model, served the day you publish.
Release an open-weight model? We benchmark it, price it, and list it inside a week — with attribution back to your repo and a direct line to our catalog team.
Teams leaving closed clouds
Portability is the feature.
Keep the OpenAI SDK you already wrote. Swap vendors at the URL level. Export logs and usage whenever you want — the gateway is the thin layer, not the lock-in.
Catalog
The OSS roster, curated.
Every model listed is production-ready, benchmarked, and priced on arrival.
Meta
llama-4-scout
Generalist · open
FastMeta
llama-4-maverick
Reasoning · open
SteadyAlibaba
qwen3-235b-instruct
Flagship · open
SteadyAlibaba
qwen3-coder-32b
Code · open
FastDeepSeek
deepseek-v3.1
Reasoning · open
SteadyDeepSeek
deepseek-r1-distill
Reasoning · open
FastMistral
mixtral-8x22b
MoE · open
FastGoogle
gemma-3-27b
Compact · open
Fast
Pricing
Club or per-usage. You pick.
Heavy daily use? Flat monthly beats metered. Bursty traffic? Per-token scales down to zero.
Free
$0
/mo
Evaluate every OSS model in the catalog before you commit a cent.
- 1 API key
- Full catalog, shared throughput
- Request logs + token ledger
- OpenAI-compatible endpoint
Jungle Club
$199
CAD /mo
Heavy-daily-use envelope on OSS, priced at Claude Max parity. Flat, predictable, no per-token math.
- Unlimited-feel monthly envelope
- Priority throughput pool
- Every OSS release, week-one
- Fair-use caps, disclosed not hidden
Per-usage
Pay
as you go
Metered per-million tokens, transparent gateway fee on top. Scales down to zero, up to production.
- Per-model token pricing
- Daily spend ledger + alerts
- Budget caps per workspace
- No monthly minimum, ever
FAQ
Quick answers.
- Why OSS only?
- Because portability is the whole point. Open weights mean no vendor deprecation drama, no rate-limit lotteries, and prices that track silicon instead of closed-cloud margins. If a model goes away, the weights don't.
- How fast do new models land?
- Week-one on every major OSS release. Our catalog is curated — we don't list everything, we list what's worth shipping — with benchmarks and pricing clarified on arrival.
- What is Jungle Club?
- $199 CAD/month for a heavy-daily-use envelope across the OSS roster. Positioned at Claude Max parity, but OSS-only. Fair-use soft caps are written on the pricing page, not hidden in a contract.
- Is it really OpenAI-compatible?
- Yes. Chat-completions shape, streaming, tool calls. Point your existing OpenAI SDK at our base URL and change the model name. That's the migration.
- Who's behind it?
- A small team operating out of Canada. Privacy and terms written for Canadian software, billing in CAD, and support that actually answers.