OSS inference · week-one

Open weights.
Served fast.

One OpenAI-compatible endpoint for every OSS model worth shipping — Llama, Qwen, DeepSeek, Mixtral, Gemma — integrated within a week of release. No closed clouds. No vendor lock-in.

Read quickstartBrowse models

Open weights, zero lock-in
Every OSS model worth shipping.
Llama, Qwen, DeepSeek, Mixtral, Gemma — served on fast inference we operate. Swap models in one line. No closed-cloud tax.
Week-one on release
New weights drop, they're live.
When a new OSS model is released, it lands in your catalog within seven days — benchmarked, priced, routable. No waiting on the hyperscalers.
Pricing you can defend
Flat $199, or per-token.
Jungle Club for heavy daily use. Metered per-million tokens for everything else. Margins shown on the page, not buried in a sales call.

Integration

One line, not a migration.

If your app already speaks OpenAI, it already speaks Junglething. Change the base URL, pick an OSS model, ship.

quickstart.ts
TypeScript
import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.JUNGLETHING_API_KEY,
  baseURL: "https://api.junglething.com/v1",
});

await client.chat.completions.create({
  model: "llama-4-scout",
  messages: [{ role: "user", content: "Ship it." }],
});
import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.JUNGLETHING_API_KEY,
  baseURL: "https://api.junglething.com/v1",
});

await client.chat.completions.create({
  model: "llama-4-scout",
  messages: [{ role: "user", content: "Ship it." }],
});

Who it's for

Built for the teams shipping OSS this quarter.

Indie AI builders
Side projects that need to feel fast.
Shipping an app, agent, or Chrome extension on the side? Plug into one endpoint, pay only for what you use, and stop rewriting your code every time a new model drops.
Pre-seed & seed founders
Runway-friendly inference from day one.
OSS economics cut your cost-per-request against closed clouds. Start on per-usage, switch to Jungle Club the day your usage justifies it — no sales call, no commitment.
Agent & dev-tool teams
Route pinned, fast, steady, or auto.
Coding agents, RAG stacks, and dev tools pick the model per task. Pin production, experiment in dev, and let auto-route hold the floor on latency when a model hiccups.
OSS maintainers
Your model, served the day you publish.
Release an open-weight model? We benchmark it, price it, and list it inside a week — with attribution back to your repo and a direct line to our catalog team.
Teams leaving closed clouds
Portability is the feature.
Keep the OpenAI SDK you already wrote. Swap vendors at the URL level. Export logs and usage whenever you want — the gateway is the thin layer, not the lock-in.

Catalog

The OSS roster, curated.

Every model listed is production-ready, benchmarked, and priced on arrival.

See all →

Meta
llama-4-scout
Generalist · open
Fast
Meta
llama-4-maverick
Reasoning · open
Steady
Alibaba
qwen3-235b-instruct
Flagship · open
Steady
Alibaba
qwen3-coder-32b
Code · open
Fast
DeepSeek
deepseek-v3.1
Reasoning · open
Steady
DeepSeek
deepseek-r1-distill
Reasoning · open
Fast
Mistral
mixtral-8x22b
MoE · open
Fast
Google
gemma-3-27b
Compact · open
Fast

Pricing

Club or per-usage. You pick.

Heavy daily use? Flat monthly beats metered. Bursty traffic? Per-token scales down to zero.

Free

/mo

Evaluate every OSS model in the catalog before you commit a cent.

1 API key
Full catalog, shared throughput
Request logs + token ledger
OpenAI-compatible endpoint

Jungle Club

$199

CAD /mo

Heavy-daily-use envelope on OSS, priced at Claude Max parity. Flat, predictable, no per-token math.

Unlimited-feel monthly envelope
Priority throughput pool
Every OSS release, week-one
Fair-use caps, disclosed not hidden

Per-usage

Pay

as you go

Metered per-million tokens, transparent gateway fee on top. Scales down to zero, up to production.

Per-model token pricing
Daily spend ledger + alerts
Budget caps per workspace
No monthly minimum, ever

FAQ

Quick answers.

Why OSS only?: Because portability is the whole point. Open weights mean no vendor deprecation drama, no rate-limit lotteries, and prices that track silicon instead of closed-cloud margins. If a model goes away, the weights don't.
How fast do new models land?: Week-one on every major OSS release. Our catalog is curated — we don't list everything, we list what's worth shipping — with benchmarks and pricing clarified on arrival.
What is Jungle Club?: $199 CAD/month for a heavy-daily-use envelope across the OSS roster. Positioned at Claude Max parity, but OSS-only. Fair-use soft caps are written on the pricing page, not hidden in a contract.
Is it really OpenAI-compatible?: Yes. Chat-completions shape, streaming, tool calls. Point your existing OpenAI SDK at our base URL and change the model name. That's the migration.
Who's behind it?: A small team operating out of Canada. Privacy and terms written for Canadian software, billing in CAD, and support that actually answers.

First token in under a minute.

Start building Read the docs

Open weights.Served fast.

Every OSS model worth shipping.

New weights drop, they're live.

Flat $199, or per-token.

One line, not a migration.

Built for the teams shipping OSS this quarter.

Side projects that need to feel fast.

Runway-friendly inference from day one.

Route pinned, fast, steady, or auto.

Your model, served the day you publish.

Portability is the feature.

The OSS roster, curated.

Club or per-usage. You pick.

Quick answers.

First token in under a minute.

Open weights.
Served fast.