OSS inference · week-one

Open weights.
Served fast.

One OpenAI-compatible endpoint for every OSS model worth shipping — Llama, Qwen, DeepSeek, Mixtral, Gemma — integrated within a week of release. No closed clouds. No vendor lock-in.

  • Open weights, zero lock-in

    Every OSS model worth shipping.

    Llama, Qwen, DeepSeek, Mixtral, Gemma — served on fast inference we operate. Swap models in one line. No closed-cloud tax.

  • Week-one on release

    New weights drop, they're live.

    When a new OSS model is released, it lands in your catalog within seven days — benchmarked, priced, routable. No waiting on the hyperscalers.

  • Pricing you can defend

    Flat $199, or per-token.

    Jungle Club for heavy daily use. Metered per-million tokens for everything else. Margins shown on the page, not buried in a sales call.

Integration

One line, not a migration.

If your app already speaks OpenAI, it already speaks Junglething. Change the base URL, pick an OSS model, ship.

quickstart.ts
TypeScript
import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.JUNGLETHING_API_KEY,
  baseURL: "https://api.junglething.com/v1",
});

await client.chat.completions.create({
  model: "llama-4-scout",
  messages: [{ role: "user", content: "Ship it." }],
});
import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.JUNGLETHING_API_KEY,
  baseURL: "https://api.junglething.com/v1",
});

await client.chat.completions.create({
  model: "llama-4-scout",
  messages: [{ role: "user", content: "Ship it." }],
});

Who it's for

Built for the teams shipping OSS this quarter.

  • Indie AI builders

    Side projects that need to feel fast.

    Shipping an app, agent, or Chrome extension on the side? Plug into one endpoint, pay only for what you use, and stop rewriting your code every time a new model drops.

  • Pre-seed & seed founders

    Runway-friendly inference from day one.

    OSS economics cut your cost-per-request against closed clouds. Start on per-usage, switch to Jungle Club the day your usage justifies it — no sales call, no commitment.

  • Agent & dev-tool teams

    Route pinned, fast, steady, or auto.

    Coding agents, RAG stacks, and dev tools pick the model per task. Pin production, experiment in dev, and let auto-route hold the floor on latency when a model hiccups.

  • OSS maintainers

    Your model, served the day you publish.

    Release an open-weight model? We benchmark it, price it, and list it inside a week — with attribution back to your repo and a direct line to our catalog team.

  • Teams leaving closed clouds

    Portability is the feature.

    Keep the OpenAI SDK you already wrote. Swap vendors at the URL level. Export logs and usage whenever you want — the gateway is the thin layer, not the lock-in.

Catalog

The OSS roster, curated.

Every model listed is production-ready, benchmarked, and priced on arrival.

See all →
  • Meta

    llama-4-scout

    Fast
  • Meta

    llama-4-maverick

    Steady
  • Alibaba

    qwen3-235b-instruct

    Steady
  • Alibaba

    qwen3-coder-32b

    Fast
  • DeepSeek

    deepseek-v3.1

    Steady
  • DeepSeek

    deepseek-r1-distill

    Fast
  • Mistral

    mixtral-8x22b

    Fast
  • Google

    gemma-3-27b

    Fast

Pricing

Club or per-usage. You pick.

Heavy daily use? Flat monthly beats metered. Bursty traffic? Per-token scales down to zero.

Free

$0

/mo

Evaluate every OSS model in the catalog before you commit a cent.

  • 1 API key
  • Full catalog, shared throughput
  • Request logs + token ledger
  • OpenAI-compatible endpoint

Jungle Club

$199

CAD /mo

Heavy-daily-use envelope on OSS, priced at Claude Max parity. Flat, predictable, no per-token math.

  • Unlimited-feel monthly envelope
  • Priority throughput pool
  • Every OSS release, week-one
  • Fair-use caps, disclosed not hidden

Per-usage

Pay

as you go

Metered per-million tokens, transparent gateway fee on top. Scales down to zero, up to production.

  • Per-model token pricing
  • Daily spend ledger + alerts
  • Budget caps per workspace
  • No monthly minimum, ever

FAQ

Quick answers.

Why OSS only?
Because portability is the whole point. Open weights mean no vendor deprecation drama, no rate-limit lotteries, and prices that track silicon instead of closed-cloud margins. If a model goes away, the weights don't.
How fast do new models land?
Week-one on every major OSS release. Our catalog is curated — we don't list everything, we list what's worth shipping — with benchmarks and pricing clarified on arrival.
What is Jungle Club?
$199 CAD/month for a heavy-daily-use envelope across the OSS roster. Positioned at Claude Max parity, but OSS-only. Fair-use soft caps are written on the pricing page, not hidden in a contract.
Is it really OpenAI-compatible?
Yes. Chat-completions shape, streaming, tool calls. Point your existing OpenAI SDK at our base URL and change the model name. That's the migration.
Who's behind it?
A small team operating out of Canada. Privacy and terms written for Canadian software, billing in CAD, and support that actually answers.

First token in under a minute.