Fast, reliable inference for open-weight LLMs.

Keln serves popular open models like GPT OSS 120B, GLM 5.1, Kimi K2.5 on the latest NVIDIA Blackwell hardware.
Contact us for early access.

Open models,
production-ready.

Open-weight models, benchmarked weekly.

See all models

GPT OSS 120B

OpenAI

Available

TTFT 0.27s

TPS 34 t/s

$0.039 / $0.19 per 1M

Kimi K2.5

Moonshot

Coming soon

TTFT 0.52s

TPS 118 t/s

$0.50 / $2.80 per 1M

MiniMax M2.7

MiniMax

Coming soon

TTFT 0.34s

TPS 182 t/s

$0.30 / $1.20 per 1M

Qwen 3.5 397B

Alibaba

Coming soon

TTFT 0.61s

TPS 98 t/s

$0.20 / $0.60 per 1M

Step 3.5 Flash

StepFun

Coming soon

TTFT 0.21s

TPS 210 t/s

$0.15 / $0.45 per 1M

Working with a small number
of early partners.

One app. Every kind of AI task.

Keln

Claude 3.5 Sonnet · Auto

$0.024 · 3.1s

Compare solar vs wind vs geothermal for home energy in the Pacific Northwest

Solar — High summer efficiency, limited winter output. ~$12–18k installed.

Wind — Requires permits, best in coastal/rural areas.

Geothermal — Highest upfront ($20–30k), lowest operating costs. Ideal for PNW.

Research & analysis

For complex questions, comparisons, and deeper research. Keln chooses the best model for long, thoughtful answers without limits or interruptions.

Keln

GPT-4o · Full Memory ON

$0.031 · 2.8s

Based on the 47-page report I uploaded last week — what were the key risks in section 3?

From your report (Section 3, p.18–24): Three risks — currency exposure in LATAM, supplier concentration (top 3 = 61%), and pending EU regulatory review expected Q2.

Deep work with full memory

For complex projects, long documents, and detailed conversations. Keln can keep the full context in memory, so you do not need to repeat yourself or lose important details.

Keln

DALL-E 3 · Auto

$0.040 · 4.2s

Generate a minimal geometric logo for a fintech startup called "Arco"

Generated image

Image & video generation

Generate images, edit photos, remove backgrounds, and create video. Keln picks the best visual model for each task.

Keln

Gemini 2.0 Flash · Auto

$0.001 · 0.8s

Summarize this article in 3 bullet points

• New battery tech achieves 40% higher energy density
• Manufacturing cost reduced ~30% with earth-abundant materials
• Commercial scaling expected by 2027, targeting EVs first

Fast everyday tasks

Quick writing, summaries, brainstorming, and simple questions. Keln uses faster, lower-cost models when you do not need the most powerful one.

Keln API

Llama 3.3 70B · OpenClaw

// One API key for all models

const keln = new KelnClient({

  apiKey: "keln_sk_...",

});

await keln.chat("Analyze this CSV", {

  model: "auto", // or pick one

});

Keln

Response · 2,341 tokens

$0.003 · 1.2s

Analysis complete. Found 3 anomalies in column "revenue_Q3":

Row 47: $0 value (likely null)
Row 83: 10× above mean (outlier)
Row 201: Negative value detected

Learn more ↗

Also for developers

Use Keln in OpenClaw and other tools with one API key. Save up to 10× with lower-cost models that match the quality of the top ones. Private and securely stored in the U.S.

Frequently
asked

Everything you need to know about Keln. Still have questions? Reach out anytime.

FAQ page ↗

Keln is an LLM inference provider. We serve large open-weight models — starting with OpenAI's GPT OSS 120B, with GLM 5.1, Kimi K2.5, MiniMax M2.7 and others coming soon — on NVIDIA Blackwell hardware, at prices below the official API. Our optimized infrastructure lets us serve them for less.

NVIDIA B200 and B300 GPUs — the Blackwell generation. These are the fastest GPUs available for inference today, which is what lets us host the largest open-weight models.

Per-token, usually below the official API prices for each model. Our optimized infrastructure lets us offer lower rates than the model's own provider.

Any OpenAI-compatible client works — no Keln-specific SDK required. Reach out to kris@keln.ai for early access details.

Yes. Point any of these tools at a Keln-served model and it works out of the box. We tune our setup specifically for the long-context, cache-heavy usage these tools produce.

No. Keln doesn't store your prompts or completions beyond what's needed to serve each request, and we never train on user data. The only thing we keep is basic operational metadata (latency, token counts, status) for billing and reliability.

Fast, reliable inference for open-weight LLMs.

Keln serves popular open models like GPT OSS 120B, GLM 5.1, Kimi K2.5 on the latest NVIDIA Blackwell hardware.
Contact us for early access.

Public benchmarks

Modern hardware

Multi-region

Built for agents

Lower prices

Open models,
production-ready.

One app. Every kind of AI task.

Research & analysis

Deep work with full memory

Image & video generation

Fast everyday tasks

Also for developers

Security, built in.

U.S.-based infrastructure

End-to-end encryption

Private mode

Permanent deletion

Never used for training

Frequently
asked

Ready to try Keln?
Contact us for early access.

Fast, reliable inference for open-weight LLMs.

Keln serves popular open models like GPT OSS 120B, GLM 5.1, Kimi K2.5 on the latest NVIDIA Blackwell hardware.Contact us for early access.

Public benchmarks

Modern hardware

Multi-region

Built for agents

Lower prices

Open models, production-ready.

One app. Every kind of AI task.

Research & analysis

Deep work with full memory

Image & video generation

Fast everyday tasks

Also for developers

Security, built in.

U.S.-based infrastructure

End-to-end encryption

Private mode

Permanent deletion

Never used for training

Frequentlyasked

Ready to try Keln?Contact us for early access.

Keln serves popular open models like GPT OSS 120B, GLM 5.1, Kimi K2.5 on the latest NVIDIA Blackwell hardware.
Contact us for early access.

Open models,
production-ready.

Frequently
asked

Ready to try Keln?
Contact us for early access.