Fast, reliable inference for open-weight LLMs.
Keln serves popular open models like GPT OSS 120B, GLM 5.1, Kimi K2.5 on the latest NVIDIA Blackwell hardware.
Available through OpenRouter.
Open models,
production-ready.
Open-weight models, benchmarked weekly.
Available through OpenRouter.
No separate signup needed.
Frequently
asked
Everything you need to know about Keln. Still have questions? Reach out anytime.
FAQ page ↗Keln is an LLM inference provider. We serve large open-weight models — GLM 5.1, Kimi K2.5, MiniMax M2.7, and others — on NVIDIA Blackwell hardware, and make them available through OpenRouter at prices below the official API. Our optimized infrastructure lets us serve them for less.
NVIDIA B200 and B300 GPUs — the Blackwell generation. These are the fastest GPUs available for inference today, which is what lets us host the largest open-weight models.
Per-token, usually below the official API prices for each model. Our optimized infrastructure lets us offer lower rates than the model's own provider.
Call the OpenRouter API and pick a Keln-served model. Any OpenAI-compatible client works — no Keln-specific SDK or separate signup required.
Yes. All of these tools work with OpenRouter, so pointing them at a Keln-served model works out of the box. We tune our setup specifically for the long-context, cache-heavy usage these tools produce.
No. Keln doesn't store your prompts or completions beyond what's needed to serve each request, and we never train on user data. The only thing we keep is basic operational metadata (latency, token counts, status) for billing and reliability.