Back to blog

Open Source LLMs, One API — Start Free with Kyma

April 12, 20263 min read

The Problem: LLMs Are Fragmented

You want to use open-source LLMs in production. But your options are painful.

You could hit each provider individually — Groq for speed, Google for quality, DeepSeek for reasoning. But that means juggling API keys, understanding each provider's rate limits, and switching your code when one is down.

You could use OpenRouter or Together.ai. But then you're paying their markup on top of provider costs, and you're locked into whatever models they decide to support.

Or you bite the bullet and use closed models from OpenAI or Anthropic. Fine if you have the budget. But if you're building a startup, demo, or side project, every token counts.

What Kyma Does

Kyma API gives you one endpoint to 21 open-source models. Sign up, get $0.50 free credits (around 1000 requests), and start using models like:

  • DeepSeek V3 — GPT-5 class reasoning, $0.81 per 1M input tokens
  • Qwen 3.6 Plus — Most popular model on Kyma, best all-around quality
  • Llama 3.3 70B — Open-weight champion, available instantly
  • Gemini 2.5 Flash — 1M context window for processing entire books at once

Every request is OpenAI-compatible. Drop-in replacement for existing code. No vendor lock-in.

How It Works

  1. Sign up at kymaapi.com — takes 30 seconds
  2. Get $0.50 in free credits
  3. Use any model through the same API endpoint
  4. Pay per token after credits run out

No hidden fees. No monthly minimums. No commitments.

Three Quick Examples

from openai import OpenAI

client = OpenAI(
    base_url="https://kymaapi.com/v1",
    api_key="your-key-here"
)

response = client.chat.completions.create(
    model="deepseek-v3",
    messages=[{"role": "user", "content": "Explain quantum computing in 2 sentences"}]
)
print(response.choices[0].message.content)
import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://kymaapi.com/v1",
  apiKey: "your-key-here",
});

const response = await client.chat.completions.create({
  model: "qwen-3.6-plus",
  messages: [{ role: "user", content: "Hello world" }],
});
console.log(response.choices[0].message.content);
curl https://kymaapi.com/v1/chat/completions \
  -H "Authorization: Bearer your-key-here" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-2.5-flash",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

Why Developers Choose Kyma

KymaOpenRouterDirect API
Cost$0.50 free to start$0.001+/tokenFree (limited)
Models21 curated models200+ (curated)1 per provider
Auto-failover
Setup time30 seconds5 minutesPer-provider
One endpoint
Prompt caching✅ (90% discount)Sometimes

Tip: New to Kyma? Start with qwen-3.6-plus (most popular) or deepseek-v3 (best value for quality).

What's Included

  • 21 open-source models across quality tiers — from fast inferencing to GPT-5 class reasoning
  • Multi-provider redundancy — if one provider is down, your request automatically retries on another
  • OpenAI SDK compatibility — works with Python openai, JavaScript OpenAI library, and any tool that speaks OpenAI format
  • Prompt caching — 90% discount on cached tokens (compatible with Google Gemini, OpenRouter)
  • Agent & tool support — call functions, use structured output, agentic workflows
  • 1M context models — Gemini 2.5 Flash for processing books, codebases, datasets

Pricing

  • Free credits: $0.50 on signup (about 1000 typical requests)
  • Pay as you go: Once credits run out, pay-per-token pricing based on token count
  • No markup over provider costs: Competitive pricing, transparent billing
  • No hidden fees: Only pay for what you use

See full pricing at kymaapi.com/models.

Who's Using Kyma

  • Code agents: OpenClaw, Roo Code, Cline, Claude Code
  • Startups: Shipping fast without big cloud budgets
  • Individual developers: Building projects, experimenting with LLMs
  • Enterprises: Evaluating open models before committing to closed APIs

Get Started

Head to kymaapi.com — sign up, get your API key, and start using open-source LLMs instantly.

Read the quickstart guide for a step-by-step walkthrough (takes 5 minutes).

Questions? Check out model recommendations or the API reference.