How to Build an AI Agent Backend with Open-Source Models
What Makes an Agent Different from a Chatbot
A chatbot responds to messages. An agent takes actions.
The key difference is the tool loop: the model calls functions, gets results, and decides what to do next. This lets you build systems that browse the web, query databases, execute code, and make decisions autonomously.
The Architecture
User Request
↓
LLM (with tools defined)
↓
┌─ If tool_call → Execute tool → Feed result back → Loop
└─ If stop → Return final response
Every agent needs:
- •An LLM that supports function calling
- •A set of tools (functions the LLM can call)
- •A loop that handles tool calls until the LLM is done
Choosing the Right Model
| Model | Strength | Cost/1K reqs | Best For |
|---|---|---|---|
kimi-k2.5 | Best tool calling | ~$1.09 | Complex multi-step agents |
deepseek-v3 | Great value | ~$0.86 | Simple agents, budget builds |
qwen-3.6-plus | Best overall | ~$0.75 | General agents |
qwen-3-32b | Fastest | ~$0.36 | Speed-critical agents |
We recommend kimi-k2.5 for agents that need reliable function calling.
Building the Agent (Python)
from openai import OpenAI
import json
client = OpenAI(
base_url="https://kymaapi.com/v1",
api_key="ky-your-key"
)
# Define tools
tools = [
{
"type": "function",
"function": {
"name": "search",
"description": "Search for information",
"parameters": {
"type": "object",
"properties": {
"query": {"type": "string"}
},
"required": ["query"]
}
}
},
{
"type": "function",
"function": {
"name": "calculate",
"description": "Evaluate a math expression",
"parameters": {
"type": "object",
"properties": {
"expression": {"type": "string"}
},
"required": ["expression"]
}
}
}
]
# Tool implementations
def execute_tool(name: str, args: dict) -> str:
if name == "search":
return f"Results for '{args['query']}': [your search results here]"
if name == "calculate":
return str(eval(args["expression"])) # use a safe math parser in production
return "Unknown tool"
# The agent loop
def run_agent(task: str, max_steps: int = 5) -> str:
messages = [
{"role": "system", "content": "You are a helpful agent. Use tools when needed. Think step by step."},
{"role": "user", "content": task}
]
for step in range(max_steps):
response = client.chat.completions.create(
model="kimi-k2.5",
messages=messages,
tools=tools,
)
choice = response.choices[0]
messages.append(choice.message)
# Agent is done
if choice.finish_reason == "stop":
return choice.message.content
# Process tool calls
if choice.message.tool_calls:
for call in choice.message.tool_calls:
args = json.loads(call.function.arguments)
result = execute_tool(call.function.name, args)
messages.append({
"role": "tool",
"tool_call_id": call.id,
"content": result
})
return messages[-1].content or "Max steps reached"
# Run it
answer = run_agent("What is the square root of 144 plus 25?")
print(answer)
Production Tips
Error handling. Wrap tool execution in try/catch. If a tool fails, return the error message so the LLM can adapt.
Max steps. Always set a limit. Without it, a confused model can loop forever. 5-10 steps covers most use cases.
Conversation history. Store messages in a database for multi-turn agent sessions. Use gemini-2.5-flash (1M context) if history gets long.
Cost control. Each step is a new API call. A 5-step agent run with kimi-k2.5 costs about $0.03 — roughly 33 agent runs per dollar.
Streaming. For real-time UIs, stream the final response. Use stream=True on the last call when finish_reason == "stop".
Cost Comparison
| Provider | Model | Cost for 1K agent runs (5 steps each) |
|---|---|---|
| Kyma | kimi-k2.5 | ~$30 |
| Kyma | deepseek-v3 | ~$25 |
| OpenAI | gpt-4o | ~$150 |
| Anthropic | claude-3.5-sonnet | ~$120 |
Open-source models via Kyma are 4-5x cheaper for agent workloads.
Next Steps
- •Agent Backend Guide — full code examples
- •Tool Calling Reference — detailed function calling docs
- •Coding Agent — build a code editing agent
- •Get your API key — $0.50 free credits to start