GLM 4.7 Flash

Zhipu AIglm-4.7-flash

Ultra cheap. 200K context. Fast.

Best for: Cheap long context, bulk

At a glance

Context window

203K

Input

$0.08/1M

$0.01/1M cached

Output

$0.54/1M

Speed

fast

Quality tier

efficient

Per 1M tokens. All-in pricing, no hidden fees.

Capabilities

Tool calling

Connect to external tools and APIs

Reasoning

Extended thinking for complex problems

Prompt caching

Cache repeated prefixes for 90% discount

Streaming

Real-time token-by-token output

Structured outputs

JSON mode and function calling

Quick start

Get API Key
curl https://kymaapi.com/v1/chat/completions \
  -H "Authorization: Bearer $KYMA_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "glm-4.7-flash",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

Details

CreatorZhipu AI
Model IDglm-4.7-flash
Quality tierefficient
Cost tiercheap
Input modalityText
Output modalityText
Prompt cachingSupported

Try GLM 4.7 Flash now

$0.50 free credits on signup. No credit card required.