Z.AI provides the GLM model family — strong, cost-effective models with first-class tool calling and prompt caching. Rumus has a dedicated provider so you can use both the general endpoint and the Coding-specialized one.Documentation Index
Fetch the complete documentation index at: https://www.rumus.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
Before you start
You need:- A Z.AI account at z.ai (or the international portal).
- An API key from your Z.AI dashboard.
- Sufficient credit on the Z.AI account.
Add Z.AI in Rumus
Paste your API key
Paste the key into API Key. Stored encrypted in your local vault.
(Optional) Custom base URL
Two endpoints are common:
- General:
https://api.z.ai/api/paas/v4(default — leave blank). - Coding:
https://api.z.ai/api/coding/paas/v4— better routing for code-heavy workloads.
Pick a model
Choose from the list (GLM-4.7, GLM-4.6, GLM-4.5, GLM-4.5 Air, GLM-4.5 Flash, etc.) or toggle Enter custom ID to type a model ID manually.
Capabilities
On the Capabilities tab:
- Tool Calling — supported on GLM-4.5+ and 4.7.
- Vision — supported on multimodal GLM versions.
- Prompt Cache — supported on GLM-4.7; cache reads are ~50% of input price.
Recommended models
| Model | Good for |
|---|---|
| GLM-4.7 | Default — long context (128K), strong tools, prompt caching |
| GLM-4.5 Air | Lightweight, faster, lower cost — great for autocomplete |
| GLM-4.5 Flash | High-volume, latency-sensitive |
Tips
- The default temperature/top_p for Z.AI (0.7 / 0.9) differs from OpenAI’s defaults. Rumus respects whatever you set per thread.
- Coding endpoint vs general: if you’re mostly using the agent for code tasks, configure the Coding endpoint — routing and rate limits are tuned for it.
- Prompt cache kicks in automatically on supported models. Long, stable system prompts (skills, rules, long file context) benefit the most.
Troubleshooting
401 Unauthorized
401 Unauthorized
Slow first responses
Slow first responses
Streaming first-token latency varies by region. If you’re far from the endpoint, switch to a closer region (Z.AI publishes regional URLs).
Model not in the list
Model not in the list
Toggle Enter custom ID and paste the exact model ID (e.g.
glm-4.7).Hit a snag we didn’t cover? Ask in the Rumus community.
Next steps
Other providers
Anthropic, OpenAI, Google, DeepSeek, Kimi, Ollama, OpenAI-compatible.
Built-in models
Use Rumus’s bundled GLM-4.7 without managing a key yourself.