Bring your own Moonshot AI key to use Kimi in Rumus — K2.5 with vision and tools, K2 Thinking for harder reasoning, and the Moonshot V1 family.Documentation Index
Fetch the complete documentation index at: https://www.rumus.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
Before you start
You need:- A Moonshot AI account.
- An API key from the dashboard:
- International: platform.moonshot.ai — base URL
https://api.moonshot.ai/v1 - China: platform.moonshot.cn — base URL
https://api.moonshot.cn/v1
- International: platform.moonshot.ai — base URL
- Sufficient credit on your Moonshot account.
Keys from the international portal and the China portal are not interchangeable. Pick the one that matches the base URL you use.
Add Kimi in Rumus
Paste your API key
Paste the key into API Key. It’s stored encrypted in your local vault.
Base URL
Default is
https://api.moonshot.ai/v1 (international). If you’re using a China-portal key, change it to https://api.moonshot.cn/v1.Capabilities
Capability flags are pre-set for the built-in models. For a custom model, mirror what Moonshot’s docs say it supports.
Recommended models
| Model | ID | Good for |
|---|---|---|
| Kimi K2.5 | kimi-k2.5 | Default daily driver — vision, tools, and thinking, 256K context |
| Kimi K2 Thinking | kimi-k2-thinking | Longer thinking budget, no vision, 256K context |
| Moonshot V1 128K / 32K / 8K | moonshot-v1-128k, moonshot-v1-32k, moonshot-v1-8k | Earlier-generation general-purpose models in different context sizes |
Tips
- Pick K2.5 first. It’s the most capable everyday model — strong at tool use, supports vision, and handles long context.
- K2 Thinking drops vision but spends more on reasoning. Reach for it on tricky multi-step problems where extra deliberation helps.
- Moonshot V1 is older. Useful if a workflow specifically depends on it; otherwise prefer K2.5.
Troubleshooting
401 Unauthorized
401 Unauthorized
Model not in the list
Model not in the list
Toggle Enter custom ID and paste the exact model ID (e.g.
kimi-k2.5).High latency on first response
High latency on first response
Streaming first-token latency depends on region. Pick the portal that’s closer to you.
Hit a snag we didn’t cover? Ask in the Rumus community.
Next steps
Other providers
Anthropic, OpenAI, Google, Z.AI, DeepSeek, Ollama, OpenAI-compatible.
Built-in models
Use Rumus’s bundled Kimi without managing a key yourself.