Rumus is provider-agnostic. You can use the built-in models that come with your subscription, bring your own keys for any of the major providers, run local models through Ollama, or point at any OpenAI-compatible endpoint. All configured models appear in the same model picker.Documentation Index
Fetch the complete documentation index at: https://www.rumus.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
Pick the right path
Just use the built-in models
Easiest. Sign in, pick a Pro plan, done. Pricing, usage, and billing all live in this guide.
Bring your own key
Pay your provider directly. Useful if you have credits, need a model Rumus doesn’t bundle, or your org requires it.
Run a local model
Keep everything on your machine via Ollama. No network, no per-token cost.
Connect a custom endpoint
For OpenRouter, vLLM, LiteLLM, LocalAI, an internal gateway — anything that speaks the OpenAI API.
Bring your own key
Rumus has first-class support for these providers. Each one has its own setup guide:Anthropic
Claude Opus, Sonnet, and Haiku.
OpenAI
GPT-5, GPT-4.1, o-series reasoning models.
Google AI
Gemini Pro and Flash.
Z.AI
GLM family, including GLM-4.7 and the Coding variant.
DeepSeek
DeepSeek V3.2 and the V3.2 Reasoner.
Kimi (Moonshot AI)
Kimi K2.5, K2 Thinking, and Moonshot V1.
Ollama
Local models — Llama, Mistral, Qwen, and anything else you’ve pulled.
OpenAI-compatible
OpenRouter, vLLM, LiteLLM, LocalAI, internal gateways.
Where models are configured
All model setup happens in Settings → AI → Models. The page is split into two sections:- Built-in Models — read-only list of what your subscription includes. Greyed out if you’re not signed in or not on a paid plan.
- Custom Models — anything you’ve added with your own key. Click Add Model to create one.
Capabilities you flag per model
When adding a custom model, you tell Rumus what the model can do. The agent uses these flags to decide whether to send images, request tool calls, or use prompt caching:| Capability | Meaning |
|---|---|
| Tool Calling | Model supports function/tool calls — required for agentic mode |
| Vision | Model accepts images as input |
| Prompt Cache | Model supports prompt caching (reduces cost on repeated prompts) |
| Privacy Protection | Provider commits to not storing or training on your data — shown as a green shield in the UI |
Privacy
- Custom models — requests go directly from your machine to the provider you configured. Rumus never sees your prompts, completions, or API keys. Keys are stored encrypted in your local vault.
- Built-in models — requests are routed through Rumus’s API to the underlying provider. We log token counts for billing but not the message contents. See Vault & encryption for how local secrets are stored.
Mixing providers
You can register multiple providers at once and switch between them per-thread. Common patterns:- Built-in for the heavy lifting, Ollama for offline drafting.
- Anthropic Sonnet for fast iteration, Opus for tricky bugs.
- OpenAI for vision, OpenRouter for the long tail.
Next steps
Built-in models, pricing & billing
Everything about the included plan: model list, pricing, usage, top-ups.
AI assistant overview
What the agent can actually do once a model is connected.