Skip to main content

Documentation Index

Fetch the complete documentation index at: https://www.rumus.ai/docs/llms.txt

Use this file to discover all available pages before exploring further.

Rumus is provider-agnostic. You can use the built-in models that come with your subscription, bring your own keys for any of the major providers, run local models through Ollama, or point at any OpenAI-compatible endpoint. All configured models appear in the same model picker.

Pick the right path

Just use the built-in models

Easiest. Sign in, pick a Pro plan, done. Pricing, usage, and billing all live in this guide.

Bring your own key

Pay your provider directly. Useful if you have credits, need a model Rumus doesn’t bundle, or your org requires it.

Run a local model

Keep everything on your machine via Ollama. No network, no per-token cost.

Connect a custom endpoint

For OpenRouter, vLLM, LiteLLM, LocalAI, an internal gateway — anything that speaks the OpenAI API.

Bring your own key

Rumus has first-class support for these providers. Each one has its own setup guide:

Anthropic

Claude Opus, Sonnet, and Haiku.

OpenAI

GPT-5, GPT-4.1, o-series reasoning models.

Google AI

Gemini Pro and Flash.

Z.AI

GLM family, including GLM-4.7 and the Coding variant.

DeepSeek

DeepSeek V3.2 and the V3.2 Reasoner.

Kimi (Moonshot AI)

Kimi K2.5, K2 Thinking, and Moonshot V1.

Ollama

Local models — Llama, Mistral, Qwen, and anything else you’ve pulled.

OpenAI-compatible

OpenRouter, vLLM, LiteLLM, LocalAI, internal gateways.

Where models are configured

All model setup happens in Settings → AI → Models. The page is split into two sections:
  • Built-in Models — read-only list of what your subscription includes. Greyed out if you’re not signed in or not on a paid plan.
  • Custom Models — anything you’ve added with your own key. Click Add Model to create one.
Custom models you add appear in the chat model picker right next to built-in ones, with a clear label so you know which is which.

Capabilities you flag per model

When adding a custom model, you tell Rumus what the model can do. The agent uses these flags to decide whether to send images, request tool calls, or use prompt caching:
CapabilityMeaning
Tool CallingModel supports function/tool calls — required for agentic mode
VisionModel accepts images as input
Prompt CacheModel supports prompt caching (reduces cost on repeated prompts)
Privacy ProtectionProvider commits to not storing or training on your data — shown as a green shield in the UI
If you’re unsure, leave them off; you can edit a model anytime.

Privacy

  • Custom models — requests go directly from your machine to the provider you configured. Rumus never sees your prompts, completions, or API keys. Keys are stored encrypted in your local vault.
  • Built-in models — requests are routed through Rumus’s API to the underlying provider. We log token counts for billing but not the message contents. See Vault & encryption for how local secrets are stored.

Mixing providers

You can register multiple providers at once and switch between them per-thread. Common patterns:
  • Built-in for the heavy lifting, Ollama for offline drafting.
  • Anthropic Sonnet for fast iteration, Opus for tricky bugs.
  • OpenAI for vision, OpenRouter for the long tail.
The model picker remembers the last model used per thread, so you can keep parallel conversations on different models without thinking about it.

Next steps

Built-in models, pricing & billing

Everything about the included plan: model list, pricing, usage, top-ups.

AI assistant overview

What the agent can actually do once a model is connected.