A long agentic conversation can pile up a lot of context — tool calls, intermediate results, half-finished plans. Rumus has a few quiet behaviors that keep things readable and within the model’s context window.Documentation Index
Fetch the complete documentation index at: https://www.rumus.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
Auto-naming
When you start a new conversation, the AI generates a short 3–7 word title from the first exchange. This is what shows up in the chat history list in the AI sidebar.| Setting | Where | Default |
|---|---|---|
| Auto-name conversations | Settings → AI → General | On |
Automatic summarization
Long conversations would eventually exceed the model’s context window. Rumus heads that off by summarizing older parts of the conversation automatically — the agent compresses the older portion into a 150–300 word summary that preserves decisions, context, actions, and pending tasks, and keeps the most recent few messages verbatim. You don’t trigger this manually; it happens whenever the conversation crosses an internal threshold (currently around 10+ new messages and rising). When it does, you’ll see a Summary block in the conversation marking where the compaction happened. The summary preserves:- Decisions made — “we chose option A because of X.”
- Context discovered — “the host was running Ubuntu 22.04, not 24.04 as assumed.”
- Actions taken — “ran the migration script; it succeeded.”
- Pending tasks — “still need to deploy to staging.”
Task lists
For multi-step jobs, the agent can render a Todo list inside the conversation — a checkbox list of tasks it’s tracking. This is separate from plan mode; it’s a lighter-weight progress tracker that shows up alongside ordinary tool calls.| Setting | Where | Default |
|---|---|---|
| Todo list | Settings → AI → Conversation → Behavior | On |
Picking a model per conversation
The model picker remembers your last choice per conversation. Switching models on one thread doesn’t affect any other thread. This makes it cheap to maintain parallel conversations on different models — say, a long debugging thread on a strong reasoning model and a high-volume routine ops thread on a faster, cheaper one.Token info per message
Click the small info icon on any AI message to see:- Input tokens — what was sent to the model (prompt + history + tool results).
- Output tokens — what the model generated.
- Cached input tokens — input served from prompt cache, billed cheaper.
- Reasoning tokens — internal “thinking” tokens, when the model exposes them.
- Cost — dollar value charged for built-in models.
Other related toggles
A few smaller behaviors you can tweak:- Auto-collapse tool calls — keeps the conversation tidy by collapsing tool blocks once they finish. On by default.
- Max mode — a per-conversation toggle that uses the model’s full context length rather than a default safety margin. Useful for long-context work; off by default to avoid the cost spikes from filling a 1M-token window.
Tips
- Start a fresh conversation when the topic changes. Summarization preserves a lot, but completely new topics are easier to follow as a clean thread.
- Star important conversations. The history view’s favorites filter makes them easy to find later — especially handy for long-running debugging where you keep coming back.
- Auto-naming + good first messages. A descriptive first message produces a descriptive title. “Why is nginx returning 502 for /api?” gives a title that means something later.
Next steps
Plan mode
Multi-step plans rendered as a checklist with status icons.
Rules & skills
Define reusable procedures the agent invokes by name.