docs: canonicalize docs paths and align zh navigation (#11428)

* docs(navigation): canonicalize paths and align zh nav * chore(docs): remove stray .DS_Store * docs(scripts): add non-mint docs link audit * docs(nav): fix zh source paths and preserve legacy redirects (#11428) (thanks @sebslight) * chore(docs): satisfy lint for docs link audit script (#11428) (thanks @sebslight)
2026-02-09 05:19:32 +08:00 · 2026-02-07 15:40:35 -05:00
parent cde29fef71
commit 929a3725d3
148 changed files with 607 additions and 687 deletions
--- a/docs/reference/api-usage-costs.md
+++ b/docs/reference/api-usage-costs.md
@@ -29,7 +29,7 @@ OpenClaw features that can generate provider usage or paid API calls.
 - `openclaw status --usage` and `openclaw channels list` show provider **usage windows**
  (quota snapshots, not per-message costs).

-See [Token use & costs](/token-use) for details and examples.
+See [Token use & costs](/reference/token-use) for details and examples.

 ## How keys are discovered

@@ -48,7 +48,7 @@ OpenClaw can pick up credentials from:
 Every reply or tool call uses the **current model provider** (OpenAI, Anthropic, etc). This is the
 primary source of usage and cost.

-See [Models](/providers/models) for pricing config and [Token use & costs](/token-use) for display.
+See [Models](/providers/models) for pricing config and [Token use & costs](/reference/token-use) for display.

 ### 2) Media understanding (audio/image/video)

--- a/docs/reference/session-management-compaction.md
+++ b/docs/reference/session-management-compaction.md
@@ -154,7 +154,7 @@ If you’re tuning limits:
 - The context window comes from the model catalog (and can be overridden via config).
 - `contextTokens` in the store is a runtime estimate/reporting value; don’t treat it as a strict guarantee.

-For more, see [/token-use](/token-use).
+For more, see [/token-use](/reference/token-use).

 ---

--- a/docs/reference/test.md
+++ b/docs/reference/test.md
@@ -7,7 +7,7 @@ title: "Tests"

 # Tests

- Full testing kit (suites, live, Docker): [Testing](/testing)
+- Full testing kit (suites, live, Docker): [Testing](/help/testing)

 - `pnpm test:force`: Kills any lingering gateway process holding the default control port, then runs the full Vitest suite with an isolated gateway port so server tests don’t collide with a running instance. Use this when a prior gateway run left port 18789 occupied.
 - `pnpm test:coverage`: Runs Vitest with V8 coverage. Global thresholds are 70% lines/branches/functions/statements. Coverage excludes integration-heavy entrypoints (CLI wiring, gateway/telegram bridges, webchat static server) to keep the target focused on unit-testable logic.
--- a/docs/reference/token-use.md
+++ b/docs/reference/token-use.md
@@ -0,0 +1,112 @@
+---
+summary: "How OpenClaw builds prompt context and reports token usage + costs"
+read_when:
+  - Explaining token usage, costs, or context windows
+  - Debugging context growth or compaction behavior
+title: "Token Use and Costs"
+---
+
+# Token use & costs
+
+OpenClaw tracks **tokens**, not characters. Tokens are model-specific, but most
+OpenAI-style models average ~4 characters per token for English text.
+
+## How the system prompt is built
+
+OpenClaw assembles its own system prompt on every run. It includes:
+
+- Tool list + short descriptions
+- Skills list (only metadata; instructions are loaded on demand with `read`)
+- Self-update instructions
+- Workspace + bootstrap files (`AGENTS.md`, `SOUL.md`, `TOOLS.md`, `IDENTITY.md`, `USER.md`, `HEARTBEAT.md`, `BOOTSTRAP.md` when new). Large files are truncated by `agents.defaults.bootstrapMaxChars` (default: 20000).
+- Time (UTC + user timezone)
+- Reply tags + heartbeat behavior
+- Runtime metadata (host/OS/model/thinking)
+
+See the full breakdown in [System Prompt](/concepts/system-prompt).
+
+## What counts in the context window
+
+Everything the model receives counts toward the context limit:
+
+- System prompt (all sections listed above)
+- Conversation history (user + assistant messages)
+- Tool calls and tool results
+- Attachments/transcripts (images, audio, files)
+- Compaction summaries and pruning artifacts
+- Provider wrappers or safety headers (not visible, but still counted)
+
+For a practical breakdown (per injected file, tools, skills, and system prompt size), use `/context list` or `/context detail`. See [Context](/concepts/context).
+
+## How to see current token usage
+
+Use these in chat:
+
+- `/status` → **emoji‑rich status card** with the session model, context usage,
+  last response input/output tokens, and **estimated cost** (API key only).
+- `/usage off|tokens|full` → appends a **per-response usage footer** to every reply.
+  - Persists per session (stored as `responseUsage`).
+  - OAuth auth **hides cost** (tokens only).
+- `/usage cost` → shows a local cost summary from OpenClaw session logs.
+
+Other surfaces:
+
+- **TUI/Web TUI:** `/status` + `/usage` are supported.
+- **CLI:** `openclaw status --usage` and `openclaw channels list` show
+  provider quota windows (not per-response costs).
+
+## Cost estimation (when shown)
+
+Costs are estimated from your model pricing config:
+
+```
+models.providers.<provider>.models[].cost
+```
+
+These are **USD per 1M tokens** for `input`, `output`, `cacheRead`, and
+`cacheWrite`. If pricing is missing, OpenClaw shows tokens only. OAuth tokens
+never show dollar cost.
+
+## Cache TTL and pruning impact
+
+Provider prompt caching only applies within the cache TTL window. OpenClaw can
+optionally run **cache-ttl pruning**: it prunes the session once the cache TTL
+has expired, then resets the cache window so subsequent requests can re-use the
+freshly cached context instead of re-caching the full history. This keeps cache
+write costs lower when a session goes idle past the TTL.
+
+Configure it in [Gateway configuration](/gateway/configuration) and see the
+behavior details in [Session pruning](/concepts/session-pruning).
+
+Heartbeat can keep the cache **warm** across idle gaps. If your model cache TTL
+is `1h`, setting the heartbeat interval just under that (e.g., `55m`) can avoid
+re-caching the full prompt, reducing cache write costs.
+
+For Anthropic API pricing, cache reads are significantly cheaper than input
+tokens, while cache writes are billed at a higher multiplier. See Anthropic’s
+prompt caching pricing for the latest rates and TTL multipliers:
+[https://docs.anthropic.com/docs/build-with-claude/prompt-caching](https://docs.anthropic.com/docs/build-with-claude/prompt-caching)
+
+### Example: keep 1h cache warm with heartbeat
+
+```yaml
+agents:
+  defaults:
+    model:
+      primary: "anthropic/claude-opus-4-6"
+    models:
+      "anthropic/claude-opus-4-6":
+        params:
+          cacheRetention: "long"
+    heartbeat:
+      every: "55m"
+```
+
+## Tips for reducing token pressure
+
+- Use `/compact` to summarize long sessions.
+- Trim large tool outputs in your workflows.
+- Keep skill descriptions short (skill list is injected into the prompt).
+- Prefer smaller models for verbose, exploratory work.
+
+See [Skills](/tools/skills) for the exact skill list overhead formula.