AnythingLLM — a doc-lookup copilot for agents
Inline KB queries inside an agent's workstation roughly halve handling time. The goal isn't to replace humans — it's to make existing agents 2-3× faster.
- Scenario
- KB exists but agents can't / won't / don't have time to browse it
- Monthly cost
- $10 - $30
- Difficulty
- Easy
What this combo solves#
Most support teams have capable agents whose real bottleneck under customer pressure is not having time to browse docs:
- 1,000+ articles across Notion / Confluence / Lark
- SOPs, product manuals, refund rules, technical FAQs — all there
- In practice, agents either rely on memory or open 3-5 tabs and grep
- Average non-standard ticket eats 2-5 minutes of doc lookup
- New-hire ramp is 1-2 months
AnythingLLM provides an agent-facing AI workstation. It does not replace agents — it makes them 2-3× faster. Positioning is completely distinct from a customer-facing AI like Chatwoot + Dify.
Architecture#
Versus customer-facing AI#
| Dimension | AnythingLLM (internal copilot) | Chatwoot + Dify (customer-facing) |
|---|---|---|
| User | Agents | Customers |
| Tolerance for mistakes | High (agent judges) | Low (goes straight out) |
| Prompt strictness | Can be loose | Must be tight |
| Risk | Wrong answer only affects agent | Wrong answer → complaint / legal |
| Deploy complexity | Single container | Multi-component |
| Starting cost | $10-30 / mo | $40-150 / mo |
Many teams run both — AnythingLLM for agents, Chatwoot + Dify for customers.
When to pick this#
| Situation | Fits? |
|---|---|
| KB exists, agents don’t use it | ✓ |
| Lots of new hires, long ramp | ✓ |
| Data sensitive — don’t want AI auto-replying customers | ✓ |
| Want to dip a toe into AI but not customer-facing | ✓ |
| Customer questions are highly repetitive | ✗ Use customer-facing AI |
Deployment (15 min)#
Option 1 — cloud models (easiest)#
docker run -d --name anythingllm \
-p 3001:3001 \
--restart unless-stopped \
-v anythingllm_storage:/app/server/storage \
-e STORAGE_DIR="/app/server/storage" \
mintplexlabs/anythingllm
Open http://YOUR_IP:3001, register admin, Settings → LLM Provider → OpenAI / Claude / DeepSeek.
Option 2 — local inference (air-gapped)#
curl -fsSL https://ollama.com/install.sh | sh
ollama pull qwen2.5:14b-instruct-q4_K_M
docker run -d --name anythingllm \
-p 3001:3001 \
--add-host=host.docker.internal:host-gateway \
-v anythingllm_storage:/app/server/storage \
-e LLM_PROVIDER=ollama \
-e OLLAMA_BASE_PATH=http://host.docker.internal:11434 \
mintplexlabs/anythingllm
Workspace model#
AnythingLLM’s core concept is the Workspace — each has its own:
- KB (uploaded docs)
- Prompt
- Conversation history
Split by department or product line:
- “Support - General” — SOPs + company policy
- “Support - Product A” — product A docs
- “Support - Product B” — product B docs
- “Support - Refund flow” — complex refund judgment
Users and permissions#
| Role | Permissions |
|---|---|
| Admin | All |
| Manager | Create Workspaces, invite members |
| Default | Use assigned Workspaces |
Each agent gets a Default account bound to their relevant Workspaces.
Real workflow#
Agent receives “My X-200 has no sound, bought last week”:
- Open the “Support - Product A” Workspace
- Ask: “X-200 no sound troubleshooting”
- AnythingLLM returns (from product manual):
X-200 no sound — checklist: 1. Check volume switch (left side) 2. Verify Bluetooth pairing 3. Hold power 10s to reboot 4. Still nothing → contact RMA (manual p.34) - Agent copies into Chatwoot, customer answered in 3 s
Cost#
| Mode | Monthly |
|---|---|
| AnythingLLM + cloud DeepSeek | $10-20 |
| AnythingLLM + local Qwen (own GPU) | $0 (sunk cost) |
| AnythingLLM + local Qwen (rented A100) | $700-1500 |
Outcomes#
A 30-agent team’s before/after:
| Metric | Before | After |
|---|---|---|
| Avg handling time | 8 min | 3.5 min |
| New hire to independent | 6 weeks | 2 weeks |
| KB usage rate | 18% | 92% |
| 2nd-touch on same issue | 22% | 9% |
Bottom line: you don’t need to replace agents — just making lookup easy delivers real gains.
Advanced: customer-facing via API#
AnythingLLM exposes an OpenAI-compatible API, so it can be Chatwoot’s Agent Bot directly:
# Chatwoot Agent Bot Outgoing URL
https://anythingllm.example.com/api/v1/openai/chat/completions
# Auth header
Authorization: Bearer <AnythingLLM API Key>
But prompts here aren’t strict — don’t customer-face from AnythingLLM. Use Dify for that.
Pitfalls#
- Desktop + server clash — running both fragments the vector store
- Large uploads time out — default 50 MB; raise it in nginx + env
- Shared model rate limits — 30 agents on a free DeepSeek key get throttled; pay for capacity
- Documents scattered across too many Workspaces — duplicates the same info N times; keep “general SOPs” in one place and reference from others