AnythingLLM — a doc-lookup copilot for agents

Inline KB queries inside an agent's workstation roughly halve handling time. The goal isn't to replace humans — it's to make existing agents 2-3× faster.

Scenario: KB exists but agents can't / won't / don't have time to browse it
Monthly cost: $10 - $30
Difficulty: Easy

AnythingLLMOllama (optional)PostgreSQL (optional)

What this combo solves#

Most support teams have capable agents whose real bottleneck under customer pressure is not having time to browse docs:

1,000+ articles across Notion / Confluence / Lark
SOPs, product manuals, refund rules, technical FAQs — all there
In practice, agents either rely on memory or open 3-5 tabs and grep
Average non-standard ticket eats 2-5 minutes of doc lookup
New-hire ramp is 1-2 months

AnythingLLM provides an agent-facing AI workstation. It does not replace agents — it makes them 2-3× faster. Positioning is completely distinct from a customer-facing AI like Chatwoot + Dify.

Architecture#

Versus customer-facing AI#

Dimension	AnythingLLM (internal copilot)	Chatwoot + Dify (customer-facing)
User	Agents	Customers
Tolerance for mistakes	High (agent judges)	Low (goes straight out)
Prompt strictness	Can be loose	Must be tight
Risk	Wrong answer only affects agent	Wrong answer → complaint / legal
Deploy complexity	Single container	Multi-component
Starting cost	$10-30 / mo	$40-150 / mo

Many teams run both — AnythingLLM for agents, Chatwoot + Dify for customers.

When to pick this#

Situation	Fits?
KB exists, agents don’t use it	✓
Lots of new hires, long ramp	✓
Data sensitive — don’t want AI auto-replying customers	✓
Want to dip a toe into AI but not customer-facing	✓
Customer questions are highly repetitive	✗ Use customer-facing AI

Deployment (15 min)#

Option 1 — cloud models (easiest)#

docker run -d --name anythingllm \
  -p 3001:3001 \
  --restart unless-stopped \
  -v anythingllm_storage:/app/server/storage \
  -e STORAGE_DIR="/app/server/storage" \
  mintplexlabs/anythingllm

Open http://YOUR_IP:3001, register admin, Settings → LLM Provider → OpenAI / Claude / DeepSeek.

Option 2 — local inference (air-gapped)#

curl -fsSL https://ollama.com/install.sh | sh
ollama pull qwen2.5:14b-instruct-q4_K_M

docker run -d --name anythingllm \
  -p 3001:3001 \
  --add-host=host.docker.internal:host-gateway \
  -v anythingllm_storage:/app/server/storage \
  -e LLM_PROVIDER=ollama \
  -e OLLAMA_BASE_PATH=http://host.docker.internal:11434 \
  mintplexlabs/anythingllm

Workspace model#

AnythingLLM’s core concept is the Workspace — each has its own:

KB (uploaded docs)
Prompt
Conversation history

Split by department or product line:

“Support - General” — SOPs + company policy
“Support - Product A” — product A docs
“Support - Product B” — product B docs
“Support - Refund flow” — complex refund judgment

Users and permissions#

Role	Permissions
Admin	All
Manager	Create Workspaces, invite members
Default	Use assigned Workspaces

Each agent gets a Default account bound to their relevant Workspaces.

Real workflow#

Agent receives “My X-200 has no sound, bought last week”:

Open the “Support - Product A” Workspace
Ask: “X-200 no sound troubleshooting”

AnythingLLM returns (from product manual):

X-200 no sound — checklist:
1. Check volume switch (left side)
2. Verify Bluetooth pairing
3. Hold power 10s to reboot
4. Still nothing → contact RMA (manual p.34)

Agent copies into Chatwoot, customer answered in 3 s

Cost#

Mode	Monthly
AnythingLLM + cloud DeepSeek	$10-20
AnythingLLM + local Qwen (own GPU)	$0 (sunk cost)
AnythingLLM + local Qwen (rented A100)	$700-1500

Outcomes#

A 30-agent team’s before/after:

Metric	Before	After
Avg handling time	8 min	3.5 min
New hire to independent	6 weeks	2 weeks
KB usage rate	18%	92%
2nd-touch on same issue	22%	9%

Bottom line: you don’t need to replace agents — just making lookup easy delivers real gains.

Advanced: customer-facing via API#

AnythingLLM exposes an OpenAI-compatible API, so it can be Chatwoot’s Agent Bot directly:

# Chatwoot Agent Bot Outgoing URL
https://anythingllm.example.com/api/v1/openai/chat/completions

# Auth header
Authorization: Bearer <AnythingLLM API Key>

But prompts here aren’t strict — don’t customer-face from AnythingLLM. Use Dify for that.

Pitfalls#

Desktop + server clash — running both fragments the vector store
Large uploads time out — default 50 MB; raise it in nginx + env
Shared model rate limits — 30 agents on a free DeepSeek key get throttled; pay for capacity
Documents scattered across too many Workspaces — duplicates the same info N times; keep “general SOPs” in one place and reference from others