Dify + RAGFlow — two-layer stack for complex documents
Dify owns the app layer and workflow; RAGFlow handles retrieval over tables, scans and complex layouts that defeat simple chunking.
- Scenario
- KB is dominated by complex PDFs — product manuals, contracts, medical guidelines — where naive chunking fails
- Monthly cost
- $150 - $500
- Difficulty
- Hard
What this combo solves#
Dify’s built-in KB is fine for Markdown / Notion exports, but it stumbles on:
| Document type | Dify default |
|---|---|
| Product manual with 12 tables | Tables fragmented; 30-60% spec hit rate |
| Two-column legal / policy docs | Text streams out of order |
| Scanned / legacy manuals | 0 hits (no OCR) |
| Tech docs with formulas / figures | Heavy information loss |
RAGFlow benchmarks show its DeepDoc engine lifts retrieval accuracy on these documents from ~60% to 90%+. RAGFlow is a retrieval engine, not an app platform — best paired as:
Dify for the app layer (prompts + workflow + multi-LLM) + RAGFlow as retrieval backend
Architecture#
Components#
| Component | Role | Sizing |
|---|---|---|
| RAGFlow | Document parsing + retrieval | 8C/16G + 100GB |
| Elasticsearch / Infinity | Backing search | 4C/8G + 50GB |
| Dify | Workflow, prompts, LLM governance | 8C/16G |
| Chatwoot | Channels + agents | 4C/8G |
| PostgreSQL | Metadata | 2C/4G |
Deployment#
1. Bring up RAGFlow#
git clone https://github.com/infiniflow/ragflow
cd ragflow/docker
docker compose -f docker-compose.yml up -d
Open http://YOUR_IP:80, register, create the first KB:
- Language: Chinese / mixed
- Chunking: General to start, or Paper / Manual for table-heavy
- Embedding: bge-m3
- Upload docs, wait for DeepDoc to finish parsing (5-30s per page)
2. Test retrieval#
In RAGFlow’s Retrieval Testing pane, run real questions:
- Tables: “max output power of model X-200”
- Two-column: “temperature limit mentioned in chapter 2”
- Scans: “safety rules in appendix B”
If top-1 lags:
- Switch chunking strategy
- Enable Auto-keyword and Auto-question preprocessing
- Add reranker (default
bge-reranker-v2-m3)
3. Use RAGFlow as Dify’s retriever#
Dify Workflow with an HTTP Request node:
Node: HTTP Request
URL: https://ragflow.example.com/api/v1/retrieval
Method: POST
Headers:
Authorization: Bearer <RAGFlow API Key>
Body:
{
"question": "{{user_question}}",
"dataset_ids": ["kb_uuid"],
"top_k": 5,
"similarity_threshold": 0.5,
"rerank": true
}
Pass the returned [{chunk_id, content, score}, ...] to your LLM node.
4. Strict prompt#
You are [Brand]'s support assistant. Answer only from "Reference"
and cite chunk_id after each fact.
Reference:
{% for chunk in retrieved_chunks %}
[{{chunk.chunk_id}}] {{chunk.content}}
{% endfor %}
Question: {{question}}
Answer (with chunk_id):
5. Chatwoot integration#
Same as Chatwoot + Dify — wire the Dify app as an Agent Bot.
Tuning#
RAGFlow chunking#
| Document type | Chunking |
|---|---|
| Generic Markdown / Word | General |
| Product manual (table-heavy) | Paper / Manual |
| Legal / contracts | Laws |
| Academic papers | Paper |
| Resumes / card-style | Resume |
| Books | Book |
Enable GraphRAG#
For entity-rich documents (interconnected specs, medical terminology), enable RAGFlow’s Knowledge Graph:
- Check “Knowledge Graph” in KB settings
- Retrieval returns graph relations alongside chunks
- Both accuracy and explainability rise noticeably
Multi-KB routing#
When docs span multiple domains, classify intent in Dify first, then route:
Question → LLM classify(domain: A|B|policy) →
├─ A → RAGFlow KB_A
├─ B → RAGFlow KB_B
└─ policy → RAGFlow KB_policy
Cost#
| Resource | Monthly (self-host) |
|---|---|
| 8C/16G (RAGFlow + Dify) | $60-80 |
| 4C/8G (Chatwoot + Postgres) | $30 |
| LLM tokens (5k conv/mo, Qwen 72B) | $50-150 |
| Embedding (local GPU = free) | $0 |
| Domain + email | $15 |
| Total | $155-275 / mo |
Effect (same product manual)#
30 real questions, with tables / two-column / scanned pages:
| Setup | Top-1 hits | Faithfulness |
|---|---|---|
| Dify default KB | 17/30 | 0.78 |
| Dify tuned KB | 23/30 | 0.85 |
| Dify + RAGFlow (default) | 28/30 | 0.87 |
| Dify + RAGFlow (tuned + GraphRAG) | 30/30 | 0.94 |
When NOT to pick this#
| Situation | Use instead |
|---|---|
| Docs are mostly Markdown / Notion | Chatwoot + Dify |
| Tight budget, single 4C/8G host | Chatwoot + Dify first |
| < 100 simple docs | AnythingLLM copilot + Chatwoot |
Pitfalls#
- Parse speed — scanned PDF ~30s/page; 100 pages = 50 minutes; batch overnight
- Elasticsearch memory — default 1G is not enough; bump to 4G+ in production
- Dify HTTP timeout — RAGFlow queries sometimes take 2s; set node timeout to 10s
- Chinese OCR weakness — RAGFlow’s built-in OCR struggles with handwriting / blurred scans; preprocess with PaddleOCR