RAGFlow
Open-source RAG engine powered by deep document understanding
- License
- Apache-2.0
- Deploy difficulty
- Medium
- GitHub stars
- 80.7k
- Last commit
- 2d ago
Latest
v0.25.4 9.2k forks
3023 open issues Synced 2026-05-17 What RAGFlow is#
RAGFlow is an open-source RAG engine from InfiniFlow. The headline differentiator is DeepDoc parsing — it identifies tables, multi-column layouts and scanned-page OCR inside PDFs rather than naive page/length chunking. Especially effective on complex Chinese documents.
Versus Dify / FastGPT#
- Dify and FastGPT are application platforms; RAGFlow is a retrieval engine
- RAGFlow emphasizes citability — every answer jumps to its source
- Best paired behind Dify / FastGPT as their upstream knowledge layer via API
Strengths
- Best-in-class document parsing — PDF tables, scans, complex layouts
- DeepDoc engine handles Chinese documents with intricate formatting
- Answers ship with clickable evidence snippets
- Built-in GraphRAG and multi-route recall
Trade-offs
- Heavy resource footprint (parser is CPU-intensive)
- Engineer-facing UI
- Needs Elasticsearch / Infinity etc.
Best for
- Government, legal, healthcare — serious documents with serious layouts
- Compliance contexts where every answer must cite source