Published Tue May 12 2026 08:00:00 GMT+0800 (中国标准时间)
benchmarkembeddingRAG
2026 Chinese embedding benchmark — bge-m3, Conan, m3e, bce, OpenAI
Same Chinese KB, same real support questions — five embedding models compared on retrieval accuracy, speed and cost.
Setup#
- Corpus: 3,000 FAQ entries + 500 product docs (e-commerce)
- Test set: 120 real user questions with annotated top-1 ground truth
- Retrieval: pure vector, top_k=5
- Metrics: MRR@5, Recall@5, per-query latency, monthly cost per 1M tokens
Models#
| Model | Dims | Open | Notes |
|---|---|---|---|
bge-m3 | 1024 | ✓ | BAAI; de-facto baseline for Chinese |
Conan-embedding-v1 | 1792 | ✓ | Tencent, 2026 |
m3e-large | 1024 | ✓ | moka-ai |
bce-embedding-base_v1 | 768 | ✓ | NetEase Youdao |
text-embedding-3-large | 3072 | ✗ | OpenAI |
Results#
| Model | MRR@5 | Recall@5 | Avg latency (ms) | Cost / 1M tokens |
|---|---|---|---|---|
bge-m3 | 0.87 | 0.94 | 22 | self-host GPU |
Conan-embedding-v1 | 0.89 | 0.95 | 31 | self-host GPU |
m3e-large | 0.81 | 0.89 | 24 | self-host GPU |
bce-embedding-base_v1 | 0.79 | 0.87 | 18 | self-host GPU |
text-embedding-3-large | 0.85 | 0.93 | 180-400 | ~$0.13 |
Recommendations#
| Scenario | Pick |
|---|---|
| Default Chinese | bge-m3 — most balanced |
| Max accuracy | Conan-embedding-v1 — edges out bge-m3 |
| Tight budget | bce-embedding-base_v1 — light and fast |
| Mixed languages | bge-m3 or text-embedding-3-large |
| No self-hosting | OpenAI is easiest |
Findings#
bce-embedding+ the matching reranker takes the lead: alone it ranks 4th but pairing withbce-reranker-base_v1lifts the combined score to 0.88, just behind bge-m3- OpenAI shines on language mixing: when English appears in Chinese context, OpenAI stays steady while open models can dip
- Dimensions aren’t everything: 3072-dim OpenAI loses to 1024-dim bge-m3
How to plug into Dify#
# Dify → Settings → Model Provider → Add
# Choose Hugging Face or SiliconFlow
# Model: BAAI/bge-m3
# Endpoint: https://api.siliconflow.cn/v1