flag92 flag92
Blog
Published Tue May 12 2026 08:00:00 GMT+0800 (中国标准时间)
benchmarkembeddingRAG

2026 Chinese embedding benchmark — bge-m3, Conan, m3e, bce, OpenAI

Same Chinese KB, same real support questions — five embedding models compared on retrieval accuracy, speed and cost.

Setup#

  • Corpus: 3,000 FAQ entries + 500 product docs (e-commerce)
  • Test set: 120 real user questions with annotated top-1 ground truth
  • Retrieval: pure vector, top_k=5
  • Metrics: MRR@5, Recall@5, per-query latency, monthly cost per 1M tokens

Models#

ModelDimsOpenNotes
bge-m31024BAAI; de-facto baseline for Chinese
Conan-embedding-v11792Tencent, 2026
m3e-large1024moka-ai
bce-embedding-base_v1768NetEase Youdao
text-embedding-3-large3072OpenAI

Results#

ModelMRR@5Recall@5Avg latency (ms)Cost / 1M tokens
bge-m30.870.9422self-host GPU
Conan-embedding-v10.890.9531self-host GPU
m3e-large0.810.8924self-host GPU
bce-embedding-base_v10.790.8718self-host GPU
text-embedding-3-large0.850.93180-400~$0.13

Recommendations#

ScenarioPick
Default Chinesebge-m3 — most balanced
Max accuracyConan-embedding-v1 — edges out bge-m3
Tight budgetbce-embedding-base_v1 — light and fast
Mixed languagesbge-m3 or text-embedding-3-large
No self-hostingOpenAI is easiest

Findings#

  1. bce-embedding + the matching reranker takes the lead: alone it ranks 4th but pairing with bce-reranker-base_v1 lifts the combined score to 0.88, just behind bge-m3
  2. OpenAI shines on language mixing: when English appears in Chinese context, OpenAI stays steady while open models can dip
  3. Dimensions aren’t everything: 3072-dim OpenAI loses to 1024-dim bge-m3

How to plug into Dify#

# Dify → Settings → Model Provider → Add
# Choose Hugging Face or SiliconFlow
# Model: BAAI/bge-m3
# Endpoint: https://api.siliconflow.cn/v1

Search

Press ⌘ K to open