benchmark
Tag: benchmark
3 posts tagged "benchmark".
Wed Apr 08 2026 08:00:00 GMT+0800 (中国标准时间)
How to evaluate your RAG quality — 5 metrics, 3 toolkits
"Feels better than last week" is not measurement. Five quantitative RAG metrics and three open-source evaluators.
Tue May 12 2026 08:00:00 GMT+0800 (中国标准时间)
2026 Chinese embedding benchmark — bge-m3, Conan, m3e, bce, OpenAI
Same Chinese KB, same real support questions — five embedding models compared on retrieval accuracy, speed and cost.
Fri May 08 2026 08:00:00 GMT+0800 (中国标准时间)
2026 Chinese-support LLM bake-off — Qwen, DeepSeek, GLM, Doubao, ERNIE
Same support prompt and knowledge base — which of the five China-trained LLMs ships the best AI support? 200 real questions decide.