29 core terms

AI support glossary

Core terms in the open-source AI support space — RAG, Embedding, MCP, Reranker, Faithfulness and more.

Core concepts 10

RAG Retrieval-Augmented Generation: Retrieval-Augmented Generation. Recall relevant chunks from a knowledge base via vector or keyword search, then have the LLM answer using those chunks. Nearly all AI support stacks use it.
Embedding 词向量 / 文本嵌入: Mapping text into a high-dimensional vector space where semantically similar texts are close. For Chinese, bge-m3 and Conan-embedding are common.
Vector DB 向量数据库: A database optimized to store vectors and search by similarity. Common: Weaviate, Milvus, Qdrant, LanceDB, pgvector.
Reranker 重排序模型: Model that re-scores initial retrieval results. Significantly boosts top-1 accuracy. For Chinese, bge-reranker and bce-reranker are common.
MCP Model Context Protocol: A protocol from Anthropic (2024) standardizing how LLMs interact with external tools and data sources. Supported by Dify, Open WebUI, LibreChat.
Function Calling 函数调用: The LLM's ability to decide which external function to call based on user input and return structured arguments.
Agent 智能体: An LLM application pattern where the model plans, calls tools, and reflects on results autonomously — versus simple Q&A. Dify, LangGraph, AutoGPT are Agent frameworks.
Workflow 工作流: A graph-style orchestration of multi-step logic (branches, loops, HTTP calls). Dify, FastGPT and n8n all ship workflow features.
LLM Large Language Model: Large Language Model. Common LLMs discussed here: GPT, Claude, Qwen, DeepSeek, GLM, ERNIE, Doubao.
Prompt 提示词: Instruction text sent to an LLM. Support prompts typically include role, style, constraints, and KB context.

RAG 3

Chunking 分块: Splitting long documents into LLM-friendly chunks. Strategies include character count, paragraph, heading, QA-split, parent-child.
Top-k: Returning the top-k most relevant results during retrieval. Typically k=3-5 for support.
GraphRAG: A RAG variant that extracts knowledge into a graph before retrieval. Strong on entity-rich domains like finance and healthcare. Built into RAGFlow.

Evaluation 4

MRR Mean Reciprocal Rank: Mean Reciprocal Rank. Evaluates retrieval — the earlier the first correct answer appears, the higher the score.
Recall@k: The fraction of all relevant items found within the top-k results.
Faithfulness 忠实度: Whether the answer is grounded entirely in retrieved content (no hallucination). The key generation-side metric.
Hallucination 幻觉: LLM-generated content that is factually wrong or unsupported. Support AI must be tightly constrained to reduce hallucinations.

Operational KPIs 5

Deflection Rate 自助解决率: Share of conversations resolved by AI without human handoff. A core AI-support KPI, typically targeted 45-70%.
CSAT Customer Satisfaction: Customer satisfaction rating (1-5). Collected post-conversation; track AI leg and human leg separately.
AHT Average Handling Time: Average handling time per conversation. Drops noticeably after AI is introduced.
FRT First Response Time: First Response Time. The interval from a user's first message to the first reply. AI support can bring it to seconds.
SLA Service Level Agreement: Service Level Agreement — committed FRT, resolution times etc. Chatwoot supports tiered SLAs per customer.

Platform features 4

Inbox: Chatwoot's channel abstraction. Every channel (web, email, WhatsApp) is an Inbox with team / auto-assignment / SLA config.
Captain: Chatwoot's built-in AI module. Replies from a KB. Officially pushed since 2024.
Agent Bot: Chatwoot's external AI hook. Sends messages to an external service (like Dify) via webhook for replies.
Pipelines: Open WebUI's Python middleware. Lets you inject business logic before / after the LLM call.

Infrastructure 3

Ollama: Local LLM inference engine — the most popular open-source OpenAI-API drop-in. Mac / Linux / Windows.
vLLM: High-throughput LLM inference framework. Production-grade; much higher throughput than Ollama.
OpenAI-compatible API: API compatible with OpenAI's schema. Many open-source models / providers offer it (DeepSeek, SiliconFlow, Ollama) — trivial to swap.