flag92 flag92
Blog
Published Fri May 01 2026 08:00:00 GMT+0800 (中国标准时间)
case studyfinancecomplianceon-prem

Case · A securities firm shipped a fully local AI support — the compliance path

No data egress, regulator approval, internal + external audit, DR plan. The real compliance journey for a mid-sized securities brokerage's local AI support.

Background#

  • Business: mid-sized securities brokerage
  • AUM: ~¥120B
  • Onboarded customers: ~1.2M
  • Support agents: 85 (3 cities)
  • Regulator: front-line CSRC compliance
  • Previous: human-only support + traditional call center vendor

Why AI now#

Four drivers:

  1. Support pressure: market-open peak hits 800+ inquiries / minute
  2. Audit burden: existing trails are too coarse; annual external audit takes 8-12 weeks
  3. Training cost: new agents take 3 months to ramp; 70% retention
  4. Peer pressure: top brokerages launched AI; customers notice

Blocker: no cloud LLM allowed — CSRC requires customer dialogue stays on-shore.

Selection#

Evaluated 5 directions:

OptionDecision
Cloud LLM× Compliance dead end
Outsource to a top-3 brokerage’s IT× Cross-data + opacity
Build a dialogue model in-house× 12-18 months, capability gap
On-prem open-source LLM + framework
On-prem commercial LLM✓ but 3-5× more expensive

Final stack: Rasa CALM + Chatwoot + RAGFlow + Qwen 2.5-72B on-prem (a finance variant of the fully local solution).

Architecture#

Procedural

Free-form Q&A

Complex / complaint

audit

audit

audit

Authenticated customer

Web / App

SSO · corporate IDP

Chatwoot Inbox
authn / anon split

Rasa CALM

Balance / Holdings / Close
strict state machine no LLM

RAGFlow retrieval
public KB only

Human agent

Qwen 2.5-72B generation

8 × A100 80GB · vLLM

WORM storage
blockchain attestation

Infrastructure: inference on 8 × A100 80GB (vLLM); Postgres HA + off-site backup; logs to WORM storage + blockchain attestation.

Compliance journey (11 months)#

Month 1-2 — kickoff#

  • IT presents proposal
  • Risk, compliance, legal review
  • CRO signs off

Month 3-4 — architecture justification#

  • Model selection rationale (why Qwen)
  • Security architecture (traffic, permissions, encryption)
  • DR architecture (3 sites, 5 replicas)

Month 5 — pre-regulator chat#

  • Informal touch with the local CSRC office
  • Submit “AI System Plan,” “Data Security Statement,” “Compliance Undertaking”
  • Got “no objection”

Month 6-7 — build#

  • Hardware in place (A100 × 8, PG HA, network)
  • Base deploy
  • Fine-tune Qwen on internal corpus (~3 weeks)

Month 8 — internal UAT + audit#

  • Risk team rides along 1,000 test cases
  • 12 edge issues found (e.g. “recommend a stock”) — all gated by Rasa flows
  • Internal audit clears

Month 9 — regulator on-site#

  • CSRC office on-site for 3 days
  • Focus: data egress, audit log completeness, kill switch
  • 2 findings: log retention < 5 years, no quarterly pen test
  • 4 weeks of remediation

Month 10 — external audit + 3rd-party security#

  • MLPS 3.0 (China grade-3) passed
  • Third-party pen test passed
  • ISO 27001 recert passed

Month 11 — go-live#

  • Canary: 1% authenticated customers
  • Week 2: 5%
  • Week 3: 20%
  • Week 4: 100%

6-month post-launch data#

MetricBeforeAfter
Monthly conversations480k520k (slight rise)
AI deflection0%58%
First response47 s avg2.3 s avg
Human agent hours / mo13,6007,200
Agent headcount8560 (25 rotated)
CSAT4.14.3
Complaint rate0.21%0.18%
External audit duration8-12 weeks4 weeks (structured logs)

Technical decisions#

1. Why Rasa, not Dify#

Financial conversations can’t guess. “Can I enable margin trading?” requires:

  • Customer tier sufficient?
  • Risk assessment sufficient?
  • Risk disclosure signed?
  • Funds account compliant?
  • All yes → guide to “enable” flow

Rasa’s Flows + LLM-for-NLU fits much better than Dify’s pure-LLM workflows.

2. Why local Qwen vs cloud GPT#

CSRC’s data-egress red line is strict:

  • OpenAI / Anthropic completely off-limits
  • Azure OpenAI China-edition is theoretically possible but compliance overhead is heavy
  • Domestic cloud LLM APIs (Alibaba, Tencent) are viable but data flows still need review

Local Qwen: zero egress, audit passes straight through.

3. Why no direct business-system access#

Too risky. LLM directly reading accounts / orders / funds risks privilege escalation. Design:

  • LLM only sees “public KB” (rules, policy, flow)
  • Account data requires a business interface (with permission check)
  • Interface returns structured data → Rasa flow assembles the reply

Hardware + cost#

ItemSpecMonthly
A100 80GB × 8$120k upfront / 36-month amort¥21k
GPU host × 264C/256G¥8k
Storage100TB (incl. WORM audit)¥6k
Network / securityFirewall + MLPS + monitoring¥5k
Software licensesRasa Pro + commercial support¥10k
4-person ops team¥80k
Total¥130k / mo

Sounds expensive, but 25 agents × ¥15k/mo = ¥375k/mo saved. Net ¥245k/mo saving.

Intangible wins#

  • Audit efficiency: external audit 8-12 weeks → 4 weeks, ~600 hours saved annually
  • Training: agents ramp 3 months → 6 weeks (AI catches misses)
  • Brand: cited by regulator as “digitization exemplar”
  • New business support: launching ETF options support went from 2 weeks of training to 3 days

Scars#

  1. Compliance review rejected first prompt — missing “not investment advice” disclaimer; rewrote everything
  2. Qwen 72B errs on compound interest math — switched to Rasa calling an external compute service
  3. Rasa Flows are heavier to maintain than expected — when business changes, Flow edits cost more than prompt edits; team stabilized at month 4
  4. A100 utilization is low — 30-40% daytime, idle at night; considering training jobs to soak it up

Advice for regulated peers#

If you’re in a regulated industry, 4 cautions:

  1. Talk to the regulator early — start the conversation 2 months before build
  2. Design compliance architecture first — data flows, permission matrix, audit needs before any code
  3. Don’t chase the newest model — pick stable + commercially supported (Qwen + Rasa Pro)
  4. Budget compliance time — 2-3 months of approvals after technical readiness; build it into the plan

Search

Press ⌘ K to open