CodePay · AI Customer Support · 0→1
I find workflow friction
and ship AI tools that remove it.
A 0→1 AI support initiative at a B2B in-person-payments company. I reframed an open-ended “use AI to cut support load” brief into a staged, de-risked system: scope what the AI may answer, give humans a knowledge base the AI cites, and grow from an internal tool toward a partner-facing agent.
Role
Product Builder + PM
Scope
Problem reframing · Workflow tooling · AI product judgment
Built with
Claude · Claude Code · Markdown · Lark Bitable
Status
Shipped assets · system piloting
Piloting, with a couple of small shipped steps that already cut load and seed the knowledge base. No invented metrics.
I reframed ‘build a support bot’ into a question-intelligence system.
“The structured record of what partners struggle with was as valuable as the deflection itself.”
The natural framing was “build an AI that answers customer questions.” My first useful move was noticing I’d bundled three different products into one — an internal tool (now), a partner-facing product (later), and a fully-automated agent (someday) — with completely different users, risks, and success criteria. I focused everything on the internal tool and treated the rest as extensions, not requirements. Then I flipped the approach from knowledge-first (dump every doc into the AI) to problem-first: build the knowledge base from the few dozen questions partners actually ask.
Layer 1 · Now
Internal AI support tool
Our own ops team. Low risk, shippable now. The deliberate focus.
Layer 2 · Later
Productized, partner-facing
Sell it to other companies. A natural extension — not a requirement.
Layer 3 · Someday
Action-taking agent
Opens & checks configs inside the platform. Vision, not scope.
Leverage · Scope to Layer 1 (the internal tool) only; treat the rest as extensions, not requirements.
Trade-off · Less ambitious on day one — but Layer 1 builds the knowledge asset every later stage depends on, so it's never throwaway.
Automate 80%. Never automate the one step where a wrong answer costs money.
“I ran the loop by hand once to learn its shape.”
I ran the loop by hand once to learn its shape. About 80% is automatable — collect (Lark + WeChat), cluster, dedupe, diff against the current FAQ, flag, and route to the right SME. The remaining ~20% — the authoritative answer(e.g. “Does EBT work on this processor?”) — is a human knowledge gate the AI routes to but never invents.
~80% automatable
the loop- 01Collect (Lark + WeChat)
- 02Cluster & dedupe
- 03Diff vs. current FAQ
- 04Flag new / changed / conflicting
- 05Route each flag to the right SME
~20% human gate
by designThe authoritative answer.
Does EBT work on this processor? When does the warranty clock start? A confidently wrong answer here costs a partner real money — so the AI routes to a human, and never invents.
regenerate
Verified answer flows back into the deliverable. Re-run, ship.
Leverage · Keep the human gate by design — automate the 80%, make the human step faster and unskippable.
Trade-off · The tool can't dissolve the gate; accepting that is exactly what stops the system shipping confident, wrong answers.
Human-in-the-loop here isn’t a limitation to engineer away. It’s the design.
The system wins by knowing when not to answer.
“A confidently wrong answer about payment setup costs a partner real money.”
I classified every question into four types and let the AI confidently answer only Type-A(safe, high-volume how-to); bugs route to an owner, account-specific questions get collected and routed, feedback becomes product signal. So I stopped optimizing for “accuracy” and started optimizing for deflection + wrong-answer rate. Underneath sits a three-layer model — raw material → a structured, human-maintained table → AI working knowledge that cites its source entry on every reply (I reasoned my way to RAG before I knew the term). When an answer is wrong, I ask which entry it cited, fix that row, reload.
Config / How-to
AI answers
The safe, high-volume sweet spot.
Bug / error
Route to owner
AI doesn't fix — it hands off.
Account-specific
Collect & route to human
AI never guesses; gathers info.
Feedback / request
Capture as product signal
Becomes intelligence, not noise.
Leverage · Type-A only, with citations as the maintenance handle.
Trade-off · Narrower coverage day one, but a wrong answer about payment setup costs real money — so the boundary is the point.
Raw material
Chats, screenshots, machine-screen photos — multimodal input.
Structured, human-maintained table
The only layer humans maintain. Version, last-verified date, owner. Fix the table, not the AI.
AI working knowledge
Reads from the table to answer — and cites the source entry on every reply.
Closed correction loop
Deliberate constraints
Build vs buy
Start far smaller than the market says
Intercom Fin, Sierra, Decagon, and Ada are built for thousands of customers across industries. We serve a few hundred in one vertical — most of that complexity would be a burden, not an asset.
Channel = context
Answer on our own surface
In-platform, the AI already knows the partner, their device, and recent activity — the biggest accuracy multiplier. The same question gets a precise answer in-platform and a vague one in a group chat.
Tooling
Workflow before agents
A deterministic retrieve → answer → escalate flow is enough for Type-A. Full autonomous agents are for later, where genuine open-ended tool use is actually needed.
I can’t build the whole service in a day — but I can ship something that cuts load now.
“Each step relieves a little support pressure today, and feeds the knowledge asset every later stage depends on.”
Two deliberately small first steps. (1) A client-facing FAQ— I consolidated scattered internal Q&A into one clean, re-runnable FAQ and got it to partners; it deflects repetitive questions immediately, and its structured content becomes the first rows of the knowledge base. (2) A KB-template draftI’m actively using to collect raw support material into structured rows — the layer the architecture reads from. The wider system is piloting; the defensible output so far is the thinking made concrete, with no invented metrics.
Before · manual
- Two source FAQs, maintained separately
- Merge & dedupe by hand, every time
- Conflicts found by chance, if at all
- One-off output — stale the moment a doc changes
After · the FAQ accelerator
- One structured content model as the source
- Re-run on demand to regenerate the deliverable
- Conflicts surfaced & flagged automatically
- Client-ready HTML + PDF, re-runnable on every change
The shipped output
A client-ready FAQ — search, categories, collapsible Q&A
Generated from the structured model, not hand-built. This is the actual artifact — open it.
The raw partner questions these steps organize — and the first-draft table that structures them.
/images/codepay-support/wechat-questions.png
/images/codepay-support/lark-questions.png
/images/codepay-support/kb-template.png
The call I’m deciding now: meet partners in chat, or centralize in the platform?
“I’d rather reduce the friction of the right architecture than accept the wrong one because it’s frictionless.”
Where should support actually live? Meeting partners in the chat groups they already use is the lowest-friction path — but it’s costly to operate, hard to scale, and the data is messy and ToS-risky to capture. Centralizing in our ops platform gives clean, owned, structured data, real assignment and accountability, and the knowledge base every later stage depends on — at the cost of some login friction. My team leans platform; leadership leans chat. I’m holding it open on purpose — because it should be settled by users, not by whoever argues hardest.
Option A · meet them in chat
Answer inside the chat groups partners already use
- + Lowest friction — partners are already there
- − High ops cost; hard to scale across groups
- − Messy, ToS-risky capture (no clean WeChat API)
Option B · centralize in the platform
my leanOne front door inside our ops platform
- + Owned, structured, accurate data
- + Assignment, accountability — and it feeds the KB
- − Login friction, mitigated by mobile-first, one-tap auth
My lean · Make the platform the system of record — the compounding, owned asset — and kill the friction objection with mobile-first, one-tap auth, rather than concede to scattered chats.
What resolves it · Deliberately deferred — I’d settle the front-door question with customer interviews on the friction hypothesis and real adoption + ops-cost data, not a meeting-room opinion.
Takeaways
Scope discipline is the skill
Type-A only, no enterprise-platform copying, a no-engineering first version. Shrinking the MVP is what keeps AI projects from dying in over-design.
Ship the tool, then defend the gate
I built a re-runnable accelerator and then refused to automate the one step where a wrong answer costs money.
The byproduct can be the product
Reframing 'support bot' into 'question-intelligence system' surfaced a stream of product signal that may be the more valuable half early on.
How this was built. The knowledge docs, the FAQ generator, and this case-study page were all built with Claude Code and Markdown-based workflows — the same AI-augmented method the support tool itself is designed around.
Thanks for reading
View More Projects→