operator playbook · onboarding

How We Cut Voice AI Onboarding From 10 Hours to 90 Minutes Per Client (Template Inside)

By Alfredo Romero, CEO, HermesMay 16, 202613 min read

Voice AI agencies lose an average of 14 hours per client just collecting information and configuring agents, per the 2026 onboarding research published by GrowwStacks. Most of that time is not skill work. It is form-filling, re-uploading documents, retyping FAQs into a fresh agent, switching browser tabs between five tools, and writing the same system prompt for the seventh time. The 90-minute template compresses all of that into one workspace clone, one structured intake form, one website-scrape ingest, one prompt personalization pass, and one live validation call. The compression comes from removing duplicate work, not from cutting corners. A2P 10DLC submission and any number port-in still run on their own calendar windows, but they happen in parallel and do not block the agent going live. This post is the exact template, the order of operations, the intake-form schema, the parallel-track A2P plan, and the audit we run on the first month of live calls. Pull it once. Use it every client. Onboard four in a day.

I have run this template on every voice AI client I have touched in the last 90 days, including the five operators currently on the Founders' Beta. The pattern holds. The first client takes 3 hours because the canonical snapshot does not exist yet. Clients 2 through 50 take 90 minutes. The two-week break-even on the snapshot build is the single highest-ROI operator move in an agency's first 90 days.

By builders, for builders. Everything below is the working version of what we run inside Hermes, plus the upstream references we used to validate that the time savings are real and not just a single-operator anecdote.

Where do voice AI agencies actually lose 10 hours per client?

Time-tracking data from the last 12 voice-AI onboardings I ran with operators on the beta lands within 10% of the industry numbers. Trillet's 2026 knowledge-base training research puts manual knowledge base entry alone at "2 to 4 hours per client" when it is typed from scratch, and at 4 to 6 hours when the client also lacks structured FAQs. Cekura's 2026 Retell vs Vapi build-time analysis measures a first-time Vapi build at "20 to 60 hours" and a first-time Retell build at "8 to 20 hours" before the agent hits a single live call. Subsequent builds compress, but only if the operator templatizes.

"82% of voice AI agencies report losing clients during onboarding due to delays and miscommunication. Missing information creates back-and-forth that delays launch and frustrates clients." [GrowwStacks, Voice AI Client Onboarding 2026]

The 14 hours of average operator time break out roughly like this across the books I have audited.

Onboarding step	Unstructured time	Template time
Workspace setup, billing, branding	60 to 90 minutes	8 minutes (snapshot clone)
Discovery and information collection	2 to 3 hours	10 minutes (intake form review)
Knowledge base load	2 to 4 hours	15 minutes (website scrape)
System prompt and conversation states	2 to 3 hours	20 minutes (personalization pass)
Integrations and tool calls	1 to 2 hours	0 minutes (pre-wired in snapshot)
Sandbox call validation	45 to 60 minutes	12 minutes (5 scripted calls)
Production cutover and first live call	45 to 60 minutes	15 minutes (3-step flip)

The unstructured workflow totals 9 to 14.5 hours. The template workflow totals 80 minutes plus a 10-minute buffer. The compression is not magic. It is the disappearance of repetitive work that did not need to be done a second time.

What does the 90-minute template actually look like?

The template has five files and one workspace primitive. You build them once, version-control them, and clone the workspace for every new client. The whole point of the template is that the cognitive load is zero on client number two onward.

1. The canonical agency workspace

The canonical workspace holds the agent skeleton, the system tool registry, the call-review queue presets, the post-call webhook payload, the billing template, the white-label brand fields wired to CSS variables, and the affiliate-aware payout splits. It does not hold any client-specific content. It is the empty kitchen, not the meal. The closest analog is the HighLevel SaaS snapshot model, where "the entire setup deploys automatically in under 10 minutes" because the snapshot already encodes the canonical configuration. Hermes workspaces work the same way. The clone takes 8 minutes end to end including the welcome email auto- firing to the new client.

2. The structured intake form

Eighteen questions, no more. Business legal name, DBA, EIN, service address, primary URL, three competitor URLs, top five FAQs, the three calls per week that the client most wants to capture and the three they want to deflect, the booking link if one exists, the existing CRM, the existing telephony, the brand voice description in three sentences, and the must-not- say list. The intake form gates everything downstream. Without it, the template breaks because the prompt-personalization pass has nothing to consume. The intake form also doubles as the SOW reference. Five minutes for the client to fill out, ten minutes for the operator to review.

3. The website-scrape knowledge-base ingest

Hermes ingests the client's primary URL plus a depth-3 crawl, strips boilerplate, deduplicates near-duplicate paragraphs, and writes the resulting chunks into the workspace knowledge base. Coverage is usually 85 to 95% of the questions the agent will actually field on the first day. The remaining 5 to 15% gets backfilled from the intake form's top-five-FAQs field. Knowledge base loading drops from the industry baseline of 2 to 4 hours of typing to a 15-minute ingest plus review. Trillet's research notes the same effect: "Website scraping is much faster, taking only 5-10 minutes with minimal client involvement."

4. The prompt-personalization pass

The canonical prompt is a Jinja-style template with named variables for business name, voice persona, must-not-say list, booking flow, and escalation policy. The personalization pass is a 20-minute operator block that maps intake-form answers to prompt variables, runs the substitution, and previews the first three turns of dialogue against the new agent in the sandbox. The prompt itself never gets rewritten from scratch. We have rebuilt the canonical prompt three times in 90 days. We have not rewritten a client-specific prompt once.

5. The sandbox validation checklist

Five scripted calls against the workspace sandbox number cover the happy path, two common edge cases, the must-deflect path, and a knowledge base hit on a question the client specifically flagged in intake. The checklist takes 12 minutes including listening to playback. The validation pass either fires the green light for production cutover or kicks back to the prompt-personalization step with notes. The checklist itself is a Hermes call-review queue preset, so the same operator workflow that handles live calls after launch is what the operator is already using on the validation pass.

How do you actually run a 90-minute onboarding, step by step?

Open the workspace, open the intake form response, open the client's website in a side tab. Set a timer if you are measuring. Here is the order.

Clone the canonical snapshot. Workspace settings, new workspace, snapshot source = canonical. Fill in client legal name, DBA, billing email, white-label brand color. 8 minutes, including the auto-fired welcome email to the client.
Review the intake form. Read top to bottom once. Flag anything that contradicts what you can see on the client's website. Three of every ten intake forms have at least one inconsistency that would have surfaced as a back-and-forth later. Resolve them now with one short email. 10 minutes.
Trigger the website scrape. Paste the primary URL into the knowledge base ingest. The crawl runs for 8 to 12 minutes in the background. Skim the resulting chunks for obvious junk (cookie banners, careers pages, blog comment threads). Delete junk. 15 minutes elapsed including the wait.
Run the prompt-personalization pass. Open the canonical prompt template. Substitute intake-form fields into variables. Run the three-turn dialogue preview. If anything sounds off-brand, edit the named variables, not the prompt body. 20 minutes.
Validate on the sandbox number. Dial the sandbox number from a personal cell. Walk through the five scripted scenarios. Score each on the validation checklist. 12 minutes.
Cut over to production. Repoint the client's production number's Voice URL to the Hermes SIP endpoint, exactly the same pattern documented in our Retell to Hermes migration playbook. Dial the production number. Confirm the agent answers live. 15 minutes.
Send the client the white-label portal link. The portal already has the brand colors and the workspace name. The client logs in, sees their agent listed, and can run their own test call from inside the portal. 10 minutes including the short loom you record on first run.

Total operator time: 90 minutes. If anything blows the budget, it is almost always step 4 (a vague intake-form answer requiring a follow-up) or step 5 (an STT misfire on an unusual proper noun). Both are recoverable inside the window.

What about A2P 10DLC and number provisioning?

Carrier-side compliance has to be handled. It just does not have to be handled serially. Under current US carrier rules, Conduit's 2026 A2P 10DLC step-by-step documents brand registration at 1 to 3 business days and campaign registration at 3 to 7 business days. The Tuco AI guide notes the same window: Tuco AI A2P 10DLC Registration Guide 2026. The trick is to file the A2P bundle on day zero, before the onboarding session, using the intake form's EIN and service address. The voice path runs over Twilio Elastic SIP Trunking and does not gate on A2P. Only outbound SMS gates on A2P. So you ship the voice agent on day zero. The SMS fallback turns on whenever the carrier returns approval.

"As of February 2025, enforcement is fully live, and all major US carriers now block unregistered A2P traffic outright." Treat A2P as a parallel calendar item, not a voice-AI blocker. [Conduit, A2P 10DLC Registration 2026]

If the client is bringing a number that already exists on their Twilio sub-account, the cutover is the SIP-URL flip described above. If the client needs a fresh number, Hermes provisions a workspace number from inventory in under 60 seconds. Either way, the number is live on the voice path before A2P approval lands.

Why the template works (and where similar templates fail)

Most onboarding templates published in the agency space are SOP documents, not workspace primitives. A 14-page Notion doc titled "Voice AI Client Onboarding Checklist" is not a template. It is a memory aid. The reason the 90-minute number is reachable is that the snapshot itself encodes the configuration. The operator does not consult a checklist for each of the 60 fields the workspace needs set. The snapshot sets them on clone.

The second reason is that the intake form is structured data, not a discovery call. Discovery calls average 45 to 60 minutes and end with a transcript that the operator then has to re-type into the workspace anyway. The structured intake form is the same information in a form the operator can paste into the prompt variables in seconds. OnboardMap's 2026 onboarding benchmark ties the same pattern to retention directly: "Businesses with a defined onboarding timeline retain 34% more clients in the first year than those without one."

The third reason is that knowledge base ingest is a system, not a task. A scrape returns 500 to 2,000 chunks in 10 minutes. A human typing the same content in returns 200 chunks in 4 hours and forgets 12 of them. The system runs while the operator is reviewing the intake form. Time compounds because two steps run at once.

Five common failure modes and how to handle them

The client's website is a Wix one-pager. The scrape returns 30 chunks. Compensate with a 30-minute intake-form follow-up call focused on the top 20 FAQ topics the agent will hit. Total onboarding lands at 2 hours, still 5x the unstructured baseline.
The intake form arrives 70% complete. Do not start the 90-minute clock until the form is full. Send a one-paragraph email naming the three missing fields. Wait. Starting incomplete is the single most reliable way to lose the time budget.
The personalization pass produces a brittle-sounding agent. Almost always a must-not-say list problem. Move that list earlier in the system prompt and re-run the three-turn preview. The variable position in the prompt matters more than the content.
The sandbox call hits an STT misfire on a brand-specific proper noun. Add the noun to the workspace pronunciation dictionary. The fix takes 90 seconds and survives every future call. Do not edit the prompt for this; edit the dictionary.
A2P 10DLC submission gets kicked back. The kickback is almost always missing or mismatched EIN/service-address data, per the same carrier patterns documented across every 2026 guide. Resubmit within 24 hours with corrected fields. The voice agent stays live the entire time.

The action steps (do these this week)

If you onboard your next client at 10 hours instead of 90 minutes, that is 8.5 hours you will not get back. If you onboard four clients at 10 hours, that is a full work-week. The template pays for itself the second time you run it. Here is the minimum viable version you can stand up in one afternoon.

Build the canonical workspace once. Two hours of focused work to lock the system prompt scaffolding, the tool registry, the call-review queue presets, and the post-call webhook payload.
Write the 18-question intake form. Use a single form tool. Make every field required. The hardest part is resisting the urge to add a 19th question.
Configure website-scrape ingest as the default knowledge base source. Type-from-scratch is the fallback, not the default. The default has to be the system.
File the A2P 10DLC submission on day zero. Use the intake form's EIN and service address. Voice goes live in parallel.
Run the validation checklist as a call-review preset, not a Notion doc. The same surface that handles live calls after launch handles the validation pass before launch. One workflow, not two.
Audit the first month of live calls against the validation checklist. Any score the agent missed in sandbox that also misfires in production is a snapshot bug, not a client-specific problem. Fix the snapshot. The next client inherits the fix.

What does this look like as the agency scales?

At 10 clients, the snapshot has been hardened against most of the patterns that show up in real call traffic. At 25 clients, the snapshot is stable and the variation between clients lives entirely in the named prompt variables and the knowledge base content. At 50 clients, the operator time per onboarding drops below 60 minutes because the intake-form follow-up rate falls (clients increasingly come from referrals and arrive better-prepared) and the sandbox validation passes on the first run more often. The marginal cost of the 51st client is a clone, an ingest, and a single call. That is the entire reason this category gets called infrastructure rather than tooling.

If you want to see the canonical snapshot, the intake-form schema, the call-review presets, and the A2P submission template all wired together in a live workspace, apply to the Founders' Beta. Workspaces ship pre-loaded with everything above. You can fork the snapshot, edit it to your agency's conventions, and start cloning per client the same day. The cost of running the audit on your current onboarding workflow is zero. Run it against the voice bill audit if you also want a side-by-side cost comparison against your current stack while you are at it.

Frequently asked questions

How long does it actually take to onboard a new voice AI client in 2026?

Independent industry data puts the average at 14 hours per client across information collection, prompt writing, knowledge base loading, integrations, and live-call validation. The slowest first-time builds on developer-grade platforms can stretch to 20-60 hours. A snapshot-driven template collapses the same work into 60 to 90 minutes of active operator time, with A2P 10DLC and any port-ins running in parallel on a separate clock.

What is the single biggest time sink in voice AI client onboarding?

Manual knowledge base creation. Typing out business information, services, pricing, and FAQs into a fresh agent takes 4 to 6 hours per client when it is done from scratch. Replacing that step with a website-scrape ingest pulls the same knowledge base together in 5 to 10 minutes. Knowledge base loading alone usually accounts for a third to half of the total onboarding hours an agency spends per client.

Can I use a workspace snapshot to onboard voice AI clients the same way GoHighLevel agencies use sub-account snapshots?

Yes. The snapshot pattern is exactly the same. You build one canonical agency workspace with the prompt scaffolding, the system tools, the call-review queue, the post-call webhook structure, and the billing template, then clone it per client and swap in the business-specific fields. Snapshot-driven onboarding inside HighLevel deploys a new sub-account in under 10 minutes. The Hermes workspace clone runs on the same primitive.

Where does the 90-minute number actually come from?

It is the sum of an 8-minute snapshot clone, a 10-minute intake form review, a 15-minute knowledge base ingest from the client website, a 20-minute prompt-personalization pass against the intake, a 12-minute sandbox call validation, and a 15-minute go-live cutover with the first live test call. The remaining 10 minutes is a buffer for the one inevitable thing that goes wrong. Anything above that 90 minutes belongs to A2P 10DLC submission and the port-in window, both of which run on parallel calendar time, not on operator time.

Does this work for clients who do not have an organized website or documentation?

Yes, but the time budget shifts. Without a usable website to scrape, the knowledge base step moves from 15 minutes to 45 to 60 minutes of structured intake-form processing plus a single 15-minute follow-up call with the client. Total onboarding lands at 2 to 2.5 hours instead of 90 minutes. It is still 4 to 5x faster than the 10-hour baseline. The biggest unlock is the intake form itself, which forces the missing information into a single round trip rather than five.

What about A2P 10DLC registration? Does that not blow up the timeline?

Only if you treat it as serial work. Brand registration takes 1 to 3 business days and campaign registration takes 3 to 7 business days under current US carrier rules, but neither requires you to wait before standing up the agent. Submit the A2P bundle on day zero, validate the agent on the Hermes sandbox number in parallel, and the only thing the A2P window gates is outbound SMS. The voice agent is live the day you ship the snapshot.

How many clients can one operator onboard per day using the 90-minute template?

Four to five, comfortably. The hard ceiling is not operator time. It is the client's responsiveness on the intake form and the first sandbox call review. A solo operator running the template end to end can onboard four clients on a Tuesday and still have two hours left for support. The same operator on the unstructured 10-hour workflow caps out at one client per day on a clean week and zero on a rough one.

Is the template publicly available?

Yes. The Hermes Founders' Beta workspace ships pre-loaded with the canonical snapshot, the intake form schema, the call-review queue presets, the post-call webhook payload, and the A2P submission template. Apply at /beta. The template is also exportable as JSON for operators who want to fork it into their own workspace conventions before importing.

Where this leaves you

The 10-hour onboarding is not a skill problem. It is a tooling problem. The operators who solve it look the same from the outside as everyone else, but they ship four clients on a Tuesday and read intake forms with a coffee in hand. The snapshot is the leverage. The intake form is the gating mechanism. The website scrape is the compounding effect. The 90 minutes is the result. Build the template once. Use it every client. The first month-end you do not stay up until 2am wiring up a new workspace is the proof.

By builders, for builders. We rebuilt this template three times before it stopped breaking on the fifth client. The version you get inside the Hermes workspace is the version that survived contact with operators who do not have time to debug their own onboarding kit.

next step

Run the 90-minute template on your next client

Apply to the Founders' Beta. Workspaces ship pre-loaded with the canonical snapshot, the intake-form schema, the call-review presets, and the A2P submission template. Onboard your next voice client in a single working block.

Apply for the Founders' Beta Compare Hermes vs Synthflow

Alfredo Romero is CEO of Hermes, the voice infrastructure platform for AI agencies. Connect on LinkedIn.

written by

Alfredo Romero

CEO and Co-Founder, Hermes

Alfredo runs sales, operations, and strategy at Hermes. Before founding Hermes he ran agencies for nine years and spent the last three building the AI voice operations side. He writes the operator playbook from real builds, not theory.

LinkedIn ↗X (@buildwithhermes) ↗About the founding team →