operations · tool sprawl · refactor guide

From Zapier+Make+GHL Spaghetti to Native Workflows: A Refactor Diary

By Alfredo Romero, CEO, HermesMay 31, 202614 min read

The typical AI voice agency stack looks like this: Retell or Vapi for the voice engine, GoHighLevel for CRM and pipeline, Zapier to connect them, Make for the workflows that Zapier got too expensive for, Twilio for phone numbers, and Stripe for billing. Six tools, five different dashboards, and somewhere between three and seven separate invoices every month. When everything works, the stack is invisible. When one webhook goes silent at 2am, you spend the next morning reconstructing what happened across four different logs.

This is the refactor diary. Not a comparison post, not a pitch. The actual breakdown of what a fragmented agency automation stack costs, what fails first when you start scaling, and the specific steps to consolidate it without taking a client offline in the process. By builders, for builders.

Every agency owner who has gotten past their fifth client has lived a version of this story. The stack that felt fine at two clients turns into a support burden at five. At ten clients, it becomes a full-time maintenance job.

What does a typical AI voice agency automation stack actually look like?

Before the refactor, document what you are actually running. Most agency owners do not have a complete inventory. They added tools one at a time as each client demanded something new, and the stack grew by accumulation rather than design.

The canonical stack, drawn from the Trillet platform comparison and confirmed across every GHL Agency Owners Facebook group thread from the past 12 months:

Voice engine: Retell AI or Vapi, connected to your agents via API or a webhook layer.
CRM and pipeline: GoHighLevel, managing contacts, opportunities, and appointment calendars.
Primary connector: Zapier, catching post-call webhook events and routing data into GHL contact fields, triggering follow-up sequences.
Secondary connector: Make, handling the more complex multi-step flows or the workflows you moved off Zapier when the task count got expensive.
Telephony: Twilio, managing the phone number pool and SIP routing.
Billing: Stripe, invoicing clients directly, separate from your platform costs.

That is six tools minimum. Per Trillet's 2026 platform teardown, this configuration generates up to five separate invoices per deployment. Each invoice is billed on a different cycle, in a different currency format, with a different usage model. At month end, reconciling what you actually spent versus what you billed clients is an hour-long spreadsheet exercise at minimum.

Why does the Zapier+Make+GHL stack get so messy so fast?

The answer is failure surface. Every integration between tools is a potential failure point that requires its own monitoring and its own debugging path. When a client call does not create a CRM contact, the problem could be in the voice platform, the Zapier webhook, the Make scenario, or the GHL workflow. Diagnosing it means checking four dashboards sequentially.

"More maintenance, more failure points, more debugging, and more time spent keeping the plumbing alive." [Sympana, Best GHL AI Voice Agent Integrations 2026]

The compounding effect is real. According to Corcava's 2026 analysis of tool sprawl for service businesses, employees who manage multi-tool stacks switch applications approximately 1,200 times daily, with each context switch costing an average of 9.5 minutes of focused work. That compounds to 70–85 hours of lost productivity per month per operator, which is nearly two full work weeks annually spent on tool overhead rather than client delivery.

For a solo agency owner managing 5 clients, that overhead is not abstract. It is the hour you spent on a Tuesday morning figuring out why Make stopped passing call disposition data to GHL, instead of spending that hour on a client call or closing a new account.

The problem compounds specifically with Zapier because its pricing model punishes the behavior that voice AI automations require. Voice AI workflows are trigger-dense: every completed call fires a webhook, every appointment confirmed triggers a sequence, every contact update cascades to a follow-up. In 2026, Zapier's free plan was reduced from 750 to 100 tasks per month, and AI-heavy usage inflates task counts by 4x according to internal Zapier benchmarks from 500+ enterprise audits. An agency running 500 calls per month with post-call CRM updates burns through 2,000+ Zapier tasks before they have processed a single follow-up sequence.

What does this stack actually cost per month?

The number most agency owners quote is their GHL plan. The number that matters is the all-in monthly burn across every middleware tool. Here is a realistic breakdown for a mid-tier agency running 5–10 clients with active post-call automations.

Tool	Purpose	Typical cost/mo	Replaceable natively?
Zapier Pro	Post-call webhook routing to GHL	$49–$74/mo	Yes — GHL native workflows
Make Core/Pro	Complex multi-step flows, GHL overflow	$10–$19/mo	Mostly — keep for external triggers
GoHighLevel	CRM, pipeline, sequences	$97–$297/mo	N/A — the platform itself
Retell / Vapi	Voice engine	$0.07–$0.33/min	Replace with native platform
Twilio	Telephony / phone numbers	$15–$60/mo	Bundled on some platforms
Stripe	Client billing	2.9% + $0.30/txn	N/A — keep for client payments

The middleware layer (Zapier + Make) alone runs $59–$93/month for a typical 5–10 client agency. That is $708–$1,116 per year for tools whose sole purpose is routing data between other tools. They produce no client value directly. They exist because the underlying platforms do not talk to each other natively.

Per Viirtue's 2026 MSP billing guide, at 50 clients an agency managing a fragmented stack faces "5x more vendor management overhead" compared to a consolidated platform, plus $500–$3,000/month in lost margin from the compounding cost of per-minute overages, middleware subscriptions, and developer time spent on integration maintenance.

This is the five-invoice problem in operational form. The invoices are the symptom. The actual cost is founder-hours diverted from growth to maintenance.

What breaks first when you try to scale with this stack?

Based on the patterns we see repeatedly, three things fail first when a spaghetti-stack agency tries to add its fifth, sixth, or seventh client.

Zapier task count hits plan ceiling. You get the email from Zapier at the worst possible time: mid-month, when a client campaign is live and the automations that trigger follow-up sequences after calls have gone silent. You either upgrade ($49 to $74 to $103/month in rapid succession) or you triage which Zaps to pause. Neither option is good in the middle of a client week. Zapier's 2026 pricing structure means Professional plan overages cost $0.02 per task above the included limit, and AI-integrated workflows inflate task counts unpredictably.

A Make scenario breaks silently. Make's error handling defaults to stopping the scenario without alerting you unless you have explicitly set up error routes or email notifications. The most common failure is a data mapping error when a field in your voice platform response payload changes slightly after an update. The scenario stops. The CRM stops updating. You find out when a client asks why their follow-up sequence has been quiet for three days.

Client isolation disappears. When all your clients share the same Zapier account, the same Make scenarios, and the same GHL sub-accounts under a single agency account, offboarding one client requires surgical extraction. Shared automations have to be cloned, modified, and tested before you can remove the departing client's data without breaking everyone else. The scaling wall post covers this in detail, but the short version is: a stack built for one client does not isolate cleanly for five.

"At 50 clients, agencies are looking at $500–$3,000/month in lost margin plus 5x more vendor management overhead from a fragmented stack." [Viirtue, AI Voice Agent Billing Guide 2026]

How does the refactor to native workflows actually work?

The refactor is a four-phase process. Do not attempt to do all four phases simultaneously. The agencies that run into trouble during a stack refactor are the ones that tried to cut over everything in a weekend.

Phase 1: Audit and map (half a day). List every active Zapier Zap and Make scenario. For each one, answer three questions: what triggers it, what does it do, and what downstream system does it feed. Most agency owners discover 10–15% of their Zaps are either disabled or no longer used by any live client. Delete those first. You are not refactoring dead automations, you are eliminating them.

Phase 2: Classify (1–2 hours). For each remaining Zap and scenario, classify it as GHL-replaceable or external-only. GHL natively handles: post-call webhook receivers (via inbound webhooks in GHL workflows), contact field updates, tag assignments, sequence enrollment, appointment confirmations, SMS/email follow-ups, and pipeline stage transitions. It cannot natively replace: Shopify order triggers, Google Sheets row events, Slack channel notifications, and most non-GHL e-commerce events. GrowwStacks' 2026 audit found that for a typical GHL marketing agency, 70–80% of Zapier usage is GHL-replaceable. Voice agencies tend to land slightly higher, around 80–90%, because the primary trigger (call completed webhook) routes natively inside GHL.

Phase 3: Rebuild in parallel (2–5 days). Rebuild each GHL-replaceable workflow inside the GHL Workflow Builder. Do not disable the Zapier version first. Both should run simultaneously against a low-volume test pipeline or a non-critical client for at least 72 hours. Compare outputs. When the GHL workflow produces the same result as the Zapier Zap on 20 consecutive trigger events, turn off the Zap.

Phase 4: Migrate the remainder to Make (1 day). For the workflows that genuinely need an external connector, move them to Make if you are not already using it. Make's Core plan at $9/month handles 10,000 operations, and its per-operation cost is 3–5x lower than Zapier's per-task cost at equivalent workloads. The remaining external-only workflows almost always fit comfortably under Make's free or Core tier.

What does the refactored stack actually look like?

After the refactor, the stack collapses from six tools and five invoices to a smaller, more debuggable configuration.

Layer	Before	After
Automation routing	Zapier + Make (2 tools, 2 invoices)	GHL native + Make free tier
CRM	GHL (same)	GHL (same)
Voice engine	Retell / Vapi (separate API + billing)	Same or native platform
Telephony	Twilio (separate invoice)	Bundled or same
Monthly middleware cost	$59–$93/mo	$0–$10/mo

The practical result: one fewer login to check when a client workflow breaks. One fewer invoice to reconcile at month-end. And crucially, one fewer failure surface during a live client campaign.

What the refactor does not fix is the deeper isolation problem. GHL sub-accounts provide some client separation, but the billing still runs through a single agency GHL plan. Phone numbers require individual Twilio management. Offboarding a client cleanly still requires manual extraction. If you are at 5+ clients and growing, the GHL + Make refactor buys you 12–18 months before the scaling wall returns in a different form. See the 50-client margin math breakdown for what that second wall looks like in numbers.

What concrete steps should I take this week?

Here is the action sequence in priority order.

Pull your last 3 Zapier invoices. Find the actual task consumption per Zap. Most agency owners have at least two Zaps consuming 40%+ of their monthly task budget on workflows that are completely replaceable with GHL native triggers. Identify those first.
Map every Zap to a GHL workflow trigger. If the Zap triggers on a webhook, form submission, or appointment event inside GHL, it is replaceable. If it triggers on an external platform event (Shopify, Google Sheets, Slack), it is not.
Rebuild the top two replaceable Zaps in GHL. Run them in parallel against a test contact or a low-volume client for 72 hours minimum. Verify the outputs match. Then disable the Zapier version.
Move remaining external Zaps to Make Core. At $9/month for 10,000 operations, Make handles what you need at roughly one-fifth the cost of Zapier Professional. The migration from Zapier to Make is straightforward for most webhook-based automations.
Set up Make error notifications. Make's default behavior is to stop silently on error. Go to your scenario settings and configure email alerts on scenario errors before you depend on any Make workflow for a live client.
If you are at 5+ clients, audit whether the real fix is a platform switch, not a refactor. The GHL + Make cleanup is the right move for agencies at 2–4 clients. At 5+, the isolation and billing problems that drove the Zapier sprawl in the first place resurface at the CRM and telephony layer. A platform built natively for multi-client voice agency operations, where each client is an isolated workspace with its own numbers, billing, and agents, solves the structural problem rather than the symptom. That is what the Hermes vs Vapi+GHL stack comparison covers in detail.

Frequently asked questions

Should I cancel Zapier if I move to GoHighLevel native workflows?

For most AI voice agency workflows, yes. GoHighLevel native workflows handle the triggers, conditions, and actions that most agencies route through Zapier at no extra cost on any GHL plan. The cases where you still need Zapier are narrow: external triggers from tools that lack a native GHL connector (specific Shopify events, certain Slack channels, Google Sheets row updates). If you audit your active Zaps, most agencies find 70–80% can be replaced with GHL-native workflows immediately. The remaining 20–30% often run fine on Make's free or Core tier, which costs $9/month and handles 10,000 operations — compared to Zapier's Professional tier at $19.99/month for 750 tasks.

How long does it take to refactor a Zapier+Make+GHL stack to native workflows?

Budget 2–3 days of focused work for a single-operator agency managing 3–5 clients. The audit and mapping phase takes a few hours. Rebuilding each workflow inside GHL natively takes 20–45 minutes per workflow on average once you know what you are replacing. The main time sink is testing, which you should do against a real (but low-volume) client setup before flipping the switch. For agencies with 10+ clients and complex multi-step automations, budget 5–7 business days and run the new stack in parallel for at least 3 days before deprecating the old one.

What Zapier automations can GoHighLevel replace natively?

GHL natively replaces: new contact created triggers, form submission responses, appointment booking confirmations and reminders, post-call data capture (if you use GHL's voice AI or a connected voice platform), pipeline stage change actions, SMS and email follow-up sequences, CRM field updates from webhooks, and tag-based branching logic. It does not natively replace: Shopify order triggers, Google Sheets row creation triggers, Slack-based notifications, Airtable updates, and most non-GHL e-commerce events. Those are the ones you keep on Make.

What is the real monthly cost of a Zapier+Make+GHL automation stack for a voice agency?

The direct cost for a mid-tier agency (5–10 clients, active automations) typically runs $70–$160/month for Zapier alone, plus $10–$20/month for Make, plus whatever GHL tier you are on. That is $80–$180/month just for the middleware layer, separate from your voice platform, Twilio, and Stripe. After the refactor, most agencies land at $0–$10/month for Make (keeping it only for the workflows GHL cannot replace natively) — a $70–$170/month reduction. Over a year, that is $840–$2,040 back in margin.

Does switching to native workflows affect call quality or latency?

Native workflows do not affect call audio quality or latency directly, since those are determined by your voice AI platform (Retell, Vapi, Hermes) and telephony stack (Twilio). What switching does affect is post-call data routing speed. Zapier webhooks introduce 1–5 second delays in most plans depending on trigger polling intervals. Native GHL webhooks and workflow triggers execute in under 500ms. For contact data capture, CRM updates, and follow-up sequences triggered by call completion events, native execution is measurably faster.

What is the biggest risk of refactoring your automation stack mid-client?

The biggest risk is breaking a workflow that runs silently — one that is not obviously client-facing but that a client depends on. Post-call CRM updates, appointment confirmations, and billing triggers are all high-risk categories. Before you deprecate any Zapier or Make workflow, confirm its output destination (which CRM field, which email, which Stripe event it feeds) and trace any client-facing touchpoint downstream. The safest approach: run both stacks in parallel for 72 hours minimum, verify output equivalence, then turn off the old one.

When should I not refactor and just add a native platform instead?

If your stack has more than 30 active Zaps and more than 5 clients, the refactor cost (founder-hours) may exceed the 12-month cost savings from canceling Zapier. In that case, the better ROI is switching to a platform that eliminates the middleware layer entirely — one with native CRM, native campaign orchestration, and native billing built in. That is what Hermes was built to be: the platform where the automation lives alongside the voice agents, inside the same workspace, with no external connectors required for the core loop. See the comparison at the bottom of this post.

The refactor in summary

The spaghetti stack is not a character flaw. It is what happens when you solve real client problems with the tools you had available at the time. The refactor is the exercise of looking at what you built with fresh eyes and asking: which of these connections could be shorter?

For most agencies, the answer is: cancel Zapier, move 80% of those workflows into GHL native, park the remaining 20% on Make for $9/month, and reduce your monthly middleware burn by $50–$85. That is the fast win. It takes a week of focused work and requires zero client downtime if you run parallel testing first.

The deeper win is clarity. When something breaks, you know exactly where to look. One workflow, one log, one fix. That is the real value of the refactor. Not the cost savings, though those are real. The hour you stop losing to debugging spaghetti is the hour you get back for building.

next step

Skip the refactor. Start with a native stack.

Hermes is the platform where the voice engine, CRM, campaign orchestration, and client billing all live natively, inside isolated workspaces per client. No Zapier. No Make. No five invoices. From $149/month. First agent live in 72 hours.

Apply for the Founders' Beta Hermes vs Vapi+GHL stack

Alfredo Romero is CEO of Hermes, the voice infrastructure platform for AI agencies. Connect on LinkedIn.

written by

Alfredo Romero

CEO and Co-Founder, Hermes

Alfredo runs sales, operations, and strategy at Hermes. Before founding Hermes he ran agencies for nine years and spent the last three building the AI voice operations side. He writes the operator playbook from real builds, not theory.

LinkedIn ↗X (@buildwithhermes) ↗About the founding team →