AI Voice Agent for Postpaid Collections at a Gulf Telecom Operator — 58% Cost Per Contact Reduction
An Arabic-and-English voice agent replacing the operator's outbound collections dialler — natural conversation, payment-link orchestration, full compliance recording.
The challenge
The operator — a Gulf-region postpaid-heavy mobile operator with approximately six million postpaid subscribers — was operating one of the largest outbound collections call centres in the region. Every month, the operator was dialling roughly four hundred thousand customers in early-stage delinquency (one to thirty days past due) with the simple objective of either collecting payment or capturing a credible promise-to-pay date.
The collections call centre employed approximately 180 outbound agents working in shifts to cover the regulator-permitted calling window (typically 10am to 8pm local time). Cost per contact had risen 41% in three years driven by agent wage inflation, telephony costs and a stubbornly low contact rate — only about 38% of dials resulted in a live conversation. The operator's CFO had set a target of cutting cost per contact by 40% without reducing collections yield.
The operator had evaluated outsourcing the collections work but had concluded that the regulatory and brand risks were unacceptable — early-stage collections conversations frequently touched on sensitive territory (job loss, family illness, fraud concerns) and the operator's compliance team had repeatedly intervened in agent training to enforce conversational standards. A previous attempt with a basic IVR-based collections workflow had achieved 12% promise-to-pay versus 38% for live agents and had been discontinued.
The approach
We deployed an AI Voice Agent (Vb, our voice-bot product) as the front-end for the operator's outbound collections workflow. The brief was specific: not to replace the entire collections team, but to handle the largest segment — early-stage, low-amount, repeat-late-payer cases — at meaningfully lower cost per contact, while preserving conversation quality.
Phase one focused on segment design. We worked with the operator's credit team to identify the customer segments that the voice agent could handle effectively: customers in the one-to-thirty-day delinquency band, with outstanding balances below a defined threshold (approximately 60% of total dial volume), with no active dispute or fraud flag, and with at least one prior payment history. This segment was assigned to the voice agent; everything else continued to go to live agents.
Phase two was the voice-agent build. The agent conducts the conversation in either Arabic or English, detected automatically from the customer's first response. The agent's conversation flow includes account validation ("can I confirm I'm speaking with the registered subscriber"), context-setting ("I'm calling about your account balance of..."), outcome capture (full payment, partial payment, promise to pay, dispute, hardship), payment-link delivery (the agent sends an SMS payment link mid-call), and compliance disclosures (recording notice, regulatory script).
The conversation model is built on top of a fine-tuned Llama 3 model with a specialised speech-to-text and text-to-speech layer trained on Gulf Arabic dialect. The model is constrained to a finite conversation graph — it cannot 'hallucinate' new offers or commitments — but is free to vary phrasing, handle interruptions, manage objections and pick up where it left off if the customer pauses or asks for clarification. Calls that go off-script (legal escalations, fraud reports, hardship cases) are seamlessly transferred to a live agent with full conversation context.
The pre-built building blocks
Rather than commission a ground-up build, the engagement leaned on MindMap's pre-built accelerator library — production-tested components that compress what would otherwise be a six-to-nine-month build into weeks.
Voice Bot
Outbound voice agent — Arabic + English, in-tenant inference
Guardrail System
Compliance-script enforcement and content filtering
Sentiment Analyzer
Real-time tone detection for escalation routing
Compliance Monitor
Call recording, transcription, compliance scoring
The architecture
The Voice Bot stack runs in the operator's private cloud tenant inside Azure UAE, with full data residency in-region. Voice is processed end-to-end inside the operator's environment — no audio is sent to any external provider.
The telephony layer uses a SIP integration into the operator's existing Genesys outbound dialler. The dialler continues to manage the dial campaign (compliance windows, contact attempts, hold patterns) and hands each connected call to the Voice Bot via a real-time SIP bridge.
The speech-to-text layer uses a fine-tuned Whisper Large v3 model with a custom Arabic dialect adapter trained on roughly 1,200 hours of in-domain audio — collections conversations from the operator's own call recordings, with full PII redaction. Latency is critical: the speech-to-text round trip stays below 250ms to preserve conversational flow.
The reasoning layer uses Llama 3 70B for the conversation graph traversal, prompted with the customer's account context (balance, payment history, prior commitments, dispute flags) and constrained by a JSON-schema-validated output that enforces what the agent can and cannot say. A separate guardrail layer monitors every model output before TTS and blocks any utterance that includes prohibited content (specific commitments outside the operator's collections policy, references to legal action, or any prohibited language patterns the compliance team has defined).
Text-to-speech uses a custom-trained voice fine-tuned on a professional voice actor — Arabic and English variants both available. The voice is intentionally not designed to be indistinguishable from a human; the operator's compliance team requires that the agent identify itself as an automated assistant during the opening disclosure.
Every call is recorded in full, transcribed, scored against the compliance script, and stored with seven-year retention. The compliance team can query and audit any conversation; conversations that breach any of 24 defined compliance rules are flagged for review within minutes of call completion.
The numbers behind the story
Cost per contact has dropped 58% in the segments handled by the voice agent. Total outbound contact volume has increased from approximately 400,000 to 720,000 calls per month, because the voice agent can run during shoulder hours when staffing a human team would be uneconomic. Live-agent capacity has been redirected to higher-value mid-stage and late-stage collections cases.
Promise-to-pay rate on voice-agent-handled calls is 44%, versus 38% on the previous live-agent baseline for the same customer segment. Same-call payment capture (where the customer pays via the SMS link during the call) is 19%, contributing meaningfully to first-call resolution.
Compliance metrics have improved. The voice agent's adherence to the compliance script is, by construction, 100%; the previous live-agent compliance score was 91% on the operator's quality-assurance sample. Regulator-led audits over the first nine months of operation have not raised a single compliance finding against voice-agent-handled calls.
An unexpected outcome: customer-reported experience on the voice-agent calls has been more favourable than on human-agent calls, on the operator's post-call CSAT survey. The team's working hypothesis is that the voice agent's emotional neutrality — it doesn't get impatient, doesn't escalate vocally, and doesn't carry over fatigue from prior calls — works better for what is, fundamentally, an unpleasant conversation for the customer.
“We had clear regulatory red lines about audio leaving our environment, and we needed a voice-AI that actually spoke Gulf Arabic rather than a textbook version of it. MindMap delivered both. The voice agent now handles more outbound calls than our entire live-agent team — and the customer satisfaction scores are, surprisingly, better than for the human calls.”— Director, Customer Care· Gulf Telecom Operator
Why MindMap was chosen
The operator had received proposals from two global contact-centre AI vendors, both of whom proposed cloud-hosted voice-AI platforms that would have required customer audio to leave the operator's regulated environment. The compliance team rejected both proposals on data-residency grounds.
MindMap proposed a fully in-tenant deployment, with the entire voice stack — speech-to-text, reasoning, text-to-speech, and call recording — running inside the operator's Azure UAE environment. We had a comparable deployment running at a regional bank that the operator's compliance team was able to reference.
The Gulf-Arabic dialect handling was the second factor. The global vendors' Arabic models were trained primarily on Modern Standard Arabic and Egyptian dialect — neither of which is what the operator's customers actually speak. Our fine-tuning approach, with in-domain audio from the operator's own call corpus, produced materially better speech-to-text accuracy on the dialect.
Third, our willingness to operate inside the operator's existing Genesys dialler infrastructure — rather than requiring the operator to migrate to a new contact-centre platform — meant the operator did not have to disrupt its existing collections operation during the rollout.
Related deployments
Telecom WhatsApp Self-Service
ChatNext deployed across SIM, billing, and bundle management — bilingual and integrated with the carrier's BSS stack — deflecting 44% of inbound contact.
Voice-First African Self-Service
Voice Bot + ChatNext reached 22M subscribers in 6 local African languages — 48% of voice-based customer interaction now AI-handled.
Network Anomaly Detection
Anomaly Detector + AI Ops platform gave the NOC 4-hour-earlier detection of customer-impacting incidents — slashing MTTR by 56%.
Want an outcome like this?
Start with a 2-week AI Readiness Sprint. We deliver a prioritised use-case backlog and business case grounded in what's actually buildable with our accelerator library.