Sign up for our daily and weekly newsletters to stay updated with the latest industry news and exclusive content on AI technologies. Learn More
Groq and PlayAI have announced a collaboration to introduce Dialog, an advanced text-to-speech model, through Groq’s high-speed inference platform.
The partnership merges PlayAI’s voice AI expertise with Groq’s specialized processing infrastructure, resulting in one of the most natural-sounding and responsive text-to-speech systems available in the market.
“Groq offers a comprehensive, low-latency system for automatic speech recognition (ASR), GenAI, and text-to-speech, all in one place,” stated Ian Andrews, Chief Revenue Officer at Groq, in an exclusive interview with VentureBeat. “With Dialog now operating on GroqCloud, customers no longer need multiple providers for a single use case — Groq provides a one-stop solution.”
Groq powers first Arabic voice AI, expanding Middle East tech presence
Dialog is noteworthy for its availability in both English and Arabic, with the Arabic version being the first voice AI specifically designed for the Middle East region. The inclusion of Arabic as one of the initial offerings was strategic for both companies.
“Arabic is the fourth most spoken language globally — by partnering with PlayAI to offer an Arabic TTS model, Groq is unlocking a key global market and enabling broader access to fast AI inference,” shared Andrews with VentureBeat.
The companies claim that their solution addresses key deficiencies in existing voice AI technologies, particularly in natural speech patterns and response speed. According to benchmark testing by third-party evaluator Podonos, Dialog was preferred by users at a rate of 10:1 compared to ElevenLabs v2.5 Turbo and over 3:1 against ElevenLabs Multilingual v2.0.
Innovative ‘adaptive speech contextualizer’ transforms conversational AI
Dialog stands out for its sophisticated approach to context. Instead of treating each vocalization as an isolated event, the system maintains awareness of the entire conversation flow.
“We developed a novel architecture known as an ‘adaptive speech contextualizer‘ (ASC), which enables the model to leverage the full context and history of a conversation,” explained Mahmoud Felfel, co-founder and CEO of PlayAI, in an interview with VentureBeat. “This ensures that each response isn’t just a standalone output; it is enriched with appropriate prosody, tone, and emotion reflecting the conversation flow.”
For enterprises looking to implement conversational AI, latency has been a persistent challenge. Groq’s specialized Language Processing Units (LPUs) appear to offer a significant advantage in this area.
“Based on initial internal testing, Groq is achieving up to 140 characters per second on PlayAI’s Dialog model, a substantial improvement compared to the same model running on GPUs at 86 characters per second,” Andrews explained. “This means Dialog generates text up to 10 times faster than real-time.”
Groq secures $1.5 billion Saudi investment to build world-class AI infrastructure
The partnership comes at a time of significant growth for Groq, which recently received a $1.5 billion commitment from Saudi Arabia to support additional infrastructure. The company has established a data center in Dammam, described as “the region’s largest inference cluster.”
“Teaming up with Groq was an obvious choice; they are the industry leader in advanced AI inference infrastructure,” Felfel stated. “Low latency is crucial for TTS and agents. While we have already optimized Dialog for real-time applications, partnering with Groq enables us to deliver the lowest latency voice model in the market.”
The voice AI market has witnessed rapid expansion as businesses seek to automate customer interactions while providing a natural, human-like experience. Applications include customer service, sales automation, voice-overs, and accessibility features for the visually impaired.
Enterprise applications extend beyond traditional customer service use cases
“In addition to customer service, other enterprise use cases involve automating sales and appointment scheduling, onboarding and personal assistants, creating voice-overs for existing content, translating English audio and video content into Arabic, improving website and static content accessibility for the visually impaired, and more,” Andrews added.
For PlayAI, founded by entrepreneurs from the Middle East and North Africa region, the inclusion of Arabic language capabilities holds significant importance.
“As MENA founders, we understand the region’s heavy investment in AI capabilities and infrastructure, as evidenced by investments like Groq, along with leading adoption,” Felfel noted. “Arabic is a global business language and one that we grew up speaking, making it a natural choice as one of our core languages.”
The companies have made Dialog technology accessible through GroqCloud’s tiered service model, offering both free and paid options. This approach allows developers to experiment with the technology before committing to larger implementations.
“GroqCloud provides free and paid plans. Anyone can create an account and generate an API code for free,” Andrews explained. “Our paid Developer Tier is self-serve, allowing individuals with a credit card to sign up themselves.”
As voice interface becomes increasingly crucial for AI systems, this partnership positions both companies to leverage the growing demand for more natural and responsive conversational experiences. By tackling technical challenges like latency and natural speech patterns, Groq and PlayAI have potentially eliminated significant barriers to wider adoption of voice AI in enterprise environments.