voice-ai tools

Deepgram

asrfreemiumFeatured

AI Speech Recognition API focused on speed, accuracy, and transcription of real-time audio streams.

Whisper (OpenAI API)

asrpaidFeatured

OpenAI's high-quality transcription and translation API, excellent for batch processing and complex audio.

Wellsaid Labs

ttspaidFeatured

AI voice synthesis tool focused on brand consistency and high-quality, clean voiceovers for enterprise use.

ElevenLabs

ttsfreemiumFeatured

4.9

Leading Text-to-Speech and Voice Cloning platform with high emotional range and fidelity.

Retell AI

agentfreemiumFeatured

Real-time voice agents platform optimized for low latency and smooth conversational flow over phone lines and web.

All tools (57)

Rasa

agentfreemium

4.2

Open-source framework for building contextual AI assistants with sophisticated dialogue management.

Play.ht

AI Voice Generator and voice cloning tool, focusing on high-quality synthetic media and audio articles.

Speechmatics

Global ASR platform known for high accuracy across a vast array of accents and challenging audio.

Murf AI

A popular text-to-speech studio used for professional voiceovers, e-learning, and commercial content.

5 Languages

Deepdub

dubbingpaid

AI-powered localization service for film, TV, and gaming, maintaining the original actor's voice tone and cadence.

5 Languages

Lyrebird (Descript)

cloningfreemium

Voice cloning technology integrated within the Descript editor, allowing text-to-speech editing of recorded audio.

Amazon Polly

AWS's Text-to-Speech service with high-quality neural voices and SSML control.

Vonage Voice API

A versatile communications API for making and receiving voice calls, focusing on call control and webhooks.

LiveKit

agentfreemium

Open-source WebRTC platform for real-time video/audio, often used to build custom low-latency voice agents.

Gladia

4.2

Fast ASR service focused on real-time transcription and multilingual low-latency use cases.

iSpeech

3.9

A simple, clean API for text-to-speech, popular for accessibility and content reading apps.

Voicify

Platform for creating and deploying voice experiences across platforms like Alexa, Google Assistant, and custom apps.

Meta Voicebox

cloningfree

4.9

State-of-the-art Generative AI model for speech synthesis and voice editing (research tool, not fully API'd).

Lovo AI (Genny)

AI voice generator specializing in realistic voiceovers for marketing, video, and e-learning.

Acapela Group

3.8

Long-standing TTS provider focusing on accessibility, branded voices, and embedded solutions.

Twilio Voice API

4.8

Programmable telephony API that connects software to the PSTN (phone lines) and SIP endpoints, essential for voice agents.

Coqui TTS

ttsfree

Open-source toolkit for Text-to-Speech (TTS) and voice cloning, focused on research and customization.

Google Dialogflow CX

Google's advanced conversational AI platform for designing complex, multi-turn virtual agents.

Dubbing AI

dubbingfreemium

Simple, fast AI dubbing tool for content creators and small businesses, focused on quick turnaround.

Amazon Transcribe

AWS's scalable, managed transcription service for both real-time and batch processing of audio/video.

Voicemaker

3.7

Web-based TTS tool with a focus on speed and ease of use for quick voice generation.

IBM Watson Speech to Text

3.9

IBM's cognitive service for converting speech to text, with strong customization for domain-specific vocabulary.

Pindrop

Voice biometrics and fraud detection platform, crucial for securing voice transactions in contact centers.

Loqui.tech

dubbingfreemium

3.9

AI-powered dubbing service focused on e-learning and internal corporate communication video content.

Krisp

AI-powered noise, voice, and echo cancellation technology, often integrated into real-time voice apps to improve ASR input.

Synapse AI

3.8

Agent platform for building virtual assistants optimized for field service and technical support via voice.

Voicera

Meeting transcription and note-taking platform, focused on extracting insights and action items from conversations.

Voice AI (Custom)

Represents the option of building your own full stack voice agent using foundational open-source models (e.g., Llama, Mistral, VAD).

Synthesia

AI video generation platform where TTS voices are used with human avatars, often for corporate training and explainer videos.

Vidnoz AI Voice Changer

cloningfreemium

3.5

Tool for voice cloning and celebrity/character voice conversion, often used for content creation and fun projects.

OpenAI TTS

OpenAI's high-quality Text-to-Speech API, offering natural and expressive voices optimized for conversation.

AWS Lex

Service for building conversational interfaces for voice and text, using the same technology as Alexa.

Speechify

Leading text-to-speech platform focused on consumption, education, and accessibility.

Respeecher

cloningpaid

4.8

Studio-grade voice cloning and voice-to-voice conversion, used in major film productions and for deepfake prevention.

2 Languages

CereVoice

3.6

Text-to-speech technology with a focus on custom voice creation and embedding TTS into various devices and apps.

Google Translate API

dubbingpaid

Google's core translation engine, widely used as the translation step in any basic dubbing pipeline.

Microsoft Translator

dubbingpaid

Microsoft's translation service, offering real-time translation capabilities for integration into communication apps.

iCloner (Third Party)

cloningfreemium

3.8

A hypothetical third-party tool focusing purely on low-cost, quick voice cloning via a simple API.

2 Languages

Speechelo

3.5

Popular online TTS tool marketed to video creators for generating voiceovers easily without API integration.

Amazon Connect

AWS's contact center service, which provides a framework for integrating AI services (Lex, Polly, Transcribe) into customer calls.

Custom Fine-Tuned Whisper

Represents using open-source Whisper and fine-tuning it with domain-specific data for superior ASR accuracy in niche areas.

2 Languages

Microsoft Azure Speech to Text

Microsoft's cloud ASR service, offering strong real-time performance and deep integration with Azure enterprise tools.

Voicely

3.7

TTS tool focused on creating audio for books, articles, and blog content with an emphasis on natural, reading-friendly tones.

Dubverse

dubbingfreemium

Collaborative video dubbing platform that makes the translation and voiceover process simple for teams.

AssemblyAI

Transcription and Intelligence API, providing sentiment, summarization, and topic detection alongside ASR.

Coqui Studio

A web-based studio offering high-quality text-to-speech generation and voice conversion services.

7 Languages

Google Cloud Speech-to-Text

Google's powerful, scalable cloud ASR service, integrated with the wider GCP ecosystem.

Microsoft Azure Text-to-Speech

Enterprise-grade TTS with highly natural voices, supporting custom voice creation and wide language support.

Resemble AI

cloningpaid

4.8

Hyper-realistic voice cloning and synthesis, capable of 'Resemble Fill' for real-time error correction.

Whisper.cpp

asrfree

High-performance C++ port of OpenAI's Whisper model, optimized for fast, on-device transcription.

Vogent

AI Voice Agents platform specializing in phone interactions, IVR navigation, and outbound calling campaigns.