Speech Recognition and Translation

Transcribe mp3, WhatsApp audio + wavs with OpenAI's Whisper or AI4Bharat / Bhashini ASR models. Optionally translate to any language too.

Audio Files

Submit Links in Bulk

ASR Model

Whisper Large v2 (openai)

Spoken Language

English | en

⚙️ Settings

Run cost = 2 credits (1 credit for 12.5 words ≈ 0.08 per word)

By submitting, you agree to Gooey.AI's terms & privacy policy.

Transcription

Generated in 3.4s on

...

ℹ️ Details

🙋🏽‍♀️ Need more help? Join our Discord

Gooey.AI's Copilot is the best chatbot builder anywhere, combining your choice of LLMs (GPT4o, Gemini, Claude3, Mixtral or LLaMA3), knowledge docs from any link or doc/PDF (with table support!), [speech …

Lipsync with Text-to-Speech

Create realistic lipsync videos with custom voices. Just upload a video or image, choose a voice from Google, OpenAI or bring your own voice from Eleven Labs to generate amazing videos with the Gooey.AI …

Compare AI Voice Generators

Input your text, pick a voice & a Text-to-Speech AI engine to create audio. Compare the best voice generators from Bark/Suno, …

Compare LLMs: GPT4o, Claude3, Gemini 1.5 Pro, LLaMA3 vs Mixtral

Which language model works best for your prompt? What are the biases inherent in each? Compare LLaMA2, Gemini, Mistral, OpenAI GPT-4 engines with more LLMs being added each month.

Speech Recognition and Translation

Audio Files

ASR Model

Spoken Language

🧩 Functions

Translation Model

Output Format

Related Workflows

Copilot Builder

Lipsync with Text-to-Speech

Compare AI Voice Generators

Compare LLMs: GPT4o, Claude3, Gemini 1.5 Pro, LLaMA3 vs Mixtral

GET STARTED

LEARN

DEVELOPERS

SOCIAL

CONNECT

EXTRAS