English Audio to Text Benchmark | Gemini 3 Pro, GPT‑5.2, Llama 4, DeepSeek v3.2

(Updated Dec 2025)
This workflow benchmarks multiple English speech‑to‑text pipelines on the same audio dataset. It compares GPT‑4o audio, realtime GPT, GPT‑5.2, Gemini 3 Pro, Llama 4, and DeepSeek‑32 based transcription workflows.

Workflows covered
Each row of your dataset (Google Sheet) with an input_audio URL is sent to all of these Gooey workflows:

GPT‑4o Audio – English ASR

URL: https://gooey.ai/copilot/0-gpt-4oaudio-english-a2t-cumo9m8mbssd/
Direct audio→text via GPT‑4o’s native audio understanding.
GPT Realtime – Streaming English Transcription

URL: https://gooey.ai/copilot/1-gpt-realtime-english-a2t-xxisb52q5i8l/
Simulates realtime / streaming ASR for latency‑sensitive use‑cases.
GPT‑5.2 – English Audio→Text Pipeline

URL: https://gooey.ai/copilot/2-gpt-52-english-a2t-j48rl7w31egx/
Uses GPT‑5.2 as the primary model for transcription and light cleanup.
Gemini 3 Pro – English Audio→Text

URL: https://gooey.ai/copilot/3-gemini3pro-english-a2t-22ehqxbjujdn/
Google Gemini 3 Pro based transcription workflow for English audio.
Llama 4 – English ASR + LLM Post‑Processing

URL: https://gooey.ai/copilot/4-llama4-english-a2t-i2wytf132u11/
Llama 4 used for transcription and/or normalization of English speech.
DeepSeek‑32 – English Audio Transcription

URL: https://gooey.ai/copilot/5-deepseek32-english-a2t-iy8blj05sfgr/
DeepSeek‑32 model pipeline for English audio→text.

Gooey Workflows
Input Data Spreadsheet
Loading...
Input Columns

Loading...



Evaluation Workflows


Run cost = 1 credits

With each run, you agree to Gooey.AI's terms & privacy policy.

Run: Compare Output Text (from input_audio) Download

Loading...


Aggregate:Mean

Loading...

Loading...