Compare V2T Raw Output Text

This workflow is designed to evaluate and compare the performance of voice-to-voice (V2V) bots that communicate in non-English languages. The input CSV must include the following columns:

'question' (the translated question from the audio file in English) and
'reference_answer' (the translated answer in English).
The workflow compares AI-generated answers (copilot responses) to the golden reference answers, scoring them for accuracy and penalizing hallucinations. It aggregates the results to provide an overall performance metric for your V2V AI bots.

9mo ago

Input Data Spreadsheet

Show as Links

Input Data Preview

Here's what you uploaded:

Language Model

Evaluation Prompts

Lower values are better

Aggregations

⚙️ Settings

Run cost = 4 credits

With each run, you agree to Gooey.AI's terms & privacy policy.

Download

Aggregate:Mean

🐞 Debug

🙋🏽‍♀️ Need more help? Join our Discord

Which AI model actually works best for your needs? Upload your own data and evaluate any Gooey.AI workflow, LLM or AI model against any other. Great for large data sets, AI model evaluation, task automation, …

Copilot Builder

Gooey.AI's base AI workflow with built-in RAG, web search, voice understanding of 1000+ languages, code creation + execution, API connections & integrations to create your own WhatsApp, Web, FB and voice AI …

Speech Recognition and Translation

Transcribe mp3s, WhatsApp voice, YouTube videos in 1000+ langs with Meta’s MMS /Seemless M4T, OpenAI's GPT4o Audio LLM, Whisper v2/v3, Azure, Google, GhanaNLP, AI4Bharat & Bhasini ASR models. Optionally …

RAG in the Cloud: Search any document with AI

We've built the best Retrieval Augmented Generation (RAG) as-a-Service anywhere - now with page-level citations! Absorb tables, PDFs, docs, links, videos or audio clips and use our synthetic data maker to …

Compare V2T Raw Output Text

Input Data Spreadsheet

Input Data Preview

Language Model

Evaluation Prompts

Lower values are better

Aggregations

🛠️ Developer Tools and Functions

Aggregate:Mean

Related Workflows

Bulk Runner and Evaluator

Copilot Builder

Speech Recognition and Translation

RAG in the Cloud: Search any document with AI

GET STARTED

LEARN

DEVELOPERS

SOCIAL

CONNECT

EXTRAS