Compare V2T Raw Output Text

This workflow is designed to evaluate and compare the performance of voice-to-voice (V2V) bots that communicate in non-English languages. The input CSV must include the following columns:

  • 'question' (the translated question from the audio file in English) and
  • 'reference_answer' (the translated answer in English).
    The workflow compares AI-generated answers (copilot responses) to the golden reference answers, scoring them for accuracy and penalizing hallucinations. It aggregates the results to provide an overall performance metric for your V2V AI bots.
Input Data Spreadsheet
Loading...
Input Data Preview

Here's what you uploaded:

Loading...


Evaluation Prompts
Loading...

Aggregations

Run cost = 4 credits

With each run, you agree to Gooey.AI's terms & privacy policy.

Download

Loading...


Aggregate:Mean

Loading...

Loading...

Related Workflows