Examples: Eval

Computational Mama aka Ambika

Copilot Evaluator

140 runs

Our general bulk evaluator to compare AI generated copilot answers against a collection of golden Answers.

Documents

🔗bulk-runner-0-1-8.csv

Dev Aggarwal

Low Resource ASR Evaluator

96 runs

Documents

🔗bulk-runner-0-4-0.csv

Gooey.AI

Speech Recognition Model Evaluator

151 runs

This recipe is used with https://gooey.ai/bulk to evaluate the latest private & open source speech recognition models (from Google, Meta, OpenAI and others). It takes a CSV file of golden (aka human provided) translations and compares those against a set of AI created translations to generate scores from 0 to 1. It then takes the mean of the scores to determine which model performed best.

Documents

🔗bulk-runner-0-7-0.csv

Gooey.AI