In this bulk run example, we compare 6 different Gooey.AI speech recognition + translations workflows (https://gooey.ai/speech), each of which uses a different AI speech model:Whisper Telugu BhashiniWhisper v2Google ChirpAzureMeta Seemless 4MTAI4Bharat.org Conformer Hindi
We then evaluate them at https://gooey.ai/eval/ (where you can see what works best....)
Run
Examples
API
Provide one or more Gooey.AI workflow runs.You can add multiple runs from the same recipe (e.g. two versions of your copilot) and we'll run the inputs over both of them.
Search any document with GPT4
Why is the transformer architecture expressive in the …
➕ Add a Workflow
Upload or link to a CSV or google sheet that contains your sample input data.For example, for Copilot, this would sample questions or for Art QR Code, would would be pairs of image descriptions and URLs.Remember to includes header names in your CSV too.
Loading...
Submit Links in Bulk
Please select which CSV column corresponds to your workflow's input fields.For the outputs, select the fields that should be included in the output CSV.To understand what each field represents, check out our API docs.
Search Query
question
Output Text
Run URL
🤲 Show All Columns
Keyword Query
———
Documents
Max References
Max Context Words
Scroll Jump
Doc Extract Url
Embedding Model
Dense Embeddings Weightage
Task Instructions
Query Instructions
Selected Model
Avoid Repetition
Num Outputs
Quality
Max Tokens
Sampling Temperature
Citation Style
Variables
Price
Run Time
Error Msg
References
Final Prompt
Final Search Query
(optional) Add one or more Gooey.AI Evaluator Workflows to evaluate the results of your runs.
➕ Add an Eval
⚙️ Settings
Run cost = 1 credits
🏃 Submit
By submitting, you agree to Gooey.AI's terms & privacy policy.
https://storage.googleapis.com/dara-c1b52.appspot.com/daras_ai/media/584a47be-aa26-11ee-8ddd-02420a000148/bulk-runner-0-0-4.csv
Generated in 48.4s on
...
ℹ️ Details
Building complex AI workflows like copilot) and then evaluating each iteration is complex.Workflows are affected by the particular LLM used (GPT4 vs PalM2), their vector DB knowledge sets (e.g. your google docs), how synthetic data creation happened (e.g. how you transformed your video transcript or PDF into structured data), which translation or speech engine you used and your LLM prompts. Every change can affect the quality of your outputs.
To get started:
🙋🏽♀️ Need more help? Join our Discord