Gemini Audio Analysis — Model Comparison
Test each Gemini model's ability to analyze music structure, detect BPM, and identify sections. (v12 - saved results preserve raw JSON + heatmap)
What we're testing
- Audio perception — can the model "hear" and understand the audio waveform?
- BPM accuracy — how close is the detected BPM to the local beat detector baseline?
- Section detection — does it identify meaningful musical sections (intro, verse, chorus, drop, etc.)?
- JSON reliability — does it return valid, parseable JSON without extra text?
- Latency — how fast does each model respond?
- Cost — estimated cost per analysis based on token usage
- Reasoning quality — do beat_interval and energy values make musical sense?
| Model |
Status |
Latency |
Cost Est. |
BPM |
Sections |
Key Moments |
JSON Valid |
Raw Response |