Gemini 3 Flash (Audio)
Google
gemini-3-flash-audioAudio understanding. Hears tone, music, SFX, language, speaker emotion — beyond pure transcription. Inline payload up to 30 min.
Best for: Audio scene understanding, mood/emotion detection, music recognition
At a glance
Price
$0.0006 / min
Speed
fast
Quality tier
frontier open
Per minute of audio processed. Min billable: 1 minute.
Capabilities
Audio input
Accepts audio uploads for transcription or scene understanding
Quick start
Get API Keycurl https://kymaapi.com/v1/chat/completions \
-H "Authorization: Bearer $KYMA_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gemini-3-flash-audio",
"messages": [{"role": "user", "content": "Hello"}]
}'Details
CreatorGoogle
Model IDgemini-3-flash-audio
Quality tierfrontier open
Cost tiercheap
Input modalityAudio
Output modalityText
Try Gemini 3 Flash (Audio) now
$0.50 free credits on signup. No credit card required.