Google

Gemini 3 Flash (Audio)

Googlegemini-3-flash-audio

Audio understanding. Hears tone, music, SFX, language, speaker emotion — beyond pure transcription. Inline payload up to 30 min.

Best for: Audio scene understanding, mood/emotion detection, music recognition

At a glance

Price

$0.0006 / min

Speed

fast

Quality tier

frontier open

Per minute of audio processed. Min billable: 1 minute.

Capabilities

Audio input

Accepts audio uploads for transcription or scene understanding

Quick start

Get API Key
curl https://kymaapi.com/v1/chat/completions \
  -H "Authorization: Bearer $KYMA_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-3-flash-audio",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

Details

CreatorGoogle
Model IDgemini-3-flash-audio
Quality tierfrontier open
Cost tiercheap
Input modalityAudio
Output modalityText

Try Gemini 3 Flash (Audio) now

$0.50 free credits on signup. No credit card required.