Token Usage & Costs¶
instatollm tracks token consumption and estimated cost for every analysis call.
How Gemini charges for video¶
Gemini tokenizes video content at approximately:
| Content type | Tokens per second |
|---|---|
| Video (visual frames) | ~263 tokens/sec |
| Audio | ~32 tokens/sec |
| Total (video + audio) | ~295 tokens/sec |
Plus the text prompt (~400 tokens) and the model's JSON response (~800–1500 tokens).
Example: 30-second Reel¶
| Component | Tokens |
|---|---|
| Video content (30s × 295) | 8,850 |
| Text prompt | ~400 |
| Total input (prompt_tokens) | ~9,250 |
| JSON response (output_tokens) | ~1,000 |
| Grand total | ~10,250 |
Model pricing¶
| Model | Input (per 1M tokens) | Output (per 1M tokens) |
|---|---|---|
gemini-2.5-pro |
$1.25 | $10.00 |
gemini-2.5-flash |
$0.30 | $2.50 |
gemini-2.5-flash-lite |
$0.10 | $0.40 |
gemini-3.5-flash |
$1.50 | $9.00 |
Cost per reel (30s video)¶
| Model | Input cost | Output cost | Total |
|---|---|---|---|
gemini-2.5-pro |
$0.0116 | $0.0100 | ~$0.022 |
gemini-2.5-flash |
$0.0028 | $0.0025 | ~$0.005 |
gemini-2.5-flash-lite |
$0.0009 | $0.0004 | ~$0.001 |
Estimates only
These are estimates based on published pricing. Actual charges from Google may vary. Always verify with Google's pricing page.
Viewing your usage¶
Your token usage and cost data is available in the dashboard at app.instatollm.com/usage.
The page shows: - Total spend (USD) - Total API calls - Token breakdown (input vs output) - Per-model breakdown with cost percentage
API usage endpoint¶
{
"total_cost_usd": 0.0843,
"total_prompt_tokens": 312400,
"total_output_tokens": 28600,
"total_tokens": 341000,
"total_calls": 34,
"by_model": [
{
"model": "gemini-2.5-pro",
"calls": 34,
"prompt_tokens": 312400,
"output_tokens": 28600,
"total_tokens": 341000,
"cost_usd": 0.0843
}
]
}
Optimizing costs¶
Use gemini-2.5-flash instead of gemini-2.5-pro
For most use cases, Flash gives equivalent quality at 4× lower cost. Switch the model in the dashboard settings.
Shorter videos cost less
A 15-second Reel costs roughly half of a 30-second Reel in input tokens.
Re-query the same Reel
Once analyzed, a Reel stays available for follow-up questions for 48 hours. Future re-query features will charge only for the new question, not the full video analysis again.