How It Works¶
Overview¶
When you submit a Reel URL, instatollm automatically:
- Downloads the video from Instagram
- Analyzes it using Google Gemini AI — reading video, audio, and on-screen text in a single pass
- Returns a structured JSON description ready to use in any LLM or workflow
The entire process takes 20–90 seconds depending on video length.
Async processing¶
Analysis is asynchronous. When you submit a URL, you get a job ID immediately. You then poll to check when it's done.
Submit URL → { id: 42, status: "pending" }
Poll every 5s → status: "processing"
Poll again → status: "done", result: { ... }
This design means:
- Your request returns instantly — no waiting for the full analysis
- You can submit multiple Reels simultaneously
- Network timeouts are never an issue
What gets extracted¶
Gemini AI reads the full video in one pass — visual frames, audio, and any text on screen:
| What | Where in response |
|---|---|
| What's happening visually | visual.description |
| Where it takes place | visual.setting |
| Objects and people | visual.subjects |
| All text visible on screen | visual.text_on_screen |
| Verbatim speech | audio.transcript |
| Spoken language | audio.language |
| Background music | audio.has_music, audio.music_description |
| Content topics | content.topics |
| Opening hook | content.hooks |
| Call to action | content.call_to_action |
| Key moments with timestamps | content.key_moments |
| Video format & tone | metadata.format, metadata.sentiment |
| Target audience | metadata.target_audience |
| Engagement metrics | platform_meta.like_count, platform_meta.view_count |
Re-queryable¶
Once a Reel is analyzed, the video stays available for 48 hours so you can ask follow-up questions without resubmitting it.
This feature is available in the dashboard — full API support coming soon.