Auto-tag Camera Shots
Automatically categorize shots is vital for streamlining the editing process and improving content discoverability.
eyepop.find-events.identify-shot-type:latest
Input
Video
Output
Wide, Close_up, Medium
Image size
640x640
Model type
QWEN3 - Better Accuracy
FPS
4
How It Works
Prepping and organizing video footage for post-production requires being able to quickly identify specific video shots and angles effectively. However, manually reviewing and tagging hours of raw footage to locate specific shot types is inefficient and time-consuming.
Being able to automatically categorize shots is vital for streamlining the editing process and improving content discoverability. The Find Events task on the Abilities tab can act as an automated cinematography assistant, identifying whether a video segment is a wide, medium, or close-up shot and locating those occurrences throughout the timeline.
For example, a specific segment of film footage should be flagged with the label wide if it captures the entire scenery or environment to establish the setting. In contrast, a segment should be flagged as close_up if it focuses tightly on one particular subject or detail, such as a person's face, to convey emotion or intricate detail.

Our expected inputs are videos, and the expected output will be the timestamps identifying exactly when each shot type, wide, medium, or close_up, occurs throughout the footage.
UI Tutorial
Step 1: Create an Ability
Go to the Abilities tab and select the button Create Ability. Get early access to Abilities here >

Fill out basic information about the ability such as its name and the description of the task itself. Since we are classifying events in a video, select the Task Type as Find Events.

Step 2: Task Configuration
To configure the task, we need to select a dataset for the specific task. If you have already uploaded your videos in a dataset simply select the name of your dataset. However, if you haven’t already done so then select <New Dataset> and upload your videos, label them by identifying each type of shot in the videos, and create the labels wide, close, and medium in Event Names.

Step 3: Configuration
Our next step is to configure the prompt, select the model, and image size. For this use case, we recommend using the below prompt and settings for highest accuracy and best results.
Prompt:
Analyze the provided image/video frame and categorize the camera shot type as wide OR medium OR close_up.
Read the following definitions carefully before making your decision:
wide: This shot captures the entire... Get early access to Abilities here >

Step 4: Run Evaluation
To check how well the prompt does against the dataset, our next step is to run the evaluation. If needed, review the examples in your dataset to ensure all necessary images can be used in the evaluation.

Step 5: Check Evaluation
All evaluations can be reviewed in the Abilities tab by clicking the dropdown arrow next to the associated ability-alias. Evaluations can take around 15-20 minutes to complete based on the size of the dataset.

In addition to the performance, recall, and precision percentages on the abilities tab, you can see a visualization of what the model predicted by revisiting the dataset. Click on the three dots and select “Go to reference dataset”.

Select one of the videos in the dataset and click on the review button.

After running the evaluation you can see what the model labelled as medium/close_up/wide and compare it to what you labelled. With this, you can improve your prompts and thus improve your accuracy.

Tips for Accuracy
- Define "Edge Cases" The key to high accuracy is a deep understanding of your specific acceptance criteria. In a marketplace context, the line between "acceptable" and "rejected" can be thin. You must be explicitly clear about where that line is drawn.
- Be Descriptive with Labels When you have multiple labels describing different things, be as descriptive and as specific as possible.
Get early access
Want to move faster with visual automation? Request early access to Abilities and get notified as new vision capabilities roll out.