Video

Find Event

Auto-tag Camera Shots

Automatically categorize shots is vital for streamlining the editing process and improving content discoverability.

+ Copy this ability

eyepop.find-events.identify-shot-type:latest

Get this prompt

Input

Video

Output

Wide, Close_up, Medium

Image size

640x640

Model type

EyePop.ai VLM

FPS

Code Example

https://github.com/eyepop-ai/abilities-hub/tree/main/identify-shot-type

How It Works

Prepping and organizing video footage for post-production requires being able to quickly identify specific video shots and angles effectively. However, manually reviewing and tagging hours of raw footage to locate specific shot types is inefficient and time-consuming.

Being able to automatically categorize shots is vital for streamlining the editing process and improving content discoverability. The Find Events task on the Abilities tab can act as an automated cinematography assistant, identifying whether a video segment is a wide, medium, or close-up shot and locating those occurrences throughout the timeline.

For example, a specific segment of film footage should be flagged with the label wide if it captures the entire scenery or environment to establish the setting. In contrast, a segment should be flagged as close_up if it focuses tightly on one particular subject or detail, such as a person's face, to convey emotion or intricate detail.

Our expected inputs are videos, and the expected output will be the timestamps identifying exactly when each shot type, wide, medium, or close_up, occurs throughout the footage.

‍

UI Tutorial

Step 1: Create an Ability

Go to the Abilities tab and select the button Create Ability. Get early access to Abilities here >

Fill out basic information about the ability such as its name and the description of the task itself. Since we are classifying events in a video, select the Task Type as Find Events.

‍

Step 2: Task Configuration

To configure the task, we need to select a dataset for the specific task. If you have already uploaded your videos in a dataset simply select the name of your dataset. However, if you haven’t already done so then select <New Dataset> and upload your videos, label them by identifying each type of shot in the videos, and create the labels wide, close, and medium in Event Names.

‍

Step 3: Configuration

Our next step is to configure the prompt, select the model, and image size. For this use case, we recommend using the below prompt and settings for highest accuracy and best results.

‍

Prompt:

Analyze the provided image/video frame and categorize the camera shot type as wide OR medium OR close_up.

Read the following definitions carefully before making your decision:

wide: This shot captures the entire... Get early access to Abilities here >

‍

Step 4: Run Evaluation

To check how well the prompt does against the dataset, our next step is to run the evaluation. If needed, review the examples in your dataset to ensure all necessary images can be used in the evaluation.

‍

Step 5: Check Evaluation

All evaluations can be reviewed in the Abilities tab by clicking the dropdown arrow next to the associated ability-alias. Evaluations can take around 15-20 minutes to complete based on the size of the dataset.

In addition to the performance, recall, and precision percentages on the abilities tab, you can see a visualization of what the model predicted by revisiting the dataset. Click on the three dots and select “Go to reference dataset”.

Select one of the videos in the dataset and click on the review button.

After running the evaluation you can see what the model labelled as medium/close_up/wide and compare it to what you labelled. With this, you can improve your prompts and thus improve your accuracy.

‍

Tips for Accuracy

Define "Edge Cases" The key to high accuracy is a deep understanding of your specific acceptance criteria. In a marketplace context, the line between "acceptable" and "rejected" can be thin. You must be explicitly clear about where that line is drawn.

Be Descriptive with Labels When you have multiple labels describing different things, be as descriptive and as specific as possible.