Live Stream
Detect

People + Animals

Detect people and animals in the same frame (images, video, or live streams)

+ Copy this ability

eyepop.animals:latest

Pre-Trained

...Run the full prompt in your EyePop.ai dashboard

Get this prompt

Model type

Pre-trained Model

How It Works

Expand your detection capabilities to include pets and wildlife alongside humans.

This model detects people plus common animal classes and returns structured bounding box coordinates with confidence scores and labels—so you can add context to mixed-use environments, outdoor monitoring, and immersive entertainment experiences.

Use it on images, recorded video, or live streams. No custom training required.

Optimized for:

  • Multi-class detection (people + animals)
  • Mixed indoor/outdoor scenes
  • Frame-by-frame results for video
  • Cloud or On-Prem deployment
  • Fast setup for prototype → production

Why This Model Exists

“Person detection” is a strong baseline. But many real-world scenes aren’t human-only.

If you’re monitoring a property, park, trail, facility, or mixed-use space, you need to know:

  • Is that movement a person or an animal?
  • Is there a pet on-site where it shouldn’t be?
  • Is there wildlife entering a restricted area?
  • Is your system reacting correctly to non-human motion?

Teams typically solve this by combining separate models (person detection + animal detection), which creates friction:

  • Conflicting labels and confidence behavior
  • More infrastructure and more failure points
  • Harder alert logic (“is it a person, a dog, or both?”)
  • Slower iteration when you just need reliable context

This model exists to provide a single, unified output:
people + animals in one pass, with a consistent schema—so your automation, alerts, and analytics can make better decisions immediately.

Key Capabilities

Input Types

  • Single images
  • Video files
  • RTSP / livestream feeds
  • Webcam / IP camera streams

Output

  • JSON with bounding boxes
  • Confidence scores
  • Class labels (person + animal classes)
  • Frame-level detections (for video/streams)

Setup

  • Create account
  • Get API key
  • Send media
  • Receive detections instantly

No training. No labeling. No tuning.

Example Output

{
  "objects": [
    {
      "category": "person",
      "classLabel": "person",
      "confidence": 0.957,
      "x": 862.3,
      "y": 248.6,
      "width": 312.9,
      "height": 684.2
    },
    {
      "category": "animal",
      "classLabel": "dog",
      "confidence": 0.931,
      "x": 512.4,
      "y": 742.1,
      "width": 124.7,
      "height": 138.5
    }
  ],
  "source_width": 1920,
  "source_height": 1080
}

(Update the class labels to match your model’s actual taxonomy—pets only, or pets + wildlife classes.)

Practical Use Cases

Environmental & Wildlife Monitoring

  • Wildlife presence detection near trails, waterways, or protected areas
  • Animal activity monitoring over time
  • Filtering footage to “only frames with animals” for faster review

Security in Mixed-Use Spaces

  • Reduce false alerts caused by pets/wildlife
  • Trigger different workflows for “human vs animal”
  • Perimeter monitoring for rural properties and outdoor facilities

Property & Facility Operations

  • Pet policy monitoring in shared spaces (with appropriate consent/workflows)
  • Animal presence detection near equipment or restricted zones
  • Outdoor camera context for safety teams

Entertainment & Storytelling

  • Automatically tag scenes with people + animals
  • Drive interactive overlays and scene indexing
  • Faster content search and clip selection

Why This Output Matters

People + animal detection adds decision-grade context to motion.

It helps you build logic like:

  • Alert only when person is present (ignore animals)
  • Notify when animal enters a specific zone
  • Track co-presence (person + dog together)
  • Filter or index footage by who/what appears in-frame

All from bounding boxes—without building a complex vision pipeline first.

Deployment Options

EyePop Cloud

  • Scalable
  • Managed infrastructure
  • Best for web apps + fast iteration

On-Premise Runtime

  • Keep video inside your network
  • Lower latency options
  • Works with GPU or CPU environments
  • Ideal for regulated or sensitive environments

Who This Is For

  • Teams monitoring outdoor spaces and mixed-use environments
  • Developers building smarter alerting logic for camera systems
  • Product teams that need pets/wildlife context without custom training
  • Anyone who needs reliable “human vs animal” differentiation fast

Get early access

Want to move faster with visual automation? Request early access to Abilities and get notified as new vision capabilities roll out.

View CDN documentation →