Live Stream
Detect

Person w/2D Body Points

Pinpoint 2D body points from images, video, and live streams

eyepop.place.holder

Model type

Pre-trained Model

Description

Pinpoint 2D skeletal positions (such as head, shoulders, elbows, hips, knees, and ankles) directly from video, images, or live streams.

Return structured keypoint coordinates (and confidence per point) so you can understand posture, infer motion patterns, and build interactive experiences without collecting training data or tuning a custom model.

This model is optimized for:

  • Reliable 2D pose keypoints across common camera angles
  • Frame-by-frame keypoint output for video
  • Cloud or On-Prem deployment
  • Fast setup for prototype → production

Why This Model Exists

If your product depends on posture, motion, or gestures, you eventually hit a wall: bounding boxes aren’t enough.

You need keypoints—but most pose workflows are slow to stand up:

  • Too many model choices, formats, and skeleton standards
  • Too much engineering just to get stable outputs
  • Too many “works in demo, breaks in production” moments

This model exists to make pose a reliable building block: clean 2D keypoints, confidence per point, consistent output per frame, ready for downstream logic.

Key Capabilities

Input Types

  • Single images
  • Video files
  • RTSP / livestream feeds
  • Webcam / IP camera streams

Output

  • JSON with 2D keypoints (x, y) per person
  • Confidence per keypoint
  • Person-level grouping (keypoints mapped to the same individual)
  • Frame-level results (for video)

Deployment

  • EyePop Cloud
  • On-Premise AI Application Runtime
  • Edge devices with GPU or CPU

Setup

  • Create account
  • Get API key
  • Send media
  • Receive keypoints instantly

No training. No labeling. No model configuration.

Example Output

{
  "keyPoints": [
    {
      "id": 18,
      "points": [
        {
          "classId": 0,
          "classLabel": "nose",
          "confidence": 0.7293,
          "id": 1,
          "x": 205.602,
          "y": 745.589
        },
        {
          "classId": 1,
          "classLabel": "left eye",
          "confidence": 0.6715,
          "id": 2,
          "x": 308.452,
          "y": 611.822
        },
        {
          "classId": 2,
          "classLabel": "right eye",
          "confidence": 0.6993,
          "id": 3,
          "x": 91.684,
          "y": 642.804
        },
        {
          "classId": 3,
          "classLabel": "left ear",
          "confidence": 0.7081,
          "id": 4,
          "x": 481.471,
          "y": 705.346
        },
        {
          "classId": 4,
          "classLabel": "right ear",
          "confidence": 0.5889,
          "id": 5,
          "x": 3.399,
          "y": 769.542
        },
        {
          "classId": 5,
          "classLabel": "left shoulder",
          "confidence": 0.3109,
          "id": 6,
          "x": 816.562,
          "y": 1190.927
        },
        {
          "classId": 6,
          "classLabel": "right shoulder",
          "confidence": 0.4752,
          "id": 7,
          "x": 17.168,
          "y": 1365.633
        },
        {
          "classId": 7,
          "classLabel": "left elbow",
          "confidence": 0.0467,
          "id": 8,
          "x": 925.019,
          "y": 1460.158
        },
        {
          "classId": 8,
          "classLabel": "right elbow",
          "confidence": 0.1132,
          "id": 9,
          "x": 8.168,
          "y": 1842.982
        },
        {
          "classId": 9,
          "classLabel": "left wrist",
          "confidence": 0.0626,
          "id": 10,
          "x": 908.479,
          "y": 1581.804
        },
        {
          "classId": 10,
          "classLabel": "right wrist",
          "confidence": 0.0456,
          "id": 11,
          "x": 34.365,
          "y": 1831.564
        },
        {
          "classId": 11,
          "classLabel": "left hip",
          "confidence": 0.0346,
          "id": 12,
          "x": 918.864,
          "y": 1851.533
        },
        {
          "classId": 12,
          "classLabel": "right hip",
          "confidence": 0.0083,
          "id": 13,
          "x": 435.725,
          "y": 1891.3
        },
        {
          "classId": 13,
          "classLabel": "left knee",
          "confidence": 0.0307,
          "id": 14,
          "x": 873.457,
          "y": 1191.49
        },
        {
          "classId": 14,
          "classLabel": "right knee",
          "confidence": 0.0042,
          "id": 15,
          "x": 32.792,
          "y": 1340.378
        },
        {
          "classId": 15,
          "classLabel": "left ankle",
          "confidence": 0.0242,
          "id": 16,
          "x": 1034.746,
          "y": 1645.421
        },
        {
          "classId": 16,
          "classLabel": "right ankle",
          "confidence": 0.0497,
          "id": 17,
          "x": 79.388,
          "y": 615.969
        }
      ]
    }
  ],
  "seconds": 0,
  "source_height": 1882,
  "source_id": "317375bd-18c6-11f1-b631-8e1aed86f95b",
  "source_width": 1094,
  "system_timestamp": 1772737494427405000,
  "timestamp": 0
}

(Names + keypoint count should be adjusted to match the exact skeleton your model returns.)

Practical Use Cases

Fitness & Sports

  • Rep counting and movement consistency
  • Form checks and posture detection inputs
  • Exercise classification (with downstream logic)
  • Interactive training overlays

Entertainment & Interactive Experiences

  • Gesture-based interactions
  • Motion-driven effects and AR overlays
  • Avatar / character control inputs (2D keypoint source)

Workplace & Safety (General)

  • Basic posture awareness signals (with downstream rules)
  • Restricted-zone activity cues (paired with ROIs)
  • Ergonomic monitoring inputs (where appropriate)

Analytics & Research

  • Movement patterns over time
  • Comparing motion sequences
  • Filtering or indexing footage by activity style (with classifiers)

Why Keypoints Matter

Bounding boxes tell you where a person is. Keypoints tell you how they’re moving.

With 2D body points, you can infer:

  • Posture and pose state
  • Limb angles and joint ranges
  • Repetition cycles and motion signatures
  • Gesture cues (wave, reach, squat, jump, etc.)

It’s the simplest output that unlocks higher-level understanding.

Deployment Options

EyePop Cloud

  • Scalable
  • Managed infrastructure
  • Best for web apps + fast iteration

On-Premise Runtime

  • Keep video inside your network
  • Lower latency options
  • Works with GPU or CPU environments
  • Ideal for regulated or sensitive footage

Who This Is For

  • Developers building fitness, sports, or motion analytics tools
  • Teams creating interactive camera experiences
  • Product teams who need “pose data” without hiring ML specialists
  • Anyone who wants structured body-point outputs from video—fast

Get early access

Want to move faster with visual automation? Request early access to Abilities and get notified as new vision capabilities roll out.

View CDN documentation →