FEATURE: New `Video` API by Ashp116 · Pull Request #1924 · roboflow/supervision

Ashp116 · 2025-07-30T19:29:49Z

Description

This PR introduces a new Video API that streamlines video processing and rendering workflows. It addresses both issues #1923 and #1929 by enabling more flexible backend support and improved audio-video synchronization.

With this update, the video processing function now supports multiple backends, including PyAV and OpenCV. Notably, PyAV is the only backend currently supporting audio rendering, which significantly improves output quality.

This PR requires the optional dependency pyAV for the video rendering backend.

Tags:
Fixes #1923
Fixes #1929

Type of change

Bug fix (non-breaking change which fixes an issue)

How has this change been tested, please provide a testcase or example of how you tested the change?

Please refer to #1923 and #1929

Any specific deployment considerations

Ensure that pyAV is installed in the environment to test pyAV backend.

Docs

Docs updated? What were the changes

…supervision into bug/process-video-audio

SkalskiP · 2025-07-31T15:32:54Z

Hi @Ashp116 👋🏻 Another great idea! Video processing is probably the oldest part of supervision, written over two years ago, and I’ve been wanting to update its API for a while. Would you be open to not only adding audio support but also helping me with the update?

Ashp116 · 2025-07-31T18:23:34Z

Hi @SkalskiP, yea, I would like to help update the API. I was thinking of changing how videos are written in process_video. The original compression is lost when annotations are added and the file is written to a target_path. But yea, I would like to help out with the update.

SkalskiP · 2025-08-01T09:54:27Z

Hi @Ashp116 I'm really glad you want to help me! Let's goooo! 🔥 🔥 🔥

I want the functionalities currently found in supervision.utils.video to be reorganized around a new Video class. Importantly, all features previously available in the old API must still be supported in the new one. Ideally, the new API should be more consistent and expressive.

get video info (works for files, RTSP, webcams)

import supervision as sv
 
# static video
sv.Video("source.mp4").info

# video stream
sv.Video("rtsp://...").info

# webcam
sv.Video(0).info

simple frame iteration (object is iterable)

import supervision as sv

video = sv.Video("source.mp4")
for frame in video:
    ...

advanced frame iteration (stride, sub-clip, on-the-fly resize)

import supervision as sv

for frame in sv.Video("source.mp4").frames(stride=5, start=100, end=500, resolution_wh=(1280, 720)):
    ...

process the video

import cv2
import supervision as sv

def blur(frame, i):
    return cv2.GaussianBlur(frame, (11, 11), 0)

sv.Video("source.mp4").save(
    "blurred.mp4",
    callback=blur,
    show_progress=True
)

overwrite target video parameters

import supervision as sv

sv.Video("source.mp4").save(
    "timelapse.mp4",
    fps=60,
    callback=lambda f, i: f,
    show_progress=True
)

complete manual control with explicit VideoInfo

from supervision import Video, VideoInfo

source = Video("source.mp4")
target_info = VideoInfo(width=800, height=800, fps=24)

with src.sink("square.mp4", info=target_info) as sink:
    for f in src.frames():
        f = cv2.resize(f, target_info.resolution_wh)
        sink.write(f)

multi-backend support decode/encode

import supervision as sv

video = sv.Video("source.mkv", backend="pyav")

video = sv.Video("source.mkv", backend="opencv")

suggested minimal protocol

class Backend(Protocol):
    def open(self, path: str) -> Any: ...
    def info(self, handle: Any) -> VideoInfo: ...

    def read(self, handle: Any) -> tuple[bool, np.ndarray]: ...
    def grab(self, handle: Any) -> bool: ...
    def seek(self, handle: Any, frame_idx: int) -> None: ...

    def writer(self, path: str, info: VideoInfo, codec: str) -> Writer: ...

class Writer(Protocol):
    def write(self, frame: np.ndarray) -> None: ...
    def close(self) -> None: ...

UPDATE: Added a new Video class with OpenCV writer and backend

…supervision into bug/process-video-audio

Ashp116 · 2025-08-02T06:52:16Z

Hi @SkalskiP,

I’ve addressed most of the features you mentioned, but I have some thoughts on a few aspects of the implementation:

.save Functionality
How would you handle .save for a video feed coming from a webcam or an RTSP stream? Currently, I have it where only video files can be saved.
Writer and Backend Classes
This is just my personal opinion, but should these classes be moved to separate scripts/modules? If we add more writers and backends in the future, keeping everything inside the main video script might become cluttered.
“Complete manual control with explicit VideoInfo” Functionality
```
from supervision import Video, VideoInfo

source = Video("source.mp4")
target_info = VideoInfo(width=800, height=800, fps=24)

with src.sink("square.mp4", info=target_info) as sink:
    for f in src.frames():
        f = cv2.resize(f, target_info.resolution_wh)
        sink.write(f)
```
I’m not fully clear on what this feature is intended to do. In this snippet, the Video instance source is created but never used afterward. Is src supposed to be source? Also, is the goal to create sinks for each backend? Could you please clarify the purpose and expected usage here?

ADD: Added audio stream for process_video

7fba113

Ashp116 requested a review from SkalskiP as a code owner July 30, 2025 19:29

pre-commit-ci bot and others added 3 commits July 30, 2025 19:30

fix(pre_commit): 🎨 auto format pre-commit hooks

8947f77

REMOVE: Removed ffprobe

73b5836

Merge branch 'bug/process-video-audio' of https://github.com/Ashp116/…

e02d298

…supervision into bug/process-video-audio

Ashp116 changed the title ~~ADD: Added audio stream for process_video~~ BUG: Added audio stream for process_video Jul 30, 2025

Ashp116 and others added 16 commits August 1, 2025 22:51

UPDATE: Added a new Video class with OpenCV writer and backend

5e07794

Merge pull request #1 from Ashp116/update/video-core

46ec693

UPDATE: Added a new Video class with OpenCV writer and backend

fix(pre_commit): 🎨 auto format pre-commit hooks

b2096d0

Precommit

9fb7098

fix(pre_commit): 🎨 auto format pre-commit hooks

850a2c6

Precommit

46900f8

Merge branch 'bug/process-video-audio' of https://github.com/Ashp116/…

34cb9a1

…supervision into bug/process-video-audio

fix(pre_commit): 🎨 auto format pre-commit hooks

c700394

UPDATE: Fixed incomplete write closing

fce8ade

ADD: Docstrings

f86f4f2

fix(pre_commit): 🎨 auto format pre-commit hooks

2265977

UPDATE: Allow for ffmpeg error passthrough

bf67bfa

UPDATE: Writer and Backend abstract class

ec4bd01

Precommit

b9e7968

fix(pre_commit): 🎨 auto format pre-commit hooks

a96c3f0

Precommit

a6c91bc

Ashp116 changed the title ~~BUG: Added audio stream for process_video~~ FEATURE: Versatile Video class Aug 2, 2025

Ashp116 mentioned this pull request Aug 5, 2025

Reimplement video utils #1929

Open

Ashp116 added 2 commits August 6, 2025 16:21

UPDATE: Added manual control

d075e03

ADD: Added docstrings

7f078ff

github-actions bot added has conflicts and removed has conflicts labels Jan 16, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FEATURE: New `Video` API#1924

FEATURE: New `Video` API#1924
Ashp116 wants to merge 139 commits intoroboflow:developfrom
Ashp116:bug/process-video-audio

Ashp116 commented Jul 30, 2025 •

edited

Loading

Uh oh!

SkalskiP commented Jul 31, 2025

Uh oh!

Ashp116 commented Jul 31, 2025

Uh oh!

SkalskiP commented Aug 1, 2025 •

edited

Loading

Uh oh!

Ashp116 commented Aug 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Ashp116 commented Jul 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

How has this change been tested, please provide a testcase or example of how you tested the change?

Any specific deployment considerations

Docs

Uh oh!

SkalskiP commented Jul 31, 2025

Uh oh!

Ashp116 commented Jul 31, 2025

Uh oh!

SkalskiP commented Aug 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Ashp116 commented Aug 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Ashp116 commented Jul 30, 2025 •

edited

Loading

SkalskiP commented Aug 1, 2025 •

edited

Loading