FFMPEG Python: A complete overview of the video processing opportunities in Python

онлайн тренажер по питону
Online Python Trainer for Beginners

Learn Python easily without overwhelming theory. Solve practical tasks with automatic checking, get hints in Russian, and write code directly in your browser — no installation required.

Start Course

What is ffmpeg‑python and why is it needed

In today’s world of multimedia processing, the number of tasks keeps growing: from simple format conversion to complex streaming, filtering, and video analytics. The ffmpeg‑python library (the official package name) allows developers to harness the power of the FFmpeg command directly inside Python scripts while preserving flexibility and code readability.

ffmpeg‑python is a Python wrapper around the popular FFmpeg library that provides a convenient interface for working with multimedia files. It enables virtually any operation on video, audio, and images: converting formats, applying filters, changing resolution, adding effects, creating streams, and much more.

Benefits of ffmpeg‑python

The library offers a number of undeniable advantages:

Pythonic syntax: A familiar, method‑chaining style for Python developers with clear parameters.

Full access to FFmpeg features: All capabilities of the powerful FFmpeg library are available through the Python API.

Flexibility in command building: Ability to create complex processing graphs with multiple inputs and outputs.

Integration with the Python ecosystem: Works seamlessly with NumPy, OpenCV, Pillow, and other popular libraries.

Excellent performance: Leverages FFmpeg’s optimized algorithms for fast media processing.

Installation and basic setup

System requirements

Before you start, make sure you have installed:

  • FFmpeg itself (version 4.0 or newer is recommended)
  • Python 3.6+ (compatible with Python 3.10/3.11)

Installing the library

# Install the Python wrapper ffmpeg‑python
pip install ffmpeg-python

# Install FFmpeg on Ubuntu/Debian
sudo apt update && sudo apt install ffmpeg

# Install FFmpeg on macOS via Homebrew
brew install ffmpeg

# Verify the installation
ffmpeg -version
python -c "import ffmpeg; print('ffmpeg‑python installed successfully')"

Note: For Windows, download the FFmpeg build from the official site and add its path to the system PATH variable.

Verification

After installation, it’s recommended to check that everything works:

import ffmpeg
import subprocess

# Check FFmpeg version
result = subprocess.run(['ffmpeg', '-version'], capture_output=True, text=True)
print(result.stdout.split('\n')[0])

# Simple library test
try:
    stream = ffmpeg.input('test.mp4')
    print("ffmpeg‑python is working correctly")
except Exception as e:
    print(f"Error: {e}")

Architecture and core concepts

Streams

The core concept of ffmpeg‑python is streams. Each stream represents a source or destination of multimedia data. Streams can be:

  • Input streams — read data from files, URLs, or devices
  • Output streams — write processed data
  • Intermediate streams — results of filter application

Filters

Filters are operations that transform multimedia data. They can modify video (scaling, cropping, effects), audio (volume, equalizer), or work with metadata.

Processing graph

FFmpeg builds a processing graph where nodes represent filters and edges represent data streams between them. This allows the creation of sophisticated transformation chains.

Complete description of library methods and functions

Main stream‑creation functions

ffmpeg.input(source, kwargs) — creates an input stream for reading data from various sources. It can read from files, URLs, devices, or standard input.

ffmpeg.output(stream, target, kwargs) — defines an output file or stream for writing processed data.

ffmpeg.probe(source, kwargs) — inspects a media file and returns detailed information about its characteristics.

Functions for handling multiple streams

ffmpeg.concat(streams, v=1, a=1) — concatenates several video or audio streams into one.

ffmpeg.join(streams) — joins streams for parallel processing.

ffmpeg.merge_outputs(outputs) — merges several output streams into a single command.

Global settings

ffmpeg.global_args(flags) — sets global FFmpeg options such as log level or file‑overwrite mode.

Stream methods

Each stream provides a set of methods for modification:

.filter(name, *args, kwargs) — applies the specified filter to the stream.

.overlay(overlay_stream, x, y) — overlays one video stream onto another at the given coordinates.

.audio — selects only the audio track from the stream.

.video — selects only the video track from the stream.

.run(cmd=None, capture_stdout=False, capture_stderr=False, kwargs) — executes the constructed FFmpeg command.

.compile(cmd='ffmpeg', overwrite_output=False) — compiles the stream into a list of command‑line arguments without execution.

.get_args() — returns the argument list that will be passed to FFmpeg.

Table of ffmpeg‑python methods and functions

Category Method / Function Description Main parameters
Stream creation ffmpeg.input() Creates an input stream source, ss, t, format, framerate
  ffmpeg.output() Creates an output stream target, vcodec, acodec, format
  ffmpeg.probe() Analyzes a media file source, select_streams
Stream merging ffmpeg.concat() Concatenates streams v (video), a (audio), unsafe
  ffmpeg.join() Joins streams Streams to join
  ffmpeg.merge_outputs() Merges outputs Multiple output streams
Filtering .filter() Applies a filter name, positional and named arguments
  .video Selects video stream None
  .audio Selects audio stream None
Overlay .overlay() Overlays video overlay_stream, x, y
Execution .run() Runs the command capture_stdout, capture_stderr, overwrite_output
  .compile() Compiles to command cmd, overwrite_output
  .get_args() Gets arguments None
Settings ffmpeg.global_args() Global parameters FFmpeg flags

Detailed review of key methods

ffmpeg.input() — creating an input stream

The input() method supports many parameters for precise data‑reading configuration:

import ffmpeg

# Basic usage
stream = ffmpeg.input('video.mp4')

# With time range
stream = ffmpeg.input('video.mp4', ss=10, t=30)  # from 10 s to 40 s

# Reading from a URL
stream = ffmpeg.input('http://example.com/stream.m3u8')

# Capture from a webcam (Linux)
stream = ffmpeg.input('/dev/video0', format='v4l2', framerate=30, video_size='1280x720')

# Reading images by pattern
stream = ffmpeg.input('img%03d.png', format='image2', framerate=1)

Key input() parameters:

  • ss — start position (seconds or HH:MM:SS)
  • t — duration to read
  • format — force a specific format
  • framerate — input frame rate
  • video_size — video size for capture devices
  • loop — loop the input

ffmpeg.output() — configuring output

The output() method offers extensive options for the resulting file:

# Simple output
ffmpeg.output(stream, 'output.mp4')

# With explicit codecs
ffmpeg.output(stream, 'output.mp4', vcodec='libx264', acodec='aac')

# Bitrate and quality settings
ffmpeg.output(stream, 'output.mp4',
              video_bitrate='2M',
              audio_bitrate='128k',
              crf=23)

# Output to multiple formats simultaneously
ffmpeg.output(stream, 'output.mp4', vcodec='libx264')
ffmpeg.output(stream, 'output.webm', vcodec='libvpx-vp9')

Important output() parameters:

  • vcodec, acodec — video and audio codecs
  • video_bitrate, audio_bitrate — stream bitrates
  • crf — constant‑rate‑factor (0‑51, lower = better quality)
  • preset — encoding speed vs. quality trade‑off
  • format — container format

Filters — the heart of processing

The filter system lets you perform virtually any transformation:

# Resize
stream = ffmpeg.input('input.mp4').filter('scale', 1920, 1080)

# Crop video
stream = ffmpeg.input('input.mp4').filter('crop', 640, 480, 100, 50)

# Rotate 90 degrees
stream = ffmpeg.input('input.mp4').filter('transpose', 1)

# Change playback speed
stream = ffmpeg.input('input.mp4').filter('setpts', '0.5*PTS')

# Add text overlay
stream = ffmpeg.input('input.mp4').filter('drawtext',
                                          text='Sample Text',
                                          x=10, y=10,
                                          fontsize=24,
                                          fontcolor='white')

Advanced usage examples

Real‑time streaming

Modern streaming platforms require high‑quality live content:

import ffmpeg

# Core streaming configuration
def setup_streaming():
    return (
        ffmpeg
        .input('/dev/video0', format='v4l2', framerate=30, video_size='1920x1080')
        .output(
            'rtmp://live.twitch.tv/app/YOUR_STREAM_KEY',
            vcodec='libx264',
            preset='veryfast',
            tune='zerolatency',
            maxrate='3000k',
            bufsize='6000k',
            pix_fmt='yuv420p',
            g=50,
            acodec='aac',
            audio_bitrate='160k',
            ar=44100,
            format='flv'
        )
        .global_args('-re')  # real‑time mode
        .run()
    )

# Stream webcam with microphone
def stream_webcam_with_audio():
    video = ffmpeg.input('/dev/video0', format='v4l2', framerate=25, video_size='1280x720')
    audio = ffmpeg.input('pulse', format='pulse')  # Linux PulseAudio
    
    return (
        ffmpeg
        .output(
            video, audio,
            'rtmp://live.youtube.com/live2/YOUR_STREAM_KEY',
            vcodec='libx264',
            acodec='aac',
            preset='fast',
            crf=28,
            format='flv'
        )
        .global_args('-re')
        .run()
    )

Key streaming parameters:

  • -re — read input in real time
  • preset='veryfast' — fast encoding
  • tune='zerolatency' — minimal delay
  • maxrate, bufsize — bitrate control

Batch processing and parallel tasks

When handling large numbers of files, efficiency is critical:

import os
import ffmpeg
from concurrent.futures import ThreadPoolExecutor, ProcessPoolExecutor
from pathlib import Path
import logging

# Logging configuration
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

def convert_video(input_path, output_dir, target_format='mp4', quality='medium'):
    """Convert a single video file"""
    input_file = Path(input_path)
    output_file = Path(output_dir) / f"{input_file.stem}.{target_format}"
    
    quality_settings = {
        'high': {'crf': 18, 'preset': 'slow'},
        'medium': {'crf': 23, 'preset': 'medium'},
        'low': {'crf': 28, 'preset': 'fast'}
    }
    
    settings = quality_settings.get(quality, quality_settings['medium'])
    
    try:
        (
            ffmpeg
            .input(str(input_file))
            .output(
                str(output_file),
                vcodec='libx264',
                acodec='aac',
                **settings
            )
            .run(overwrite_output=True, quiet=True)
        )
        logger.info(f"Successfully converted: {input_file.name}")
        return True
    except ffmpeg.Error as e:
        logger.error(f"Conversion error {input_file.name}: {e}")
        return False

def batch_convert(input_dir, output_dir, max_workers=4, target_format='mp4'):
    """Batch conversion with parallel processing"""
    input_path = Path(input_dir)
    output_path = Path(output_dir)
    output_path.mkdir(exist_ok=True)
    
    video_extensions = ['.mp4', '.avi', '.mkv', '.mov', '.wmv', '.flv', '.webm']
    video_files = [
        f for f in input_path.iterdir() 
        if f.suffix.lower() in video_extensions
    ]
    
    logger.info(f"Found {len(video_files)} video files for processing")
    
    with ThreadPoolExecutor(max_workers=max_workers) as executor:
        futures = [
            executor.submit(convert_video, str(video_file), str(output_path), target_format)
            for video_file in video_files
        ]
        
        results = [future.result() for future in futures]
        
    successful = sum(results)
    logger.info(f"Processed {successful} out of {len(video_files)} files successfully")

# Usage
if __name__ == "__main__":
    batch_convert("input_videos", "output_videos", max_workers=4)

Complex processing graphs with multiple filters

Professional effects often require chaining several filters:

import ffmpeg

def create_professional_video_effect(input_path, output_path):
    """Create a professional video effect"""
    
    # Base video stream
    main_video = ffmpeg.input(input_path)
    
    # Apply a sequence of filters
    processed = (
        main_video
        .filter('scale', 1920, 1080)          # Standardize size
        .filter('fps', fps=24, round='up')   # Set 24 fps
        .filter('eq', contrast=1.2, brightness=0.1, saturation=1.3)  # Color correction
        .filter('unsharp', 5, 5, 0.8, 3, 3, 0.4)  # Sharpen
        .filter('drawtext',
                text='Professional Edit',
                x='(w-text_w)/2',
                y='h-th-20',
                fontsize=48,
                fontcolor='white',
                shadowcolor='black',
                shadowx=3,
                shadowy=3,
                enable=f'between(t,0,5)')   # Show text for first 5 s
    )
    
    # Add fade‑in/out
    with_fades = (
        processed
        .filter('fade', type='in', duration=1)   # Fade‑in
        .filter('fade', type='out', start_time=25, duration=2)  # Fade‑out
    )
    
    # Final output
    (
        ffmpeg
        .output(with_fades, output_path,
                vcodec='libx264',
                acodec='aac',
                crf=18,
                preset='slow',
                pix_fmt='yuv420p')
        .run(overwrite_output=True)
    )

def create_video_montage(clips, output_path):
    """Create a montage from several clips"""
    
    # Normalize all clips to the same size and fps
    normalized_clips = []
    for clip_path in clips:
        clip = (
            ffmpeg
            .input(clip_path)
            .filter('scale', 1920, 1080)
            .filter('fps', fps=30)
            .filter('setsar', 1)  # Set pixel aspect ratio
        )
        normalized_clips.append(clip)
    
    # Concatenate clips
    concatenated = ffmpeg.concat(*normalized_clips, v=1, a=1)
    
    # Add transitions between clips
    with_transitions = (
        concatenated
        .filter('fade', type='in', duration=0.5)
        .filter('fade', type='out', start_time=28, duration=0.5)
    )
    
    # Output file
    (
        ffmpeg
        .output(with_transitions, output_path,
                vcodec='libx264',
                acodec='aac',
                crf=20,
                preset='medium')
        .run(overwrite_output=True)
    )

# Example usage
create_professional_video_effect('raw_video.mp4', 'processed_video.mp4')
create_video_montage(['clip1.mp4', 'clip2.mp4', 'clip3.mp4'], 'montage.mp4')

Generating thumbnails and preview grids

Creating attractive previews is a vital part of video content workflows:

import ffmpeg
import os
from pathlib import Path

def generate_thumbnails(video_path, output_dir, count=9, quality='high'):
    """Generate a grid of preview frames"""
    
    # Get video info
    probe = ffmpeg.probe(video_path)
    duration = float(probe['streams'][0]['duration'])
    
    # Create output directory
    thumb_dir = Path(output_dir)
    thumb_dir.mkdir(exist_ok=True)
    
    # Generate frames at equal intervals
    interval = duration / (count + 1)
    
    for i in range(1, count + 1):
        timestamp = interval * i
        thumb_path = thumb_dir / f'thumb_{i:02d}.jpg'
        
        (
            ffmpeg
            .input(video_path, ss=timestamp)
            .output(str(thumb_path), 
                    vframes=1,
                    vf='scale=320:240',
                    q=2 if quality == 'high' else 5)
            .run(overwrite_output=True, quiet=True)
        )
    
    print(f"Created {count} thumbnails in {output_dir}")

def create_video_grid(video_path, output_path, rows=3, cols=3):
    """Create a thumbnail grid from a single video"""
    
    probe = ffmpeg.probe(video_path)
    duration = float(probe['streams'][0]['duration'])
    total_thumbs = rows * cols
    interval = duration / (total_thumbs + 1)
    
    # Capture temporary frames
    temp_frames = []
    for i in range(1, total_thumbs + 1):
        timestamp = interval * i
        temp_path = f'temp_frame_{i:02d}.jpg'
        
        (
            ffmpeg
            .input(video_path, ss=timestamp)
            .output(temp_path, vframes=1, vf='scale=320:240')
            .run(overwrite_output=True, quiet=True)
        )
        temp_frames.append(temp_path)
    
    # Build grid
    rows_streams = []
    for row in range(rows):
        row_frames = temp_frames[row*cols:(row+1)*cols]
        row_inputs = [ffmpeg.input(frame) for frame in row_frames]
        row_stream = ffmpeg.filter(row_inputs, 'hstack', inputs=cols)
        rows_streams.append(row_stream)
    
    # Stack rows vertically
    grid = ffmpeg.filter(rows_streams, 'vstack', inputs=rows)
    
    (
        ffmpeg
        .output(grid, output_path, q=2)
        .run(overwrite_output=True)
    )
    
    # Clean up temporary files
    for temp_file in temp_frames:
        os.remove(temp_file)
    
    print(f"Created thumbnail grid: {output_path}")

def create_animated_gif(video_path, output_path, start_time=0, duration=3, fps=10, width=480):
    """Create an animated GIF from a video segment"""
    
    # First pass: generate palette
    palette_path = output_path.replace('.gif', '_palette.png')
    
    (
        ffmpeg
        .input(video_path, ss=start_time, t=duration)
        .filter('fps', fps)
        .filter('scale', width, -1, flags='lanczos')
        .filter('palettegen')
        .output(palette_path)
        .run(overwrite_output=True, quiet=True)
    )
    
    # Second pass: create GIF using palette
    (
        ffmpeg
        .input(video_path, ss=start_time, t=duration)
        .filter('fps', fps)
        .filter('scale', width, -1, flags='lanczos')
        .filter('paletteuse', palette=palette_path)
        .output(output_path)
        .run(overwrite_output=True)
    )
    
    os.remove(palette_path)
    print(f"Created animated GIF: {output_path}")

# Example usage
generate_thumbnails('video.mp4', 'thumbnails', count=12)
create_video_grid('video.mp4', 'grid_preview.jpg', rows=3, cols=4)
create_animated_gif('video.mp4', 'preview.gif', start_time=10, duration=5)

Working with subtitles and metadata

Extracting and processing subtitles

Subtitles are an important part of modern video content:

import ffmpeg
import json
from pathlib import Path

def extract_subtitles(video_path, output_dir=None):
    """Extract all subtitle streams from a video file"""
    
    if output_dir is None:
        output_dir = Path(video_path).parent
    else:
        output_dir = Path(output_dir)
    
    output_dir.mkdir(exist_ok=True)
    
    # Probe the file for subtitle streams
    probe = ffmpeg.probe(video_path)
    subtitle_streams = [
        stream for stream in probe['streams'] 
        if stream['codec_type'] == 'subtitle'
    ]
    
    if not subtitle_streams:
        print("No subtitles found in the file")
        return []
    
    extracted_files = []
    
    for i, stream in enumerate(subtitle_streams):
        codec = stream.get('codec_name', 'unknown')
        language = stream.get('tags', {}).get('language', 'unknown')
        
        extension_map = {
            'subrip': 'srt',
            'ass': 'ass',
            'ssa': 'ssa',
            'webvtt': 'vtt',
            'mov_text': 'srt'
        }
        extension = extension_map.get(codec, 'srt')
        
        base_name = Path(video_path).stem
        subtitle_file = output_dir / f"{base_name}_{language}_{i}.{extension}"
        
        try:
            (
                ffmpeg
                .input(video_path)
                .output(str(subtitle_file), 
                        map=f'0:s:{i}',
                        codec='copy' if extension != 'srt' else 'srt')
                .run(overwrite_output=True, quiet=True)
            )
            extracted_files.append(str(subtitle_file))
            print(f"Extracted subtitles: {subtitle_file}")
        except ffmpeg.Error as e:
            print(f"Error extracting subtitle {i}: {e}")
    
    return extracted_files

def embed_subtitles(video_path, subtitle_path, output_path, language='eng'):
    """Embed external subtitles into a video file"""
    
    video = ffmpeg.input(video_path)
    subtitles = ffmpeg.input(subtitle_path)
    
    (
        ffmpeg
        .output(video, subtitles, output_path,
                codec='copy',
                **{
                    'c:s': 'mov_text',
                    'metadata:s:s:0': f'language={language}',
                    'map': '0',
                    'map': '1'
                })
        .run(overwrite_output=True)
    )
    
    print(f"Subtitles embedded into {output_path}")

def burn_subtitles(video_path, subtitle_path, output_path, font_size=24, font_color='white'):
    """Burn subtitles onto the video (hard‑subtitles)"""
    
    (
        ffmpeg
        .input(video_path)
        .output(output_path,
                vf=f"subtitles='{subtitle_path}':force_style='FontSize={font_size},PrimaryColour=&H00{font_color}'",
                codec='libx264',
                acodec='copy')
        .run(overwrite_output=True)
    )
    
    print(f"Subtitles burned onto video: {output_path}")

Working with metadata

Metadata contains important information about media files:

def extract_metadata(video_path, output_file=None):
    """Extract detailed metadata"""
    
    probe_data = ffmpeg.probe(video_path)
    
    metadata = {
        'format': probe_data.get('format', {}),
        'streams': []
    }
    
    for stream in probe_data.get('streams', []):
        stream_info = {
            'index': stream.get('index'),
            'codec_type': stream.get('codec_type'),
            'codec_name': stream.get('codec_name'),
            'duration': stream.get('duration'),
            'tags': stream.get('tags', {})
        }
        
        if stream['codec_type'] == 'video':
            stream_info.update({
                'width': stream.get('width'),
                'height': stream.get('height'),
                'avg_frame_rate': stream.get('avg_frame_rate'),
                'pix_fmt': stream.get('pix_fmt')
            })
        elif stream['codec_type'] == 'audio':
            stream_info.update({
                'sample_rate': stream.get('sample_rate'),
                'channels': stream.get('channels'),
                'channel_layout': stream.get('channel_layout')
            })
        
        metadata['streams'].append(stream_info)
    
    if output_file:
        with open(output_file, 'w', encoding='utf-8') as f:
            json.dump(metadata, f, indent=2, ensure_ascii=False)
        print(f"Metadata saved to {output_file}")
    
    return metadata

def update_metadata(video_path, output_path, metadata_dict):
    """Update video file metadata"""
    
    metadata_args = {}
    for key, value in metadata_dict.items():
        metadata_args[f'metadata:{key}'] = str(value)
    
    (
        ffmpeg
        .input(video_path)
        .output(output_path, codec='copy', **metadata_args)
        .run(overwrite_output=True)
    )
    
    print(f"Metadata updated in {output_path}")

def set_video_metadata(video_path, output_path, title=None, artist=None, album=None, year=None):
    """Set basic video metadata"""
    
    metadata = {}
    if title:
        metadata['title'] = title
    if artist:
        metadata['artist'] = artist
    if album:
        metadata['album'] = album
    if year:
        metadata['year'] = year
    
    update_metadata(video_path, output_path, metadata)

# Example usage
extract_subtitles('movie.mkv', 'subtitles')
embed_subtitles('video.mp4', 'subtitles.srt', 'video_with_subs.mp4', 'rus')
metadata = extract_metadata('video.mp4', 'video_metadata.json')
set_video_metadata('input.mp4', 'output.mp4', 
                   title='My Video', artist='Creator', year='2024')

Performance optimization and debugging

Progress monitoring

When processing large files, tracking progress is essential:

import ffmpeg
import subprocess
import re
import threading
from datetime import timedelta

def run_with_progress(stream, duration=None):
    """Run FFmpeg while displaying progress"""
    
    cmd = stream.compile()
    
    # Get duration if not provided
    if duration is None and len(cmd) > 1:
        try:
            probe = ffmpeg.probe(cmd[cmd.index('-i') + 1])
            duration = float(probe['format']['duration'])
        except:
            duration = None
    
    process = subprocess.Popen(
        cmd,
        stdout=subprocess.PIPE,
        stderr=subprocess.PIPE,
        text=True
    )
    
    def monitor_progress():
        while True:
            output = process.stderr.readline()
            if output == '' and process.poll() is not None:
                break
            
            if output and duration:
                time_match = re.search(r'time=(\d+):(\d+):(\d+)\.(\d+)', output)
                if time_match:
                    hours, minutes, seconds, ms = map(int, time_match.groups())
                    current_time = hours * 3600 + minutes * 60 + seconds + ms / 100
                    progress = (current_time / duration) * 100
                    print(f"\rProgress: {progress:.1f}% ({current_time:.1f}s / {duration:.1f}s)", end='')
    
    monitor_thread = threading.Thread(target=monitor_progress)
    monitor_thread.daemon = True
    monitor_thread.start()
    
    process.wait()
    print("\nProcessing completed")
    
    return process.returncode

# Example usage
stream = (
    ffmpeg
    .input('large_video.mp4')
    .output('compressed.mp4', vcodec='libx264', crf=28)
)

run_with_progress(stream)

Settings for optimal performance

def optimize_for_speed(input_path, output_path):
    """Optimize for maximum processing speed"""
    
    (
        ffmpeg
        .input(input_path)
        .output(output_path,
                vcodec='libx264',
                preset='ultrafast',
                crf=28,
                threads=0,
                acodec='copy')
        .run(overwrite_output=True)
    )

def optimize_for_quality(input_path, output_path):
    """Optimize for highest quality"""
    
    (
        ffmpeg
        .input(input_path)
        .output(output_path,
                vcodec='libx264',
                preset='veryslow',
                crf=18,
                pix_fmt='yuv420p',
                acodec='aac',
                audio_bitrate='320k',
                ar=48000)
        .run(overwrite_output=True)
    )

def get_hardware_acceleration():
    """Check for available hardware acceleration"""
    
    try:
        result = subprocess.run(['ffmpeg', '-encoders'], 
                              capture_output=True, text=True)
        accelerations = []
        if 'h264_nvenc' in result.stdout:
            accelerations.append('nvenc')
        if 'h264_vaapi' in result.stdout:
            accelerations.append('vaapi')
        if 'h264_videotoolbox' in result.stdout:
            accelerations.append('videotoolbox')
        return accelerations
    except:
        return []

def encode_with_gpu(input_path, output_path):
    """Encode using GPU acceleration if available"""
    
    accelerations = get_hardware_acceleration()
    
    if 'nvenc' in accelerations:
        (
            ffmpeg
            .input(input_path)
            .output(output_path,
                    vcodec='h264_nvenc',
                    preset='fast',
                    cq=23,
                    acodec='copy')
            .run(overwrite_output=True)
        )
        print("Used NVIDIA GPU acceleration")
    
    elif 'vaapi' in accelerations:
        (
            ffmpeg
            .input(input_path, hwaccel='vaapi', 
                   hwaccel_device='/dev/dri/renderD128')
            .output(output_path,
                    vcodec='h264_vaapi',
                    vf='format=nv12,hwupload',
                    cq=23,
                    acodec='copy')
            .run(overwrite_output=True)
        )
        print("Used VAAPI acceleration")
    
    else:
        print("No hardware acceleration available, falling back to CPU")
        optimize_for_speed(input_path, output_path)

Error handling and debugging

import ffmpeg
import logging

logging.basicConfig(level=logging.DEBUG)
logger = logging.getLogger(__name__)

def safe_conversion(input_path, output_path, max_retries=3):
    """Robust conversion with retries and error handling"""
    
    for attempt in range(max_retries):
        try:
            logger.info(f"Attempt {attempt + 1} converting {input_path}")
            
            # Verify input file
            probe = ffmpeg.probe(input_path)
            logger.debug(f"Input file info: {probe['format']}")
            
            # Perform conversion
            (
                ffmpeg
                .input(input_path)
                .output(output_path, vcodec='libx264', acodec='aac')
                .global_args('-loglevel', 'verbose')
                .run(overwrite_output=True)
            )
            
            logger.info(f"Successfully converted: {output_path}")
            return True
            
        except ffmpeg.Error as e:
            error_message = e.stderr.decode() if e.stderr else str(e)
            logger.error(f"FFmpeg error (attempt {attempt + 1}): {error_message}")
            
            if attempt == max_retries - 1:
                logger.error("All attempts exhausted")
                return False
            
        except Exception as e:
            logger.error(f"Unexpected error: {e}")
            return False
    
    return False

def debug_command(stream):
    """Print the FFmpeg command for debugging"""
    
    cmd = stream.compile()
    logger.debug("FFmpeg command:")
    logger.debug(" ".join(cmd))
    
    with open('debug_command.txt', 'w') as f:
        f.write(" ".join(cmd))
    
    return cmd

# Debug example
stream = (
    ffmpeg
    .input('input.mp4')
    .output('output.mp4', vcodec='libx264')
)

debug_command(stream)
safe_conversion('input.mp4', 'output.mp4')

Integration with other Python libraries

Working with OpenCV

Combining ffmpeg‑python with OpenCV opens a wide range of possibilities:

import ffmpeg
import cv2
import numpy as np
from pathlib import Path

def extract_frames_to_opencv(video_path, start_time=0, duration=None):
    """Extract video frames for OpenCV processing"""
    
    input_args = {'ss': start_time}
    if duration:
        input_args['t'] = duration
    
    probe = ffmpeg.probe(video_path)
    width = int(probe['streams'][0]['width'])
    height = int(probe['streams'][0]['height'])
    
    process = (
        ffmpeg
        .input(video_path, **input_args)
        .output('pipe:', format='rawvideo', pix_fmt='bgr24')
        .run_async(pipe_stdout=True, pipe_stderr=True)
    )
    
    frames = []
    frame_size = width * height * 3
    
    while True:
        in_bytes = process.stdout.read(frame_size)
        if not in_bytes:
            break
        frame = np.frombuffer(in_bytes, np.uint8).reshape([height, width, 3])
        frames.append(frame)
    
    process.wait()
    return frames

def apply_opencv_processing(input_path, output_path, processing_func):
    """Apply OpenCV processing to a video via ffmpeg"""
    
    probe = ffmpeg.probe(input_path)
    width = int(probe['streams'][0]['width'])
    height = int(probe['streams'][0]['height'])
    fps = eval(probe['streams'][0]['avg_frame_rate'])
    
    input_process = (
        ffmpeg
        .input(input_path)
        .output('pipe:', format='rawvideo', pix_fmt='bgr24')
        .run_async(pipe_stdout=True)
    )
    
    output_process = (
        ffmpeg
        .input('pipe:', format='rawvideo', pix_fmt='bgr24', 
               s=f'{width}x{height}', r=f'{fps}')
        .output(output_path, vcodec='libx264', pix_fmt='yuv420p')
        .overwrite_output()
        .run_async(pipe_stdin=True)
    )
    
    frame_size = width * height * 3
    
    while True:
        in_bytes = input_process.stdout.read(frame_size)
        if not in_bytes:
            break
        
        frame = np.frombuffer(in_bytes, np.uint8).reshape([height, width, 3])
        processed_frame = processing_func(frame)
        output_process.stdin.write(processed_frame.tobytes())
    
    input_process.wait()
    output_process.stdin.close()
    output_process.wait()

def add_edge_detection(frame):
    """Edge detection overlay"""
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    edges = cv2.Canny(gray, 100, 200)
    edges_colored = cv2.cvtColor(edges, cv2.COLOR_GRAY2BGR)
    return cv2.addWeighted(frame, 0.7, edges_colored, 0.3, 0)

# Example usage
apply_opencv_processing('input.mp4', 'output_edges.mp4', add_edge_detection)

Integration with NumPy and scientific libraries

import ffmpeg
import numpy as np
import matplotlib.pyplot as plt
from scipy import signal
import librosa

def analyze_video_audio(video_path):
    """Analyze a video's audio track with scientific libraries"""
    
    out, _ = (
        ffmpeg
        .input(video_path)
        .output('-', format='wav')
        .run(capture_stdout=True)
    )
    
    audio_data = np.frombuffer(out, np.int16)
    y = audio_data.astype(np.float32) / 32768.0
    sr = 44100
    
    tempo, beats = librosa.beat.beat_track(y=y, sr=sr)
    spectral_centroids = librosa.feature.spectral_centroid(y=y, sr=sr)[0]
    mfcc = librosa.feature.mfcc(y=y, sr=sr, n_mfcc=13)
    
    return {
        'tempo': tempo,
        'beats': beats,
        'spectral_centroids': spectral_centroids,
        'mfcc': mfcc,
        'raw_audio': y
    }

def create_visualization_video(audio_features, output_path, duration):
    """Create a video visualizing audio analysis results"""
    
    fps = 30
    total_frames = int(duration * fps)
    frames = []
    
    for frame_idx in range(total_frames):
        fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(10, 6))
        time_idx = int(frame_idx * len(audio_features['spectral_centroids']) / total_frames)
        ax1.plot(audio_features['spectral_centroids'][:time_idx])
        ax1.set_title('Spectral Centroid')
        ax1.set_xlim(0, len(audio_features['spectral_centroids']))
        ax2.imshow(audio_features['mfcc'][:, :time_idx], aspect='auto', origin='lower')
        ax2.set_title('MFCC')
        plt.tight_layout()
        fig.canvas.draw()
        frame = np.frombuffer(fig.canvas.tostring_rgb(), dtype=np.uint8)
        frame = frame.reshape(fig.canvas.get_width_height()[::-1] + (3,))
        frames.append(frame)
        plt.close(fig)
    
    height, width, _ = frames[0].shape
    
    process = (
        ffmpeg
        .input('pipe:', format='rawvideo', pix_fmt='rgb24', 
               s=f'{width}x{height}', r=fps)
        .output(output_path, vcodec='libx264', pix_fmt='yuv420p')
        .overwrite_output()
        .run_async(pipe_stdin=True)
    )
    
    for frame in frames:
        process.stdin.write(frame.tobytes())
    
    process.stdin.close()
    process.wait()

# Example usage
audio_features = analyze_video_audio('music_video.mp4')
create_visualization_video(audio_features, 'visualization.mp4', 30)

Frequently asked questions

How to convert a video to another format while preserving quality?

Use the CRF (Constant Rate Factor) parameter for maximum quality retention:

(
    ffmpeg
    .input('input.avi')
    .output('output.mp4', vcodec='libx264', crf=18, preset='slow')
    .run(overwrite_output=True)
)

A CRF value between 18 and 23 provides high quality, and preset='slow' improves compression.

How to reduce file size without noticeable quality loss?

Use two‑pass encoding with a target bitrate:

# First pass
(
    ffmpeg
    .input('input.mp4')
    .output('output.mp4', vcodec='libx264', b='2M', passlogfile='pass')
    .global_args('-pass', '1', '-f', 'null')
    .run(overwrite_output=True)
)

# Second pass
(
    ffmpeg
    .input('input.mp4')
    .output('output.mp4', vcodec='libx264', b='2M', passlogfile='pass')
    .global_args('-pass', '2')
    .run(overwrite_output=True)
)

How to concatenate several videos into one file?

There are multiple approaches depending on requirements:

# Simple concatenation (all files must share the same parameters)
video1 = ffmpeg.input('video1.mp4')
video2 = ffmpeg.input('video2.mp4')
video3 = ffmpeg.input('video3.mp4')

(
    ffmpeg
    .concat(video1, video2, video3, v=1, a=1)
    .output('combined.mp4')
    .run(overwrite_output=True)
)

# Concatenation with parameter normalization
inputs = []
for video_file in ['video1.mp4', 'video2.mp4', 'video3.mp4']:
    inputs.append(
        ffmpeg.input(video_file).filter('scale', 1920, 1080).filter('fps', fps=30)
    )

(
    ffmpeg
    .concat(*inputs, v=1, a=1)
    .output('normalized_combined.mp4')
    .run(overwrite_output=True)
)

How to extract audio from a video file?

(
    ffmpeg
    .input('video.mp4')
    .output('audio.mp3', acodec='mp3', audio_bitrate='320k')
    .run(overwrite_output=True)
)

# Or without re‑encoding (if the audio is already in the desired format)
(
    ffmpeg
    .input('video.mp4')
    .output('audio.aac', vn=None, acodec='copy')
    .run(overwrite_output=True)
)

How to create a video from a sequence of images?

(
    ffmpeg
    .input('image_%03d.jpg', framerate=24)
    .output('slideshow.mp4', vcodec='libx264', pix_fmt='yuv420p')
    .run(overwrite_output=True)
)

# With added audio
video = ffmpeg.input('image_%03d.jpg', framerate=24)
audio = ffmpeg.input('background_music.mp3')

(
    ffmpeg
    .output(video, audio, 'slideshow_with_music.mp4',
            vcodec='libx264', acodec='aac', shortest=None)
    .run(overwrite_output=True)
)

How to change video resolution?

# Scale while preserving aspect ratio
(
    ffmpeg
    .input('input.mp4')
    .filter('scale', 1920, -1)  # -1 auto‑calculates height
    .output('output_1920.mp4')
    .run(overwrite_output=True)
)

# Force a specific size (may distort aspect ratio)
(
    ffmpeg
    .input('input.mp4')
    .filter('scale', 1280, 720)
    .output('output_720p.mp4')
    .run(overwrite_output=True)
)

# Smart scaling with padding (letterboxing)
(
    ffmpeg
    .input('input.mp4')
    .filter('scale', 1920, 1080, force_original_aspect_ratio='decrease')
    .filter('pad', 1920, 1080, '(ow-iw)/2', '(oh-ih)/2', color='black')
    .output('output_letterbox.mp4')
    .run(overwrite_output=True)
)

How to trim a video by time?

# Trim with start time and duration
(
    ffmpeg
    .input('input.mp4', ss=30, t=60)  # from 30 s, length 60 s
    .output('trimmed.mp4', codec='copy')  # copy for fast lossless trim
    .run(overwrite_output=True)
)

# Trim with start and end timestamps
(
    ffmpeg
    .input('input.mp4', ss='00:01:30', to='00:03:45')
    .output('trimmed2.mp4', codec='copy')
    .run(overwrite_output=True)
)

How to add a watermark to a video?

main_video = ffmpeg.input('video.mp4')
watermark = ffmpeg.input('watermark.png')

(
    ffmpeg
    .overlay(main_video, watermark,
             x='main_w-overlay_w-10',  # 10 px from right edge
             y='10')                   # 10 px from top edge
    .output('watermarked.mp4')
    .run(overwrite_output=True)
)

# Semi‑transparent watermark
watermark_with_alpha = (
    ffmpeg
    .input('watermark.png')
    .filter('format', 'rgba')
    .filter('colorchannelmixer', aa=0.5)  # 50 % opacity
)

(
    ffmpeg
    .overlay(main_video, watermark_with_alpha, x=10, y=10)
    .output('transparent_watermark.mp4')
    .run(overwrite_output=True)
)

Conclusion

The ffmpeg‑python library is a powerful tool for working with multimedia data within the Python ecosystem. It successfully combines the capabilities of the professional FFmpeg utility with the convenience and readability of Python code.

Key advantages of ffmpeg‑python:

Universality — support for all major video, audio, and image formats, plus real‑time stream handling.

Performance — optimized FFmpeg algorithms deliver high processing speed even for large files.

Flexibility — ability to build complex processing graphs with multiple inputs, outputs, and filter chains.

Integration — seamless work with popular Python libraries such as NumPy, OpenCV, and Matplotlib.

Scalability — support for parallel and batch operations suitable for industrial use.

Typical use cases:

  • Media production — editing, color grading, adding effects
  • Streaming and live broadcasting — creating live streams and real‑time processing
  • Automation — batch processing of large content volumes
  • Web development — generating previews, converting formats for various devices
  • Scientific research — analyzing multimedia data, computer vision
  • Archiving — conversion and compression for long‑term storage

By mastering the principles described in this article, you’ll have a professional toolkit capable of handling virtually any multimedia processing task—from simple conversions to sophisticated, automated pipelines that operate at industrial scale.

Remember that effective use of the library requires an understanding of both FFmpeg fundamentals and Python‑specific best practices. Experiment with different filters and parameters, study the FFmpeg documentation for deeper insight, and always prioritize testing and performance optimization in your projects.

News