Skip to main content

I built a tool on Google Colab to easily create demo videos and convert them into GIFs or MP4s.

If this site helped you, please support us with a star! 🌟
Star on GitHub

Changelog

DateChanges
2026/03/29Added BGM feature (MP4 output only).
• Download audio from a YouTube URL and merge it into the final MP4 with configurable start position, speed, and volume
• Advanced license auto-detection via metadata analysis
• Slider changes reflected in the audio player in real time
2026/03/08Applied # @title / # @markdown form mode to all cells.
Overhauled zoom UI:
• Added "🏁 Record end position" button to allow direct end-time input
• Hold duration is now calculated automatically
• Added collapsible event panels and selection highlighting
2026/02/24Zoom feature overhauled.
Start times can now be recorded automatically during preview playback.
Overlapping zoom events are detected and prevented.

When you share a tool or project on GitHub or social media, attaching a demo GIF or video makes a real difference in how well it lands. But the moment you try to convert a screen recording, you run into the usual friction: SaaS tools need a paid plan for fine-grained quality settings, and local GUI apps have a habit of breaking whenever the OS updates.

So I built a tool that runs ffmpeg inside Google Colab's free environment and converts screen recordings directly to GIF or MP4. Everything runs in the browser — no environment setup, no software to install.

What I Built

🚀 Try It Now

No setup required. Open the link below and run it directly in your browser.

Why I Built This

Adding a GIF to a GitHub README noticeably improves first impressions. Seeing something in motion immediately tells visitors what a project actually does, and I think it meaningfully increases the chance they keep reading.

That said, converting a screen recording to a GIF has always been quietly annoying.

SaaS tools are convenient because they run in the browser, but fine-grained quality control tends to sit behind a paid plan. Local GUI apps work until they don't — an OS update often breaks them, and keeping them working takes ongoing effort.

As an engineer, spending time learning a video conversion tool isn't the point, and paying a recurring subscription for it feels like overkill. I figured I might as well build something I could reuse however I want.

Google Colab can run ffmpeg for free, and building a UI directly in the notebook means settings are handled through a GUI without ever touching the command line. That was the idea behind this tool.

How to Use It

Run cells from top to bottom and interact with the UI as prompted. No Python code required. Each cell has # @title form mode applied, so running in Colab's form view gives a clean, focused interface.

1. Setup

Running the first cell installs ffmpeg and yt-dlp automatically. A _ffmpeg helper function is also defined here to make error output easier to handle. This only needs to be run once.

2. Upload Videos

A file selection dialog opens. Select the video file(s) you want to convert. Selecting multiple files enables merging.

Supported input formats: .mov .mp4 .avi .mkv .webm

3. Set Merge Order (Multiple Files Only)

If you uploaded multiple videos, use the dropdowns to specify the merge order. This step is skipped automatically for a single file.

4. Configure Format, Quality, and Speed

Use the radio buttons to choose an output format and quality level.

Choosing an Output Format

FormatCharacteristicsBest ForNot Ideal For
GIFAuto-plays, loops, works in any Markdown renderer. 256 colors, larger fileGitHub README, Zenn / QiitaSize-constrained environments
MP4 (H.264)High quality, small file size, full color. Requires a media player.X / Slack / GitHub README (click-to-play)Scenarios requiring auto-play or looping

GIF Quality Presets

PresetWidthFPSColorsUse Case
GitHub README960px15fps256README embedding (recommended)
SNS / Lightweight640px10fps128Minimize file size
High Quality1280px20fps256Quality-first output
CustomanyanyanyFine-grained control

MP4 Quality Presets

PresetCRFNotes
High Quality18Larger file size
Standard23Balanced (recommended)
Lightweight28Smaller file size
Custom0–51Fine-grained control

Playback speed is also adjustable. Speeding up a long recording to 1.5x or 2.0x makes for a more compact demo.

5. Generate Preview

Run the cell to generate a preview. The preview is rendered at the same resolution and frame rate as the final output, so you can use it to identify exact timestamps for zoom configuration. To regenerate the preview after changing settings, adjust the format and quality settings and re-run this cell — no need to start from the beginning.

The preview is generated using the fast preset with no audio. Playback speed changes (via the setpts filter) are already applied at this stage.

6. Review Preview

The preview video appears inside the notebook with playback controls. Play and pause to find the exact timestamps where you want to apply zoom effects.

7. Zoom Settings (Optional)

You can add zoom effects to highlight specific moments in your demo.

How to use:

  1. Check "Add zoom" — the first zoom event is added automatically
  2. Play the embedded preview video and pause at the scene you want to zoom
  3. Click "📍 Record start position" to apply the current playback position to the selected event's start time
  4. Pause at the scene where you want the zoom to end, then click "🏁 Record end position" to set the end time. The hold duration (time spent at peak zoom) is calculated automatically from the in and out durations
  5. Each event's header can be collapsed by clicking it. Use the "Set as target event" button or the dropdown to switch which event you're recording to — the active event is highlighted with a green border
  6. Click "+ Add Zoom Event" to add more zoom points. Adding a new event collapses existing ones automatically, and the next event's start time is calculated from the end of the previous one
  7. Configure each event:
    • Zoom area: select from the 3×3 grid (top-left, center, bottom-right, etc.)
    • Max zoom level: peak magnification (1.1x – 5.0x)
    • Start (sec): when the zoom begins (can be set via the 📍 button)
    • In (sec): time to ramp from 1x to peak zoom
    • Out (sec): time to ramp back down to 1x
    • End (sec): when the zoom effect fully ends (can be set via the 🏁 button). Hold duration is auto-calculated as End − Start − In − Out

Timeline example: with Start=5.0, In=0.3, Out=0.3, End=6.6:

  • Zoom begins at 5.0 seconds
  • Ramps from 1x to peak over 0.3 seconds (5.0 → 5.3 sec)
  • Holds at peak for 1.0 second (5.3 → 6.3 sec) ← auto-calculated
  • Ramps back to 1x over 0.3 seconds (6.3 → 6.6 sec)
  • Normal playback resumes after 6.6 seconds

Tips:

  • If zoom events overlap in time, an error is reported when the final output runs
  • The 3×3 grid makes it straightforward to zoom in on a specific UI element
  • Unchecking "Add zoom" disables all zoom effects without deleting your settings

Use cases:

  • Draw attention to a button click in a UI walkthrough
  • Highlight a code change in an editor
  • Zoom in on specific data points in a dashboard demo
  • Focus on a form field during an input demo

8. BGM Settings (Optional — MP4 Output Only)

When exporting as MP4, you can add background music sourced from a YouTube video.

How to use:

  1. Check "Add BGM" — the option is disabled when GIF is selected
  2. Enter a YouTube URL and click "Download". The tool fetches metadata first, then downloads the audio
  3. Preview the downloaded audio and adjust the following settings:
    • Start position (sec): which point in the audio to begin playback from. Click "📍 Record start position" while the audio is playing to apply the current position
    • ⚡ Speed: playback speed of the BGM itself (0.5x – 2.0x)
    • 🔊 Volume: adjust from 0.0 to 1.0 — slider changes are reflected in the audio player in real time
  4. Run step ⑨ to generate the final output

Auto loop / cut: If the audio remaining from the start position (after speed adjustment) is shorter than the video, it loops automatically. If it's longer, it is cut at the point the video ends.

License detection: Metadata is analyzed at download time and a badge is displayed to indicate the likely copyright status:

BadgeCondition
🚫 Commercial track detectedartist / track metadata fields are present
🚫 Official license notice foundDescription contains "Licensed to YouTube by"
✅ Likely royalty-freeTitle or uploader name contains keywords like NCS, No Copyright, etc.
✅ Creative CommonsLicense field contains a CC identifier
⚠️ License unknownNone of the above conditions match

In all cases, please verify usage rights yourself before publishing.

9. Final Output

Run the cell to generate the final output with all settings applied, including zoom effects and BGM. To adjust zoom and re-export, modify the zoom settings and re-run this cell.

If the resolution settings were changed after generating the preview, an error is raised here prompting you to regenerate the preview. This prevents mismatches between the preview and final output.

Final output behavior:

  • GIF: full palette generation using the selected dithering method
  • MP4 (no BGM): re-encoded with the slow preset for maximum compression efficiency (no audio)
  • MP4 (with BGM): audio with speed and volume applied is merged in. Automatically loops or cuts to match the video length

10. Select Save Destination

Select a destination using the radio buttons.

DestinationDescription
Download locallySaved via the browser download dialog
Save to Google DriveAuto-copied to the specified path (includes mount step)
BothExecutes both options simultaneously

11. Save

Run this cell to write the file to the selected destination. Separating destination selection and the save action makes it easier to review your choice before committing.

Technical Notes

A few implementation details worth highlighting.

  • High-quality GIF conversion

    palettegen / paletteuse are FFmpeg's dedicated two-pass GIF filters. palettegen analyzes the entire video to generate an optimal 256-color palette, and paletteuse applies that palette when rendering each frame. Compared to single-pass conversion, color fidelity improves significantly.

    GIF output quality depends heavily on ffmpeg filter configuration. This tool uses a two-pass approach combining palettegen and paletteuse.

    [0:v] fps=15,scale=960:-1:flags=lanczos,split [a][b];
    [a] palettegen=max_colors=256:stats_mode=full [p];
    [b][p] paletteuse=dither=floyd_steinberg:diff_mode=rectangle

    The lanczos filter improves resize quality, and floyd_steinberg dithering produces smooth color gradients. Custom mode also supports bayer dithering for a different file size vs. quality tradeoff.

  • Automatic file size warning

    os.path.getsize() is a function from Python's standard os library that returns the size of a given file in bytes. It requires no external dependencies, making it well-suited for a quick size check immediately after conversion.

    If a GIF exceeds 15 MB, the notebook displays a warning automatically with guidance on how to reduce it. GitHub's GIF size limit is 10 MB, so catching this before you try to embed it in a README is useful.

  • Playback speed adjustment

    setpts (Set Presentation Timestamps) is an FFmpeg video filter that changes playback speed by rewriting the timestamp of each frame. Dividing by a factor greater than 1 speeds up playback — for example, PTS/1.5 produces 1.5x speed. When audio is present, the atempo filter must also be applied separately to match.

    Speed is controlled via the setpts filter. Bumping a long recording to 1.5x or 2.0x noticeably reduces file size. Since speed is applied during preview generation, the final output can reuse the preview footage directly.

    speed_filter = f'setpts=PTS/{speed}'  # 1.5x speed → setpts=PTS/1.5
  • Merging multiple videos

    FFmpeg concat demuxer is the mechanism for joining multiple video files sequentially. Combined with the -c copy option, it copies video and audio streams without re-encoding, resulting in fast, lossless merges.

    ffmpeg's concat muxer handles merging with -c copy, skipping re-encoding entirely for fast results. The dropdown UI lets you specify merge order freely. Filenames containing special characters are escaped correctly.

  • Separated preview and final output

    FFmpeg -preset option controls the tradeoff between encoding speed and compression efficiency. fast encodes quickly at the cost of slightly larger files, while slow takes more time to achieve the best compression ratio. Using the appropriate preset at each stage means fast iteration during review and high-quality output at the end.

    The workflow is split into preview generation and final output. This lets you confirm exact timestamps on a full-quality preview before configuring zoom, and iterate on zoom settings without regenerating the base video. The final output step also checks that resolution settings haven't changed since the preview was generated, preventing silent configuration mismatches.

  • Zoom via FFmpeg's zoompan filter

    zoompan is an FFmpeg video filter that produces dynamic pan-and-zoom effects by specifying zoom level (z), x position, and y position per frame using mathematical expressions. Built-in variables like on (frame number) and iw / ih (input width / height) can be used inside these expressions, enabling complex timeline-driven zoom behavior.

    Zoom is implemented using ffmpeg's zoompan filter, which generates smooth, frame-accurate zoom animations. Each zoom event is translated into expressions that control zoom level, x position, and y position on a per-frame basis.

    One constraint with zoompan is that the z variable cannot be referenced inside the x and y expressions. This was addressed by inlining z_expr directly into the coordinate expressions, which fixes a zoom-center drift issue present in earlier versions.

    # Example: zoom to center (area 5) at 2.0x peak
    # Timeline: Start=5.0, In=0.3, Out=0.3, End=6.6 (Hold=1.0 auto-calculated)

    zoompan=z='if(gte(on,150),if(lt(on,159),(1+(2.0-1)*(on-150)/9),\
    if(lt(on,189),2.0,\
    if(lt(on,198),(2.0-(2.0-1)*(on-189)/9),1))),1)':
    x='if(between(on,150,198),floor(max(0,min(iw-iw/(z_expr),480-iw/(2*(z_expr))))),0)':
    y='if(between(on,150,198),floor(max(0,min(ih-ih/(z_expr),270-ih/(2*(z_expr))))),0)':
    d=1:s=960x540:fps=15

    The tool automatically:

    • Converts timestamps to frame numbers based on the video's FPS
    • Calculates zoom center coordinates from the 3×3 grid position
    • Generates smooth interpolation curves for zoom-in and zoom-out transitions
    • Sorts zoom events by start time and reports overlaps as errors before any output is generated
  • Start and end position recording buttons

    google.colab.kernel.invokeFunction is a Google Colab-specific API that allows JavaScript running in cell output to call Python functions. Register a function with colab_output.register_callback('name', fn), then invoke it from an HTML onclick handler using google.colab.kernel.invokeFunction('name', [args], {}).

    A preview video is embedded directly in the zoom settings cell. Pausing it at any point and clicking a button immediately applies the current playback position to the selected event. Both "📍 Record start position" and "🏁 Record end position" use google.colab.kernel.invokeFunction to invoke Python callbacks from JavaScript.

    def _record_position(time):
    if _zoom_start_widgets and record_target_sel.value is not None:
    _zoom_start_widgets[record_target_sel.value].value = round(float(time), 2)

    def _record_end_position(time):
    if _zoom_end_widgets and record_target_sel.value is not None:
    _zoom_end_widgets[record_target_sel.value].value = round(float(time), 2)

    colab_output.register_callback('record_position', _record_position)
    colab_output.register_callback('record_end_position', _record_end_position)
  • Collapsible event panels with highlight

    ipywidgets is a library for rendering interactive UI widgets inside Jupyter notebooks and Google Colab. VBox and HBox handle layout composition, and Layout objects apply CSS-equivalent styling. Properties can be updated dynamically in Python, and the UI reflects changes immediately without any page reload.

    To keep multiple zoom events manageable, each event is wrapped in a collapsible container. Adding a new event automatically collapses existing ones, and the currently targeted event is highlighted with a green border. The "Set as target event" button switches the recording target to any event at any time.

  • BGM synthesis: atempo filter and stream looping

    atempo is an FFmpeg audio filter that changes playback speed without altering pitch (time-stretching). Its accepted range is 0.5x to 2.0x in a single pass; larger changes require chaining, e.g. atempo=2.0,atempo=2.0. -stream_loop -1 repeats an input stream indefinitely, and combined with -t to set a duration limit, it handles "loop if too short, cut if too long" in a single command without any branching logic.

    The BGM feature works in two stages. First, speed and volume adjustments are applied to the downloaded audio to produce an intermediate file. That file is then merged with the video.

    # Apply speed and volume, write intermediate file
    _ffmpeg(
    '-ss', start_pos,
    '-i', bgm_raw_path,
    '-vn', '-af', f'atempo={speed},volume={vol}',
    '-c:a', 'aac', '-b:a', '128k', '-ac', '2',
    'bgm_segment.aac'
    )

    # -stream_loop -1 + -t {dur}: loop if short, cut if long — one command, no branching
    _ffmpeg(
    '-i', 'preview.mp4',
    '-stream_loop', '-1', '-i', 'bgm_segment.aac',
    '-t', vid_dur,
    *VIDEO_ARGS, *AUDIO_ARGS,
    OUTPUT_NAME
    )

    Speed adjustment uses ffmpeg's atempo filter. Because atempo only accepts values between 0.5 and 2.0, the UI slider range is set to match. Looping and cutting are handled together using -stream_loop -1 (infinite loop) and -t {video duration} (cut at that point), with no conditional logic needed.

  • Real-time slider preview

    Widget.observe() is an event listener provided by ipywidgets that monitors property changes on a widget and fires a callback when they occur. Specifying names='value' restricts it to value changes only. Calling display(Javascript(...)) inside a callback injects JavaScript into the Colab cell output, enabling direct DOM manipulation.

    Moving the speed or volume slider is immediately reflected in the notebook's audio player. The observe callback fires display(Javascript(...)) to update the player. A guard condition ensures JavaScript is only injected when the player actually exists, preventing unnecessary accumulation of cell output.

    def _on_bgm_speed_change(change):
    _update_duration_hint()
    if bgm_audio_ready:
    display(Javascript(
    f"var a=document.getElementById('bgm_audio_player');"
    f"if(a) a.playbackRate={change['new']};"
    ))

    bgm_speed_slider.observe(_on_bgm_speed_change, names='value')

    The audio player's initial values are applied using the oncanplay event, which fires once the audio is ready to play. An _init flag prevents the initialization block from running more than once.

    <audio id="bgm_audio_player" controls
    oncanplay="if(!this._init){
    this.volume={vol};
    this.playbackRate={speed};
    this._init=true;
    }">
  • License auto-detection via metadata

    yt-dlp --dump-json writes a video's metadata as JSON to standard output without downloading the actual file. It's fast and returns a rich set of fields including title, uploader, artist, track, license, and description.

    Metadata fetched via yt-dlp's --dump-json is analyzed to assess copyright risk in order of specificity: presence of artist / track fields (set by YouTube for commercially distributed music), the string "Licensed to YouTube by" in the description, and keyword matching against the title and uploader name for royalty-free signals (NCS, nocopyright, etc.). Since metadata alone isn't a reliable indicator, the result is displayed as a reference badge rather than a definitive determination.

  • Efficient preview workflow

    FFmpeg -preset option (libx264) offers a range from ultrafast to veryslow, trading encoding speed for compression efficiency. Faster presets use less CPU but produce somewhat larger files; slower presets spend more time to achieve higher compression. The difference in file size between slow and fast at the same CRF value is noticeable.

    Preview generation uses the fast preset, keeping iteration quick. The final output uses the slow preset for maximum compression efficiency. When no zoom is configured, the final output re-encodes the preview footage directly, so it completes quickly.

Platform Size Limits

Limits vary by platform. Zenn's 3 MB cap is particularly tight — the SNS / Lightweight preset (640px / 10fps / 128 colors) combined with 1.5x playback speed is a realistic target for clearing it.

PlatformSupported FormatsSize Limit
GitHub READMEGIF / MP4 / MOV100 MB (video) / 10 MB (GIF)
ZennGIF3 MB
QiitaGIF / MP4100 MB
X (standard)MP4 / MOV512 MB
SlackGIF / MP4 / MOV and more1 GB

Closing

Not wanting to pay a subscription for a conversion tool, and not wanting to install extra software locally — those two things together were enough motivation to build this myself.

Packaging it as a Google Colab notebook means it runs anywhere with a browser, with all settings handled through a GUI. This update adds BGM support: paste a YouTube URL, download the audio, and adjust start position, speed, and volume with sliders before merging it into the final MP4. Slider changes reflect in the audio player in real time, so you can hear the balance while you're setting it. Personally, being able to add a bit of music to a demo video without leaving the notebook has made the whole workflow feel a lot more self-contained.

If you've ever found yourself thinking "converting demo videos is such a chore," "I wish I could highlight that one interaction more clearly," or "I want to add some background music to make this demo more engaging," I hope this is useful.

If this site helped you, please support us with a star! 🌟
Star on GitHub

References