Skip to content

fix(video): add duration, width and height metadata to VideoMessage#265

Open
dumilani wants to merge 1 commit intoasternic:mainfrom
dumilani:fix/video-metadata
Open

fix(video): add duration, width and height metadata to VideoMessage#265
dumilani wants to merge 1 commit intoasternic:mainfrom
dumilani:fix/video-metadata

Conversation

@dumilani
Copy link

Problem

Videos sent via /chat/send/video are missing Seconds, Width, and Height fields in the VideoMessage proto. This causes recipients to see 0-second duration and the video fails to play on WhatsApp Desktop and mobile clients.

The current SendVideo handler constructs the VideoMessage without these fields:

msg := &waE2E.Message{VideoMessage: &waE2E.VideoMessage{
    // ... other fields set correctly
    // Seconds, Width, Height are missing
}}

Solution

  1. Auto-detect metadata via ffprobe — Added a getVideoMetadata() helper function that extracts duration, width, and height from the video file data using ffprobe (already available in the Docker image via the ffmpeg package). It follows the same temp-file + exec.Command pattern already used by runFFmpegConversion for sticker conversion.

  2. Accept optional metadata in the API payload — The SendVideo endpoint now accepts optional Seconds, Width, and Height fields in the JSON payload. If provided, they take priority over auto-detected values. This gives API callers flexibility while maintaining backward compatibility.

  3. Graceful degradation — If ffprobe fails or is unavailable, the function returns zero values and the video is sent as before (same as current behavior).

Changes

  • helpers.go: Added VideoMetadata struct and getVideoMetadata() function
  • handlers.go: Modified SendVideo() to extract metadata and set Seconds, Width, Height on the VideoMessage proto

API Payload (backward compatible)

Existing payloads continue to work unchanged. New optional fields:

{
  "Phone": "5519999999999",
  "Video": "data:video/mp4;base64,...",
  "Caption": "my video",
  "Seconds": 30,
  "Width": 1920,
  "Height": 1080
}

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @dumilani, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves an issue where videos sent via the /chat/send/video endpoint were missing crucial metadata, leading to incorrect duration displays and playback failures on various clients. The solution introduces automatic video metadata extraction using ffprobe and enhances the API to allow optional manual specification of these properties, ensuring videos are sent with complete and accurate information.

Highlights

  • Video Metadata Extraction: Implemented a getVideoMetadata() helper function that uses ffprobe to automatically extract video duration, width, and height from video file data. This addresses the issue of videos appearing with 0-second duration and failing to play on WhatsApp clients due to missing metadata.
  • API Payload Enhancement: The SendVideo API endpoint now accepts optional Seconds, Width, and Height fields in its JSON payload. These fields take precedence over auto-detected values, offering flexibility to API callers while maintaining backward compatibility.
  • Graceful Degradation: The metadata extraction process is designed for graceful degradation; if ffprobe fails or is unavailable, the function returns zero values, and the video is sent without metadata, mirroring the previous behavior.
  • Code Organization: A new VideoMetadata struct was introduced in helpers.go to encapsulate video properties, and the getVideoMetadata() function was added to this file for modularity.
  • Integration into SendVideo Handler: The SendVideo() function in handlers.go was modified to call getVideoMetadata() and populate the Seconds, Width, and Height fields of the VideoMessage proto before sending.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • handlers.go
    • Added Seconds, Width, and Height fields to the SendVideo request payload struct, marked as omitempty.
    • Integrated a call to getVideoMetadata to extract video properties from the uploaded file data.
    • Implemented logic to prioritize user-provided metadata from the API payload over auto-detected values.
    • Populated the Seconds, Width, and Height fields of the waE2E.VideoMessage proto with the determined metadata.
  • helpers.go
    • Imported the strconv package for parsing string representations of numbers.
    • Defined a new VideoMetadata struct to hold video duration, width, and height.
    • Created the getVideoMetadata function, which utilizes ffprobe to parse video stream and format information from a temporary file, returning a VideoMetadata struct.
    • Added error handling and logging for temporary file operations and ffprobe execution/output parsing.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly adds video metadata (duration, width, height) to VideoMessage, addressing a client-side playback issue. The implementation uses ffprobe for auto-detection and allows for API overrides, which is a solid approach. My review includes a couple of suggestions for the new getVideoMetadata helper function in helpers.go. One is a high-severity recommendation to improve security and performance by piping data to ffprobe via stdin instead of using temporary files. The other is a medium-severity suggestion to refactor the duration parsing logic to reduce code duplication and improve maintainability. Overall, this is a good fix that solves the described problem.

helpers.go Outdated
Comment on lines 788 to 809
tmpFile, err := os.CreateTemp("", "video-probe-*.mp4")
if err != nil {
log.Warn().Err(err).Msg("getVideoMetadata: failed to create temp file")
return meta
}
defer os.Remove(tmpFile.Name())

if _, err := tmpFile.Write(filedata); err != nil {
tmpFile.Close()
log.Warn().Err(err).Msg("getVideoMetadata: failed to write temp file")
return meta
}
tmpFile.Close()

cmd := exec.Command("ffprobe",
"-v", "quiet",
"-print_format", "json",
"-show_format",
"-show_streams",
"-select_streams", "v:0",
tmpFile.Name(),
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

Using temporary files for passing data to an external process like ffprobe can introduce security vulnerabilities (TOCTOU - Time-of-check to time-of-use) and is less efficient due to disk I/O. A more secure and performant approach is to pipe the data directly to ffprobe's standard input. ffprobe supports this with the -i - argument. This avoids creating temporary files on disk altogether.

This pattern is also used in runFFmpegConversion, and a similar refactoring could be applied there for consistency and security.

cmd := exec.Command("ffprobe",
		"-v", "quiet",
		"-print_format", "json",
		"-show_format",
		"-show_streams",
		"-select_streams", "v:0",
		"-i", "-",
	)
	cmd.Stdin = bytes.NewReader(filedata)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! Switched to piping data via stdin using bytes.NewReader(filedata) with -i -. No more temp files.

Comment on lines 836 to 851
if len(probeResult.Streams) > 0 {
meta.Width = uint32(probeResult.Streams[0].Width)
meta.Height = uint32(probeResult.Streams[0].Height)
if probeResult.Streams[0].Duration != "" {
if dur, err := strconv.ParseFloat(probeResult.Streams[0].Duration, 64); err == nil && dur > 0 {
meta.DurationSeconds = uint32(dur + 0.5)
}
}
}

// Fallback to format-level duration if stream duration was not available
if meta.DurationSeconds == 0 && probeResult.Format.Duration != "" {
if dur, err := strconv.ParseFloat(probeResult.Format.Duration, 64); err == nil && dur > 0 {
meta.DurationSeconds = uint32(dur + 0.5)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There's some code duplication in how the video duration is parsed from the stream and format information. This can be refactored to be more concise and maintainable by determining the correct duration string first, and then parsing it once.

var durationStr string
	if len(probeResult.Streams) > 0 {
		stream := probeResult.Streams[0]
		meta.Width = uint32(stream.Width)
		meta.Height = uint32(stream.Height)
		durationStr = stream.Duration
	}

	// Fallback to format-level duration if stream duration was not available
	if durationStr == "" && probeResult.Format.Duration != "" {
		durationStr = probeResult.Format.Duration
	}

	if durationStr != "" {
		if dur, err := strconv.ParseFloat(durationStr, 64); err == nil && dur > 0 {
			meta.DurationSeconds = uint32(dur + 0.5)
		}
	}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! Refactored to extract durationStr first (stream → format fallback) and parse it once.

Videos sent via /chat/send/video were missing Seconds, Width, and Height
fields in the VideoMessage proto, causing recipients to see 0-second
duration and inability to play on WhatsApp Desktop and mobile clients.

This change:
- Adds getVideoMetadata() helper that uses ffprobe to extract video
  duration and dimensions from the file data
- Sets Seconds, Width, and Height on the VideoMessage proto
- Accepts optional Seconds/Width/Height in the API payload, which take
  priority over auto-detected values
- Gracefully degrades to current behavior if ffprobe is unavailable
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments