1

I have a video clip that has a frame rate of 25 fps (MJPEG compression) and an audio clip with a sample rate of 48kHz. Both of them play for the same duration - 27 seconds.

Is there any standard way to sync the two signals of different sampling rates if I wish to write them to file?

If this is not doable cleanly in MATLAB, I am happy with a some software tool on Linux or MacOS that can do this syncing.

E.g. the VideoFileWriter object can take both an audio sample and a video frame at a single step, however I have about 1917 audio samples corresponding to a single video frame.

Some details:

I have read in the audio signal using [signal, fs] = audioread(filename). signal is a 1336321 x 1 array.

All the 697 frames in the video are saved as a cell array of RGB images.

AruniRC
  • 170
  • 1
  • 8
  • Welcome to SE.DSP. What do you mean by "write them to file"? – Laurent Duval Aug 12 '16 at 16:32
  • 1
    Maybe duplicate every frame 1917 times, and feed that to VideoFileWriter? – MBaz Aug 12 '16 at 19:33
  • @LaurentDuval: by "write to file" I mean generate a video that has the frames as the video component and the audio file as the sound component and write this resultant file to disk. – AruniRC Aug 13 '16 at 19:50
  • I did not try to do something similar, but I suspect there are simpler ways outside Matlab. Did you check solutions like http://stackoverflow.com/questions/17013363/can-i-add-a-compressed-audio-track-to-an-mjpeg – Laurent Duval Aug 13 '16 at 19:57
  • @MBaz: and then set the frame rate of the resultant video clip to be the same as the audio clip (i.e. 48 kHz)? – AruniRC Aug 14 '16 at 09:06
  • @AruniRC, Probably, yes. – MBaz Aug 14 '16 at 16:13
  • @MBaz I ended up doing the reverse of what you suggested, because of file size issues: VideoFileWriter() allows associating multiple audio samples (1917 in this case) with a single video frame (this was glaringly missing in the MATLAB documentation). thanks for the initial idea! – AruniRC Aug 15 '16 at 00:46
  • @AruniRC I'm glad that the idea was helpful and that you solved the problem. – MBaz Aug 15 '16 at 02:05
  • The answer still seems more of a programming issue; leaving closed for now. @MBaz Do you think the solution posted in the question should be here as an answer? Would it help future readers? – Peter K. Aug 16 '16 at 11:20
  • @PeterK. I think that posting the solution as an answer instead of an edit on the question would fit our format better. It's appropriate to leave the question closed, because it's offtopic, but it could certainly be useful to people facing the same problem. – MBaz Aug 16 '16 at 22:31
  • @MBaz ok! I've reopened the question. If the op can include their solution as an answer, that'd be good! – Peter K. Aug 16 '16 at 23:09
  • @AruniRC Now that your question is re-opened, please consider adding your solution as an answer and accepting it. – MBaz Aug 17 '16 at 00:43
  • @MBaz I am still not sure why this is off topic? Does this forum deal with only the theoretical aspects of DSP and not with things like matlab functions? Because then I will shift this to SO. Following your guidelines this seemed to be a good fit. – AruniRC Aug 18 '16 at 06:33
  • @AruniRC It's subjective, and your question is IMO very close to on-topic, but to me the problem is a bit more about a programming issue than about DSP. Having said that, allow me to insist in my suggestion to turn your edit into an answer and accept it :) – MBaz Aug 18 '16 at 13:16
  • @MBaz done. Might I then request that you migrate this to SO, where programmers would find this useful? I do not have the reputation points at DSP to do this. – AruniRC Aug 19 '16 at 16:49

2 Answers2

2

Thanks to helpful comments from @MBaz, I managed to come up with a solution:

we can associate multiple audio samples with a single frame using the VideoFileWriter object. This fact and use-case is missing in the documentation.

First, some stats about the audio and video files. The stereo audio samples are in a 2xN array signal. The video frames are in a cell-array of RGB images, frames.

%% Write audio and video to file
%   Write both audio and video samples into a single video file. 
%   Multiple audio samples are matched with one video frame.
%   MATLAB problem: Compression is not possible when audio is included.

% Display A/V stats
fprintf('\nVIDEO STATS:');
fprintf('\nNum. of frames: %d', length(frames));
fprintf('\nFrame rate: %.02f fps', annot.video.frame_rate);
fprintf('\n\nAUDIO STATS:');
fprintf('\nNum. of samples: %d', size(signal,1));
fprintf('\nSampling rate: %.02f Hz\n', infoAudio.SampleRate);

Here's the portion that matches multiple audio samples to a single video frame. Note that there are 697 video frames and 1336321 audio samples. Some basic fiddling to ensure proper matching. The audio subsampling effect is imperceptible in the output AVI files that I tried.

numAudio = size(signal,1);
numRep = floor(numAudio/length(frames));
numDiff = numAudio - numRep*length(frames); % mismatch

if numDiff
    % if length(frames) does not evenly divide nAudioSamples, then 
    % subsample audio to match numRep*length(frames)
    selector = round(linspace(1, numAudio, numRep*length(frames))); 
    subSignal = signal(selector, :);
end
assert(numRep*length(frames) == size(subSignal,1));

Finally, we use the VideoFileWriter object available in the Computer Vision System Toolbox of MATLAB to write audio and video to file.

videoFWriter = vision.VideoFileWriter(fullfile(shotPath, 'avclip', ...
                                        [num2str(shotNum) '_av_clip.avi']), ...
                                      'AudioInputPort', true, ...
                                      'FrameRate',  annot.video.frame_rate);
for i = 1:length(frames)
   fprintf('Frame: %d/%d\n', i, length(frames));
   step(videoFWriter, frames{i}, subSignal(numRep*(i-1)+1:numRep*i,:)); 
end
release(videoFWriter);
AruniRC
  • 170
  • 1
  • 8
-1

U can use simulink we can associate multiple audio samples with a single frame using the VideoFileWriter object. This fact and use-case is missing in the documentation.

First, some stats about the audio and video files. The stereo audio samples are in a 2xN array signal. The video frames are in a cell-array of RGB images, frames