[Ardour-Users] "Extract LTC from audio and align video" menu option missing from "Transcode/Import Video file" popup

Chris Caudle chris at chriscaudle.org
Wed Apr 4 12:50:00 PDT 2018

On Tue, April 3, 2018 9:07 pm, robertlazarski . wrote:
> The default of auto mute is off on Free Run and RTC mode and I didn't
> change it.

Then assuming the behavior works like it seems to be documented (which if
I understand correctly is that the time code is continuously output as
long as the Zoom F8 is powered on) it seems that the video recorder should
not have incorrect time code (i.e. not matching the audio).

> The main issue might be some missing data in the first part of the LTC on
> CH1 of the video.

Missing code could be because the phone has trouble adjusting to the time
code audio.  As far as I know most phones would have some type of
automatic gain control since you don't have microphone gain knobs on the
phone, so perhaps it takes the phone AGC a few seconds to determine an
appropriate gain setting for LTC.  The sound of LTC is definitely not like
typical audio signals that the AGC behavior would be optimized to handle.

> I am looking at getting a hardware slate like a Denecke TS-3 - to keep my
> sanity I only use software in post :-) .

You could get a hardware slate, but seems expensive.  If you were
responding to my suggestion for a clap board, I meant old school,
literally knock two sticks of wood together.  You should be able to step
frame by frame  in the video to see the frame where the pieces hit
together, and in the  audio editor it should be pretty obvious where the
impact occurs as long as the room is otherwise relatively quiet. It lets
you line up by hand in the editor if you want, but also gives you a
reference point to check the timecode in that video frame, and check the
time code where Ardour lined up that point in the audio track.

On Tue, April 3, 2018 9:43 pm, robertlazarski . wrote:
>> That sounds like both recorders are free-running.
> Might be, I tried using time of day timestamps and it didn't help.

No, by free running we were referring to the case where for example the
camera generates  its own time code track, and the Zoom generates its own
time code track, and even if they start out together they could drift
apart over time due to inaccuracies in the individual oscillators used to
track the time.  You have previously clarified that the camera does not
generate any time code, you just cable the time code output of the zoom
into one channel of the camera audio track.

>> Note that both audio and video may import at some late position in the
>> timeline. e.g. at 14:00:00:00 and may not be visible with default
>> zoom-level.
> I tried zooming in and out and didn't see an obvious misalignment.

It wouldn't be a misalignment per se, just that timecode is defined as
starting at one second after midnight, and covering a full 24 hours until
23:59:59 (hours:min:sec).  I don't remember if Ardour has a way to specify
what hour and minute your project timeline starts, but by default it will
just assume 00:00:00 (or maybe 00:00:01, I forget), so if you have your
timecode set to follow wall clock time, and you record at say early
evening at 19:22:13, Ardour will happily import your audio into a project
starting at 00:00:00 and place the audio at 19:22:13 on the timeline so
that you end up with a project that will play back silence for over 19
hours and 22 minutes if you place the playhead at the project beginning
and hit play.
Just a detail to be aware of.

> On my latest files I am attempting to use "time of day" timestamps in the
> timecode, using the F8 "Int RTC" mode.
> The sync error pattern I am noticing, is the first entries of LTC in the
> MP4 file seem to be missing or corrupted. The POS from ltcdump never
> starts at zero. When I open the raw MP4 I see no such obvious problem.

What do you use to open the raw MP4?
Speaking of MP4, it just occurred to me that the phone is probably going
to be encoding the audio tracks with AAC (or some other lossy codec).  I
have no idea what effect that has on LTC decoding, and I could not find
any reference to the effect of AAC on LTC recording with a quick google
search.  Maybe Robin knows, or I can ask around to see if anyone else has
experience with that.  Maybe the AAC encoding is causing some problems for
the LTC decoder.
I'll see if I can figure out a way to script up some tests for that.  Can
you check to see if you phone is using AAC for sure, and what bit rate? 
Should be able to tell pretty easily with ffmpeg -i on one of the phone
video files, that should dump the codec and bit rate detected.

I see later that the audio you extracted from the phone video is wav, was
that recorded uncompressed native, or was the original audio compressed
and you converted to wav when you extracted from the video file?

> [linux-7cab(iksrazal)]
>  /home/iksrazal> sndfile-info -b F8.wav
> Version : libsndfile-1.0.25-exp
> Description      : SPEED=29.970D
> TAKE=004
> UBITS=00000000
> SCENE=180403
> TAPE=180403
> TR1=Tr1
> TR2=Tr2
> TR7=Tr7
> TR8=Tr8
> Originator       : ZOOM F8
> Origination ref  :
> Origination date : 2018-04-03
> Origination time : 18:31:09
> Time ref         : 0x0bebdd5f1 (66669.002354 seconds)
> BWF version      : 1
> UMID             :
> Coding history   : A=PCM,F=48000,W=24,M=multi,T=F8;VERSION=1.10;1:1 0 0 R
> 1   00;2:1 0 0 CNTR   00;3:0 0 0 CNTR   00;4:0 0 0 CNTR   00;5:0 0 0 CNTR
> 00;6:0 0 0 CNTR   00;7:1 0 0 CNTR   00;8:1 0 0 CNTR   00;L:0 1 0 CNTR
> 00;R:0 1 0 CNTR   00;
> [linux-7cab(iksrazal)]
>  /home/iksrazal> ltcdump phone_video_with_ltc.wav | head
> Note: This is not a mono audio file - using channel 1
> #User bits  Timecode   |    Pos. (samples)
> 00000000   18:31:21.29 |     2639     4239
> 00000000   18:31:22.00 |     4240     5841
> 00000000   18:31:22.01 |     5842     7442
> 00000000   18:31:22.02 |     7443     9044
> 00000000   18:31:22.03 |     9045    10646
> 00000000   18:31:22.04 |    10647    12247
> 00000000   18:31:22.05 |    12248    13849

So the audio file begins at 18:31:09.
The video seems to be missing the beginning, but position 4240 audio
samples is decoded as 18:31:22.00.
4240 samples is 4240/48000 = 88ms.

13 seconds should be 624 000 samples, so it does seem that the video
starts later than the F8 audio, but the question is  whether that part of
the video still lines up with the appropriate part of the audio recording,
or if they are offset by that many (or some other non-zero) seconds.  That
is where some common reference point, like wood blocks knocking together,
would come in useful.
What is on the second channel of the phone/camera audio? You could also
send a copy of the microphone feed with the clap board audio to F8 and
camera second channel simultaneously, that would give you another
reference point for comparison.

Chris Caudle

More information about the Ardour-Users mailing list