[Ardour-Dev] Hello, some tech talk, etc.

Fons Adriaensen fons at linuxaudio.org
Sat Jul 7 04:15:53 PDT 2012


On Sat, Jul 07, 2012 at 11:56:36AM +0200, Florian Paul Schmidt wrote:
 
> Did you try profiling the code? Maybe that holds some instructive
> surprises :D

There isn't much profiling to do on a single loop
over all samples in a buffer.

The architecture of the app is very Jack-like,
and this is the complete 'process ()', minus some
trivial code at the start and end. The buffer size
(na) is 16384 samples. ia, na, k are ints, all the
other variables are 32-bit floats.

    for (ia = 0; ia < na; ia++)
    {
    // Get sample
        A->samp (ia, &x1, &y1);
    // Get local osclllator state
        k = _sincos->ind (ph);
        c = fcos [k];
        s = fsin [k];
    // Complex multiplication
        x2 = c * x1 + s * y1;
        y2 = c * y1 - s * x1;
    // Low pass filter the real part
        zm += _lf_c1 * (x2 - zm);
    // Cheap atan2 (y2, zm)
        e = y2 / (fabsf (zm) + 1.0f);
    // Loop filter
        fr += _lf_c2 * e;
        ph += _lf_c1 * e + fr;
    }   

A->samp() is an accessor that gets a complex sample
from the buffer, it's inlined of course. _sincos->ind()
converts the float phase (ph) to an integer index into
the sine/cosine tables, also inlined. No difference at
all when those are written out locally. The sine/cosine
lookup and complex multiplication are exactly the same
as in the old code.  

The main difference is that in the old code the phase
measurement and filtering is done on the entire buffer
(and it's *a lot* more complicated than it is here, 
also using 'future' samples), and only some time later
the rotations are performed over the entire buffer.
The latter happens in a separate thread, and is also
much more complicated than it is here since the old
code also performs diversity recombination and some
other extra functions.

Separating the two was possible because the old code
used 'open loop' algorithms, there's no feedback at
all. The PLL of course requires feedback with only 
a single sample of delay.

Most time seems to go into the float to integer
conversion that is part of _sincos->ind(). There
is no way to avoid that, except by replacing the
sine/cosine lookup tables by real calls to sinf()
and cosf(), but that's even worse.


Any ideas welcome !

-- 
FA

A world of exhaustive, reliable metadata would be an utopia.
It's also a pipe-dream, founded on self-delusion, nerd hubris
and hysterically inflated market opportunities. (Cory Doctorow)




More information about the Ardour-Dev mailing list