[ardour-dev] AMD64 SSE optimisation
Sampo Savolainen
v2 at iki.fi
Mon Jan 2 03:06:16 PST 2006
Quoting John Rigg <ardev at sound-man.co.uk>:
> I've worked out what's happening. The variables get optimised out,
> even with only -O (this is with gcc-4.0), and are stored in registers.
>
> For example, in x86_compute_peak in the sandbox tester:
> %rdi &buf[0] (also stored at 32(%rbp))
> %rsi nframes
> %rbx ITERATIONS
> %xmm0 peak
That sounds bad. Really bad. That means that the ABI is totally different
with x86/64. This means that I need to make a full rewrite of the asm code
using something a bit more compatible than pure assembler.
I've tried one library which is wraps the underlying vector calls in nice
looking C, but the code it produced was not as fast as the asm I wrote, nor
did it support unaligned buffers which is essential as ardours' process
cycle can contain events mid-buffer. This means that there is no guarantee
that the buffers are 16 byte aligned. (which is a necessary for the SSE
instructions)
Does anyone on the list have experience with xmmintrin.h? This is a header
file which provides access to the SSE functions from inside normal C. This
is what I've considered using to rewrite the SSE in ardour.
Sampo
More information about the ardour-dev
mailing list