Opened 16 years ago

Closed 11 years ago

# yinfft can be further optimised

Reported by: Owned by: ciacciax@yahoo.com Paul Brossier normal 0.3.3 corelib 0.3.2 normal

Simplifying everything, your YinFFT algorithm can be summarized by five steps:

1. windowing
2. compute the FFT of the windowed signal
3. compute the magnitude of the spectrum of step 2
4. compute the FFT of the array obtained in step 3
5. computing yinFFT using this formula: yinFFT[tau] = sum(above) - radius[tau] * cos(phase[tau]);

(point 5 is not very explanatory, but you should be able to understand it)

I noticed that in your implementation, you handle complex points using (radius, phase) notation instead of the "standard" (real,imaginary) one, and this slows down the execution of your algorithm. In point 2, FFTW returns an array of (half)complex numbers, where each point has a real and a imaginary part. Then, you convert the outcome of FFTW in the following way:

• radius = sqrt(re * re + im * im)
• phase = tan2(im, re)

In the point 3, when you compute the weighted magnitude of the spectrum, you square the radius of the complex number. This means that in point 2, you use sqrt() to get the square-root of the magnitude and in point 3 you square the radius, operations that could be avoid.

In point 5, you subtract radius[tau] * cos(phase[tau]) to the sum computed above. The expression "radius[tau] * cos(phase[tau])" simplifies to "real part of the complex number at tau". This means that in the point 4 you convert the FFT outcome to polar coordinates, and then in point 5 you try to extract the real part that before the conversion was already available.

What I am trying to say is that if you didn't convert the outcome of FFTW to polar coordinates, you could avoid a lot of unnecessary computations

• in point 2 the sqrt could be skipped
• in point 2 there is no sense in computing the phase (which requires trigonometric functions) because it is not needed
• (same as point 2 in point 4)
• in point 5 "radius[tau] * cos(phase[tau])" could be replaced by "real part of complex at index tau"

I attach to this ticket the way I would rewrite pitchyinfft.c

To see the real changes, diff my file with the one in aubio 0.3.2

Best regards Andrea

### Changed 16 years ago by ciacciax@yahoo.com

improved version of pitchyinfft without polar coordinates

### comment:1 follow-up:  2 Changed 16 years ago by Paul Brossier

Description: modified (diff) new → assigned The pitchfft computation in aubio 0.3.2 wastes lots of CPU cycles unnecessarily → yinfft can be further optimised

Thank you for your report Andrea. I need to investigate this carefully to make sure we find the exact same results, but it sounds very promising.

(simplified title, fixed typo in the first sentence)

### comment:2 in reply to:  1 Changed 16 years ago by anonymous

Thank you for your report Andrea. I need to investigate this carefully to make sure we find the exact same results, but it sounds very promising.

(simplified title, fixed typo in the first sentence)

Hi Piem, I use this algorithm in a Java application I am developing and it works fine. I guess the optimization is fine, otherwise my application would not get "meaningful" pitches...

Just another thing: there is an error at line 102 of my file. The loop for the variable l should start at index 1 and not 0 (the case for 0 and N/2 are handled above).

So this line:

102 for (l=0; l < p->bufsize/2; l++){

should become:

102 for (l=1; l < p->bufsize/2; l++){

If you notice something is wrong, please inform me so that I can fix my own code (even if I repeat it works fine). In my application I don't use FFTW, I use a Radix-2 FFT implementation real-2-halfcomplex (taken from the GNU Scientific Library). My function returns the data as following:

If FFTW returns the data in a different format, you should compute the magnitude in a different way.

Bye Andrea

### comment:3 Changed 11 years ago by Paul Brossier

Resolution: → fixed assigned → closed

The latest changes in 4a95f83758b4fe4f3f8ddce88447701f50705eef include these optimisations.

The yinfft function is slightly different when computed this way, probably avoiding noise that was introduced by the real/imag->norm/phas->real/imag conversion.

First tests showed a speed improvements by about ... drum roll ... 30%. Great!

Note: See TracTickets for help on using tickets.