.NET Framework - Fast Fourier Transform (FFT) in VB .Net - Please Help

Asked By IdlePhaedrus on 11-Oct-07 09:25 AM
Hi, I have a FFT routine that I converted from C++ to VB in a module
as follows:

Const M_PI = 3.1415926535897931

' Fast Fourier Transform
Public Sub FFT(ByRef rex() As Single, ByRef imx() As Single, ByVal
N As UShort)

Dim nm1 As UShort = CUShort(N - 1)
Dim nd2 As UShort = CUShort(N \ 2)
Dim m As UShort = Math.Log(N) \ Math.Log(2)
Dim j As UShort = nd2
Dim k As UShort
Dim i As UShort

' bit reversal sorting
For i = 1 To N - 2
If i < j Then
swap(rex(i), rex(j))
swap(imx(i), imx(j))
End If
k = nd2
Do While k <= j
j = CUShort(j - k)
k = CUShort(k \ 2)
Loop
j += k
Next i

' loop for each stage
Dim le As UShort
Dim le2 As UShort
Dim jm1 As UShort
Dim ip As UShort
Dim ur As Single
Dim ui As Single
Dim sr As Single
Dim si As Single
Dim tr As Single
Dim ti As Single
Dim el As UShort

For el = 1 To m
'le = Round(pow(2,el),0);
le = CUShort(1 << el)
le2 = CUShort(le >> 1)
ur = 1
ui = 0
' sine & cosine
sr = Math.Cos(M_PI / le2)
si = -Math.Sin(M_PI / le2)
' loop for each sub DFT
For j = 1 To le2
jm1 = CUShort(j - 1)
' loop for each butterfly
For i = jm1 To nm1 Step le
ip = CUShort(i + le2)
tr = (rex(ip) * ur) - (imx(ip) * ui)
ti = (rex(ip) * ui) + (imx(ip) * ur)
rex(ip) = rex(i) - tr
imx(ip) = imx(i) - ti
' TODO: there is cuck going on here, don't know
what.
rex(i) += tr
imx(i) += ti
Next i
tr = ur
ur = tr * sr - ui * si
ui = tr * si + ui * sr
Next j
Next el

End Sub

Public Sub swap(Of T)(ByRef a As T, ByRef b As T)

Dim temp As T
temp = a
a = b
b = temp

End Sub

I am using this routine to process an array of 256 singles in array
rex() and imx() from an audio buffer to produce a visual
representation of the audio, however, it is very slow, so the
application 'lags' behind the sound.

The problem appears to be these two lines, if they are commented out
the program is very responsive, but obviously the algorithm fails to
produce the desired results:
rex(i) += tr
imx(i) += ti

I have tried using a temporary variable and modifying the code as
such: rex(i) = rex(i) + tr, but it is still terribly slow and the lag
on a few seconds of audio buffer could be nearly 30 seconds.

Alternatively I have created a C++ DLL from the original code, but
because it uses pointers, I cannot use it from managed code from
VB.Net unless I use old-style function declarations and I would prefer
to use managed code, and my C++ skills are somewhat lacking.  If
someone could point me in the direction of converting the following
from pointered code to managed code that may also work and solve the
problem using c++.  Any help with issue would be very much
appreciated....

Phaedrus

Header file:

using namespace System;

namespace dm {

public ref class clsMaths
{
// TODO: Add your methods for this class here.
public:
void FFT(float* rex, float* imx, unsigned short){}
};
}

cpp file:

// This is the main DLL file.



template<class T>
void swap( T% t1, T% t2 )
{ T tmp( t1 ); t1 = t2; t2 = tmp; }

// Fast Fourier Transform
void FFT(float* rex, float* imx, unsigned short N)
{
unsigned short nm1 = (unsigned short)(N-1);
unsigned short nd2 = (unsigned short)(N/2);
unsigned short m = Math::Round(log((float)N)/log((float)2),0);
unsigned short j = nd2;
unsigned short k;

// Windowing
//Hanning(rex);

// bit reversal sorting
for (unsigned short i=1; i<=N-2; i++)
{
if (i < j)
{
swap(rex+i, rex+j);
swap(imx+i, imx+j);
}
k = nd2;
while (k <= j)
{
j = (unsigned short)(j-k);
k = (unsigned short)(k/2);
}
j += k;
}
// loop for each stage
unsigned short le, le2, jm1, ip;
float ur, ui, sr, si, tr, ti;
for (unsigned short el=1; el<=m; el++)
{
//le = Round(pow(2,el),0);
le = (unsigned short)(1<<el);
le2 = (unsigned short)(le>>1);
ur = 1;
ui = 0;
// sine & cosine
sr = cos(M_PI/le2);
si = -sin(M_PI/le2);
// loop for each sub DFT
for (j=1; j<=le2; j++)
{
jm1 = (unsigned short)(j-1);
// loop for each butterfly
for (int i=jm1; i<=nm1; i+=le)
{
ip = (unsigned short)(i + le2);
tr = rex[ip]*ur - imx[ip]*ui;
ti = rex[ip]*ui + imx[ip]*ur;
rex[ip] = rex[i] - tr;
imx[ip] = imx[i] - ti;
rex[i] += tr;
imx[i] += ti;
}
tr = ur;
ur = tr*sr - ui*si;
ui = tr*si + ui*sr;
}
}
}




Armin Zingler replied on 11-Oct-07 10:11 AM
I think you will never get it as fast as in C++ (though, maybe close), but
at least you should enable Option Strict On in order to see the otherwise
implicitly, maybe unwanted, conversions done.
Be aware that in VB there is the "/" and "\" operator. The former does
floating point division, the latter integer division.
(I didn't review the whole code)


Armin
IdlePhaedrus replied on 11-Oct-07 11:05 AM
Thanks Armin for your quick response, I have turned on Option Strict
but it seems to make not difference, and I will fiddle around with the
\ & / operators, but the problems seems to be specifically with
updating a value in an array with its own value and some other value
held in a variable, so for instance "rex(i) = tr + 2" and "rex(i) =
rex(i) + 2" is fast, but "rex(i) += tr" or "rex(i) = rex(i) + tr" is
slow.  It is very odd as the tr variable is the same type as rex(i).

Regards
Phaedrus
Mindgeek replied on 11-Oct-07 12:16 PM
This is a BIG slowdown. Use lookup tables instead.
' sine & cosine
sr = Math.Cos(M_PI / le2)---->>>SLOW
si = -Math.Sin(M_PI / le2))---->>>SLOW
IdlePhaedrus replied on 12-Oct-07 04:05 AM
Thanks Mindgeek, but that does not seem to be where the bottleneck is
when the code is running.  If I comment out these two lines:

rex(i) += tr
imx(i) += ti

Then the code is almost instantaneous.

Phaedrus
Armin Zingler replied on 12-Oct-07 08:33 AM
I slight improvement might be if you declare the arrays ByVal. ByRef is not
necessary because you do not create new array, you only change the content of
the array.


Armin
Anderson replied on 16-Oct-07 01:37 PM
Hi

Also try to use Integer and Double arrays...
From MSDN: "For variables that never contain fractions, the integral
data types are more efficient than the nonintegral types. In Visual Basic,
Integer and UInteger are the most efficient numeric types. For fractional
numbers, Double is the most efficient data type, because the processors on
current platforms perform floating-point operations in double precision.
However, operations with Double are not as fast as with the integral types
such as Integer."


Anderson