I'm trying to figure out which ffts are fastest, best, etc.
First, it is difficult to get fftw3 + pyfftw installed into python.
For pyfftw to install, you need to enable floats and every specific type when compiling fftw3 (thanks to linux toolkits blog). Apparently you can't have --enable-long-double and --enable-float simultaneously, but pyfftw requires both, so you have to install twice or thrice? Nope, four times!! Also, my own post indicates that I should install to a different path to avoid weird conflicts with other libraries.
./configure --enable-threads --enable-openmp --enable-mpi --enable-shared \ --enable-fortran --enable-avx \ CFLAGS="-O3 -fno-common -fomit-frame-pointer -fstrict-aliasing" make -j 4 sudo make install ./configure --enable-float --enable-threads --enable-openmp --enable-mpi --enable-shared \ --enable-fortran --enable-avx \ CFLAGS="-O3 -fno-common -fomit-frame-pointer -fstrict-aliasing" make -j 4 sudo make install # quad precision is not supported in mpi ./configure --enable-quad-precision --enable-threads --enable-openmp --enable-shared \ --enable-fortran --enable-avx \ CFLAGS="-O3 -fno-common -fomit-frame-pointer -fstrict-aliasing" make -j 4 sudo make install ./configure --enable-long-double --enable-threads --enable-openmp --enable-mpi --enable-shared \ --enable-fortran --enable-avx \ CFLAGS="-O3 -fno-common -fomit-frame-pointer -fstrict-aliasing" make -j 4 sudo make install
Onto speed test comparisons:
If you want to use fftw3 with astropy's convolve_fft, use this example.
Onto some comparisons from astropy issue 4374: