From octave-maintainers-request at bevo dot che dot wisc dot edu Thu Jan 23 18:14:08 2003 Subject: RFC: Threaded ATLAS/FFTW in Octave From: Mumit Khan To: Date: Thu, 23 Jan 2003 18:13:50 -0600 (CST) Has anyone done any benchmarking on Octave's use of threaded ATLAS and FFTW? In my own codes, I'm often surprised by the performance boost in large matrix-matrix multiplications using ATLAS even on a 2-way machine. However, my few experiments with threaded FFTW pretty much told me that it's going to take user intervention in telling it how many threads it should use, no magic number given say the number of CPU's on the machine. I believe FFTW doc says the same thing. There is thread tester program to help with this process. The advantage of FFTW approach is that you can tell it to use a single thread at runtime, equivalent to what's currently implemented in Octave. ATLAS on the other hand uses every CPU it finds if built with threaded option, which may or may not what the user wants, even on a multi-CPU machine. In my case, I build both threaded and a non threaded versions, and I just use the one that I feel is appropriate at the time of use. Is there any interest in getting this into Octave? The ATLAS part is trivial -- if ATLAS is found by configure, we can use a flag such as --enable-threads to enable the threaded support (ie., link with the right libraries), and that's all there is to it. For FFTW, the fft module needs to add a init call before first use; what's left is an interface to specify the number of threads to use in subsequent calls to FFTW routines, subject to change at any time. Regards, Mumit