From help-octave-request at bevo dot che dot wisc dot edu Sat Jan 17 18:55:50 2004 Subject: Re: Octave and threaded ATLAS and FFTW From: "Henry F. Mollet" To: "Dmitri A. Sergatskov" , "John W. Eaton" CC: Octave_post Date: Sat, 17 Jan 2004 16:52:21 -0800 With a 400 Mhz Power PC G3 the two items tested take considerably longer. I can understand a factor of 2000/400 = 5. Not that I need the speed, but does it imply that vecLib (with BLAS and LAPACk) in Mac OS X Dev Tools is about 4-10 times slower compared to ATLAS 2.6.0? Henry octave:1> tic; t1=cputime ; a=rand(2000); t2=cputime ; toc ans = 5.0177 octave:2> t2-t1 ans = 3.1200 % ca. 20 times longer octave:3> tic; t1=cputime ; b=inv(a); t2=cputime ; toc ans = 87.499 octave:4> t2-t1 ans = 72.130 % ca. 39 times longer on 1/16/04 4:03 PM, Dmitri A. Sergatskov at dmitri at unm dot edu wrote: > John W. Eaton wrote: > >> I think it would be worth an attempt just to see what happens, but I >> don't have time (or the need) to do it myself. If you do it, please >> post your results to the list. > > I did compile octave 2.1.50 with pthreaded atlas. Though it mostly work > and some benchmarks show improvements. E.g. from (in)famous Octave2 benchmark > (which was discussed on octave-maint. list) > > "normal" ATLAS: > 700x700 cross-product matrix (b = a' * a)___________ (sec): 0.2713 > > "pthreded" ATLAS > > 700x700 cross-product matrix (b = a' * a)___________ (sec): 0.1727 > > I am not quite sure yet those are real numbers since I discovered that > "cputime" does not work as expected any more (at least not as I expected): > > octave:2> tic; t1=cputime ; a=rand(2000); t2=cputime ; toc > ans = 0.16871 > octave:3> t2-t1 > ans = 0.16000 > (That is probably close enough) > > octave:4> tic; t1=cputime ; b=inv(a); t2=cputime ; toc > ans = 7.3011 > octave:5> t2-t1 > ans = 1.8600 > (7.3 seems more real here) > > octave:9> tic; t1=cputime ; c=b*a; t2=cputime ; toc > ans = 2.9955 > octave:10> t2-t1 > ans = 0 > (Well...) > > octave:12> tic; t1=cputime ; ifft(fft(a)); t2=cputime ; toc > ans = 1.5286 > octave:13> t2-t1 > ans = 1.5000 > > (Looks OK again) > > So all BLAS/LAPACK benchmarks which use "cputime" are greatly improved :) > (Octave2 uses tic; toc method.) > > Frankly, by reading ATLAS docs, I do not understand why would I get any > improvements > since ATLAS by itself will not spawn separate tasks... > >> Thanks, >> >> jwe > > Sincerely, > Dmitri. > > p.s. > > The tests done on AthlonMPx2 2000 MHz. Octave2 is from > http://www.sciviews.org/other/benchmark.htm > ATLAS 2.6.0 which I compiled myself. > > > > ------------------------------------------------------------- > Octave is freely available under the terms of the GNU GPL. > > Octave's home on the web: http://www.octave.org > How to fund new projects: http://www.octave.org/funding.html > Subscription information: http://www.octave.org/archive.html > ------------------------------------------------------------- ------------------------------------------------------------- Octave is freely available under the terms of the GNU GPL. Octave's home on the web: http://www.octave.org How to fund new projects: http://www.octave.org/funding.html Subscription information: http://www.octave.org/archive.html -------------------------------------------------------------