From octave-maintainers-request at bevo dot che dot wisc dot edu Sat Jan 17 19:12:27 2004 Subject: Re: benchmarks - sort From: Schloegl Alois To: David Bateman Cc: Paul Kienzle , octave-maintainers@bevo.che.wisc.edu Date: Sun, 18 Jan 2004 02:12:08 +0100 Zitat von David Bateman : > > Well, for the hell of it I programmed a simple sort oct-file to test > what would be faster (IEEE754 integer compare or floating point compare > with checking for Inf and NaN), you can check the attached oct-file. > One complication of IEEE754 is the symmetry of negative and postive numbers. > This means that positive and negative tests need to be treated seperately, as > opposed to a 2's complement respresentation > > The code at the moment is only for a row vector and is attached the example > I used was > > a=randn(1,100000); a(1) = NaN; a(2) = -Inf; a(3) = Inf; tic; mysort(a); t = > toc; tic; mysort(a,1); t1= toc; t, t1 > > And I got the result > > t = 0.27054 > t1 = 0.23212 > > showing that IEEE754 integer compares are about 20% faster.... Well > IEEE754 is faster, but it is certainly not a factor of 10 faster... > Seems like we need to look elsewhere for this order of magnitude > increase... Good job. I did not think about the sign problem first. Your result shows that, handling of NaN's as exceptions cost 20% of total computational effort. A comment in your source says, this is the bubble sort algorithm. Bubble-Sort has O(n^2) a computational effort. An algorithm with an effort O(n*log(n)) like heap sort, merge sort or quick sort could provide a much larger performance increase. Still, there is another idea: The major effort in sorting are the comparison operation. In your version, the comparison contains the sign-comparison and the (long long) comparison. It would be more efficient, separating both. In the first step, long long comparisons are are used to sort the data. This provides the sorted absolute values and requires O(n.log(n)) steps of the simpler integer comparison. The second step performs the sign test on the sorted output of the first stop with only O(n) effort. step 1: % DO_SORT data with (long long) comparisions and neglecting the sign bit. s = sort (abs(data)) % step 2 l = 0; m = N; for k = 1:N, if s(k)>0, o(m)=s(k); m = m - 1; else o(l) = s(k); l = l+1; end; end; The basic difference is that only O(n) sign tests (instead of O(n.log(n)) ) are required. I agree that this is not the ultimate solution. Alois > > Regards > D. > > > According to Schloegl Alois (on 01/06/04): > > Zitat von Paul Kienzle : > > > > > > > > Paul Thomas pointed me to these benchmarks. Anybody want to do > > > something about them? > > > > > > http://www.sciviews.org/other/benchmark.htm > > > > > > > ... > > > > > > > > I.C Matlab 0.89 - Octave 7.77 > > > b = sort(a); > > > > > > Octave's sort is surprisingly slow. 3x worse than any other package > > > mentioned. Anyone know a fast stable sort algorithm? > > > > > > > > > One reason, for the bad performance of the sort algorithm might be the > exception > > handling of NaNs. See also > > http://www.octave.org/octave-lists/archive/bug-octave.2001/msg00047.html > > > > Sorting on the binary level might be helpful, because the bit patterns of > the > > IEEE754 numbers provide the correct sorting order. This is > > -inf < -1 < 0 < 1 < inf < NaN > > > > Just my few thougths. > > > > > > Alois > > > > -- > David Bateman David dot Bateman at motorola dot com > Motorola CRM +33 1 69 35 48 04 (Ph) > Parc Les Algorithmes, Commune de St Aubin +33 1 69 35 77 01 (Fax) > 91193 Gif-Sur-Yvette FRANCE > > The information contained in this communication has been classified as: > > [x] General Business Information > [ ] Motorola Internal Use Only > [ ] Motorola Confidential Proprietary >