From octave-maintainers-request at bevo dot che dot wisc dot edu Thu Jan 22 04:47:57 2004 Subject: Re: benchmarks - sort From: David Bateman To: Paul Kienzle Cc: David Bateman , THOMAS Paul Richard , octave-maintainers@bevo.che.wisc.edu Date: Thu, 22 Jan 2004 11:43:47 +0100 According to Paul Kienzle (on 01/22/04): > > For the indexed case, maybe you could create a vector > of pointers to doubles. That leads to an extra dereference > in the code which will surely slow things down a bit, but it > keeps the size of data you are shifting down to a word. > You can recover the index by pointer subtraction. Not sure I see what you mean. The code as it stands treats "*a = *b", where at the moment "a" is "double *a" for example. For the indexed case are you suggesting I have a pointer to a pointer to a structure containing the value and the index? So that only the pointers are moved around. For example OCTAVE_LOCAL_BUFFER (*vec_index, vi, elements); /* Flip the data in the vector so that int compares on * IEE754 give the correct ordering */ for (int i = 0; i < elements; i++) { vi[i]->vec = FloatFlip (p[i]); vi[i]->indx = i + 1; } Yeah this might be faster. I should try it and see. I've started but getting seg-faults.... > sortrows would be faster if it were implemented directly > rather than stable sorts on each column from right to left. As I've implemented the sort as a generic class, it should be quite easy it instantiate it for strings in sortrows and sort them directly as you suggest. > >The basic result is that the radix sort compares well against Matlab > >for randomly distributed values (1.10 times the speed of Matlab), but > >as this sort takes no advantage of partial ordering, it breaks down in > >these cases. The mergesort with double values is faster than Matlab > >for partailly ordered lists, but about 2.5 to 3 times slower than > >Matlab for random lists. Interestingly, although merge_double is > >significantly slower with indexing, it maintains the same relationship > >with Matlab. > > Is it possible that matlab is using quicksort when > you aren't indexing? Does sort need to be stable if you > are not preserving the index? If not, then choose > the fastest sort that works reasonably well on mostly > ordered data when you aren't keeping the index. IIRC, > basic quicksort doesn't perform very well on mostly > ordered data. This would explain the speed differences with Matlab I observed. Do we care enough to gain another 20% to 40% in speed at the cost of lots of code duplciation? > Did you happen to try on both big and little endian machines? > Don't you have to reverse the order of the integer comparisons > in the two cases? Isn't IEEE754 independent of big and little endian issues? An integer compare on the bytes of a big endian number gives the same results as on a little endian machine. IE. short a = 1; 0100 on little endian and 0001 on big endian short b = 2; 0200 on little endian and 0002 on big endian if (b > a) printf("This test is true on both big and little endian\n"); So it isn't necessary to check big/little endian issues. It is however, necessary to check at configure whether the machine respects the IEEE754 or IEEE854 formats. How to do this? > Thanks for all your efforts. Sort is used pretty often in some > unexpected ways to avoid loops in the script, so it's nice that > it will be fast. It would be nice to have it in octave-forge this weekend though :-).... D. -- David Bateman David dot Bateman at motorola dot com Motorola CRM +33 1 69 35 48 04 (Ph) Parc Les Algorithmes, Commune de St Aubin +33 1 69 35 77 01 (Fax) 91193 Gif-Sur-Yvette FRANCE The information contained in this communication has been classified as: [x] General Business Information [ ] Motorola Internal Use Only [ ] Motorola Confidential Proprietary