From octave-maintainers-request at bevo dot che dot wisc dot edu Mon Jan 19 10:40:14 2004 Subject: Re: benchmarks - sort From: David Bateman To: Paul Kienzle Cc: David Bateman , octave-maintainers@bevo.che.wisc.edu Date: Mon, 19 Jan 2004 17:34:37 +0100 According to Paul Kienzle (on 01/19/04): > If you are going to implement it, keep compatibility > for complex sort, since any code that uses it is going > to depend on the order. > > Complex sort could be used to match complex pairs. > Chances are, though, that it is not going to be very > reliable since the complex pairs may have differing > magnitudes due to machine precision issues. Has > anyone actually seen complex sorts in the wild? > > It is tempting to just punt on complex sort and tell > the users to use the following instead: > > [junk,idx] = sortrows([abs(x(:)),arg(x(:))]) > y = x(idx); > > That way they will know they are doing a meaningless > operation. Yes, but matlab has it so, if we do this we introduce another annoying incompatiability.... > A couple of other points: > > (1) Octave's existing sort is stable (i.e., equal values > retain their order). Is, I thought the divide and conquer algorithm it uses makes it inevetiable that there are misorderings. In any case, radix sorts are also stable, so the order is also maintained. The complex ordering in octave doesn't respect the values phase however. > (2) Make sure to check the case of a sorted or almost > sorted array. These occur quite a bit in existing code > (e.g., lookup and hist). This is not so easy to do in radix sort, as there is no comparisons in the code per-se. Its all done with table, so you tested for almost sorted arrays would just significantly slow things down. Radix-sorts also loose out in cases where there are lots of identical values in the array. But hey, I've gained a factor of 8 in speed, so I think you still win with my version of the code. > BTW, the python folk are quite proud of their sort. A description of > the algorithm is here: > http://cvs.sourceforge.net/viewcvs.py/python/python/dist/src/Objects/ > listsort.txt?rev=1.1&content-type=text/vnd.viewcvs-markup > > And the code is here: > http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/dist/src/ > Objects/listobject.c?rev=2.131&content-type=text/vnd.viewcvs-markup Well, except for the case of partially sorted lists the Radix sort should be faster as it breaks the O(lg(n!)) lower bound on comparison based sorts. In fact it runs in linear time, and so it will be increasing better than a comparison based sort as n increases. Where the radix sort will loose out is small partially sorted lists. The question is how small is small and what structure in the original array we can exploit. It will also be determined by the cost of the compare operation. It seems that what is really costing us is the check to for NaN's... Anyway, need more work to find the best solution. It might be a case of find the break point where a radix sort is faster than a comparison sort and using the faster of the two algorithms... Regards David -- David Bateman David dot Bateman at motorola dot com Motorola CRM +33 1 69 35 48 04 (Ph) Parc Les Algorithmes, Commune de St Aubin +33 1 69 35 77 01 (Fax) 91193 Gif-Sur-Yvette FRANCE The information contained in this communication has been classified as: [x] General Business Information [ ] Motorola Internal Use Only [ ] Motorola Confidential Proprietary