From octave-maintainers-request at bevo dot che dot wisc dot edu Wed Nov 15 15:52:53 2000 Subject: Re: new set functions From: Paul Kienzle To: "Lippert, Ross A." Cc: "'octave-maintainers at octave dot org'" Date: Wed, 15 Nov 2000 20:50:22 +0000 The run-time complexity of the set functions should be as follows: create_set = O (n log(n)) since you have to look up each new element in the existing set before inserting it, and lookup is O(log n). So the sort algorithm is a fine way to handle this. union = O(n+m) since in the worst case, you need to copy both sets. intersection, complement = O(max(n,m)) if the set sizes are similar, or O(m log(n)) if m is much smaller than n, since in the worst case you need to copy the whole set (m~n) or just look up a few elements in the larger set (m< > > > I have made a modification to create_set because I found myself > often wanting not just the set of unique elements but also the > counts of those elements in the original input. Perhaps that is > just my own eccentricity. > > However, if one modifies create_set in this way (returning a > second output of counts), then I believe that the runtime complexity > of intersection and complement can be reduced to n log n instead of > n^2 (which they appear to be now, but I am no expert on complexity). > > Anyhow, as a quick example I tried > x = fix(20000*rand(1000,1)); y = fix(20000*rand(1000,1)); > > and found a big difference between the timings of > intersection(x,y), so I think I am on the right track. > > I'd be sending this out in the form of a patch, but for the moment, > I'd really like feedback to tell me if I have done something stupid > so I don't waste time updating my CVS, installing the new scripts > and then diffing all to find out I missed something. > > > > -r > >