From help-request at octave dot org Fri Jan 21 11:17:24 2005 Subject: Re: Very slow filter.cc From: "John W. Eaton" To: Miroslaw Kwasniak Cc: help at octave dot org Date: Fri, 21 Jan 2005 12:20:10 -0500 On 21-Jan-2005, Miroslaw Kwasniak wrote: | On Wed, Jan 19, 2005 at 09:45:37AM +0100, David Bateman wrote: | > | > t = cputime; y = filter(b,1,x); cputime - t | > | > rather than using tic/toc and see if there is better consistency between | > 2.0.17 and the 2.1.x versions... | | $ sh run.sh | OCTAVE_VERSION = 2.0.17 | ans = 2.0600 | OCTAVE_VERSION = 2.1.50 | ans = 15.380 | OCTAVE_VERSION = 2.1.57 | ans = 15.690 | OCTAVE_VERSION = 2.1.63 | ans = 14.060 | | 2.1.63 is a little better because it's compiled for 686 - rest are debian | 386 packages. Please try the following patch. | I tried with LD_PROFILE but octave has PROF signal disabled :( What is LD_PROFILE? | I would supprised if 2.0 is compilable with current tools. Only a few changes need to be made to filter.cc from 2.0.x so that it will compile with current versions of GCC. jwe src/ChangeLog: 2005-01-21 John W. Eaton * DLD-FUNCTIONS/filter.cc (filter): Avoid slow Marray indexing ops. Index: src/DLD-FUNCTIONS/filter.cc =================================================================== RCS file: /usr/local/cvsroot/octave/src/DLD-FUNCTIONS/filter.cc,v retrieving revision 1.16 diff -u -r1.16 filter.cc --- src/DLD-FUNCTIONS/filter.cc 2 Nov 2004 02:42:25 -0000 1.16 +++ src/DLD-FUNCTIONS/filter.cc 21 Jan 2005 16:59:05 -0000 at @ -146,35 +146,51 @@ if (a_len > 1) { - for (int i = 0; i < x_len; i++) + T *py = y.fortran_vec (); + T *psi = si.fortran_vec (); + + const T *pa = a.data (); + const T *pb = b.data (); + const T *px = x.data (); + + psi += si_offset; + + for (int i = 0, idx = x_offset; i < x_len; i++, idx += x_stride) { - int idx = i * x_stride + x_offset; - y (idx) = si (si_offset) + b (0) * x (idx); + py[idx] = psi[0] + pb[0] * px[idx]; - if (si_len > 1) + if (si_len > 0) { for (int j = 0; j < si_len - 1; j++) { OCTAVE_QUIT; - si (j + si_offset) = si (j + 1 + si_offset) - - a (j+1) * y (idx) + b (j+1) * x (idx); + psi[j] = psi[j+1] - pa[j+1] * py[idx] + pb[j+1] * px[idx]; } - si (si_len - 1 + si_offset) = b (si_len) * x (idx) - - a (si_len) * y (idx); + psi[si_len-1] = pb[si_len] * px[idx] - pa[si_len] * py[idx]; } else - si (si_offset) = b (si_len) * x (idx) - - a (si_len) * y (idx); + { + OCTAVE_QUIT; + + psi[0] = pb[si_len] * px[idx] - pa[si_len] * py[idx]; + } } } else if (si_len > 0) { - for (int i = 0; i < x_len; i++) + T *py = y.fortran_vec (); + T *psi = si.fortran_vec (); + + const T *pb = b.data (); + const T *px = x.data (); + + psi += si_offset; + + for (int i = 0, idx = x_offset; i < x_len; i++, idx += x_stride) { - int idx = i * x_stride + x_offset; - y (idx) = si (si_offset) + b (0) * x (idx); + py[idx] = psi[0] + pb[0] * px[idx]; if (si_len > 1) { at @ -182,14 +198,17 @@ { OCTAVE_QUIT; - si (j + si_offset) = si (j + 1 + si_offset) + - b (j+1) * x (idx); + psi[j] = psi[j+1] + pb[j+1] * px[idx]; } - si (si_len - 1 + si_offset) = b (si_len) * x (idx); + psi[si_len-1] = pb[si_len] * px[idx]; } else - si (si_offset) = b (1) * x (idx); + { + OCTAVE_QUIT; + + psi[0] = pb[1] * px[idx]; + } } } } ------------------------------------------------------------- Octave is freely available under the terms of the GNU GPL. Octave's home on the web: http://www.octave.org How to fund new projects: http://www.octave.org/funding.html Subscription information: http://www.octave.org/archive.html -------------------------------------------------------------