From help-octave-request at bevo dot che dot wisc dot edu Thu Jan 23 01:00:29 2003 Subject: Re: How to input tabular data through a form? (fwd) From: Paul Kienzle To: "Dmitri A. Sergatskov" Cc: help-octave Date: Thu, 23 Jan 2003 02:00:43 -0500 Dmitri A. Sergatskov wrote: >Oops, forgot to CC to the list. > >---------- Forwarded message ---------- >Date: Wed, 22 Jan 2003 23:12:17 -0700 (MST) >From: Dmitri A. Sergatskov >To: Paul Kienzle >Subject: Re: How to input tabular data through a form? > >Paul, > >Just for fun I timed your dlmread vs straight fscanf on a rather >large file (did 3 times to see if there cache related effects): > >-------- >GNU Octave, version 2.1.42 (i686-pc-linux-gnu). >.... >gnuplot_binary = /usr/local/bin/gnuplot >octave:1> tic ; a=dlmread('tmp2.dat') ; toc >ans = 125.80 >octave:2> tic ; a=dlmread('tmp2.dat') ; toc >ans = 126.94 >octave:3> tic ; a=dlmread('tmp2.dat') ; toc >ans = 126.73 >octave:4> tic; fid=fopen('tmp2.dat'); a=fscanf(fid,'%f,%f,%f,%f,%f,%f,%f,%f,%f,%f',[10,inf]); fclose(fid); toc >ans = 23.869 >octave:5> tic; fid=fopen('tmp2.dat'); a=fscanf(fid,'%f,%f,%f,%f,%f,%f,%f,%f,%f,%f',[10,inf]); fclose(fid); toc >ans = 23.518 >octave:6> tic; fid=fopen('tmp2.dat'); a=fscanf(fid,'%f,%f,%f,%f,%f,%f,%f,%f,%f,%f',[10,inf]); fclose(fid); toc >ans = 23.485 > >[dima at psyche t4c-8]$ ls -lh tmp2.dat >-rw-rw-r-- 1 dima dima 63M Jan 22 22:48 tmp2.dat >[dima at psyche t4c-8]$ >---------------- > >I understand that dlmread does quite a few things, but frankly I was surprised >that difference is so large. I suspect it is mostly due to 'reshape', >but have not done any actual profiling. > There are several reasons that dlmread is slower: 1) I suck in the whole file. Even though I specify bytes, fread converts these to doubles, and I then convert them back to bytes. This means 8xN memory, where N is the file size. 2) I traverse the whole file three times doing search and replace. 3) I reshape the result at the end. BTW, are you sure your data is in the right order at the end? I would guess you need to do a transpose because it reads the data in by rows but puts it into the matrix by columns. You could perhaps improve upon dlmread by reading first one line to find the number of columns, then build the appropriate format string to read the rest. Line terminators will still be an issue, since IIRC, nobody has patched the appropriate bits of Octave to handle Mac-sytle line endings. This won't solve dlmread's outstanding compatibility issue: ,,,3,4,5 should be read as 0 0 0 3 4 5 instead of 3 4 5 since sscanf('3,,4','%g,%g,%g') returns 3 rather than 3 0 4. Paul Kienzle pkienzle at users dot sf dot net ------------------------------------------------------------- Octave is freely available under the terms of the GNU GPL. Octave's home on the web: http://www.octave.org How to fund new projects: http://www.octave.org/funding.html Subscription information: http://www.octave.org/archive.html -------------------------------------------------------------