From bug-octave-request at bevo dot che dot wisc dot edu Mon Jan 19 16:39:48 2004 Subject: STR2NUM and nasty side effects From: Schloegl Alois To: bug-octave at bevo dot che dot wisc dot edu Date: Mon, 19 Jan 2004 16:28:24 -0600 This message is in MIME format. --=_73sf7p7iklwc Content-Type: text/plain; charset="ISO-8859-1" Content-Disposition: inline Content-Transfer-Encoding: 7bit Description: ----------- I was analyzing some ascii files, which contain some configuration information. Unexpectingly, the working directory has suddenly changed. I tracked the problem down to STR2NUM. STR2NUM is using EVAL. octave:15> cd data octave:16> pwd % show working directory /home/schloegl/data octave:17> line ='200,300,400,cd,yes,no,999,maybe,do_something_bad'; octave:18> x=str2num(line) x = 200 300 400 octave:19> pwd % working directory has changed. /home/schloegl It took quite some time figuring out the problem. In this example, the first three numbers were converted correctly. Then, the string 'cd' is evaluated, which results in a change of the working directory. THIS IS A SIDE EFFECT, WHICH IS NOT INTENDED AND NOT EXPECTED. Please note, that STR2NUM could be used for really nasty things, like str2num('unix(''rm *.*'')') % do not use the example unless you know what you do. You have been warned. IT IS ALSO A SECURITY FLAW WHICH CAN BE EXPLOITED, SIMPLY BY FEEDING A NASTY ASCII FILE THROUGH STR2NUM. I hope you got the idea. I do not want to elaborate on that, not to give somebody further ideas. Anyway, STR2NUM must not use EVAL, otherwise its a security flaw and you can not use STR2NUM for analyzing arbitrary ascii files. Repeat-By: --------- see example above. Fix: --- The attached implementation might not comform to all style guidelines for Octave, but it works. I wrote it also having in mind the articles by D. A. Wheeler on secure programming http://www-106.ibm.com/developerworks/library/l-sp3.html The main features of the attached STR2NUM are: - it does not use EVAL and is therefore save. - it's much faster, noticeable in Octave4Windows. - it can deal with matrices, - it is compatible with the previous version. However, str2num('cd') cannot anymore be used to change into the home directory ;-) Alois --=_73sf7p7iklwc Content-Type: text/x-objcsrc; charset="ISO-8859-1"; name="str2num.m" Content-Disposition: attachment; filename="str2num.m" Content-Transfer-Encoding: 7bit function [num,status,strarray] = str2num(s,cdelim,rdelim) ## STR2NUM converts strings into numeric values ## [NUM, status] = STR2NUM(STR) ## ## STR can be the form '[+-]d[.]dd[[eE][+-]ddd]' ## d can be any of digit from 0 to 0 ## [.] indicate optional elements ## NUM is the corresponding numeric value. ## in case STR is not a valid patter, NUM is NaN; ## status = 0: conversion was successful ## status = -1: couldnot convert string into numeric value ## ## STR can also contain multiple elements. ## Then, NUM and status return matrices of appropriate size. ## Row-delimiters are: ## NEWLINE, CARRIAGE RETURN and SEMICOLON i.e. ASCII 10, 13 and 59. ## Column-delimiters are: ## TAB, SPACE and COMMA i.e. ASCII 9, 32, and 44. ## Elements which are not defined or not valid return NaN and ## the status flag becomes -1 ## ## For compatibility with previous versions, ## NUM = STR2NUM(STR) ## returns empty matrix [] if conversion fails. ## ## Examples: ## str2num('-.1e-5') ## ans = -1.0000e-006 ## ## str2num('.314e1, 44.44e-1, .7; -1e+1') ## ans = ## 3.1400 4.4440 0.7000 ## -10.0000 NaN NaN ## ## line ='200,300,400,cd,yes,no,999,maybe,do_something_bad'; ## [x,status]=str2num(line) ## x = ## 200 300 400 NaN NaN NaN 999 NaN NaN ## status = ## 0 0 0 -1 -1 -1 0 -1 -1 ## ## x=str2num(line) ## x = [](0x0) ## This program is free software; you can redistribute it and/or ## modify it under the terms of the GNU General Public License ## as published by the Free Software Foundation; either version 2 ## of the License, or (at your option) any later version. ## ## This program is distributed in the hope that it will be useful, ## but WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ## GNU General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with this program; if not, write to the Free Software ## Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA. ## Copyright (C) 2004 by Alois Schloegl ## a dot schloegl at ieee dot org %%valid_char='0123456789eE+-.nNaAiIfF'; % digits, sign, exponent,NaN,Inf cdelim = char([9,32,abs(',')]); % column delimiter rdelim = char([10,13,abs(';')]); % row delimiter num = []; status = 0; strarray = {}; k1 = 0; % current row nc = 0; % number of columns while ~isempty(s), [u,s] = strtok(s,rdelim); %% get next row if isempty(u), return; end; k1 = k1 + 1; num(k1,:) = nan; %% add row to output matrix status(k1,:) = 0; k2 = 0; while ~isempty(u), [t,u] = strtok(u,cdelim); %% get next element k2 = k2 + 1; if k2 > nc, %% add column if neccessary num(:,k2) = nan; status(:,k2) = 0; nc = k2; end; strarray{k1,k2} = t; if ~isempty(t), epos=find((t=='e') | (t=='E')); %% positon of E if (length(epos)>1), %% if more than one E is found status(k1,k2) = -1; %% return error code else if length(epos)==0, %% no E found e=0; epos = length(t)+1; elseif (length(epos)==1), %% one E found l = epos+1; if ~any(t(length(t))=='0123456789'); %last character must be a digit e = nan; status(k1,k2) = -1; else % get exponent v = 0; g = 1; if t(l)=='-', g=-1; l = l+1; elseif t(l)=='+', l = l+1; end; while (t(l)==0), l=l+1; end; %% skip leading zeros while l<=length(t), if any(t(l)=='0123456789'); v = v*10+t(l)-48; else v = nan; status(k1,k2) = -1; end; l = l+1; end; e = g*v; end; end; %% get mantisse g = 0; v = 1; if t(1)=='-', v = -1; l = 2; elseif t(1)=='+', l = 2; else l = 1; end; if strcmpi(t(l:epos-1),'inf') v = v*inf; elseif strcmpi(t,'NaN'); v = NaN; elseif all(sum(t(1:epos-1)=='.')<[2,epos-1]), % at most one dot, and at least one digit is needed while (t(l)==0), l=l+1; end; %skip leading zeros p = 0; while l