From bug-octave-request at bevo dot che dot wisc dot edu Mon Jan 5 18:44:20 2004 Subject: Re: Bad octave norms From: Glenn Golden To: Vic Norton Cc: bug-octave at bevo dot che dot wisc dot edu Date: Mon, 05 Jan 2004 17:44:02 -0700 Vic Norton writes: > > Sorry to disagree with you, Glen. > Not sure what you're disagreeing with; it appears to be with something I didn't say. I stated the documented Octave/Matlab behavior of norm(A, p) when A is a vector, that the norm computed in this case is the vector p-norm, and that that computation is performed consistent with the usual definition of vector p-norm. I did not say that norm(A, p) computes the same answer that you'd get if you asked a mathematician -- or a subroutine -- what the _matrix p-norm_ of a row vector is, which seems to be the point you're making, and there is no disagreement on that: It does not, and the documentation of norm() reflects this. Why did Mathworks decide to implement it this way? That would be an interesting question to post on the Mathworks web forum. It's reasonably active, sometimes even Moler Himself fields questions, and I think this is an interesting one, from a historical perspective. My 2c-worth guess is that it was probably an intentional tradeoff in favor of the primary intended users of Matlab, that is, engineers and others solving numerical problems rather than mathematicians concerned about compliance with theoretical linear algebra. The tradeoff is between two mutually exclusive types of egregious behavior, one being egregious to one user group, and the other egregious to the other. Type M egregious behavior is the one you're referring to. Type E egregious behavior would result from fixing Type M: If this were done, then the norm()s of row and column vectors for p != 2 would differ. But in many problem solving situations, a row vector is simply an ordered set of data, not a linear functional in a dual space, and in that context, one really wants it to be treated as an n-vector, not as a 1xn matrix. My guess is that the Mathworkers probably figured there were more users who'd be annoyed/confused by Type M behavior than by Type E, so they took the route that minimized expected user annoyance. As for renaming, you could bring that up in the Mathworks forum too, but my guess is that you're not going to get much sympathy, even aside from the obvious legacy issue. The function behaves in a consistent way in view of its documentation. There is no violation of submultiplicativity as long as one is aware of its behavior. It can even be used to compute the _matrix p-norm_ of row vectors for p = 1, inf, by invoking it with p = inf, 1 respectively. And besides, computer implementations of many other math functions -- including float arithmetic itself -- do not behave strictly in accordance with their formal mathematical definitions, yet use the same names. Another possibility, on the Octave side, would be for you to contribute some new functions, say vnorm() and mnorm(), which would be _defined_ as the vector and matrix p-norms. If this were done, and also cross-documented with norm(), then both Type E and Type M users would be satisfied, and aware of the others' interpretation, too. IMHO, that would be a nice touch of technical providence for Octave, just one more reason for people to appreciate it. The octave-maintainers list would probably be the place to poll on this, see what they think. Glenn > Octave's definition may be in agreement with MATLAB, but it is not in > agreement with the accepted mathematical definition. Reread Chapter > 2.3, Matrix Norms, of Golub & Van Loan's Matrix Computations. > > The p-norm of an (m x n)-matrix matrix A is, by definition, > norm(A, p) = sup { norm(A * x, p) : norm(x, p) = 1 }, > where x is an n x 1 vector. It follows that, if the 1-norm of an n x > 1 vector x is > sum { abs(x(i)) : i = 1,..,n }, > then the 1-norm of an m x n matrix A must be > max { sum { abs(A(i, j)) : i = 1,..,m } : j = 1,..,n }. > Likewise, if the infinity-norm of an n x 1 vector x is > max { abs(x(i)) : i = 1,..,n }, > then the infinity-norm of an m x n matrix A must be > max { sum { abs(A(i, j)) : j = 1,..,m } : i = 1,..,n }. > > Octave and MATLAB have got their definitions right for every m other > than 1. Why in the world would they think they should change things > when m = 1? Is this just the computer way of doing things? Anything > for expediency? But we are talking mathematics here. Mathematics > doesn't work this way. Mathematics insists on elegance, on logical > consistency. Mathematics is not just a bag of tricks -- no matter > what MATLAB thinks! > > I suggest suggest that Octave rename its function morn(A, p). That > way everyone will know that it just an ad hoc creation, not what > mathematicians call the p-norm. > > And then there is duality and the relation (1/p) + (1/q) = 1. Start > with R^n. The dual space of R^n, (R^n)*, looks just like R^n. So what > has this got to do with p's and q's? Just this: the p-norm on (R^n)* > is the same as the q-norm on R^n. Column vectors are in R^n; row > vectors are in (R^n)*. The only time that R^n and (R^n)* are the same > normed spaces is when p = q = 2 -- no matter what MATLAB and Octave > say! > > BTW, if I recall correctly, complete normed algebras that obey the rule > norm(A * B) <= norm(A) * norm(B) > are called Banach algebra's, and Banach was doing this stuff way before MATLAB. > > > Regards, > > Vic > > At 12:57 PM -0700 1/1/04, Glenn Golden wrote: > >Vic, > > > >> The 1 and inf octave norms are not defined correctly for row vectors. > >> The definitions should be > >> norm(A, 1) = max(abs(A)) > >> norm(A, inf) = sum(abs(A)) > >> when rows(A) = 1. > >> > > > >When rows(A) == 1 or columns(A) == 1, then norm(A, p) are vector p-norms > >(as opposed to matrix p-norms). Octave's implementation of them agrees > >with Matlab's, and both behave in accordance with the definition of > >vector p-norm, > > > > norm(A, p) =def= ( SUM ( |A(k)| ** p) ) ** (1/p) > > k > > > >in which case > > > > norm(A, 1) = sum(abs(A)) > > norm(A, inf) = max(abs(A)) . > > > > > > > > > >> > >> A reasonable matrix norm should satisfy > >> norm(A * B) <= norm(A) * norm(B). > >> > > > >Octave's matrix p-norms satisfy this. > > > > > >> > >> The 1 and inf octave norms can violate this condition when A is a row > >> vector. For example, set A = [1 0] and B = [1 1; 0 0]. Then > >> 2 = norm(A * B, 1) > norm(A, 1) * norm(B, 1) = 1. > >> Again, set A = [1 1] and B = [1 0; 1 0]. Then > >> 2 = norm(A * B, inf) > norm(A, inf) * norm(B, inf) = 1. > >> > > > > > >In your examples, A is a vector, B is a matrix. In this case, the > >requirement for "reasonable" norm behavior (where "reasonable" amounts > >to subordinate behavior of matrix p-norms w.r.t. their corresponding > >vector p-norms) is > > > > norm(B * A', p) <= norm(B, p) * norm(A', p) > > > >and Octave's (and Matlab's) behavior satisfy this. > > > > > > > >Ref: Golub & Van Loan, 1/e, Sec. 2.1, 2.2. > > > >Glenn Golden > > -- > *---* mailto:vic at norton dot name > | Victor Thane Norton, Jr. > | Mathematician and Motorcyclist > | phone: 419-353-3399 > *---* http://vic.norton.name ------------------------------------------------------------- Octave is freely available under the terms of the GNU GPL. Octave's home on the web: http://www.octave.org How to fund new projects: http://www.octave.org/funding.html Subscription information: http://www.octave.org/archive.html -------------------------------------------------------------