From bug-octave-request at bevo dot che dot wisc dot edu Fri Dec 15 10:55:32 2000 Subject: Unidentified subject! From: "John W. Eaton" To: bug-octave at bevo dot che dot wisc dot edu cc: Ikonen Teemu Date: Fri, 15 Dec 2000 10:55:22 -0600 On 14-Dec-2000, John W. Eaton wrote: | On 18-May-2000, Ikonen Teemu wrote: | | | To: bug-octave at bevo dot che dot wisc dot edu | | Cc: teemu | | Subject: Interpreter crash with the abuse of \ operator | | | | Bug report for Octave 2.1.30 configured for %OCTAVE_CONF_CANONICAL_HOST_TYPE% | | | | Description: | | ----------- | | | | Using the left-division operator with two longish row-vectors results | | in a segfault and the crash of the interpreter. | | | | Repeat-By: | | --------- | | | | octave:1> x = linspace(0,2, 60); | | octave:2> A = sin(x); | | octave:3> y = A; | | octave:4> a = A\y; | | | | Program received signal SIGSEGV, Segmentation fault. | | 0x831c139 in dgemm_ () at oct-time.cc:292 | | oct-time.cc:292: No such file or directory. | | (gdb) backtrace | | #0 0x831c139 in dgemm_ () at oct-time.cc:292 | | #1 0x832bee7 in dgelss_ () at oct-time.cc:292 | | Sorry for the long delay. | | I think this problem should now be fixed in the current CVS sources. I should have included a bit more info here. The bug was really in DGELSS from Lapack, which I think I've fixed, so the problem will go away if you link to the copy of Lapack that is distributed with Octave, but if not if you use a vendor version of Lapack, or some other Lapack library on your system that has the bug. Here is a copy of the message I sent to the Lapack group about it, including the patch: Subject: bug in dgelss and zgelss (with patch) From: "John W. Eaton" To: lapack at cs dot utk dot edu cc: jwe at bevo dot che dot wisc dot edu Date: Thu, 14 Dec 2000 02:12:36 -0600 Message-ID: <14904 dot 33012 dot 50917 dot 969769 at foobar dot bogus dot domain> The following bug was reported for Octave (www.octave.org). From: "John W. Eaton" To: john Cc: octave-maintainers at bevo dot che dot wisc dot edu Subject: Unidentified subject! Date: Thu, 14 Dec 2000 01:55:52 -0600 Bug report for Octave 2.1.31 configured for %OCTAVE_CANONICAL_HOST_TYPE% (CVS 9 December 2000) Description: ----------- The following not at all useful code causes a Segmentation fault: arrow2 [544] octave GNU Octave, version 2.1.31 (i586-pc-linux-gnulibc1). Copyright (C) 1996, 1997, 1998, 1999, 2000 John W. Eaton. This is free software with ABSOLUTELY NO WARRANTY. For details, type `warranty'. [...] octave:1> xx = linspace(-3,10)'; octave:2> xx / (pi+xx) panic: Segmentation fault -- stopping myself... attempting to save variables to `octave-core'... save to `octave-core' complete Segmentation fault For cases like this, Octave eventually calls dgelss from Lapack. The crash was happening inside dgemm, but it looks like a bug in Lapack that can cause the bounds of a work array to be exceeded. I checked netlib and the development version of Octave seems to have the latest Lapack routines, so I don't think this problem has been fixed in the Lapack code that is currently being distributed netlib. Anyway, here my best guess at a fix is appended below, followed by some diagnosis. Can you please let me know if my fix for dgelss and zgelss is OK? It seems to fix the problem, but I'm not sure I follow all that is going on with the work array. Thanks, jwe 2000-12-14 John W. Eaton * lapack/dgelss.f (DGELSS): Use correct leading dimension for workspace array passed to dgemm and dlacpy. (ZGELSS): Likewise, for calls to zgemm and zlacpy. Index: dgelss.f =================================================================== RCS file: /usr/local/cvsroot/octave/libcruft/lapack/dgelss.f,v retrieving revision 1.3 diff -u -r1.3 dgelss.f --- dgelss.f 2000/02/10 09:26:48 1.3 +++ dgelss.f 2000/12/14 07:45:56 at @ -491,8 +491,8 @@ DO 40 I = 1, NRHS, CHUNK BL = MIN( NRHS-I+1, CHUNK ) CALL DGEMM( 'T', 'N', M, BL, M, ONE, WORK( IL ), LDWORK, - $ B( 1, I ), LDB, ZERO, WORK( IWORK ), N ) - CALL DLACPY( 'G', M, BL, WORK( IWORK ), N, B( 1, I ), + $ B( 1, I ), LDB, ZERO, WORK( IWORK ), M ) + CALL DLACPY( 'G', M, BL, WORK( IWORK ), M, B( 1, I ), $ LDB ) 40 CONTINUE ELSE Index: zgelss.f =================================================================== RCS file: /usr/local/cvsroot/octave/libcruft/lapack/zgelss.f,v retrieving revision 1.3 diff -u -r1.3 zgelss.f --- zgelss.f 2000/02/10 09:26:50 1.3 +++ zgelss.f 2000/12/14 07:46:02 at @ -512,8 +512,8 @@ DO 40 I = 1, NRHS, CHUNK BL = MIN( NRHS-I+1, CHUNK ) CALL ZGEMM( 'C', 'N', M, BL, M, CONE, WORK( IL ), LDWORK, - $ B( 1, I ), LDB, CZERO, WORK( IWORK ), N ) - CALL ZLACPY( 'G', M, BL, WORK( IWORK ), N, B( 1, I ), + $ B( 1, I ), LDB, CZERO, WORK( IWORK ), M ) + CALL ZLACPY( 'G', M, BL, WORK( IWORK ), M, B( 1, I ), $ LDB ) 40 CONTINUE ELSE Here's a Fortran program that demonstrates the problem. program foo integer nn, m, n, nrhs, nrr, lwork, info integer i parameter (nn = 100) parameter (m = 1, n = nn, nrhs = n, nrr = nn) parameter (ls = 1, lwork = 3205) double precision a(m,n), result(nrr,nrhs), s(ls), work(lwork) double precision rcond, rank do 20 j = 1, nrhs do 10 i = 1, nrr result(i,j) = 0.0 10 continue 20 continue do 30 i = 1, n a(1,i) = i 30 continue do 40 j = 1, nrhs result(1,j) = j 40 continue rcond = -1.0 call dgelss (m, n, nrhs, a, m, result, nrr, s, rcond, rank, $ work, -1, info) print *, 'dgelss recommends lwork = ', work(1) print *, 'we used lwork = ', lwork print *, 'calling dgelss' call dgelss (m, n, nrhs, a, m, result, nrr, s, rcond, rank, $ work, lwork, info) print *, 'done with info = ', info end Compiling this code with g77 on a Linux system (linked with the current Fortran Lapack from netlib) and then running it results in a segmentation fault. Here is a traceback from gdb: foobar:3> gdb ./a.out GNU gdb 5.0 Copyright 2000 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i686-pc-linux-gnu"... (gdb) r Starting program: /home/jwe/src/octave/liboctave/dgelss-bug/./a.out dgelss recommends lwork = 3205. we used lwork = 3205 calling dgelss Program received signal SIGSEGV, Segmentation fault. 0x805000a in dgemm_ (transa=0x8073f33, transb=0x8073f37, m=0x8074458, n=0xbfff92d0, k=0x8074458, alpha=0x8073f50, a=0xbfff935c, lda=0xbfff92a4, b=0x8087340, ldb=0x8074454, beta=0x8073f48, c=0xbfff9364, ldc=0x8074454, __g77_length_transa=1, __g77_length_transb=1) at dgemm.f:258 258 C( I, J ) = ALPHA*TEMP Current language: auto; currently fortran (gdb) where #0 0x805000a in dgemm_ (transa=0x8073f33, transb=0x8073f37, m=0x8074458, n=0xbfff92d0, k=0x8074458, alpha=0x8073f50, a=0xbfff935c, lda=0xbfff92a4, b=0x8087340, ldb=0x8074454, beta=0x8073f48, c=0xbfff9364, ldc=0x8074454, __g77_length_transa=1, __g77_length_transb=1) at dgemm.f:258 #1 0x804ee5c in dgelss_ (m=0x8074458, n=0x8074454, nrhs=0x8074454, a=0xbffff784, lda=0x8074458, b=0x8087340, ldb=0x8074454, s=0xbffff77c, rcond=0xbfff934c, rank=0xbfff9344, work=0xbfff9354, lwork=0x8074464, info=0xbffffaa8) at dgelss.f:493 #2 0x8069138 in MAIN__ () at foo.f:37 #3 0x806f408 in main () #4 0x40059c1c in __libc_start_main () from /lib/libc.so.6 ------------------------------------------------------------- Octave is freely available under the terms of the GNU GPL. Octave's home on the web: http://www.octave.org How to fund new projects: http://www.octave.org/funding.html Subscription information: http://www.octave.org/archive.html -------------------------------------------------------------