From maintainers-request at octave dot org Wed Feb 9 17:09:45 2005 Subject: Re: min function very slow? From: "John W. Eaton" To: maintainers at octave dot org Date: Wed, 9 Feb 2005 18:14:27 -0500 On 9-Feb-2005, I wrote: | On 30-Jan-2005, I wrote: | | | If you have min and max in .oct files, then they are really defined in | | the same file (minmax.oct) and the min.oct and max.oct files are just | | additional links in the filesystem. It seems that the second one | | loaded is much slower. I think this must be some accidental overhead | | in checking whether the symbol needs to be reloaded. The problem goes | | away for me if I define min and max in separate files (i.e., create | | min.cc and max.cc, each with only one DEFUN_DLD macro) or if I set | | ignore_function_time_stamp = "all" (I'm not sure why setting it to | | "system" fails, even when the min/max functions are installed in a | | "system" directory). | | | | I'll try to check it out more thoroughly. I'm traveling, so it may | | be a week or so before I can get to it, so if someone else can pin it | | down more completely (point out the code that is wrong), that would | | help. | | I've done some more checking. The problem is that when trying to | determine whether a symbol is out of date and should be reloaded from | a file, we get the wrong answer for functions that have been found in | .oct files that have already been loaded. Here's why: | | For the first function loaded from a .oct file (say min, from | min.oct, which is a link to minmax.oct), we save a pointer to an | object that represents the .oct file. That object contains | information about the name of the file that was opened to load the | .oct file (in this case, /some/directory/min.oct) | | For the second function loaded from the .oct file (say max, from | max.oct, which is a link ot minmax.oct), we also save a pointer to | the object that represents the .oct file. This object will contain | information about the name of the file that was initially opened to | load the .oct file (in this case, /some/directory/min.oct, NOT | /some/directory/max.oct) | | When we check to see if a file is up to date, we look in the | filesystem again for it and try to decide whether the name we find | is the same file. Currently, the code to do that looks something | like this: | | if (file_name_we_looked_up != file_name_cached_in_function_object) | return true; // symbol is out of date and should be reloaded | else | { | if (time stamp of file_name_cached_in_function_object | is more recent than time stamp of file_name_we_looked_up) | return true; // file has changed since last loaded, so it | } | | Instead of just checking to see whether the names are the same, we | should be checking to see if the names refer to the same file. That | can be a bit tricky, but it's not impossible. There is code to do it | in the coreutils package, but it needs some work to be adapted for | Octave. OK, please try the following patch. It should work if your system has canonicalize_file_name (Linux and other glibc systems) or resolvepath (Solaris and possibly others). For other systems, we need to implement the guts of canonicalize_file_name. As noted above, there is code to do it in the coreutils package. I think it would be best to adapt that to C++ directly in liboctave rather than using the files from coreutils (doing that drags in a lot of extra baggage). jwe ChangeLog: 2005-02-09 John W. Eaton * configure.in: Check for canonicalize_file_name and resolvepath. liboctave/ChangeLog: 2005-02-09 John W. Eaton * file-ops.cc (file_ops::canonicalize_file_name): New functions. * file-ops.h: Provide decls. src/ChangeLog: 2005-02-09 John W. Eaton * variables.cc (same_file): New static function. (symbol_out_of_date): Use it. * syscalls.cc (Fcanonicalize_file_name): New function. Index: configure.in =================================================================== RCS file: /usr/local/cvsroot/octave/configure.in,v retrieving revision 1.458.2.1 diff -u -r1.458.2.1 configure.in --- configure.in 18 Jan 2005 18:52:26 -0000 1.458.2.1 +++ configure.in 9 Feb 2005 23:01:27 -0000 at @ -981,15 +981,16 @@ ### Checks for functions and variables. -AC_CHECK_FUNCS(atexit basename bcopy bzero dup2 endgrent endpwent execvp \ - fcntl fork getcwd getegid geteuid getgid getgrent getgrgid \ - getgrnam getpgrp getpid getppid getpwent getpwuid gettimeofday \ - getuid getwd _kbhit kill link localtime_r lstat memmove mkdir \ - mkfifo mkstemp on_exit pipe poll putenv raise readlink rename \ - rindex rmdir round select setgrent setpwent setvbuf sigaction \ - siglongjmp sigpending sigprocmask sigsuspend stat strcasecmp strdup \ - strerror strftime stricmp strncasecmp strnicmp strptime symlink \ - tempnam umask unlink usleep vfprintf vsprintf vsnprintf waitpid) +AC_CHECK_FUNCS(atexit basename bcopy bzero canonicalize_file_name \ + dup2 endgrent endpwent execvp fcntl fork getcwd getegid geteuid \ + getgid getgrent getgrgid getgrnam getpgrp getpid getppid getpwent \ + getpwuid gettimeofday getuid getwd _kbhit kill link localtime_r \ + lstat memmove mkdir mkfifo mkstemp on_exit pipe poll putenv raise \ + readlink rename resolvepath rindex rmdir round select setgrent \ + setpwent setvbuf sigaction siglongjmp sigpending sigprocmask \ + sigsuspend stat strcasecmp strdup strerror strftime stricmp \ + strncasecmp strnicmp strptime symlink tempnam umask unlink usleep \ + vfprintf vsprintf vsnprintf waitpid) OCTAVE_SMART_PUTENV Index: liboctave/file-ops.cc =================================================================== RCS file: /usr/local/cvsroot/octave/liboctave/file-ops.cc,v retrieving revision 1.36 diff -u -r1.36 file-ops.cc --- liboctave/file-ops.cc 23 Jan 2004 20:04:36 -0000 1.36 +++ liboctave/file-ops.cc 9 Feb 2005 23:01:29 -0000 at @ -311,6 +311,88 @@ return status; } +std::string +file_ops::canonicalize_file_name (const std::string& name) +{ + std::string msg; + return canonicalize_file_name (name, msg); +} + +std::string +file_ops::canonicalize_file_name (const std::string& name, std::string& msg) +{ + msg = std::string (); + + std::string retval; + +#if defined (HAVE_CANONICALIZE_FILE_NAME) + + char *tmp = ::canonicalize_file_name (name.c_str ()); + + if (tmp) + { + retval = tmp; + ::free (tmp); + } + +#elif defined (HAVE_RESOLVEPATH) + +#if !defined (errno) +extern int errno; +#endif + +#if !defined (__set_errno) +# define __set_errno(Val) errno = (Val) +#endif + + if (name.empty ()) + { + __set_errno (ENOENT); + return retval; + } + + // All known hosts with resolvepath (e.g. Solaris 7) don't turn + // relative names into absolute ones, so prepend the working + // directory if the path is not absolute. + + name = octave_env::make_absolute (name); + + size_t resolved_size = name.length (); + + while (1) + { + resolved_size = 2 * resolved_size + 1; + + OCTAVE_LOCAL_BUFFER (char, resolved, resolved_size); + + resolved_len = resolvepath (name, resolved, resolved_size); + + if (resolved_len < 0) + break; + + if (resolved_len < resolved_size) + { + retval = resolved; + break; + } + } + +#else + + // XXX FIXME XXX -- provide replacement here... + retval = name; + +#endif + + if (retval.empty ()) + { + using namespace std; + msg = ::strerror (errno); + } + + return retval; +} + // We provide a replacement for tempnam(). std::string Index: liboctave/file-ops.h =================================================================== RCS file: /usr/local/cvsroot/octave/liboctave/file-ops.h,v retrieving revision 1.20 diff -u -r1.20 file-ops.h --- liboctave/file-ops.h 11 Oct 2002 20:57:21 -0000 1.20 +++ liboctave/file-ops.h 9 Feb 2005 23:01:29 -0000 at @ -55,6 +55,12 @@ static int rmdir (const std::string&); static int rmdir (const std::string&, std::string&); + static int resolvepath (const std::string&, std::string&); + static int resolvepath (const std::string&, std::string&, std::string&); + + static std::string canonicalize_file_name (const std::string&); + static std::string canonicalize_file_name (const std::string&, std::string&); + static std::string tempnam (const std::string&, const std::string&); static std::string tempnam (const std::string&, const std::string&, std::string&); Index: src/syscalls.cc =================================================================== RCS file: /usr/local/cvsroot/octave/src/syscalls.cc,v retrieving revision 1.50 diff -u -r1.50 syscalls.cc --- src/syscalls.cc 27 Sep 2004 13:50:01 -0000 1.50 +++ src/syscalls.cc 9 Feb 2005 23:01:34 -0000 at @ -190,7 +190,7 @@ exec_args[i+1] = tmp[i]; } else - error ("exec: arguments must be strings"); + error ("exec: arguments must be character strings"); } else { at @ -942,6 +942,37 @@ return retval; } +DEFUN (canonicalize_file_name, args, , + "-*- texinfo -*-\n\ + at deftypefn {Built-in Function} {[@var{cname}, @var{status}, @var{msg}]} canonicalize_file_name (@var{name})\n\ +Return the canonical name of file at var{name} dot \n\ + at end deftypefn") +{ + octave_value_list retval; + + if (args.length () == 1) + { + std::string name = args(0).string_value (); + + if (! error_state) + { + std::string msg; + + std::string result = file_ops::canonicalize_file_name (name, msg); + + retval(2) = msg; + retval(1) = msg.empty () ? 0 : -1; + retval(0) = result; + } + else + error ("canonicalize_file_name: argument must be a character string"); + } + else + print_usage ("canonicalize_file_name"); + + return retval; +} + #if !defined (O_NONBLOCK) && defined (O_NDELAY) #define O_NONBLOCK O_NDELAY #endif Index: src/variables.cc =================================================================== RCS file: /usr/local/cvsroot/octave/src/variables.cc,v retrieving revision 1.263 diff -u -r1.263 variables.cc --- src/variables.cc 1 Nov 2004 21:06:44 -0000 1.263 +++ src/variables.cc 9 Feb 2005 23:01:35 -0000 at @ -746,6 +746,20 @@ return retval; } +// Return TRUE if F and G are both names for the same file. + +static bool +same_file (const std::string& f, const std::string& g) +{ + std::string c_f = file_ops::canonicalize_file_name (f); + std::string c_g = file_ops::canonicalize_file_name (g); + + file_stat f_fs (c_f); + file_stat g_fs (c_g); + + return (f_fs.ino () == g_fs.ino () && f_fs.dev () == g_fs.dev ()); +} + // Is there a corresponding function file that is newer than the // symbol definition? at @ -783,9 +797,7 @@ (Vload_path_dir_path.find_first_of (names), octave_env::getcwd ()); - if (file != ff) - retval = true; - else + if (same_file (file, ff)) { tmp->mark_fcn_file_up_to_date (octave_time ()); at @ -794,6 +806,8 @@ if (fs && fs.is_newer (tp)) retval = true; } + else + retval = true; } } }