From bug-octave-request at bevo dot che dot wisc dot edu Tue Feb 10 07:29:30 2004 Subject: anova.m From: toni saarela To: bug-octave at bevo dot che dot wisc dot edu Date: Tue, 10 Feb 2004 15:22:51 +0200 Version: Octave 2.1.50 (i686-pc-linux-gnu) Description: I think there's a small bug in anova.m (which performs one-way analysis of variance). It only occurs when using anova with two input arguments, as in octave:1> anova (y,g) ,where y is a vector containing the data and g is a vector defining the groups, and only with unequal group sizes. The total mean is calculated from the group means (see below). This works fine if the group sizes are equal. However, if they are not, it gives too much weight to smaller groups in calculation of total mean, sometimes leading to too high estimates of between-groups variance (and of total variance), and thus too high F- and too small p-values. Example: octave:1>y = [1 3 4 2 1 5 3 5 6 7 4 5 7 10 11 3]'; octave:2>g = [1 1 1 1 1 1 1 1 2 2 2 2 2 3 3 3]'; octave:3>anova (y, g) gives F=7.4067, p=0.0071 (ssq between groups = 71.5600) should be (please correct me if I'm wrong): F=6.3797, p=0.0117 (ssq between groups = 61.6375) Fix: --- Simply replacing the vector group_mean with y (input vector containing all the data) in calculation of total_mean on line 83 should fix it: line 83: total_mean = mean (group_mean); to: total_mean = mean (y); Now the SSQ's produce the right result: (lines 84-86) SSB = sum (group_count .* (group_mean - total_mean) .^ 2); SST = sumsq (reshape (y, n, 1) - total_mean); SSW = SST - SSB; (Or if group_mean is to be used, it should be weighted with relative group sizes) Best regards, Toni Saarela ------------------------------------------------------------- Octave is freely available under the terms of the GNU GPL. Octave's home on the web: http://www.octave.org How to fund new projects: http://www.octave.org/funding.html Subscription information: http://www.octave.org/archive.html -------------------------------------------------------------