From bug-request at octave dot org Wed Dec 7 15:48:48 2005 Subject: bugfix for anova.m From: D Goel To: KH , bug@octave.org Date: Wed, 7 Dec 2005 15:43:48 -0600 --=-=-= I am a newbie to anova, so please pardon me if I screwed up. The attached diff and changelog fixes is what I believe was a bug in anova.m. (This diff is against the anova.m from 2.1.69.) When calculating total mean, it was illegit to simply take the mean of the group-means because the various groups may have had different lengths. This example illustrates the point: octave:88> y=[-2 -1 0 1 2 7 8 9]'; g = [ 0 0 0 0 0 1 1 1]' octave:90> anova(y,g) One-way ANOVA Table: Source of Variation Sum of Squares df Empirical Var ********************************************************* Between Groups 128.0000 1 128.0000 Within Groups 12.0000 6 2.0000 --------------------------------------------------------- Total 140.0000 7 Test Statistic f 64.0000 p-value 0.0002 ans = 0.000203464502079087 As you see above, the default overestimates the between groups SS (as 128 instead of 120), and hence the total variation (as 140), the latter's true value being: octave:91> sumsq(y - mean(y)) ans = 132 As a check, here's what octave-forge's anovan yields: octave:93> anovan(y,g) 1-way ANOVA Table (Factors A,): Source of Variation Sum Sqr df MeanSS Fval p-value ********************************************************************* Error 12.00 6 2.00 Factor A 120.00 1 120.00 60.000 0.000243 This is what the fixed anova also yields. Another check: The anova function is M$ "Excel" also reports the same results. Sincerely, deego -- --=-=-= Content-Type: application/octet-stream Content-Disposition: attachment; filename=ChangeLog 2005-12-07 Deepak Goel * anova.m: Total Mean should be actual mean, not mean of means of groups: the latter may not be same as total mean because of varying group lengths. --=-=-= Content-Type: application/octet-stream Content-Disposition: attachment; filename=anovadiff --- anova.m 2005-12-07 16:38:38.322759384 -0500 +++ tmpanovamy.m 2005-12-07 16:40:15.681958536 -0500 at @ -80,7 +80,7 @@ endif - total_mean = mean (group_mean); + total_mean = mean(y(:)); SSB = sum (group_count .* (group_mean - total_mean) .^ 2); SST = sumsq (reshape (y, n, 1) - total_mean); SSW = SST - SSB; --=-=-=-- ------------------------------------------------------------- Octave is freely available under the terms of the GNU GPL. Octave's home on the web: http://www.octave.org How to fund new projects: http://www.octave.org/funding.html Subscription information: http://www.octave.org/archive.html -------------------------------------------------------------