From help-request at octave dot org Tue Sep 6 17:14:32 2005 Subject: Re: statistical function example From: Joe Koski To: Dean Allen Provins , Matti Picus CC: Octave Help Date: Tue, 06 Sep 2005 16:13:12 -0600 Dean, There is also some on-line textbook like info on the KS test, if you Google for it. I looked it up a while back when I was contemplating a use for it, but found an alternative approach. Joe on 9/6/05 1:18 PM, Dean Allen Provins at provinsd at telusplanet dot net wrote: > Matti: > > On Thu, Aug 25, 2005 at 09:12:59PM +0000, Matti Picus wrote: >> Dean Allen Provins telusplanet.net> writes: >> >>>> On Tue, 23 Aug 2005, Dean Allen Provins wrote: >>>> >>>>> I have been trying to make some sense out of the "kolmogorov_smirnov_test" >>>>> function result. Given a sample of 8 data points, for which Swan and >>>>> Sandilands, "Introduction to Geological Data Analysis", give a clear >>>>> answer, I cannot get an answer from the KS test that has any meaning >>>>> for me. >>>>> >>>>> S&S obtain the maximum deviation (about 0.22) and compare that value to >>>>> that which would be exceeded with probability 0.05 (their table indicates >>>>> about 0.46). The second return value from the Octave KS test is much >>>>> larger: >>>>> >>>>> p = 0.053223 >>>>> k = 1.3466 >>>>> >>>>> I presume the "p" value is the probability of rejecting H0, but what is >>>>> "k"? No such value appears in the one-sided test tables that I located >>>>> on the 'net. >>>>> >>>>> The input data X and the cumulative frquency used (i/n+1) is: >>>>> X CF >>>>> 0.07000 0.11111 >>>>> 0.12000 0.22222 >>>>> -0.06000 0.33333 >>>>> -0.04000 0.44444 >>>>> -0.05000 0.55556 >>>>> 0.08000 0.66667 >>>>> 0.04000 0.77778 >>>>> 0.00000 0.88889 >>>>> >>>>> Would any readers with some insight care to enlighten me? >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Dean >> Background : just so we are talking about the same thing... >> The test works like this: given two sampled "cumulative frequencies" F1 and >> F2 >> (btw they are more commonly refereed to as "cumulative distribution >> functions"), >> calculate a value k based on the number of samples in each F1 and F2 and the >> maximum distance between them (maximum distance is defined as follows: plot >> the >> two distributions using the sampled values on the x axis and their associatd >> probablilities on the y axis. Maximum distance is the point at a vertical >> line >> joining the two plots is maximum length). Then use the value k to look up a >> probability for H0. >> >> You can accept H0 with confidence level p, or alternatively reject it with >> confidence (1-p). A value of 0.05 makes it pretty clear that the two >> distributions are different. There are different methods for calculating p >> from >> k, some authors are a little careless for k values that result in such a >> clear >> rejection of the null hypothesis since those cases are not interesting to >> most >> of us. >> >> The call to the octave implementation of the test assumes that you have >> x - a set of raw obesrvations >> i.e. [0, 0.4, -0.1, 0.7, 0.3, 0.4, -0.9] >> dist - a text string that when evaluated using feval('dist_cdf(y)') will >> yeild >> the CDF of the chosen distribution at the value y >> >> so a call to the function like >> [p,k]=kolmogorov_smirnov_test(x, "uniform", 0, 1) >> would give the probability p that the sample x is drawn from a uniform >> distribution over 0 to 1. >> The value k would be an intermediate value calulated from the length of x and >> the maximum difference between a sampled CDF of x and a uniform distribution, >> used to look up p. >> >> The strength of the test is that the value of k determines directly the >> probablility, with no assumptions about either distribution >> >> Did this help? >> Matti > > Thanks for the assistance, and I apologize for not responding sooner. > > I have examined the code, and tried to make some sense of it in the light > of the only text (Swan and Sandilands, 1995) that I have that mentions > a KS test. > > I think that with your explanation and my code study, I'll be able to > make use of the test with some confidence. > > Thanks again, > > Dean > > -- > Dean Provins, P. Geoph. > 50.95033N, 114.03791E > provinsd at telusplanet dot net > dprovins at alumni dot ucalgary dot ca > KeyID at at pgpkeys.mit.edu:11371: 0x9643AE65 > Fingerprint: 9B79 75FB 5C2B 22D0 6C8C 5A87 D579 9BE5 9643 AE65 > > > > ------------------------------------------------------------- > Octave is freely available under the terms of the GNU GPL. > > Octave's home on the web: http://www.octave.org > How to fund new projects: http://www.octave.org/funding.html > Subscription information: http://www.octave.org/archive.html > ------------------------------------------------------------- > ------------------------------------------------------------- Octave is freely available under the terms of the GNU GPL. Octave's home on the web: http://www.octave.org How to fund new projects: http://www.octave.org/funding.html Subscription information: http://www.octave.org/archive.html -------------------------------------------------------------