Why is Fisher's test called an exact test?

Question:

apple guava

2009-05-21 06:23:16 UTC

What's so exact about it?

I understand when numbers in a fourfold table are too small for x^2 test, we can use exact probability test devised by Fisher, Irwin and Yates.

What I don't get is how it is supposed to be "exact" when using just small amount of data? I would have thought using less data you'll be less exact in terms making claims, inferences or generalisations about a bigger population...

Three answers:

Puzzling

2009-05-24 23:43:35 UTC

The Fisher test is a diagnostic procedure used to confirm suspected normal pressure hydrocephalus (NPH). The Fisher test confirms NPH if the removal of 30 mL of cerebrospinal fluid (CSF) results in clinical improvement of NPH symptoms.

Diagnosis of NPH is usually first led by a lumbar puncture, followed by the evaluation of clinical response to removal of CSF. This can be followed by a CT, MRI, and continuous external lumbar CSF drainage during 3 or 4 days.

Lumbar puncture is usually the first step in diagnosis of NPH and the opening pressure measured carefully. In most cases, CSF pressure is usually above 155 mmH2O. Clinical improvement after removal of CSF (30 mL or more) has a high predictive value for subsequent success with shunting. This is called the "lumbar tap test" or "Fisher test". A "negative" test has a very low predictive accuracy, as many patients may improve after a shunt in spite of lack of improvement after CSF removal.

Retrieved from "http://en.wikipedia.org/wiki/Fisher_test"

Anish

2009-05-29 02:09:19 UTC

Fisher's exact test is a statistical significance test used in the analysis of contingency tables where sample sizes are small. It is named after its inventor, R. A. Fisher, and is one of a class of exact tests, so called because the significance of the deviation from a null hypothesis can be calculated exactly rather than by relying on a test statistic having a distribution that is approximately that of a known theoretical distribution. Fisher is said to have devised the test following a comment from Muriel Bristol, who claimed to be able to detect whether the tea or the milk was added first to her cup.

The test is useful for categorical data that result from classifying objects in two different ways; it is used to examine the significance of the association (contingency) between the two kinds of classification. So in Fisher's original example, one criterion of classification could be whether milk or tea was put in the cup first; the other could be whether Ms Bristol thinks that the milk or tea was put in first. We want to know whether these two classifications are associated - that is, whether Ms Bristol really can tell whether milk or tea was poured in first. Most uses of the Fisher test involve, like this example, a 2 x 2 contingency table. The p-value from the test is computed as if the margins of the table are fixed, i.e. as if, in the tea-tasting example, Ms. Bristol knows the number of cups with each treatment (milk or tea first) and will therefore provide guesses with the correct number in each category. As pointed out by Fisher, this leads under a null hypothesis of independence to a hypergeometric distribution of the numbers in the cells of the table.

With large samples, a chi-square test can be used in this situation. The usual rule of thumb is that the chi-square test is not suitable when the expected values in any of the cells of the table, given the margins, is below 10: the sampling distribution of the test statistic that is calculated is only approximately equal to the theoretical chi-squared distribution, and the approximation is inadequate in these conditions (which arise when sample sizes are small, or the data are very unequally distributed among the cells of the table). In fact, for small, sparse, or unbalanced data, the exact and asymptotic p-values can be quite different and may lead to opposite conclusions concerning the hypothesis of interest.[1][2] The Fisher test is, as its name states, exact, and it can therefore be used regardless of the sample characteristics. It becomes difficult to calculate with large samples or well-balanced tables, but fortunately these are exactly the conditions where the chi-square test is appropriate.

For hand calculations, the test is only feasible in the case of a 2 x 2 contingency table. However the principle of the test can be extended to the general case of an m x n table[3], and some statistical packages provide a calculation (sometimes using a Monte Carlo method to obtain an approximation) for the more general case.

[edit] Example

For example, a sample of teenagers might be divided into male and female on the one hand, and those that are and are not currently dieting on the other. We hypothesize, perhaps, that the proportion of dieting individuals is higher among the women than among the men, and we want to test whether any difference of proportions that we observe is significant. The data might look like this:

men women total

dieting 1 9 10

not dieting 11 3 14

totals 12 12 24

These data would not be suitable for analysis by a chi-squared test, because the expected values in the table are all below 10, and in a 2 × 2 contingency table, the number of degrees of freedom is always 1.

The question we ask about these data is: knowing that 10 of these 24 teenagers are dieters, and that 12 of the 24 are female, what is the probability that these 10 dieters would be so unevenly distributed between the women and the men? If we were to choose 10 of the teenagers at random, what is the probability that 9 of them would be among the 12 women, and only 1 from among the 12 men?

Before we proceed with the Fisher test, we first introduce some notation. We represent the cells by the letters a, b, c and d, call the totals across rows and columns marginal totals, and represent the grand total by n. So the table now looks like this:

men women total

dieting a b a + b

not dieting c d c + d

totals a + c b + d n

Fisher showed that the probability of obtaining any such set of values was given by the hypergeometric distribution:

p = {{{a+b}\choose{a}}{{c+d}\choose{c}}}\left/{{{n}\choose{a+c}}}\right. =\frac{(a+b)!(c+d)!(a+c)!(b+d)!}{n!a!b!c!d!}

where \tbinom nk is the binomial coefficient and the symbol ! indicates the factorial operator.

This formula gives the exact probability of observing this particular arrangement of the data, assuming the given marginal totals, on the null hypothesis that men and wo

2016-12-24 15:40:31 UTC

Copied from Statistica 7.0: "For small n, this probability may well be computed precisely by counting all a probability tables that is built based on the marginal frequencies" (information in Crosstabulations - Fisher top attempt) This application purely does it for 2x2, however the actual concern is the dimensions of n with the aid of fact that "counting all a probability tables" grows extra effective than exponentially. (n is the sum of all cells) be conscious which you need to learn your table with all different a probability tables. you need to do it for different table sizes, yet purely for extremely small n, with the aid of fact combos certainly surpass any notebook's skill. you will come across an occasion in: Statistical tactics for expenses and Proportions: Joseph L. Fleiss

ⓘ

This content was originally posted on Y! Answers, a Q&A website that shut down in 2021.

about - legalese