Question:
Two samples. Only know sample size and range limits. Significantly different?,?
anonymous
1970-01-01 00:00:00 UTC
Two samples. Only know sample size and range limits. Significantly different?,?
Four answers:
bobizoid
2006-06-30 13:19:53 UTC
Though I am not a statistician, I am an epidemiologist and have studied statistics.



Unfortunately, taking the mid-range as the mean is a very bad assumption. In fact, you have no idea what the distribution of values is between the limits. It could be highly skewed. Assumptions about the SD are also misleading.



For instance, in your n=10 example, the true mean could be anything from 2.2 to 3.8. And in your n=12 example, the true mean could be 3.25 to 5.75. You just don't know.



I don't know of any test for such limited data. I agree that you should not accept conclusions drawn from insufficient data such as you have described.
Angie B
2006-06-30 13:06:01 UTC
!!



I'm sorry but two years since my last business statistics class is enough for me to go "huh?"



But I still don't see how you can just assume the standard deviation. Wouldn't a test based on assumption give unreliable results?



Sorry if I don't know what I'm talking about.
2feEThigh
2006-06-30 13:00:23 UTC
u got me there, can't remeber anything from my statistic class, took 2 semesters and remember nothing. shame on me.
itsverystrange
2006-07-05 16:24:12 UTC
As to part one, I can not recommend any acceptable, known test for such data (perhaps what I've written below might work).



As to part two, this is a horrid approximation since the midrange as the mean is a major assumption (even more far-reaching than assuming something about the populations' probability distributions).



For part 3 and also part 1:



The first problem you run into is that you do not know anything about the nature of the population distribution (normal, uniform, etc). Thus, a nonparametric test would be required. I know of no nonparametric tests for any of the statistics you have (minimum, maximum or range).



If it is possible to obtain the actual values for each of the samples in each set, the Kruskal-Wallis test should suffice. It is nonparametric, so there are no problems with the requirement of assumptions about population distributions.



Further, its job is to determine if two samples are far enough apart such that the null hypothesis that the population distributions are the same can be rejected. This would give you what you desire.



As far as your problem is concerned, the only statistics you have are minimum, maximum and range. Perhaps one could derive a test to test the minima or maxima of each set, but one still runs into the problem that there can be no parameter placed on the population distribution.



If you can make an assumption about the distributions of the populations of interest (normal or whatnot), then it would be possible to derive such a test. This would be a tedious task, however, as it would require thorough knowledge of order statistics and distribution functions.



One would need to use a test statistic based of off the difference in the two statistics that you use (I would use the maxima since they are further apart than the minima and probability distributions for ranges are uncharted territory unless you're a Ph.D.). Thus, your test statistic is based off of the difference between the maxima (Y(n) - Z(n) where the (n) are subscripts, which is 2 in this case).



To perform this test, you would need to derive the probability distribution of the difference of the two maxima and use that to find the probability of the difference being two or greater. This is quite a task for anyone and requires a large amount of knowledge in mathematical statistics. I believe I could perform such a task, but it would likely take some time to complete.



Remember though that you must make a critical assumption about the population before you can even do this. I would recommend using the assumption of uniformity if you are unsure of the populations' distributions. This is because the derived test would then be least likely to give a type I error, where you reject the data improperly, because the test is less likely to reject.


This content was originally posted on Y! Answers, a Q&A website that shut down in 2021.
Loading...