The Importance and Effect of Sample Size - Select Statistical Consultants
The power of any statistical test is 1 - ß. of significance (alpha); n (sample size); and the effect size (ES). Statistical tests look for evidence that you can reject the null hypothesis and conclude How do I use power calculations to determine my sample size? . to interpret effect sizes, discusses the relationship between significance and effect size. Similarly, the larger the sample size the more information we have and Increasing our sample size can also give us greater power to detect.
There are lots of things that can affect how well our sample reflects the population and therefore how valid and reliable our conclusions will be. In this blog, we introduce some of the key concepts that should be considered when conducting a survey, including confidence levels and margins of error, power and effect sizes.
See the glossary below for some handy definitions of these terms. The size of our sample dictates the amount of information we have and therefore, in part, determines our precision or level of confidence that we have in our sample estimates.
An estimate always has an associated level of uncertainty, which depends upon the underlying variability of the data as well as the sample size. The more variable the population, the greater the uncertainty in our estimate.
Similarly, the larger the sample size the more information we have and so our uncertainty reduces. Suppose that we want to estimate the proportion of adults who own a smartphone in the UK.
We could take a sample of people and ask them. The larger the sample size the more information we have and so our uncertainty reduces. We can also construct an interval around this point estimate to express our uncertainty in it, i.
In other words, if we were to collect different samples from the population the true proportion would fall within this interval approximately 95 out of times. What would happen if we were to increase our sample size by going out and asking more people?
Power and Sample Size
Suppose we ask another people and find that, overall, out of the people own a smartphone. However, our confidence interval for the estimate has now narrowed considerably to Because we have more data and therefore more information, our estimate is more precise. Figure 1 As our sample size increases, the confidence in our estimate increases, our uncertainty decreases and we have greater precision.
This is clearly demonstrated by the narrowing of the confidence intervals in the figure above. If we took this to the limit and sampled our whole population of interest then we would obtain the true value that we are trying to estimate — the actual proportion of adults who own a smartphone in the UK and we would have no uncertainty in our estimate.
Power Analysis, Statistical Significance, & Effect Size
Power and Effect Size Increasing our sample size can also give us greater power to detect differences. Suppose in the example above that we were also interested in whether there is a difference in the proportion of men and women who own a smartphone. We can estimate the sample proportions for men and women separately and then calculate the difference.
When we sampled people originally, suppose that these were made up of 50 men and 50 women, 25 and 34 of whom own a smartphone, respectively.
Sample size estimation and power analysis for clinical research studies
The difference between these two proportions is known as the observed effect size. Is this observed effect significant, given such a small sample from the population, or might the proportions for men and women be the same and the observed effect due merely to chance? Note that large numbers are needed in some cases.
A web site that will do the calculations Click the arrow below for a pdf paper giving more details on power analysis. Although there is probably sufficient information given in the table above and the example below for you to estimate your required sample size, you can click below for a web site which will do the calculations for you. It can be downloaded from this web site An example comparing two means A vet wants to compare the effect on blood pressure of two anesthetics for dogs under clinical conditions.
He has published some preliminary data. The dogs were unsexed healthy animals weighing 3. Mean systolic blood pressure was mm Hg with a standard d eviation of 36mm, the noise Assume: A difference in blood pressure of 20 mmHg the signal or more would be of clinical importance a clinical not a statistical decision.
A significance level of 0.
Note that great accuracy is not needed as there are uncertainties in the estimates of the standard deviation and the effect size of clinical importance. However there are many statistical software packages will do the calculations.
Statistical power & sample size- Principles
The output below is done using the R statistical package for this set of data. Note that the sample size needs to be rounded up to a whole number. Sixty-eight dogs per group in total is a lot of dogs and using such animals would be time-consuming.
An alternative In the same journal an investigator was working with male Beagles weighing kg.
These had a mean BP of mm Hg. Assume a 20mm difference between groups would be of clinical importance as before. The table below summarises the situation. This poses a problem.
And is there ever any case for using genetically heterogeneous animals if all it does is increase noise and reduce the power of the experiment, leading to false negative results? Alternative approaches It would make no sense to go ahead and do the experiment simply using the heterogeneous dogs. But there are some obvious alternatives.
If each dog could be given both anaesthetics say in random order on different daysthen it would be possible to use small numbers of even quite heterogeneous dogs, assuming that there are no important breed differences in response. Technically, this would be a randomised block design discussed later 2.
As far as possible there should be equal numbers in each group. This would indicate whether the two anesthetics differ over-all and whether breed differences need to be taken into account when choosing one of these anesthetics.
If lots of characters are being measured it may not be clear which one is the most important There may be no estimate of the standard deviation if the character has not previously been measured In fundamental research it may be impossible to specify an effect size likely to be of scientific importance A power analysis is difficult with complex experiments involving many treatment groups and possible interactions. This depends on the law of diminishing returns.