Hypothesis Testing

CHAPTER 10 Hypothesis Testing




We know that the values of many variables that occur in nature have a Gaussian or bell-shaped frequency distribution. The height of the curve is lowest at the tail ends, so those values do not occur very frequently. If we were to pick a value at random from this distribution, it would be very unlikely that the value would be from one of the ends of the distribution; it is much more likely to be from somewhere toward the middle.


In the same way, if an event takes place that has multiple possible outcomes, and the outcome variable has a normal frequency distribution, certain outcomes are more likely to occur than others. The probability that a specific outcome will occur by chance depends on where in the normal frequency distribution it is located. The more likely outcomes are toward the middle, with higher values on the ordinate. Those less likely to occur will be at the tail ends of the distribution. A value picked at random will be very likely to be from the center portion of the graph.


Let us see how we can apply the logic of probability using the normal distribution to decide how likely it is that a medical intervention makes a real difference. Inferential statistics gives us the means to answer questions such as: Is one pathway better than another, on average? More specifically, we can answer these types of questions:



These are very representative of the types of questions posed in research studies. The questions look at different problems but the overall approach to getting an answer is the same. The type of statistical test that is used is different, as is shown by the flow diagrams in Appendices A and B. The specific test depends on the type of variable and the question being asked. However, the logic behind the tests uses the same standard approach. We will look at each of these situations in more detail by focusing on the type of statistical test. Before we delve into the individual situations, however, we first need to understand the format that is used when performing statistical tests.


When a researcher sets up a clinical trial, s/he follows a basic protocol. These steps are common to every study even though the particular type of study may vary. These include the procedural steps as well as the logistic steps of the statistical process that focuses on hypothesis testing, which are in italics in the following list. We have discussed many of these concepts already, and are now ready to look at the statistical methodology.



There is virtually an infinite number of topics that could potentially be studied. Not all research questions are answerable, however. Many questions are just too broad to be considered as a research hypothesis. Most major investigators will choose a topic in their field of expertise so they are able to pose questions that are particularly appropriate, given the prior research leading up to what is currently known about a subject. They are able to identify the next logical step in the process that leads to a usable understanding of the knowledge. They are aware of time and cost limitations when determining the feasibility of a research proposal. Ethical issues need to be considered, too. Many studies cannot be justified because they could be construed to take advantage of a certain group of people, especially those in lower socioeconomic classes or those who cannot give consent, such as a fetus.


The first step in research, then, is choosing an answerable question (keeping in mind the above constraints). Once that is done, the population to be studied is identified, as are the variables pertinent to the research hypothesis. The next step is to declare the null hypothesis, to see whether the data collected will support it. We cannot reach a conclusion unless we have gone this route.



THE NULL HYPOTHESIS AND ALTERNATIVE HYPOTHESIS


All statistical tests start out with the premise that the data are a result of chance variation. This hypothesis is called the null hypothesis, or H0. This is different from the research hypothesis, which is the question that the experiment was designed to answer. The null hypothesis is the first step in the actual process that is used in the statistical analysis.


In the first two types of research questions that were posed above, we are actually asking whether there is a difference between two or more groups with respect to the variable of interest, such as outcome. Another way of stating this is “Is it likely that the groups are different with respect to the intervention, or is the difference more likely due to chance?” The null hypothesis in this case would state that the treatment had no effect.


There will always be a difference when comparing the outcome variables of groups within a sample, because that is the nature of variables. They vary. So what we are really asking is “Does the difference of the outcome variable in the sample fall outside of the expected range when no difference really exists in the population?”


Besides testing for differences between groups exposed to alternate pathways, we can also use biostatistics to test for relationships between variables. The questions that would be posed in these circumstances would be similar to the above questions 3 and 4. The way to answer this type of question is to ask “How probable is it that the relationship we observed was due solely to chance variation in the value of the variables?” In this case, the null hypothesis (H0) would state that there is no relationship between the variables in the population, and the statistical test used would test the likelihood that any apparent relationship was due to chance occurrence in the sample.



In theory, any result we get is possible due to chance variation. Assuming that the result we observe is truly due to chance, we would like to know how probable it is that this could happen. If it is highly improbable to get the result we did if there is no true difference (or relationship) in groups, we will reject the null hypothesis as being too outlandish. Statistical methods used to test the null hypothesis are termed tests of significance, which we will talk about more in the next chapter.


Why do we use null hypothesis? It may seem like backward logic to assume that the outcomes are not dependent upon an intervention, and then to try to prove yourself wrong. Why not assume that there is a true difference and try to prove yourself right? The philosophical reasoning behind this approach was proposed by Fisher in the early 1900s, who developed the concept of hypothesis testing. Simply put, his argument states that it is easier to reject a statement as false by finding data that do not support it. However, to accept that something is always true, one must account for every instance in which the statement could be true. We would have to test multiple hypotheses—looking at every possible numerical relationship that could exist—which is an impossible task. It is therefore much more productive to form a statistical hypothesis of no difference or no relationship and then see if the data support or go against this hypothesis. Without delving into the specifics of the precise argument in favor of the null hypothesis, it is important to know that it is a starting point for virtually any statistical test. The logic of applied biostatistics is based on the null hypothesis, and it is widely accepted as the foundation of statistical testing.


It is customary to state an alternative hypothesis, which would be accepted if the null hypothesis were shown to be highly unlikely. In this case, we would reject the null hypothesis of no difference (or relationship) and adopt the alternative hypothesis that a treatment difference (or relationship among variables) does indeed exist. The typical symbol for the alternative hypothesis is Ha. Occasionally the symbol H1 will be used. The specific alternative hypothesis depends on the question being asked. We will see examples of this in the following chapters.

Stay updated, free articles. Join our Telegram channel

Jun 18, 2016 | Posted by in BIOCHEMISTRY | Comments Off on Hypothesis Testing

Full access? Get Clinical Tree

Get Clinical Tree app for offline access