Categorical data: a single proportion


The Problem


We have a single sample of n individuals; each individual either ‘possesses’ a characteristic of interest (e.g. is male, is pregnant, has died) or does not possess that characteristic (e.g. is female, is not pregnant, is still alive). A useful summary of the data is provided by the proportion of individuals with the characteristic. We are interested in determining whether the true proportion in the population of interest takes a particular value.


The Test of a Single Proportion


Assumptions


Our sample of individuals is selected from the population of interest. Each individual either has or does not have the particular characteristic.


Notation


r individuals in our sample of size n have the characteristic. The estimated proportion with the characteristic is p = r/n. The proportion of individuals with the characteristic in the population is π. We are interested in determining whether π takes a particular value, π1.


Rationale


The number of individuals with the characteristic follows the Binomial distribution (Chapter 8), but this can be approximated by the Normal distribution, providing np and n(1 − p) are each greater than 5.


Then p is approximately Normally distributed with:


an estimated mean = p and


c23ue001


Therefore, our test statistic, which is based on p, also follows the Normal distribution.









1 Define the null and alternative hypotheses under study
H0: the population proportion, π, is equal to a particular value, π1

H1: the population proportion, π, is not equal to π1.

2 Collect relevant data from a sample of individuals

3 Calculate the value of the test statistic specific to H0

c23ue002

which follows a Normal distribution.
The 1/2n in the numerator is a continuity correction: it is included to make an allowance for the fact that we are approximating the discrete Binomial distribution by the continuous Normal distribution.

4 Compare the value of the test statistic to values from a known probability distribution
Refer z to Appendix A1.

5 Interpret the P-value and results
Interpret the P-value and calculate a confidence interval for the true population proportion, π. The 95% confidence interval for π is approximated by

c23ue003


We can use this confidence interval to assess the clinical or biological importance of the results. A wide confidence interval is an indication that our estimate has poor precision.

Only gold members can continue reading. Log In or Register to continue

Stay updated, free articles. Join our Telegram channel

May 9, 2017 | Posted by in GENERAL & FAMILY MEDICINE | Comments Off on Categorical data: a single proportion

Full access? Get Clinical Tree

Get Clinical Tree app for offline access