24. Biostatistics

“Lies, damned lies, and statistics with biases.”

24.1 Incidence and Prevalence

 Incidence Prevalence Definition Number of “new” cases/number of people at risk of acquiring that disease • During a specified period of time Total number of cases (new + old)/whole population Over a specified period of time = period prevalence At any particular point in time = point prevalence Use Acute cases, e.g., incidence of Zika virus infection in August Chronic cases only,a e.g., prevalence of diabetes in United States Gives an idea of Risk of getting the disease How widespread is the disease. aIt does not make sense to calculate prevalence of acute disease.

Attack rate

• Attack rate is a special type of incidence rate helpful in outbreaks.

• Attack rate = Number of new cases in a population at risk/number of population at risk.

• For example, 100 people were eating at a Chinese buffet, and 25 people got diarrhea = the attack rate is 25%.

24.1.2 What May Affect this Pool?

 Cause Effect Improved diagnostic tool Increases prevalence as well as incidence A new treatment that controls the disease better, or Improved care for chronic diseases Over time prevalence increases with an increase in survival A new treatment that cures the disease Decreases prevalence by increasing cure rate

Clinical Case Scenarios

In the beginning of 2016, 20 out of 100 people in a small town had diabetes. By the end of 2016, 10 more people were diagnosed with diabetes.

1. What is the incidence of DM in 2016 in the small town?

2. What is the prevalence at the end of 2016?

24.2 Defining Characteristics of a New Diagnostic Test 1Reliability, validity, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), accuracy.

 Reliability (how reliable is this test?) Test should give the same values “REPEATEDLY.” For example, a new finger-stick device, created to measure low-density lipoprotein (LDL), gives values of 100, 123, and 145, when tested repeatedly on the same patient on the same day. This test is not reliable. Validity (how valid is this test?) Test result is compared with a gold standard test. After some adjustment, the test reliably gives LDL values of 123 mg/dL in the same patient. However, when LDL is checked with the “gold standard method of LDL testing” it comes back as 150 mg/dL for the same patient. This test has an issue with validity.

24.2.1 Sensitivity, Specificity, Positive Predictive Value (PPV), and Negative Predictive Value (NPV)

 Patients with true diseasea Patients without diseasea Results of a new diagnostic test Positive True positive (TP) False positive (FP) TP/(TP + FP) = TP/(total number of patients with positive result) = PPVb Negative False negative (FN) True negative (TN) TN/(TN + FN) = TN/ (total number of patients with negative result) = NPVc TP/(TP + FN) = TP/(total number of patients with disease) = sensitivityd TN/(TN + FP) = TN/(total number of patients without disease) = specificitye aTypically, new diagnostic tests are compared with gold standard tests, which have the highest sensitivity and specificity. bTest question commonly asks PPV in this manner: “If the test is positive, how likely does the patient truly has the disease?” 2 PPV and NPV: statistical measures of a test’s ability to identify truly positive (PPV) and truly negative (NPV) states. cIf the test is negative, how likely is that the patient does not have the disease?2 dMRS: Truly sick people are sensitive. So, sensitivity is a ratio that involves only the patients with the disease (true positive and false negative). eMRS: Truly healthy minds are very specific in what they want to do.

 Tests with High sensitivity High specificity Identifies all patients with the disease, and if negative, can virtually rule out the diagnosis. If the test is positive, it is likely to be truly positive. Good for Screening purposes. Ruling out the diagnosis Confirming the diagnosis Example test Antinuclear antibody (ANA) has high sensitivity for SLE (systemic lupus erythematosus), so if ANA is negative SLE is very unlikely. But, if ANA is positive it does not confirm the dx, as healthy old people might have positive ANA, but may not have underlying connective tissue disease. Anti-dsDNA or anti-Smith antibodies are tests with high specificity for SLE, which means if they are positive, SLE is very likely. However, if negative there is still a possibility of SLE, as these tests have low sensitivity.

MRS

Reliably repeatedly gives the same value.

MRS

Validity is checked by comparing it with a gold standard

MRS

For all the following calculations, the “TRUE” value is always the numerator, and the denominator is the sum of “TRUE” and “FALSE” values.

Clinical Case Scenarios

To answer most of the following questions, we recommend drawing the two by two table given on previous page. Practice creating the table again and again until you are very good at it.

1. A test developed for diagnosing lung cancer has a sensitivity of 80%. There are 100 patients who actually have lung cancer in the study and 300 who do not. Calculate the total number of false negatives?

2. A 60-year-old man requests fecal occult blood (FOB) testing for colon cancer screening as he does not want to do colonoscopy. He reports that he has been eating barbecued meat, processed food, and bacon at least two times per day all his life. A detailed artificial intelligence software analysis reveals that patients with similar baseline characteristics have a 10% prevalence of colon cancer. FOB comes back positive. Studies have reported that FOB has a sensitivity of 80% and a specificity of 70%. What is the likelihood that this patient really has colon cancer?

1. 82%

2. 46%

3. 23%

4. 55%

5. 96%

6. 81%

1. If FOB had come back negative, what is the likelihood that the patient does not have colon cancer?

2. What is the accuracy of FOB in patients with these baseline characteristics?

3. What happens to NPV of a test as prevalence increases?

4. When a test with 12% NPV comes back negative, what are the chances that the patient truly has the disease?

24.2.2 Effects of Using Different Cut-Off Values

Let’s say that we have created a test for diagnosis and that the test result has a numerical value. When we plot the frequency distribution of results obtained from general population, this will generally produce a bell-shaped (symmetric) distribution curve. Examples of such tests include fasting blood glucose, HbA1C, serum cholesterol, etc.

In the following graphs, fasting blood sugar cut-off is set at 126: > 126 mg/dL is diabetes and < 126 is not diabetes.

We can use different cut-offs for the test to serve different purposes, but we need to know what happens to test characteristics when we change the cut-off:

 In the Fig. 24.1 on previous page, what happens if the cut-off is shifted: From X to A (left shift): e.g., the cut-off value of fasting blood sugar to diagnose diabetes is decreased from 126 to 100 mg/dL From X to B (right shift): e.g., the cut-off value of fasting blood sugar to diagnose diabetes is increased from 126 to 140 mg/dL Sensitivity and Negative predictive value Increases Decreases Specificity and positive predictive value Decreases Increases

24.2.3 The ROC Curve (Receiver Operating Characteristic Curve)

• In this graph the true positive fraction/rate (TP/TP+FN= sensitivity) is plotted against the false positive rate (100-Specificity).

• The best test is plotted in the upper left corner (100% sensitivity and 100% specificity). Note, the horizontal axis is not specificity but 100-specificity.

• Accuracy of the test is measured by the area under the curve. The closer the ROC curve is to the upper left corner, the higher is the overall accuracy of the test.

• In the above example the red-line test has better accuracy than the dotted-line test.

Clinical Case Scenarios

9. What is the specificity/sensitivity of points A, B, and C in the following ROC curve?

MRS

SEN = Sensitivity moves with the Negative predictive value. When you are lowering the cut-off point, you are including more patients with disease, thus false negative rate decreases. So, when the test is negative, it is more likely to be truly negative. But as we try to increase the sensitivity of the test, we lose the specificity.

SPE = Specificity moves with the Positive predictive value. When you increase the cutoff point, you are making sure that you only get patients with the disease, thus the false positive rate decreases. So, when the test is positive, it is more likely to be truly positive.

But as we try to increase the specificity of the test, we lose the sensitivity.

24.3 Different Types of Statistical Studies

 Study designs Observational studies Higher standard studies Case report Cross-sectional Cohort Case-control Controlled trial Systematic review Meta-analysis Timing and/or type Anecdotal Usually surveys • Snapshot of present time Prospective, or retrospective Retrospective Interventional Overview of literature to date Study design and purpose Rare disease reporting Case history with illustrative images, treatment, and follow-up Thorough surveys to identify current health problems Typically measures prevalence and can also identify associations Starts with risk factor and identifies diseases Two cohorts: one group that is exposed and the other group that is not (both cohorts are considered to be free of a given disease) Identifies various effects due to riskfactor exposure and natural history Starts with disease and identifies risk factors Case = with disease Control = without disease Identifies risk factor/s, or source/s of exposure Comparison of interventions Is the intervention effective and what are adverse effects? Compilation of studies Statistical analyses of data gathered from systematic review. (All meta-analyses are derived from systematic review but not all systematic reviews are meta-analyses). Question posed What happened to this person? What is happening now? What will happen when you have certain risk factors or exposure? (prospective cohort study) What happened? Which is better? What has been published so far? How we can merge all the data (gathered from previous studies) to create a single data? Examples A case report of a newly reported autoimmune condition that does not fit into other known diagnosis. How many people have diabetes and how many of them are obese and eat refined carbohydrates? Framingham Trial Hormonal replacement study of Women’s Health Initiative • A cholera outbreak in United States was identified to be due to mangoes from Mexico. • Randomized, double-blind placebo-controlled study (this design is considered to be the highest quality of evidence) Weak- nesses(see below in the bias section of this chapter) It is not conclusive, but helps in hypothesis generation Cannot determine cause and effect relationship Attrition (loss of study participants) or migration Selection bias is built in Confounding bias Recall bias Confounding bias Usually expensive and time-consuming Publication bias (studies have shown that statistically significant papers are more likely to be put up for publishing than nonsignificant ones) Selection bias (publishing only the outcomes with statistically significant results)