24. Biostatistics

“Lies, damned lies, and statistics with biases.”

24.1 Incidence and Prevalence

	Incidence	Prevalence
Definition	Number of “new” cases/number of people at risk of acquiring that disease • During a specified period of time	Total number of cases (new + old)/whole population Over a specified period of time = period prevalence At any particular point in time = point prevalence
Use	Acute cases, e.g., incidence of Zika virus infection in August	Chronic cases only,^a e.g., prevalence of diabetes in United States
Gives an idea of	Risk of getting the disease	How widespread is the disease.
^aIt does not make sense to calculate prevalence of acute disease.

Attack rate

Attack rate is a special type of incidence rate helpful in outbreaks.
Attack rate = Number of new cases in a population at risk/number of population at risk.
For example, 100 people were eating at a Chinese buffet, and 25 people got diarrhea = the attack rate is 25%.

24.1.1 The Incidence to Prevalence Pool for Chronic Diseases

24.1.2 What May Affect this Pool?

Cause	Effect
Improved diagnostic tool	Increases prevalence as well as incidence
A new treatment that controls the disease better, or Improved care for chronic diseases	Over time prevalence increases with an increase in survival
A new treatment that cures the disease	Decreases prevalence by increasing cure rate

Clinical Case Scenarios

In the beginning of 2016, 20 out of 100 people in a small town had diabetes. By the end of 2016, 10 more people were diagnosed with diabetes.

What is the incidence of DM in 2016 in the small town?
What is the prevalence at the end of 2016?

24.2 Defining Characteristics of a New Diagnostic Test

¹Reliability, validity, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), accuracy.

Reliability

(how reliable is this test?)

Test should give the same values “REPEATEDLY.”

For example, a new finger-stick device, created to measure low-density lipoprotein (LDL), gives values of 100, 123, and 145, when tested repeatedly on the same patient on the same day. This test is not reliable.

Validity

(how valid is this test?)

Test result is compared with a gold standard test.

After some adjustment, the test reliably gives LDL values of 123 mg/dL in the same patient. However, when LDL is checked with the “gold standard method of LDL testing” it comes back as 150 mg/dL for the same patient. This test has an issue with validity.

24.2.1 Sensitivity, Specificity, Positive Predictive Value (PPV), and Negative Predictive Value (NPV)

	Patients with true disease^a	Patients without disease^a
Results of a new diagnostic test	Positive	True positive (TP)	False positive (FP)	TP/(TP + FP) = TP/(total number of patients with positive result) = PPV^b
Results of a new diagnostic test	Negative	False negative (FN)	True negative (TN)	TN/(TN + FN) = TN/ (total number of patients with negative result) = NPV^c
		TP/(TP + FN) = TP/(total number of patients with disease) = sensitivity^d	TN/(TN + FP) = TN/(total number of patients without disease) = specificity^e
^aTypically, new diagnostic tests are compared with gold standard tests, which have the highest sensitivity and specificity.
^bTest question commonly asks PPV in this manner: “If the test is positive, how likely does the patient truly has the disease?” ² PPV and NPV: statistical measures of a test’s ability to identify truly positive (PPV) and truly negative (NPV) states.
^cIf the test is negative, how likely is that the patient does not have the disease?²
^dMRS: Truly sick people are sensitive. So, sensitivity is a ratio that involves only the patients with the disease (true positive and false negative).
^eMRS: Truly healthy minds are very specific in what they want to do.

Tests with	High sensitivity	High specificity
	Identifies all patients with the disease, and if negative, can virtually rule out the diagnosis.	If the test is positive, it is likely to be truly positive.
Good for	Screening purposes. Ruling out the diagnosis	Confirming the diagnosis
Example test	Antinuclear antibody (ANA) has high sensitivity for SLE (systemic lupus erythematosus), so if ANA is negative SLE is very unlikely. But, if ANA is positive it does not confirm the dx, as healthy old people might have positive ANA, but may not have underlying connective tissue disease.	Anti-dsDNA or anti-Smith antibodies are tests with high specificity for SLE, which means if they are positive, SLE is very likely. However, if negative there is still a possibility of SLE, as these tests have low sensitivity.

MRS

Reliably repeatedly gives the same value.

MRS

Validity is checked by comparing it with a gold standard

MRS

For all the following calculations, the “TRUE” value is always the numerator, and the denominator is the sum of “TRUE” and “FALSE” values.

Clinical Case Scenarios

To answer most of the following questions, we recommend drawing the two by two table given on previous page. Practice creating the table again and again until you are very good at it.

A test developed for diagnosing lung cancer has a sensitivity of 80%. There are 100 patients who actually have lung cancer in the study and 300 who do not. Calculate the total number of false negatives?
A 60-year-old man requests fecal occult blood (FOB) testing for colon cancer screening as he does not want to do colonoscopy. He reports that he has been eating barbecued meat, processed food, and bacon at least two times per day all his life. A detailed artificial intelligence software analysis reveals that patients with similar baseline characteristics have a 10% prevalence of colon cancer. FOB comes back positive. Studies have reported that FOB has a sensitivity of 80% and a specificity of 70%. What is the likelihood that this patient really has colon cancer?

If FOB had come back negative, what is the likelihood that the patient does not have colon cancer?
What is the accuracy of FOB in patients with these baseline characteristics?
What happens to NPV of a test as prevalence increases?
When a test with 12% NPV comes back negative, what are the chances that the patient truly has the disease?

24.2.2 Effects of Using Different Cut-Off Values

Let’s say that we have created a test for diagnosis and that the test result has a numerical value. When we plot the frequency distribution of results obtained from general population, this will generally produce a bell-shaped (symmetric) distribution curve. Examples of such tests include fasting blood glucose, HbA1C, serum cholesterol, etc.

In the following graphs, fasting blood sugar cut-off is set at 126: > 126 mg/dL is diabetes and < 126 is not diabetes.

We can use different cut-offs for the test to serve different purposes, but we need to know what happens to test characteristics when we change the cut-off:

In the Fig. 24.1 on previous page, what happens if the cut-off is shifted:	From X to A (left shift): e.g., the cut-off value of fasting blood sugar to diagnose diabetes is decreased from 126 to 100 mg/dL	From X to B (right shift): e.g., the cut-off value of fasting blood sugar to diagnose diabetes is increased from 126 to 140 mg/dL
Sensitivity and Negative predictive value	Increases	Decreases
Specificity and positive predictive value	Decreases	Increases

24.2.3 The ROC Curve (Receiver Operating Characteristic Curve)

In this graph the true positive fraction/rate (TP/TP+FN= sensitivity) is plotted against the false positive rate (100-Specificity).

The best test is plotted in the upper left corner (100% sensitivity and 100% specificity). Note, the horizontal axis is not specificity but 100-specificity.
Accuracy of the test is measured by the area under the curve. The closer the ROC curve is to the upper left corner, the higher is the overall accuracy of the test.
In the above example the red-line test has better accuracy than the dotted-line test.

Clinical Case Scenarios

9. What is the specificity/sensitivity of points A, B, and C in the following ROC curve?

MRS

SEN = Sensitivity moves with the Negative predictive value. When you are lowering the cut-off point, you are including more patients with disease, thus false negative rate decreases. So, when the test is negative, it is more likely to be truly negative. But as we try to increase the sensitivity of the test, we lose the specificity.

SPE = Specificity moves with the Positive predictive value. When you increase the cutoff point, you are making sure that you only get patients with the disease, thus the false positive rate decreases. So, when the test is positive, it is more likely to be truly positive.

But as we try to increase the specificity of the test, we lose the sensitivity.

24.3 Different Types of Statistical Studies

Study designs	Observational studies				Higher standard studies
Study designs	Case report	Cross-sectional	Cohort	Case-control	Controlled trial	Systematic review	Meta-analysis
Timing and/or type	Anecdotal	Usually surveys • Snapshot of present time	Prospective, or retrospective	Retrospective	Interventional	Overview of literature to date
Study design and purpose	Rare disease reporting Case history with illustrative images, treatment, and follow-up	Thorough surveys to identify current health problems Typically measures prevalence and can also identify associations	Starts with risk factor and identifies diseases Two cohorts: one group that is exposed and the other group that is not (both cohorts are considered to be free of a given disease) Identifies various effects due to riskfactor exposure and natural history	Starts with disease and identifies risk factors Case = with disease Control = without disease Identifies risk factor/s, or source/s of exposure	Comparison of interventions Is the intervention effective and what are adverse effects?	Compilation of studies	Statistical analyses of data gathered from systematic review. (All meta-analyses are derived from systematic review but not all systematic reviews are meta-analyses).
Question posed	What happened to this person?	What is happening now?	What will happen when you have certain risk factors or exposure? (prospective cohort study)	What happened?	Which is better?	What has been published so far?	How we can merge all the data (gathered from previous studies) to create a single data?
Examples	A case report of a newly reported autoimmune condition that does not fit into other known diagnosis.	How many people have diabetes and how many of them are obese and eat refined carbohydrates?	Framingham Trial Hormonal replacement study of Women’s Health Initiative	• A cholera outbreak in United States was identified to be due to mangoes from Mexico.	• Randomized, double-blind placebo-controlled study (this design is considered to be the highest quality of evidence)
Weak- nesses(see below in the bias section of this chapter)	It is not conclusive, but helps in hypothesis generation	Cannot determine cause and effect relationship	Attrition (loss of study participants) or migration Selection bias is built in Confounding bias	Recall bias Confounding bias	Usually expensive and time-consuming	Publication bias (studies have shown that statistically significant papers are more likely to be put up for publishing than nonsignificant ones) Selection bias (publishing only the outcomes with statistically significant results)

Only gold members can continue reading. Log In or Register to continue

Stay updated, free articles. Join our Telegram channel

Tags: Manoj Gurung, Yayra MusabekThieme Review for the USMLE®: A WIN for Step 2 and 3 CK

Dec 11, 2021 | Posted by drzezo in GENERAL & FAMILY MEDICINE | Comments Off

Basicmedical Key

Fastest Basicmedical Insight Engine

24. Biostatistics

24.1 Incidence and Prevalence

24.1.1 The Incidence to Prevalence Pool for Chronic Diseases

24.1.2 What May Affect this Pool?

24.2 Defining Characteristics of a New Diagnostic Test

¹Reliability, validity, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), accuracy.

24.2.1 Sensitivity, Specificity, Positive Predictive Value (PPV), and Negative Predictive Value (NPV)

24.2.2 Effects of Using Different Cut-Off Values

24.2.3 The ROC Curve (Receiver Operating Characteristic Curve)

24.3 Different Types of Statistical Studies

Like this:

Related

Stay updated, free articles. Join our Telegram channel

Full access? Get Clinical Tree

Basicmedical Key

Fastest Basicmedical Insight Engine

24. Biostatistics

24.1 Incidence and Prevalence

24.1.1 The Incidence to Prevalence Pool for Chronic Diseases

24.1.2 What May Affect this Pool?

24.2 Defining Characteristics of a New Diagnostic Test 1Reliability, validity, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), accuracy.

24.2.1 Sensitivity, Specificity, Positive Predictive Value (PPV), and Negative Predictive Value (NPV)

24.2.2 Effects of Using Different Cut-Off Values

24.2.3 The ROC Curve (Receiver Operating Characteristic Curve)

24.3 Different Types of Statistical Studies

Share this:

Like this:

Related

Related posts:

Stay updated, free articles. Join our Telegram channel

Full access? Get Clinical Tree

24.2 Defining Characteristics of a New Diagnostic Test

¹Reliability, validity, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), accuracy.