TYPES OF LABORATORY TESTS

Diagnostic tests

Diagnostic tests are those that are made on a sample from a patient, the result allocating the case to a diagnostic grouping; an example would be a needle core biopsy of a lesion of the breast which is sent for histopathological examination and classified into a benign or malignant (i.e. cancer) category. Quantitative measurements, such as haemoglobin concentration or arterial blood oxygen tension, may be used in the clinician’s diagnostic process but they do not by themselves assign a patient to a particular diagnostic category. A diagnostic test may be based on:

• quantitative measurement, such as the level of beta-human chorionic gonadotrophin in the diagnosis of trophoblastic disease

• subjective assessment, based on past experience such as a histopathologist’s assessment of a needle core biopsy or fine needle aspirate of the breast.

The ideal diagnostic test would produce complete separation between two diagnostic categories; usually, however, there is some overlap. This problem can be illustrated by taking as an example a screening test for colorectal carcinoma which makes measurements on a sample of faeces (many attempts have been made to devise such a test using measurements of blood contained in the faeces and other parameters). An ideal diagnostic test would produce complete separation of patients with and without colorectal carcinoma (Fig. 4.1). The majority of real diagnostic tests do not provide complete separation between diagnostic categories and there is overlap (Fig. 4.2).

Fig. 4.1 Distribution graph for an ideal diagnostic test. There is complete separation of the population into those with colorectal carcinoma (shaded area) and those without. In this example a measurement of above 70 units would indicate that the subject had colorectal carcinoma and a measurement below 60 units would indicate that the subject did not have colorectal carcinoma.

Fig. 4.2 Distribution graph of a more realistic diagnostic test. In this example there is a range of values between 60 and 80 units where there are subjects with and without colorectal carcinoma.

The effectiveness of a diagnostic test can be expressed using a number of different parameters:

• A true positive result (TP) is a positive result from the test under consideration which is confirmed by the real outcome of the situation (e.g. a needle core biopsy of the breast (NCB) which is reported as malignant and the subsequently excised breast tissue contains invasive carcinoma; Table 4.1).

• A true negative result (TN) is a negative test result confirmed by a negative real outcome.

• A false positive result (FP) is a positive test result that has a negative real outcome (e.g. an NCB that is reported as malignant but the subsequently excised breast tissue shows no evidence of malignancy).

• A false negative result (FN) is the reverse of this.

Table 4.1 True and false test results in needle core biopsy of the breast (NCB)

Test result from NCB
Actual outcome	Benign	Malignant
Benign	True negative	False positive
Malignant	False negative	True positive

These can be combined into the following measures:

The desired values of these for a particular test will vary according to the action taken on the result. A malignant NCB result can result in a surgeon excising the breast (mastectomy), so the specificity and predictive value of a positive result must be as close to 100% as possible. In contrast, if a disease has a relatively safe, non-toxic treatment (such as a course of antibiotics) but the consequences of not detecting the disease can be fatal (e.g. bacterial meningitis), the sensitivity and predictive value of a negative result should be as high as possible. In most situations there is a direct ‘trade-off’ between sensitivity and specificity and a suitable threshold has to be set that will give the best overall performance (Fig. 4.3).

Fig. 4.3 A graph showing the effect of moving the threshold value for a test on its sensitivity and specificity. If the threshold is set at A then there are no false positives so the specificity is 100% but the sensitivity is low at about 60%. If the threshold is moved down to C there are no false negatives so the sensitivity is 100% but the rise in false positives has led to a reduction in the specificity to about 30%. At threshold B the test gives the greatest overall accuracy with three false positives and two false negatives.

In many medical situations a continuous biological spectrum is arbitrarily divided into a number of discrete categories which will always lead to some apparent misclassification but is necessary to give information on which clinicians can base their management decisions (e.g. division of intra-epithelial neoplasia of the uterine cervix into three categories, see Ch.19).

A laboratory’s performance in diagnostic tests should be monitored by a formal audit process and by use of appropriate positive and negative controls in tests.

Quantitative measurements

Many tests in pathology do not categorise results into discrete groups but give a quantitative result which is interpreted in relation to a ‘normal’ range of values. Examples of such tests include measurement of haemoglobin concentration, electrolyte concentrations, and blood oxygen and carbon dioxide levels.

The measures of performance for such tests differ from diagnostic grouping tests. In quantitative tests the accuracy of the measurement (how close the measured value is to the ‘true’ value determined by a more accurate or absolute method) and the reproducibility of the measurement (what variation there is when measuring the same sample many times) are important parameters. These can be assessed by using reference samples with ‘known’ values and putting these through the measurement system at regular intervals; most laboratories will have their own reference samples which are used frequently (internal quality assurance), and graphs of single measurement and running mean values will be used to ensure that the test is performing within expected limits and not showing ‘drift’ away from the central expected value (Fig. 4.4). Many countries also have external quality assurance schemes where reference samples are sent to all participating laboratories to ensure acceptable analytical performance.

Fig. 4.4 Internal quality assurance graph for a quantitative pathological test. A reference sample is used for each test; tests A and B lie outside the acceptable range and the process of the test would have to be investigated for sources of error (e.g. out of date reagents, contamination, etc.).

When a laboratory gives a quantitative result for a parameter that is under physiological control, a reference range is often given to facilitate interpretation of the result. If a parameter shows normal (Gaussian) distribution in the local population, the ‘normal’ range is often given as two standard deviations below the mean to two standard deviations above the mean. If a value lies outside this range then it lies outside 95% of the results for that population (Fig. 4.5) and may be regarded as abnormal, but 2.5% of the healthy population will have values lying outside the range at either end. Thus, all the details of the individual case must be considered, including other measurements, as a number of results at the top end of the ‘normal’ range could be more significant than a single result just above the ‘normal’ range. If the distribution is not Gaussian it may require normalisation by transformation, or non-parametric methods must be used.

Fig. 4.5 Quantitative measurement with a normal (Gaussian) distribution in the population. The result at A lies more than two standard deviations away from the mean and so may be regarded as abnormal, but 2.5% of the normal population will have values in this area.

Prognostic tests

In many tumours, assignment to a diagnostic category (e.g. adenoma or carcinoma) gives an indication of the prognosis for the individual patient, but within such groupings (e.g. colorectal carcinoma) there may be wide variation in the biological behaviour of the tumour. In order to plan appropriate treatment and to be able to give useful information and counselling to individual patients, many prognostic pathological tests have been developed.

In tumour pathology one of the most predictive prognostic tests is staging of the tumour (extent of spread), which is always assessed in the histopathological examination of specimens. One of the best examples of this is Dukes’ staging of colorectal carcinoma (Ch. 11 and Ch. 15). The histological type of tumour has important prognostic implications, particularly in some organs; subjects with papillary thyroid carcinoma have a life expectancy that is the same as for the rest of the general population without the tumour, whereas subjects with anaplastic thyroid carcinoma have a median survival of a few months. The grade of the tumour, an assessment of its degree of differentiation and proliferative activity, also has predictive value; well-differentiated tumours (closely resembling parent tissue) with few mitoses have a better prognosis.

In tumours that produce substances that enter the blood or urine (e.g. alpha-fetoprotein produced by testicular teratomas, see Ch. 20), measurement of the levels of these at the time of diagnosis may be predictive of prognosis (and can be used in follow-up). As more becomes known of the molecular abnormalities of tumours, the possibilities for specific molecular tests that will have prognostic value increase, but the translation of an apparently significant research result into a routinely used prognostic test is not straightforward. When evaluating any new prognostic test the significance for the individual patient has to be considered; a test that shows a statistically significant difference between two large groups of patients may not assign individual cases to a prognostic category with a sufficient degree of certainty to be useful in management decisions or patient information. One recently developed test that has found usage is the detection of expression of the transmembrane receptor tyrosine kinase KIT, which is defined by the CD117 antigen and is the product of the c-kitproto-oncogene in stromal tumours of the gastrointestinal tract. This can be detected by immunohistochemistry (Fig. 4.6), which, if positive, predicts that the patient’s tumour will respond to treatment with a specific tyrosine kinase inhibitor, imatinib mesylate.