Procedures and Strategies
The problems with scene segmentation depend on the target. Under ideal circumstances, the microscopic image is based on a monolayer of single cells against a clean background, free of debris. At the other extreme, scene segmentation has to deal with complex imagery such as is seen in histologic sections of glands.
There are two distinctly different approaches to scene segmentation. The first is image-oriented. Here, the image in its entirety is processed by a segmentation algorithm. All objects in the scene are segmented by the same procedure.
The second approach is object-oriented.
First, a search for “objects” is conducted, usually by applying an image processing algorithm, such as threshold-setting to the image. Then, each recognized object is outlined by a chaincode (see below) and stored (Freeman, 1961
). Next, each of the stored objects is categorized and processed by a suitable segmentation algorithm. The object-oriented approach thus employs a flexible strategy, where different objects in the scene may be segmented by different procedures.
The image-oriented approach has the advantage of simplicity and speed. However, the selected algorithm may work well on some targets but fail with others. The object-oriented approach has a much higher software requirement and may be somewhat slower, but can better handle complex imagery. The result in both approaches is an outline of objects of interest, usually represented as a chaincode.
is a list of values that begins with an x,y
image coordinate for the first or start-pixel in the display, followed by a set of directions to the next pixel, etc. until the code returns to the start-pixel. One may define “the next pixel” depending on the system used. The definition is important because the approach to the pixelation of a small object may affect features such as perimeter length, object roundness, and object area (Bartels and Thompson, 1994
; Neal et al, 1998
Many segmentation algorithms are available.
There are algorithms strictly based on the optical density of the object or pixel optical density thresholding. These may be used interactively by an observer who may adjust the threshold
up and down until an object’s outline agrees with the visual boundary. However, a fixed threshold may interfere with or “bleed” into the interior of the object to be outlined, as shown in Figure 46-3A
, or be diverted from the desired outline by image background, as seen in Figure 46-3B
. Simple thresholding frequently requires interactive corrections. A threshold may also be calculated using a histogram of pixel OD values by selecting a point of separation of two image regions. Various algorithms have been introduced to determine the differences in pixel OD value between an object and its background. Variations on these themes are algorithms that minimize some function to track the best boundary separating different objects in an image (Lester et al, 1978
Figure 46-3 A. Segmentation by pixel OD thresholding. There is a risk of the threshold “bleeding” into the interior of the object. B. Failure of the segmentation algorithm to adhere to the object contour.
When objects touch or overlap, one may employ a “shrink and blow” algorithm
). Here, as the pixel OD value threshold is gradually increased, an indentation appears between two overlapping objects. The indentation deepens with the increasing threshold until the objects are finally separated. The algorithm then defines a segmentation line between them and expands the two new objects to the contours of the original single object. The sequence of processing is shown in Figure 46-5A—E
. Similar algorithms find cusps—sharp concavities in the outline enclosing two touching objects. The algorithm determines which two cusps best correspond to each other and positions a segmentation line accordingly. One may segment an image by enclosing or outlining areas of particular color hue.
Figure 46-4 Finding of a segmentation line by a “shrink-and blow” algorithm.
Of particular value are processing sequences based on mathematic morphologic image processing operations
; Dougherty, 1992
; Soille, 1998
). Such operations allow all objects below a certain size to be eliminated, and the outlines of other objects to be corrected by smoothing and in-fill. The processing sequence developed by Juetting et al (1983)
for the quantitative evaluation of thyroid follicles in fine needle aspirates are a good example, as shown in Figure 46-5A-E
. The original image of a follicle is transformed into a histogram of pixel OD values. A threshold is set to outline nuclei. The resulting image is transformed to binary form. An erosion is performed to eliminate small particles of cellular debris and to correct small protuberances on the nuclear boundaries. By definition, a follicle includes a center or an interior region. Asearch for such interior regions is conducted within the thresholded image. The original binary image is segmented. Only the nuclei bordering the interior region are retained and stored as “follicle” for further analysis.
All these algorithms have a high success rate, but rarely does one single algorithm segment all objects correctly.
An interactive correction is then required. This may be feasible when the number of objects is modest. For large images, however, full automation is mandatory, and the demanding requirements for a machine vision system have to be satisfied. The problem of scene segmentation in cytologic preparations proved to be one of the most challenging tasks in designing automated primary screening devices for cervical cancer.
It is encountered with equal degree of difficulty in karyometry in histopathologic sections. Representative regions of a lesion may extend over several square millimeters. Using objectives with high numerical aperture, this translates directly into hundreds of video frames corresponding to the number of visual fields. Interactive correction of segmentation is no longer practical.
Figure 46-5 Processing of a scene by mathematical-morphologic operations. In A, the binary image of a thyroid follicle is shown. Note small debris is shown. B. The image after erosion, to remove small debris and smooth the nuclear outlines. C,D. The search for the interior region of the follicle. In E, only the nuclei forming the follicle are retained and segmentation lines are shown.
increases the degree of difficulty. One is no longer dealing with a processing task that is concerned with separate, well defined objects, such as nuclei. Instead, the histologic section is composed of adjacent components that represent a variety of structures. Processing steps to find a segmentation line for one component may adversely affect correct segmentation for other structures. The entire scene is combined or “coupled” in its processing requirements. Scene segmentation for histopathologic sections has finally become tractable by the development of knowledge-guided procedures
(Liedtke et al, 1987
; Thompson et al, 1993
). In a knowledge-guided process, information, not offered by the image itself, is used to control the scene segmentation. This principle was first employed by Liedtke et al (1987)
in the segmentation of cervical cytologic materials. Knowledge-guided process control now has been applied to the automated segmentation of prostate lesions, colonic tissues, and breast lesions (Anderson et al, 1997
Knowledge-Guided Scene Segmentation
Knowledge-guided scene segmentation provides the means for autonomous processing of imagery from a specific visual target. The knowledge-guided system should be able to segment any scene from the target in a fully automated fashion and without errors.
The principal difference between visual perception
microscopic image and the information evaluation by a machine vision system
is that human vision perceives the entire scene simultaneously. Every object and structure is revealed in their relative position. A machine vision system acquires the data sequentially. It is “pixel-bound” in processing of the image, i.e., tied to the pixel currently being processed and its immediate neighbors. Human image assessment is supported by information not offered by the image itself, such as the professional experience of a diagnostician, knowledge of anatomy, histology, pathology, and knowledge of the relationships between structure and function. If one expects a machine vision system to perform at a comparable level, then such information must be made available to its control software. This information may be offered to the machine vision system in the form of a knowledge file
(Bartels et al, 1992
). A knowledge file
will hold a large amount of generally applicable information, as well as very specific processing instructions for a particular and narrow diagnostic target or domain, for example, prostatic intraepithelial neoplastic lesions or a poorly differentiated prostatic carcinoma.
Figure 46-7 Tile merging based on cross-correlation techniques, showing the exact match of structures from both tiles down to the pixel level.
A knowledge file consists of two major sections. In a first declarative section, all entities pertaining to a given domain are listed, with their properties, the computer operations required for their evaluation, and the sequencing of these operations. The entities entered into the knowledge file may fall into two logical groups. The first group consists of all entities recognized by the human eye, such as the nucleus, cytoplasm, various cell types, etc. The second group consists of entities that are solely related to image processing and constructs known as “intermediate segmentation products.” Examples of entities that are solely related to image processing could be red image, pixel, pixel OD value threshold and area, as well as subroutines, algorithms, and functions constructed to define a segmentation procedure, for example, a function that finds objects of interest.
Intermediate segmentation products
are objects produced by machine vision segmentation during the initial processing phase aimed at detecting image regions that can be separated from the background. Some of these outlined objects may be well-defined morphologic components such as a single nucleus. Others may be objects that require further segmentation, e.g. a cluster of overlapping nuclei. Others may be fragments of one or more histologic components, such as secretory epithelium or portions of two different glands that appear in machine vision to be a single object. Figure 46-8
shows a brief processing sequence of segmentation of glandular epithelium. In Figure 46-8A
, a portion of the glandular epithelium has been correctly recognized and segmented, but the gray-shaded glandular epithelium on the left belongs to three different glands. It is an intermediate segmentation product. In Figure 46-8B
, the section of epithelium belonging to the gland on the lower left has been recognized, and so has the remaining segment of epithelium for the gland on the right. But, an intermediate segmentation product, now comprising segments from two different glands, still remains in need of further segmentation. Figure 46-8C
indicates the segmentation line that the system found. In the next step, both of the remaining epithelial segments would be correctly assigned to their glands.
Human vision immediately identifies these intermediate segmentation products. A machine vision system needs to be given explicit instructions on how to recognize the objects and how to process them. Fortunately, only a limited number of different intermediate segmentation products occur for scenes from a given domain. They must be specified by name as separate entities in the knowledge file, to allow the control software to call on the appropriate next processing sequence.
Figure 46-8 Processing sequence of a knowledge-guided segmentation. In A, the glandular epithelium from three glands remains connected, as an intermediate segmentation product. In B, the segment belonging to the gland in the right center is correctly segmented and added to that gland. In C, a portion of the epithelium belonging to the gland at the center left is correctly recognized and segmented. The glandular epithelium from the large gland at the top is assigned to that gland and a segmentation line has been drawn.
The second section of the knowledge file contains definition statements. It gives specific processing instructions to find the required entities and also can logically relate the entities from the first two subsections to each other. The definition statements are written in a subset of English, e.g. a nucleus is an object with size (limited by constraints) and with shape (limited by constraint) and with total optical density (limited by constraint).
The knowledge file is a text file and may readily be amended or modified. It is read by the control software and consulted at run time. As the software interprets the definition statements, it sets up a node and processing sequence for each entity. The results obtained during the scene processing must satisfy the constraints at all the nodes.
Relatively simple targets or scenes, such as tissue from well-differentiated adenocarcinoma, require approximately 100 entities to allow successful segmentation for 90% to 95% of all scenes. More complex scenes have required approximately 250-300 entities to provide segmentation at the same rate of success (Thompson et al, 1995
The complete segmentation process involves two major phases. In the first, the scene is segmented until every object is either recognized as a correctly outlined histologic entity, as listed in the knowledge file, or as a fragment of such an entity. In the second phase, the scene is reconstructed from the objects on file. In this operation, the control software has to check for logical consistency of the reconstruction. The extraction of histometric diagnostic information is begun only after scene segmentation, reconstruction and consistency checks have been completed.
A knowledge-guided scene segmentation system constitutes a major software development. The machine vision system at the Optical Sciences Center, University of Arizona, comprises some 20,000 lines of control code and image processing code. Structurally, the control software is an expert system
implemented as an associated network with frames at each node (Jackson, 1986
). There the specific results from the processing of each object are accumulated for later use.
Systems capable of segmenting, analyzing and interpreting imagery in a fully autonomous fashion, guided by a source of external knowledge of the domain and the processes required have become known as image understanding systems
(Bartels et al, 1989
shows the result of an automated segmentation of a tissue section from the prostate. Sample fields showing segmentation of lumen, stroma, secretory, and basal cell nuclei, and basal cells only are shown as a sidebar.
Figure 46-9 Automated segmentation of a tissue section from the prostate. The regions identified by the machine vision system as lumen, stroma, nuclei of the glandular epithelium, and basal cell nuclei alone shown in black in the small demonstration windows. Actually, the entire image is segmented automatically.