Chapter 21

Title: Image Mining Extension for RapidMiner (Advanced)


Chapter 20 introduces the RapidMiner IMage Mining (IMMI) Extension and presents some introductory image processing and image mining use cases. Chapter 21 provides more advanced image mining applications.

Chapter 21 presents advanced image mining applications using the RapidMiner Image Mining (IMMI) Extension introduced in the previous chapter. This chapter demonstrates several examples of the use of the IMMI extension for image processing, image segmentation, feature extraction, pattern detection, and image classification. The first application extracts global features from multiple images to enable automated image classification. The second application demonstrates the Viola-Jones algorithm for pattern detection. And the third process illustrates the image segmentation and mask processing. The classification of an image is used to identify which group of images a particular image belongs to. An automated image classifier could, for example, be used to distinguish different scene types like nature versus urban environment, exterior versus interior, images with and without people, etc. Global features are usually used for this purpose. These features are calculated from the whole image. The key to a correct classification is to find the features that differentiate one class from other classes. Such a feature can be, for example, the dominant color in the image. These features can be calculated from the original image or from an image after pre-processing like Gaussian blur or edge detection.

Pattern detection searches known patterns in images in the images, where approximate fits of the patterns may be sufficient. A good algorithm for detection should not be sensitive to the size of the pattern in the image or its position or rotation. One possible approach is to use a histogram. This approach compares the histogram of the pattern with the histogram of a selected area in the image. In this way, the algorithm passes step by step through the whole image, and if the match of histograms is larger than a certain threshold, the area is declared to be the sought pattern. Another algorithm, which is described in this chapter, is the Viola-Jones algorithm. The classifier is trained with positive and negative image examples. Appropriate features are selected using the AdaBoost algorithm. An image is iterated during pattern detection using a window with increasing size. Positive detections are then marked with a square area of the same size as the window. The provided example application uses this process to detect the cross-sectional artery in an ultrasound image.

After detection, the images can be used to measure the patient’s pulse if taken from a video or stream of time-stamped images. The third example application demonstrates image segmentation and feature extraction: Image segmentation is often used for the detection of different objects in the image. Its task is to split the image into parts so that the individual segments correspond to objects in the image. In this example, the identified segments are combined with masks to remove the background and focus on the object found.

Table of Contents

21.1 Introduction
21.2 Image Classification
21.2.1 Load Images and Assign Labels
21.2.2 Global Feature Extraction
21.3 Pattern Detection
21.3.1 Process Creation
21.4 Image Segmentation and Feature Extraction
21.5 Summary
21.5 Bibliography

Data & Processes: Click here to download