Journal of Imaging (May 2018)
A Review of Supervised Edge Detection Evaluation Methods and an Objective Comparison of Filtering Gradient Computations Using Hysteresis Thresholds
Abstract
Useful for human visual perception, edge detection remains a crucial stage in numerous image processing applications. One of the most challenging goals in contour detection is to operate algorithms that can process visual information as humans require. To ensure that an edge detection technique is reliable, it needs to be rigorously assessed before being used in a computer vision tool. This assessment corresponds to a supervised evaluation process to quantify differences between a reference edge map and a candidate, computed by a performance measure/criterion. To achieve this task, a supervised evaluation computes a score between a ground truth edge map and a candidate image. This paper presents a survey of supervised edge detection evaluation methods. Considering a ground truth edge map, various methods have been developed to assess a desired contour. Several techniques are based on the number of false positive, false negative, true positive and/or true negative points. Other methods strongly penalize misplaced points when they are outside a window centered on a true or false point. In addition, many approaches compute the distance from the position where a contour point should be located. Most of these edge detection assessment methods will be detailed, highlighting their drawbacks using several examples. In this study, a new supervised edge map quality measure is proposed. The new measure provides an overall evaluation of the quality of a contour map by taking into account the number of false positives and false negatives, and the degrees of shifting. Numerous examples and experiments show the importance of penalizing false negative points differently than false positive pixels because some false points may not necessarily disturb the visibility of desired objects, whereas false negative points can significantly change the aspect of an object. Finally, an objective assessment is performed by varying the hysteresis thresholds on contours of real images obtained by filtering techniques. Theoretically, by varying the hysteresis thresholds of the thin edges obtained by filtering gradient computations, the minimum score of the measure corresponds to the best edge map, compared to the ground truth. Twenty-eight measures are compared using different edge detectors that are robust or not robust regarding noise. The scores of the different measures and different edge detectors are recorded and plotted as a function of the noise level in the original image. The plotted curve of a reliable edge detection measure must increase monotonously with the noise level and a reliable edge detector must be less penalized than a poor detector. In addition, the obtained edge map tied to the minimum score of a considered measure exposes the reliability of an edge detection evaluation measure if the edge map obtained is visually closer to the ground truth or not. Hence, experiments illustrate that the desired objects are not always completely visible using ill-suited evaluation measure.
Keywords